Merge Theory for LLMs

The Challenge: Why not just average AI models?

In many real problems, multiple distinct inputs can produce the same output (think several diagnoses explaining the same symptoms). When that many-to-one structure shows up, the assumptions behind common merging tricks (including simple weight averaging) break, and you can get systematic errors.

Traditional approaches like weight-space averaging (model soups, Fisher-weighted averaging) and task arithmetic work well in some cases, but fail when non-injective mappings dominate. My research introduces Merge Theory—a set-valued framework that keeps all plausible explanations and then uses consensus + uncertainty to decide how to act.

The Core Insight: Embracing Multiple Solutions

Instead of forcing models to agree on a single answer, Merge Theory recognizes that multiple valid solutions can coexist. We formalize this through:

1. Set-Valued Predictors

Rather than outputting a single answer, our merged model returns a calibrated set of possibilities:

# Traditional approaches (can be problematic with many-to-one mappings)
def merge_weights(models):
    # Weight averaging (model soups, Fisher-weighted, etc.)
    return average_parameters([m.weights for m in models])

def merge_probabilities(models, input):
    # Simple probability averaging
    return average([m.predict(input) for m in models])

# Merge Theory approach
def merge_setvalued(models, input, threshold=0.3):
    predictions = {}
    for model in models:
        for answer, prob in model.predict_proba(input):
            if prob > threshold:
                predictions[answer] = predictions.get(answer, 0) + prob
    return {k: v for k, v in predictions.items() if v > threshold}

2. The Merge Degree - Quantifying Uncertainty

A key innovation is the merge degree - a measure of how many different inputs map to the same output. Higher merge degree = higher uncertainty.

Merge Degree Formula: m_f(y) = |f^(-1)(y)|

Where f^(-1)(y) is the set of all inputs that produce output y

More precisely, under exchangeable priors, the maximum posterior confidence equals exactly 1/m_f(y) - confidence shrinks as 1/merge degree. This gives us a principled bound: when many causes lead to the same effect, confidence in any single cause must decrease proportionally.

The Mathematics: Five Proven Guarantees

Our framework provides rigorous theoretical foundations with five key theorems:

Theorem 1: Threshold Optimality

Selecting predictions above a probability threshold λ minimizes regularized risk.

Theorem 2: Top-k Bayes Optimality

When constrained to k predictions, choosing the k highest probabilities minimizes Bayes risk.

Theorem 3: Uncertainty Bounds

Maximum confidence = 1/merge_degree under exchangeable priors.

Theorem 4: Consensus Gains

With independent reasoning chains and accuracy p > 1/2 (above chance), error decreases exponentially with agreement at rate 2(p - 1/2)².

Theorem 5: Proper Scoring & Coverage

Posterior distributions uniquely minimize all proper scoring rules, with minimal sets for target coverage.

The Algorithms: MERGE and SPREAD

Forward MERGE Algorithm

Aggregates multiple reasoning chains into semantic clusters:

def forward_merge(prompt, model, k=8):
    # Generate k reasoning chains in parallel
    chains = model.generate_batch(prompt, k, temperature=0.7)

    # Cluster semantically similar responses
    clusters = semantic_cluster(chains, similarity_threshold=0.95)

    # Score each cluster by consensus
    for cluster in clusters:
        cluster.score = log_sum_exp([
            chain.logprob/chain.length
            for chain in cluster.members
        ])

    # Calculate merge degree (uncertainty)
    merge_degree = effective_cluster_count(clusters)

    return best_cluster.answer, merge_degree

Backward SPREAD Algorithm

Enumerates high-probability explanations for observations:

def backward_spread(observation, model, k=None, threshold=None):
    explanations = []

    # Generate diverse explanations
    for temperature in [0.3, 0.5, 0.7, 0.9]:
        batch = model.explain_batch(observation, k/4, temperature)
        explanations.extend(batch)

    # Verify and score
    verified = []
    for exp in explanations:
        if verifier.check(exp, observation):
            verified.append((exp, model.score(exp)))

    # Select top-k or threshold
    if k:
        selected = top_k(verified, k)
    else:
        selected = [e for e in verified if e.score > threshold]

    return semantic_deduplicate(selected)

Real-World Applications

Medical Diagnosis

Multiple diseases presenting identical symptoms - the framework provides calibrated differential diagnoses with confidence bounds.

Multi-hop Reasoning

When multiple independent reasoning paths converge to the same conclusion, consensus voting achieves exponential error reduction (under independence and accuracy > 1/2).

Information Retrieval

Handling near-duplicates and paraphrases while maintaining diversity in search results.

Key Innovations

Non-Convexity Recognition: Traditional weight averaging assumes convex solution spaces. Set-valued approaches handle disconnected solution regions.
Consensus as Regularization: With independent models and accuracy above chance (p > 1/2), agreement between models acts as implicit regularization with exponential error reduction at rate 2(p-1/2)².
Principled Uncertainty: Merge degree provides calibrated uncertainty—maximum posterior confidence equals exactly 1/m_f(y) under exchangeable priors.

Practical Impact

Model Selection: Consensus gain metric identifies compatible models before merging
Adaptive Merging: Input-dependent weights allow dynamic expertise selection
Continual Learning: Natural extension to incremental model integration

Implementation Guidelines

GPU Optimization

Batch k=4-8 chains for parallelization
KV cache reuse across prompt variants
Early exit at 75% consensus threshold

Diversity Injection

Temperature variation: T ∈ [0.3, 0.9]
Random seed permutation
Avoid beam search (kills diversity)

Threshold Calibration

Plot risk-coverage curves on validation data to select optimal λ for target coverage α.

When to Use Merge Theory

Effective Domains

Abductive reasoning (many causes → one effect)
Medical/fault diagnosis (symptom aliasing)
Multi-hop QA (path convergence)
Mathematical problems (alternative solutions)

Limitations

Injective tasks (no merging possible)
Poor base accuracy (p ≤ 0.5)
Highly correlated chains
Infinite input spaces

Conclusion

Merge Theory provides the mathematical foundations for principled model combination, moving beyond ad-hoc merging and simple averaging approaches to handle the complex realities of non-injective mappings. By embracing set-valued predictions and leveraging consensus gains, we can build more robust and reliable AI systems.

The framework bridges formal guarantees with practical engineering, enabling reliable combination of diverse model capabilities as specialized LLMs proliferate.

How to Cite

If you use this work in your research, please cite:

Shaddelan, M. (2025). Merge Theory for LLMs: Set-Valued Inverses, Consensus Gains, and Uncertainty. Zenodo. https://doi.org/10.5281/zenodo.16884687

BibTeX

@misc{shaddelan2025merge,
  author       = {Shaddelan, Milad},
  title        = {Merge Theory for LLMs: Set-Valued Inverses,
                  Consensus Gains, and Uncertainty},
  year         = {2025},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.16884687},
  url          = {https://doi.org/10.5281/zenodo.16884687}
}

Resources

Paper (Official Version): Zenodo DOI: 10.5281/zenodo.16884687
Full PDF: Download Paper
Code Implementation: GitHub Repository (Currently private - will be made public after testing and experiments are complete)
Contact: [email protected]

Note: The Zenodo DOI represents the official archived version of this paper. Please use it for citations. The code repository is temporarily private while experiments and testing are being finalized.

This research was presented in August 2025. The theoretical foundations establish conditions for optimal set selection, characterize uncertainty bounds, and prove convergence properties for model merging in non-injective domains.