A Mathematical Proof for Repeatable Success:
Iterative Improvement as a Deterministic Path
to Probabilistic Certainty
_______________________________________________
Including Diminishing Returns and Asymptotic Ceiling Analysis
Working Paper • Revised Draft
Abstract
This paper presents a formal mathematical framework demonstrating that persistent iterative effort, combined with marginal compounding improvement, transforms even vanishingly improbable outcomes into near-certainties. We begin with an idealized model showing that an agent with an initial success probability of one in one million (p₀ = 10⁻⁶) who improves at a constant 1% per iteration reaches 50% cumulative success probability by iteration 888 and 99% by iteration 1,077—compared to 693,147 trials required without improvement.
Critically, we then extend the analysis to four diminishing-returns models that address the most natural objection to the idealized case: that improvement becomes harder over time. Under a mild hyperbolic decay model, 99% cumulative success is reached by iteration 1,683. Under a logistic model where per-trial probability can never exceed 15%—a severely pessimistic ceiling—cumulative success still reaches 99% by iteration 1,088. The central finding is robust: even under the harshest realistic constraints on improvement, the cumulative compounding of many imperfect attempts overwhelms the difficulty of any individual attempt. We formalize this as the Iterative Dominance Theorem and argue that the primary determinant of success is not initial probability, path selection, or talent, but the commitment to sustained iteration with any positive rate of improvement.
1. Introduction
1.1 The Persistence Paradox
Conventional wisdom in decision theory, career planning, and entrepreneurship emphasizes path selection—choosing the right strategy, market, or domain. Under this paradigm, failure is attributed to having chosen the wrong path, and the remedy is to abandon the current trajectory and search for a better one. This paper challenges that assumption. We argue, and prove mathematically, that persistent iteration with compounding improvement is sufficient to overcome arbitrarily unfavorable initial conditions.
1.2 The Diminishing Returns Objection
The most immediate critique of any compounding-improvement argument is that real-world improvement is not constant. Going from terrible to mediocre is easy; going from good to great is hard; going from great to world-class may be nearly impossible. Skill acquisition follows learning curves. Markets saturate. Low-hanging fruit gets picked. Any credible mathematical framework for persistence must grapple with this reality.
We address this objection head-on by analyzing five models spanning the full spectrum from idealized to severely pessimistic. Our central finding is that the core thesis—persistent iteration converges to success—survives even the harshest diminishing-returns assumptions. The reason is subtle but powerful: what matters for cumulative success is not that any single trial achieves high probability, but that the aggregate probability mass accumulated across all trials approaches certainty. Diminishing returns slow the per-trial trajectory but cannot prevent the cumulative sum from diverging.
1.3 Core Thesis
We formalize the following claim: Given any initial success probability p₀ > 0 and any improvement trajectory that is positive for a sufficient number of iterations, there exists a finite number of iterations N such that the cumulative probability of at least one success approaches certainty. Furthermore, N is dramatically smaller than the brute-force baseline under all tested models, including those with severe diminishing returns.
2. Formal Framework
2.1 The General Model
Let p₀ ∈ (0, 1) denote the initial success probability. Let r(n) ≥ 0 denote the improvement rate at iteration n, which may vary. The per-trial success probability at iteration n is:
p(n) = min(1, p₀ · ∏ᵢ₌₀ⁿ⁻¹ (1 + r(i)))
The cumulative probability of at least one success in N trials is:
P(N) = 1 − ∏ᵢ₌₀ᴺ⁻¹ (1 − p(i))
This general framework accommodates constant improvement, decaying improvement, logistic ceilings, and the brute-force baseline (r(n) = 0 for all n) as special cases.
2.2 Five Models Under Analysis
We evaluate the following five models, each representing a different assumption about how improvement behaves over time:
Table 1: Model Specifications
Model
Description
Functional Form
Model A
Constant r = 1%
p(n) = p₀ · 1.01ⁿ (idealized)
Model B
Mild decay (α = 0.001)
r(n) = 0.01 / (1 + 0.001n)
Model C
Steep decay (α = 0.005)
r(n) = 0.01 / (1 + 0.005n)
Model D
Logistic, ceiling 30%
p(n) = 0.30 / (1 + 299999 · e⁻⁰·⁰¹ⁿ)
Model E
Logistic, ceiling 15%
p(n) = 0.15 / (1 + 149999 · e⁻⁰·⁰¹ⁿ)
Model A is the idealized baseline: constant 1% improvement forever. Model B introduces mild hyperbolic decay, where the improvement rate at iteration 1,000 has fallen to approximately 0.5%. Model C imposes steep decay, where improvement at iteration 1,000 is approximately 0.17%. Models D and E use logistic functions with hard ceilings—the per-trial probability can never exceed 30% or 15% respectively, regardless of effort. These represent domains where irreducible randomness or structural barriers cap achievable skill.
2.3 Key Assumptions
A1 (Independence). Each trial’s outcome is independent of all others, conditional on its probability p(i).
A2 (Non-negative improvement). r(n) ≥ 0 for all n. The agent never gets worse. (We discuss relaxation of this assumption in Section 6.)
A3 (Non-zero start). p₀ > 0, ensuring the goal is not logically impossible.
3. Central Theorems
3.1 Theorem 1: Finite Convergence (Idealized Case)
Theorem 1. For constant r > 0, the per-trial probability p(n) reaches any target τ ∈ (p₀, 1) in finite iterations:
N*(τ) = ⌈ ln(τ / p₀) / ln(1 + r) ⌉
Canonical result: For p₀ = 10⁻⁶ and r = 0.01, the per-trial probability reaches 50% at N* = 1,319 iterations.
3.2 Theorem 2: Cumulative Convergence (General Case)
Theorem 2. For any improvement trajectory where ∑ p(i) diverges (i.e., the sum of per-trial probabilities is unbounded), the cumulative success probability P(N) → 1 as N → ∞.
Proof sketch. We use the fact that for x ∈ [0, 1), ln(1 − x) ≤ −x. Therefore ln(∏(1 − p(i))) = ∑ ln(1 − p(i)) ≤ −∑ p(i) → −∞, which implies ∏(1 − p(i)) → 0, and thus P(N) → 1. The condition ∑ p(i) = ∞ is satisfied by all five models since p(i) grows or remains bounded away from zero. □
This theorem is critical because it establishes that cumulative success does not require the per-trial probability to reach any particular threshold. Even if each individual attempt plateaus at a low ceiling (as in Models D and E), the accumulation of many low-probability attempts still converges to certainty.
3.3 Theorem 3: The Improvement Leverage Theorem
Theorem 3. The brute-force baseline (r = 0 for all n) requires N₀ = ⌈ ln(0.5) / ln(1 − p₀) ⌉ ≈ 693,147 iterations for 50% cumulative success. All five improvement models achieve 50% cumulative success in fewer than 4,200 iterations—a minimum reduction of 99.4%.
4. Numerical Results
4.1 The Central Comparison
Table 2 presents the iteration at which each model reaches key cumulative success probability thresholds. This is the operationally meaningful metric: the number of attempts required before success becomes likely, very likely, or near-certain.
Table 2: Cumulative Success Probability Milestones Across All Models (p₀ = 10⁻⁶)
Cumul. P
Model A
Model B
Model C
Model D
Model E
10%
699
903
2,133
696
696
25%
800
1,085
3,061
797
797
50%
888
1,258
4,172
885
887
75%
958
1,405
5,308
956
958
90%
1,008
1,519
6,323
1,008
1,012
95%
1,035
1,580
6,921
1,035
1,040
99%
1,077
1,683
8,018
1,080
1,088
4.2 The Key Insight: Cumulative Resilience
The most striking result in Table 2 is the relative stability of the cumulative success milestones across dramatically different models. Consider the 99% cumulative threshold:
Under the idealized model (A), 99% cumulative success is reached at iteration 1,077. Under the most pessimistic logistic model (E), where per-trial probability is permanently capped at 15% and can never reach even 50%, the 99% cumulative threshold is reached at iteration 1,088.
The difference is 11 iterations.
This near-equivalence is not a coincidence. It reflects a deep structural property of cumulative probability: the early iterations (where all models behave similarly, because improvement is easy and far from any ceiling) dominate the cumulative calculation. By the time diminishing returns or ceilings become binding, the cumulative probability has already accumulated most of its mass.
4.3 Summary Table: The Convergence Gap
Table 3 isolates the critical comparison between per-trial and cumulative milestones across all models, including the brute-force baseline.
Table 3: Per-Trial vs. Cumulative Milestones Across Models
Model
p(n) = 50%
Cumul. P = 50%
Cumul. P = 99%
Model A (ideal)
1,319
888
1,077
Model B (mild decay)
2,727
1,258
1,683
Model C (steep decay)
44,633+
4,172
8,018
Model D (30% ceiling)
Never
885
1,080
Model E (15% ceiling)
Never
887
1,088
Brute force (r = 0)
Never
693,147
4,605,168
The brute-force row (in red) demonstrates the leverage of any improvement whatsoever. Even Model C, with steep diminishing returns, requires only 1.2% of the brute-force iterations for 50% cumulative success and 0.17% for 99%.
4.4 Sensitivity to Improvement Rate
Table 4 demonstrates the sensitivity of convergence speed to the initial improvement rate under the idealized model. Even at one-tenth the canonical rate (0.1%), the reduction versus brute force remains dramatic.
Table 4: Effect of Initial Improvement Rate on Convergence (Idealized Model, p₀ = 10⁻⁶)
Improvement Rate (r)
Iterations to p(n) = 0.5
Reduction vs. Brute Force
0% (no improvement)
693,147
—
0.1%
13,129
98.1%
0.5%
2,632
99.6%
1%
1,319
99.8%
2%
663
99.9%
5%
269
99.96%
10%
138
99.98%
5. The Iterative Dominance Theorem
We now state the paper’s unifying result, which synthesizes the findings across all models:
The Iterative Dominance Theorem. For any goal with non-zero initial probability and any improvement trajectory satisfying ∑ p(i) = ∞, cumulative success probability converges to 1 in finite iterations. The convergence speed is dominated by the early-iteration improvement rate, not by the asymptotic behavior of the improvement function. Consequently:
(i) Diminishing returns delay but do not prevent convergence.
(ii) Hard ceilings on per-trial probability are asymptotically irrelevant to cumulative success.
(iii) The logarithmic scaling N* = O(log(1/p₀)) ensures that even dramatic increases in initial difficulty produce only linear increases in required iterations.
Implication: A goal that is 1,000× harder requires only approximately 693 additional iterations at a 1% initial improvement rate—not 1,000× more effort. The difficulty of the goal is logarithmically compressed by the compounding process.
6. Discussion
6.1 Why Diminishing Returns Don’t Break the Thesis
The resilience of the cumulative result to diminishing returns has a clean intuitive explanation. Diminishing returns affect the per-trial probability trajectory—how good you get at any single attempt. But cumulative success depends on the sum of all per-trial probabilities, not the maximum. A thousand attempts at 0.1% each contribute the same cumulative mass as one attempt at approximately 100%. The diminishing-returns objection conflates per-trial performance with aggregate outcome.
This is analogous to compound interest with a declining rate. Even if the annual return drops from 10% to 2% to 0.5% over time, the portfolio still grows without bound—it just grows more slowly. Similarly, as long as improvement remains positive (even if vanishingly small), cumulative success probability continues to accumulate toward certainty.
6.2 The Logistic Ceiling: A Feature, Not a Bug
Models D and E, where per-trial probability is permanently capped at 30% and 15% respectively, may seem like the strongest counterargument to the persistence thesis. If you can never become more than 15% likely to succeed on any given attempt, how can persistence guarantee success?
The answer is that a 15% per-trial probability, applied repeatedly, is extraordinarily powerful. After just 15 attempts at p = 0.15, the cumulative probability of at least one success is 1 − 0.85¹⁵ ≈ 91.3%. The ceiling constrains how skilled you become, but it cannot constrain how many times you try. And the mathematics of cumulative probability ensures that trying enough times at any fixed positive probability converges to certainty.
In fact, the logistic models reach 99% cumulative success almost as fast as the idealized model because the cumulative calculation is dominated by the approach phase (iterations 0–1,000), during which all models behave similarly.
6.3 The 1% Rule in Context
A 1% improvement per iteration is the canonical parameter, chosen for plausibility. In practice, this corresponds to: refining a sales pitch after each rejection, improving a product based on user feedback, or adjusting research methodology after each failed experiment. The key insight from the sensitivity analysis is that even 0.1% improvement—a rate so small it would be imperceptible in any single iteration—still reduces the path to success by 98% compared to brute force.
6.4 Honest Limitations
We acknowledge several constraints that bound the applicability of these results:
Negative iterations. The model assumes non-negative improvement. In practice, agents sometimes regress—bad habits form, markets shift, skills atrophy. If negative iterations are frequent enough, they could slow or prevent convergence. The model holds as long as the net trend of improvement is positive over a sufficient number of iterations.
Iteration cost. Each attempt costs time, capital, and energy. The model assumes iterations are feasible indefinitely. In practice, resource constraints may terminate the process before convergence. The practical question is whether the convergence threshold falls within the agent’s resource budget.
Opportunity cost. Persistence on one path forecloses others. The model does not incorporate the value of exploration or pivot optionality. A rational agent must weigh the expected remaining iterations to success against the expected value of alternative paths.
Measurement. In many real-world settings, the agent cannot directly observe p(n) or confirm that improvement is occurring. The model assumes improvement is real even if unobservable—a strong assumption in domains with noisy feedback.
Survivorship bias. Observed examples of persistence leading to success may overrepresent favorable conditions. The mathematical result holds regardless of empirical bias, but practitioners should be cautious about inferring real-world improvement rates from success stories.
Despite these limitations, the formal result is robust: under any improvement trajectory where per-trial probabilities sum to infinity, cumulative success converges to certainty in finite iterations.
7. Conclusion
This paper establishes a rigorous mathematical foundation for the thesis that persistence with continuous improvement is the dominant factor in achieving improbable goals. Starting from a one-in-a-million probability, a 1% initial improvement rate yields 99% cumulative success probability within approximately 1,077 iterations under ideal conditions and 1,088 iterations even under the most pessimistic ceiling model tested.
The diminishing-returns analysis—the paper’s central contribution beyond the idealized proof—demonstrates that the core result is not an artifact of unrealistic assumptions. Under mild decay, steep decay, and hard logistic ceilings, cumulative success remains achievable in a number of iterations that is vanishingly small compared to the brute-force baseline. The reason is structural: cumulative probability depends on the sum of per-trial probabilities, not their maximum, and this sum diverges under any positive improvement trajectory.
The practical conclusion is direct: the path to success is not about finding the right door. It is about knocking on any door repeatedly, learning from each attempt, and getting slightly better every time. Even if improvement slows, even if you hit a ceiling, even if any single attempt never becomes particularly likely to succeed—the mathematics of cumulative probability ensures that persistence with improvement makes success not merely possible, but asymptotically certain.
Appendix A: Notation Summary
p₀ — Initial success probability (canonical value: 10⁻⁶)
r, r(n) — Improvement rate (constant or iteration-dependent)
α — Decay parameter for hyperbolic models
p(n) — Per-trial success probability at iteration n
P(N) — Cumulative probability of ≥1 success across N trials
p_max — Logistic ceiling parameter (Models D, E)
N* — Iteration at which p(n) reaches a target threshold
N₀ — Brute-force baseline: iterations for 50% cumulative success without improvement (≈693,147)
Appendix B: Proof of Cumulative Divergence
Claim: If ∑ᵢ₌₀∞ p(i) = ∞ and p(i) ∈ [0, 1) for all i, then ∏ᵢ₌₀∞ (1 − p(i)) = 0.
Proof. For x ∈ [0, 1), the inequality ln(1 − x) ≤ −x holds. Therefore:
ln ∏ᵢ₌₀ᴺ (1 − p(i)) = ∑ᵢ₌₀ᴺ ln(1 − p(i)) ≤ −∑ᵢ₌₀ᴺ p(i)
As N → ∞, the right-hand side diverges to −∞, so the product converges to 0. Since P(N) = 1 − ∏(1 − p(i)), we have P(N) → 1. □
Appendix C: Verification of ∑ p(i) = ∞ for Each Model
Model A (Constant r):
p(i) = p₀ · (1.01)ⁱ grows without bound, so the sum trivially diverges.
Model B (Mild decay, α = 0.001):
The growth of p(n) is approximately p₀ · exp(∑ᵢ r₀/(1+αi)) ≈ p₀ · (1+αn)^(r₀/α). For r₀/α = 10, this is polynomial growth of order 10, and the sum of a polynomially growing sequence diverges.
Model C (Steep decay, α = 0.005):
Similarly, r₀/α = 2, so p(n) grows quadratically. The sum of a quadratically growing sequence diverges.
Models D and E (Logistic):
p(n) → p_max > 0 as n → ∞. The sum of a sequence converging to a positive constant diverges (comparison with ∑ p_max/2 for sufficiently large n). □