Formalization of Feedback-Efficient Intelligence
1. Definitions
- \( E \): the environment (a partially observable stochastic dynamical system)
- \( \pi \): the agent’s policy mapping histories to actions, or put more simply, the agent
- \( \hat{M}_t \): the agent’s learned internal latent model of \( E \) at time \( t \)
- \( \delta_t \): drift error of \( \hat{M}_t \), i.e., true and internal prediction distributions
- \( F_t \): feedback from the environment at time \( t \)
- \( U_t \): utility or task performance at time \( t \)
- \( I_t = I(F_t ; \Delta \hat{M}_t) \): mutual information between feedback and model update
- \( \eta_{t} = \frac{I_t}{\Delta U_t} \): feedback efficiency—feedback per unit performance gain at time \( t \)
2. Axiom: Drift in Internal Representations
Unbounded Drift Axiom:
If an internal model is not continually recalibrated with external feedback, then:
\[ \lim_{t \to \infty} \delta_t = \infty \]
(under finite precision and chaotic/stochastic environments)
3. Theorem: No Self-Sufficient Intelligence
Nonzero Feedback Theorem:
No intelligent system can maintain bounded error \( \delta_t < \varepsilon \) without cumulative feedback:
\[ \int_0^\infty \|F_t\| \, dt > 0 \quad \forall \varepsilon > 0 \]
4. Intelligence as Feedback Efficiency
Define feedback needed per unit of utility improvement:
\[ \eta_{t}(\pi) = \limsup_{t \to \infty} \frac{I(F_t ; \Delta \hat{M}_t)}{\Delta U_t} \]
Then agent \( \pi \) is more intelligent than agent \( \pi' \) over task class \( T \) at time \( t \) if:
5. Corollary: The Singularity is Asymptotic
Even self-improving superintelligences cannot reach perfect intelligence (i.e., \( \eta = 0 \)) due to irreducible environmental entropy:
Here, \( \mathcal{A}_{\text{phys}} \) denotes the set of all physically realizable agents, that is, agents embedded in environments with bounded memory, finite energy, and nonzero entropy.
6. Model Compression View
- \( D_{\mathrm{KL}}(P_{E_{t}} \| \hat{P_{t}}) \): divergence between true and internal prediction distributions at time \( t \), where \( \hat{P_{t}} \) is the internal model at time \( t \) and \( P_{E_{t}} \) is the true probability distribution at time \( t \)
Define a new quantity \( \xi_{t} \), which captures the agent’s compression inefficiency. That is, how much model error remains per bit of feedback about the model at time \( t \):
In contrast to \( \eta_{t} \), which quantifies feedback cost per performance gain, \( \xi_{t} \) quantifies model accuracy per bit of feedback. Both reflect intelligence in different regimes: outer utility vs inner alignment.
Interpretation
This reframes intelligence not as raw computational power or self-sufficiency, but as the ability to compress reality, updating latent models with minimal feedback. The lower the feedback efficiency \( \eta_{t} \), the more intelligent the agent in a task-general sense. The lower the compression inefficiency \( \xi_{t} \), the more precisely it internalizes external structure. In both views, intelligence is fundamentally defined by how efficiently error is reduced per bit of external signal.