“Plasticity as the Mirror of Empowerment” (Abel et al., 2025) defines information-theoretic measures for plasticity (how much an agent is influence by its environment) and empowerment (how much the agent can influence the environment). Towards the end of the paper the authors make a claim that deterministic environments result in zero plasticity. This was initially very counterintuitive to me, so in this post I explore their definitions, show why their claim is true, and provide some intuition.
The authors start by defining agents as maps from observations to actions, and environments as maps from actions
Towards the end of the paper, the authors claim that deterministic environments result in zero plasticity. This was initially very surprising to me. Even in the simplest RL problems, an agent adapts its internal state (and therefore its actions) based on observations. How could plasticity be zero? In the next section I verify this this claim by writing out the math. Then, I provide intuition: when examining an agent-environment interaction solely in terms of distributions over actions and observations, a deterministic environment implies that conditioning on observations does not change the probability of actions. Given previous actions, observations become redundant in the conditioning set because they are fixed functions of those actions. Therefore, action distributions are not influenced by observations, and plasticity is zero.
This counterintuitive result is partly a consequence of the choice of abstractions: their framing can be thought of as bird’s eye view of agents and environments, where one has knowledge of the counterfactuals for both, i.e., “what could have been.” Now let’s go into the math:
The Zero Plasticity Proof
After lots of preliminary definitions, they introduce generalized directed information (GDI): a measure of how two past elements in one sequence of random variables influence future elements in another sequence.
This provides the foundation for understanding how actions affect observations (empowerment) and observations affect actions (plasticity).
Concretely, for any two sequences of discrete random variables
where the term inside the summation is the conditional mutual information, representing the reduction in uncertainty about the current outcome
Then, the plasticity
This brings us to the claim: every deterministic environment
Expanding this using the definition of conditional mutual information yields:
Here,
The key insight is that conditioning on the observation
Now, recall that plasticity is the sum of these conditional mutual information terms (via GDI). Consequently:
Interpretation
This was quite counterintuitive at first. Consider the agent’s perspective: as it receives new information, it changes its internal state (e.g., its Q-function). Clearly it’s adapting, so it must be plastic, right? We must instead take a third-person perspective, viewing distributions over actions alongside a deterministic function mapping actions to observations. Plasticity is a function of how much an action distribution changes when conditioned on previous observations. Since the observations are deterministic functions of actions, they provide no unique information and the change in the distribution is zero.
Crucially, adaptation to new observations is still happening and visible in how the action distribution changes as the agent receives observations. However, in this specific theoretical frame, this change is only a function of the previous action distribution, because there is no observation uncertainty. Any change in the action distributions is purely a function of previous action distributions. Therefore, from a third-person probabilistic sense, the observations did not influence the actions. There is no plasticity (according to their definition).
This definition does not describe the physical capacity of an agent to change via new information, e.g., synaptic plasticity. Instead, it builds upon the authors’ (very general) abstractions of agents and environments. It allows for a probabilistic understanding of how observations and actions interact, without considering the agent’s internal machinery. Exploring where these probabilities “live,” and what they imply about causality and agent design is still unclear to me. Very cool read.
Blogpost 17/100