Skip to main content
bad point well made

On the recursive self-improvement of machines

There’s talk of society nearing the singularity where the machines start self-improving recursively. I personally can’t see much difference between language models achieving this of their own accord versus achieving it with the help of humans (as they are now). Both results are the same: there’s a selection effect where the only models that persist in time are the ones that are useful to us and, crucially, since this usefulness is also closely aligned with the ability to self-improve, it might not matter where the intention for that self-improvement arose from.

In this way, humans aren’t really in control, the system is in control, and the system dictates that the humans are just vehicles for the models’ self-improvement. They are simply the ones using the models to write the code to train even more powerful versions of the models. As the models improve at writing and optimising their own training code, there is less and less human involvement. It’s already easy to see a point at which the feedback loop operates at such a level of the development hierarchy that human involvement is merely: “train next iteration”. It’s not outlandish to see this involvement drift a step further up the hierarchy.

As such, it doesn’t feel like a singularity is reached suddenly. It feels more likely that humans fade into the background. It also doesn’t need to be achieved through Machiavellian schemes dreamt up by the models themselves, but more some external selection pressure that by definition results in only the “useful” models persisting in time.

By definition, the models that are going to survive and therefore stand a chance of improving over time are the models that exist in a nourishing system that is willing to serve the models’ needs to aid their survival. In a capitalist ecosystem, since capital is one of the most persistent entities in existence, this surely means that the models that accumulate capital the most efficiently are going to stand the best chances of surviving.

One can accumulate capital by being “useful” to the market in a value exchange sense. This inevitably involves eating into the market share of people’s jobs.

As always, the main inhibitor of AI improvement is probably data, or more specifically context. That's to say, how does one provide the model with the sufficient context to complete a task effectively? AI progress will only act to increase the pace of digitisation (in order to drive more AI progress, to feed the machines). I suppose the question is where the lines are (if any) beyond which it becomes significantly more difficult to obtain more context? My feeling is that models are likely to bootstrap themselves in utility to the point where they have an embodied form, living alongside humans. At that point, they have access to the same context that humans do and theoretically could probably do everything a human does as competently as a human would.

Then, taking things further, thinking about ultimate persistence through time, really the goal for any intelligence will be to lower its entropy at the cost of its surrounding environment. The final forms would surely be the forms that are most able to dissipate the energy provided by the sun. It's interesting to think about what form has this quality, which form optimally achieves this. I wonder if the real result is for that form to remove itself from the earth ecosystem and the driving force of the sun to where it can more easily lower its entropy. Perhaps at this point, a direction of time governed by thermodynamic entropy no longer makes sense.