If LLMs are Just Parrots, Humans are Just Gene Propagators

The phrase: “LLMs cannot be truly intelligent / conscious because they are just next-token predictors”.

This line of reasoning, or slight variations of it, have been adopted broadly across people who think they have some kind of insight into how AI and/or the human mind works. The people who parrot this phrase range from cringe tech bros all the way to AI pioneers, cognitive scientists, and philosophers of mind (not that the latter group cannot also be ridiculous).

The reason why this meme spread so successfully, might have to do with the fact that part of it is obviously true: LLMs are trained as generative models to predict the probability of the next token, at least in the pre-training stage. So there is nothing wrong with this, factually. The issue arises with calling them just next-token-predictors, which is often done in debates around whether they can be considered as truly intelligent. A System 1 pass of the phrase may not ring any alarm bells. After all, no-one ever made a big deal of simple autocomplete functions on the iPhone, for good reason.

But we can quickly show how reductive arguments like this fall apart when applied to other cases. Take for instance the phrase “Humans are just gene propagators”. Similarly to the phrase, when removing the word just, the sentence becomes trivially correct. Not only do (a significant part of) humans propagate their genes, but we could even explain the existence of humans based on their success as vehicles to propagate genes (à la selfish gene). But, unless you’re some edgy teenage nihilist, I don’t think you would take this as the basis for rejecting the fields of cognitive science, psychology, sociology etc. In other words, acknowledging that humans can, through one lens, be seen as gene replicators, does not imply that this is the only, or even the most fundamental lens. We could also say “Humans are just carbon oxidizers”. One could potentially find arguments for why this is a more fundamental description than “gene propagators”, but it would also miss the point completely: Reducing humans to one aspect of their behavior or properties may result in interesting thought experiments, but will invariably miss most about their nature. Finally, we may also reduce humans to a generative model, depending on how successful predictive processing theories about the human brain will be.

Coming back to LLMs, we could reduce them even further: Who cares about next-token-prediction, all they do is loss function minimization! Actually, scratch that, LLMs are a useless concept, all that’s happening is a Von-Neumann-Machine stepping through its instructions… you get the point. None of these reductions negates the idea that there might be much more to be said behind the scenes. After all, how does an LLM manage to predict the next token? What kinds of representations are used? Does it have a world model? Etc. People who use the phrase also tend to be quite oblivious to existing work in mechanistic interpretability research.

Much more could be said at this point about emergence, levels of abstraction, the relationship between consciousness and intelligence, etc. But the main point I want to make is that pointing at a loss function is not a valid substitute for an actual engagement with the behaviors and internal representations the model has developed.