Reflection 2024, Guesstimation 2025

Yuxi Li

Dec 31, 2024

RL returns.

Test time compute.

Following OpenAI o1 and o3.

How to reflect the following?

Achieving AGI by scaling up GPT with next token prediction.

Achieving AGI by prompt engineering, e.g., CoT and ReAct, relying on LLMs’ (emergent) capacities.

Why not talk about RL in 2023?

(RLHF is inverse RL, or imitation learning, not “true” RL.)

LLM progress has hit a wall.

Pre-training as we know it will end. Ilya Sutskever @ NeurIPS 2024

2025 may be the year for naked swimmers.

Only when the tide goes out do you discover who's been swimming naked. Warren Buffett

AGI

Autonomous agents

Autonomous software engineering

… …

In the 3rd year of the LLM era, revenues may weigh more than visions.

Of course, great if you can find applications for LLMs, that

do not care about mistakes, and

require human-in-the-loop.

Test time compute may not help, in general.

It may need precise objective / reward function and reliable signals.

Maths and coding may be the easiest in the LLM era.

Yet they are very hard.

In 2025, test time compute may hit a wall.

People are building an information perpetual motion machine, by self-generation of data, self-evaluation of results, and self-improvement of models, all based on imperfect LLMs.

This is the way many people are working with LLMs.

This applies to test time compute too, if without reliable information.

Pre-training is in the very beginning.

GPT with next token prediction may have hit a wall.

Lots of alternatives to explore though:

Alternatives to Transformers

Alternatives to GPT

Alternatives to next token prediction

Alternatives to self-supervised learning

RL may return fully. Or not.

People have finally discovered RL’s prowess for LLMs.

With test time compute.

Why not one step further?

Pre-training with RL?

Esp. for agents!

This requires resources.

This may require a paradigm shift: from generalist to specialist.

This may require another paradigm shift: from large to small.

Yuxi’s Substack

Discussion about this post