Iterative improvements from feedback. Interact with the world; perceive the world; transform the world. Grounding and agency. Reinforcement learning is the natural and right framework.
Couldn't agree more. Your insight on applying AlphaGo's RL approach to LLMs is realy spot on. The grounding and agency argument feels like the key for next-gen models. Fantastic read!
Couldn't agree more. Your insight on applying AlphaGo's RL approach to LLMs is realy spot on. The grounding and agency argument feels like the key for next-gen models. Fantastic read!
thx for more info about LLM with RL :)