Cameron R. Wolfe, Ph.D. (@cwolferesearch): "Continual learning is being positioned as a prerequisite for AGI (i.e., general systems must be adaptable). I spent a large part of graduate school working on continual learning. Over the last month I've revisited the literature, considered its connections / relevance to LLM…"

Make money doing the work you believe in

Continual learning is being positioned as a prerequisite for AGI (i.e., general systems must be adaptable).

I spent a large part of graduate school working on continual learning. Over the last month I've revisited the literature, considered its connections / relevance to LLMs, and captured everything in a long-form blog (scheduled for release tomorrow morning).

Most research on the topic of continual learning is very different from the LLM research we see today. The continual learning problem for LLMs is unique due to scaling. Prior knowledge and data are nearly infinite, which creates a lot of new considerations.

One could argue that the generality of LLMs makes continual learning easier, but it also increases the risk of catastrophic forgetting. Plus, pure model and data scale make the application of even basic techniques more difficult.

As an illustrative example, replay buffers are a common and simple technique to retain a model's knowledge over prior data. But, efficiently maintaining a replay buffer over a multi-trillion-token training corpus is an extremely complex systems problem. We might also not even have access to any of the model's prior data.

Despite these difficulties, it's possible the tools we are currently using in LLM research may naturally lend themselves toward solving continual learning:

- Large-scale multi-task training is fundamental to how LLMs are trained.

- RL training naturally avoids forgetting of prior knowledge.

There are very few "free" wins at LLM scale, but RL might be one of them. Continual learning is not a completely disjoint problem from what we are already trying to solve in LLM research. Rather, continuing on the current trajectory (with modifications to consider different styles of learning) will yield natural progress.

Jan 26

12:49 AM

Make money doing the work you believe in

Log in or sign up