Pascal Biese (@pascalbiese)

Google's Alternative to RAG: Retrieval Interleaved Generation (RIG)

Wouldn't it be great if LLMs could accurately answer questions like:

- What's the current population of New York City?

- How many new COVID-19 cases were reported in the US last week?

- What percentage of the world's population lives below the poverty line?

That's exactly what Google is promising with a new approach that's integrating LLMs with Data Commons, an open-source database of public data.

By grounding LLMs in verifiable, up-to-date data, they want to unlock a new level of reliability.

The researchers explored two methods:

1. Retrieval Interleaved Generation (RIG): The LLM is trained to generate natural language queries to fetch relevant data from Data Commons.

2. Retrieval Augmented Generation (RAG): Relevant data tables are retrieved from Data Commons and used to enhance the LLM's prompt.

Both methods showed significant improvements in the factual accuracy of LLM outputs across a range of queries.

To learn more about DataGemma and RIG, read my latest article on LLM Watch:

Google's RAG Alternative: Retrieval Interleaved Generation (RIG)

Sep 16, 2024

3:40 PM