Google's Alternative to RAG: Retrieval Interleaved Generation (RIG)
Wouldn't it be great if LLMs could accurately answer questions like:
- What's the current population of New York City?
- How many new COVID-19 cases were reported in the US last week?
- What percentage of the world's population lives below the poverty line?
That's exactly what Google is promising with a new approach that's integrating LLMs with Data Commons, an open-source database of public data.
By grounding LLMs in verifiable, up-to-date data, they want to unlock a new level of reliability.
The researchers explored two methods:
1. Retrieval Interleaved Generation (RIG): The LLM is trained to generate natural language queries to fetch relevant data from Data Commons.
2. Retrieval Augmented Generation (RAG): Relevant data tables are retrieved from Data Commons and used to enhance the LLM's prompt.
Both methods showed significant improvements in the factual accuracy of LLM outputs across a range of queries.
To learn more about DataGemma and RIG, read my latest article on LLM Watch: