Yet another Meta language model: Atlas

In the latest, tech giant Meta released a new language model named Atlas. It is a retrieval-augmented language model with strong performance in a few hits on question-answering and fact-checking tasks, Meta adds.

In the paper titled, “Few-shot Learning with Retrieval Augmented Language Models,” the researchers say they conducted assessments on a variety of tasks such as MMLU, KILT, and NaturalQuestions. This model achieves 42% accuracy on natural questions using only 64 examples and outperforms PaLM (a 540B parameter model) by 3% despite having over 50 times fewer parameters (11B).

Augmented recovery model

In the article, the researchers discuss the need for this model to emerge. They add that LLMs have already shown ability to get results in a few taps, but for question answering and fact-checking where knowledge is essential, “a massive number of parameters to store knowledge seem to be needed.”

This is where augmented retrieval models come in, as they are able to perform knowledge-intensive tasks without needing too many parameters. The researchers add that they wanted to see if these models work in few-shot settings.

“We investigate whether learning in a few hits requires models to store a large amount of information in their parameters, and whether memorization can be decoupled from generalization,” the researchers say.

According to the researchers, Atlas retrieves relevant documents using a general-purpose dense retriever using a dual-encoder architecture based on the Contriever. After that, the documents are processed through a sequence-to-sequence model using the Fusion-in-Decoder architecture.

Image: Learning in a few steps with retrieval of augmented language models

Researchers are investigating the impact of different techniques for training Atlas on its performance in a few shots on tasks such as fact-checking and question-answering. “We find that joint pre-training of components is crucial for hit-and-miss performance,” the paper adds. The model performs well in resource-rich environments and few shots. It demonstrates SOTA results on a few natural questions (+2.8%), TriviaQA (+3.3%), FEVER (+5.1%). Atlas is very strong in the traditional comprehensive training set parameters and establishes a new state of the art on NaturalQuestions by 8%, and TriviaQA by 9% and on 5 KILT tasks, informs Meta.

Image: Learning in a few steps with retrieval of augmented language models

Architecture

The research team follows the text-to-text framework. Tasks follow this path:

  • The system receives a text query as input
  • It generates text output

For classification tasks, this query is in the form of a textual input and the model generates the “lexicalized class label”.

Image: Learning in a few steps with retrieval of augmented language models

The model is based on two sub-models, informs the document.

  • The retriever – Here the retriever based on the Contriever. It is an information retrieval technique based on continuous dense embeddings.
  • Language model – The team uses the T5 sequence-to-sequence architecture. It uses Fusion-in-Decoder modification of sequence-to-sequence models and processes each document independently in the encoder.

For any task such as answering a question or generating articles, the model follows a similar approach. It starts by retrieving the top-k of relevant documents from a large corpus of text with the retriever. Then these documents are passed to the language model, along with the query, which generates the output. The catcher and the language model are based on pre-trained transformer networks according to the article.

“Atlas outperforms much larger non-augmented models on answering a few questions (NaturalQuestions and TriviaQA) and fact-checking (FEVER), and is competitive with various very large models on a wide range of real-world exams”, adds Meta.

Meta tell us about the other benefits of Atlas. The passages retrieved can be inspected for better interpretability and the corpus extracted from Atlas can be modified, or even completely exchanged. This ensures that Atlas can be kept up to date without needing to be recycled.

About Clara Barnard

Check Also

On-the-road review: Hyundai Ioniq5 Limited electric vehicle

Hyundai Motor Co. (including Kia and Genesis) will soon be America’s best-selling electric vehicle maker, …