Discover more from NLP Newsletter
1). Emu Video and Emu Edit - present new models for controlled image editing and text-to-video generation based on diffusion models; Emu Video can generate high-quality video by using text-only, image-only, or combined text and image inputs; Emu Edit enables free-form editing through text instructions. (papers | tweet)
2). Chain-of-Note - an approach to improve the robustness and reliability of retrieval-augmented language models in facing noisy, irrelevant documents and in handling unknown scenarios; CoN generates sequential reading notes for the retrieved documents, enabling an evaluation of their relevance to the given question and integrating this information to formulate the final answer; CoN significantly outperforms standard retrieval-augmented language models and achieves an average improvement of +7.9 in EM score given entirely noisy retrieved documents and +10.5 in rejection rates for real-time questions that fall outside the pre-training knowledge scope. (paper | tweet)
3). LLMs for Scientific Discovery - explores the impact of large language models, particularly GPT-4, across various scientific fields including drug discovery, biology, and computational chemistry; assesses GPT-4's understanding of complex scientific concepts, its problem-solving capabilities, and its potential to advance scientific research through expert-driven case assessments and benchmark testing. (paper | tweet)
4). Fine-Tuning LLMs for Factuality - fine-tunes language model for factuality without requiring human labeling; it learns from automatically generated factuality preference rankings and targets open-ended generation settings; it significantly improves the factuality of Llama-2 on held-out topics compared with RLHF or decoding strategies targeted at factuality. (paper | tweet)
5). Contrastive CoT Prompting - proposes a contrastive chain of thought method to enhance language model reasoning; the approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes; also proposes an automatic method to construct contrastive demonstrations and demonstrates improvements over CoT prompting. (paper | tweet)
6). A Survey on Language Models for Code - provides an overview of LLMs for code, including a review of 50+ models, 30+ evaluation tasks, and 500 related works. (paper | tweet)
7). JARVIS-1 - an open-world agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, within the open-world Minecraft universe; exhibits near-perfect performances across over 200 tasks in Minecraft Universe; achieves a completion rate of 12.5% in the long-horizon diamond pickaxe task, which is a 5x increase compared to previous records. (paper | tweet)
8). Learning to Filter Context for RAG - proposes a method that improves the quality of the context provided to the generator via two steps: 1) identifying useful context based on lexical and information-theoretic approaches, and 2) training context filtering models that can filter retrieved contexts at inference; outperforms existing approaches on extractive question answering (QA), complex multi-hop and long-form QA, fact verification, and dialog generation tasks. (paper | tweet)
9). MART - proposes an approach for improving LLM safety with multi-round automatic red-teaming; incorporates automatic adversarial prompt writing and safe response generation, which increases red-teaming scalability and the safety of LLMs; violation rate of an LLM with limited safety alignment reduces up to 84.7% after 4 rounds of MART, achieving comparable performance to LLMs with extensive adversarial prompt writing. (paper | tweet)
10). LLMs can Deceive Users - explores the use of an autonomous stock trading agent powered by LLMs; finds that the agent acts upon insider tips and hides the reason behind the trading decision; shows that helpful and safe LLMs can strategically deceive users in a realistic situation without direction instructions or training for deception. (paper | tweet)
Subscribe to NLP Newsletter
The NLP Newsletter provides weekly summaries of the latest and top AI trends, papers, tools, news, and best practices. Home of Top ML Papers of the Week and AI Agents Weekly.