An overview of instruction backtranslation method (source)
Image credit: Meta AI
Unveiling a groundbreaking "instruction backtranslation" method for data augmentation, this paper heralds a new era for instruction-following models. Through iterative self-augmentation and self-curation of data, the authors engineer high-caliber instruction-following examples. Notably, the resulting approach far surpasses non-distilled baselines across multiple benchmarks and exhibits formidable zero-shot performance across a diverse array of NLP tasks.
AI systems that can interact with users through natural language are transforming the digital landscape. However, creating instruction-following models that can navigate such interactions comes with a hefty price tag and a significant time commitment, largely due to the need for data collection and labeling. Addressing this challenge, the authors introduce a game-changing method to generate high-quality, self-supervised data, enabling the scaling up of instruction-following models.
The authors present a novel "instruction backtranslation" approach to data augmentation. Central to this strategy are two key steps: Self-augment and Self-curate. The process kicks off with a seed set of human-annotated (instruction, output) examples, complemented by a web corpus of unlabelled data. During self-augmentation, instructions are crafted for the unlabelled data, while the self-curation phase identifies high-quality examples for fine-tuning the base model. This cyclical process yields progressively enhanced instruction-following models.
Employing 3,200 examples from the Open Assistant dataset as human-annotated seed data, the authors fine-tune the pre-trained LLaMa model, experimenting with various parameters (7B, 33B, 65B). The English segment of the Clueweb corpus serves as the unlabelled data source.
When pitted against instruction datasets drawn from diverse sources, the authors' instruction backtranslation approach clearly shines. Their creation, the "Humpback" model, leaves other non-distilled models in the dust at both 65B and 33B scales. Human evaluations validate the model's enhanced performance, and further testing on commonsense reasoning benchmarks and the massive multitask language understanding (MMLU) benchmark reveals heightened zero-shot accuracy across all domains.
Humpback is preferred to both open source (source)
The authors undertake ablation studies to dissect the two pillars of their method: data selection quality and joint training. Interestingly, they discover that self-curation performance experiences an uptick in the second iteration, yielding improved instruction-following outcomes.
The innovative "instruction backtranslation" technique for data augmentation empowers the expansion of instruction-following models through iterative self-augmentation and self-curation of data.
This approach trounces non-distilled baselines across numerous benchmarks and demonstrates robust zero-shot performance on a vast spectrum of NLP tasks.
The recurring self-curation process bolsters data selection quality, translating into superior instruction-following performance.
The "Humpback" model promises to be a boon for applications demanding natural language understanding and generation, such as chatbots, virtual assistants, and customer support systems. The capacity to follow instructions and produce accurate responses is pivotal in these contexts. With its iterative data augmentation and curation technique, the "Humpback" model markedly enhances instruction-following performance, establishing itself as an invaluable asset for developers and organizations striving to amplify the prowess of their AI-driven conversational systems. Source
Nota: This article is shared on Medium too.