Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat
Date:
Oral presentation at ACL 2026 for the EEUCA 2026 Shared Task on Understanding Toxic Behavior in Gaming Communities.
Our system classifies World of Tanks chat messages into six toxicity categories using Llama 3.1 8B with LoRA fine-tuning and carefully calibrated 5% synthetic data augmentation. We provide extensive analysis revealing a “validation trap” phenomenon where high validation performance correlates with poor test transfer.
Result: 4th place out of 35 participating teams (F1-macro: 0.6234).
