Srikar Kashyap Pulipaka

NLP Researcher · Multilingual NLP · Synthetic Data · LLM Evaluation

profile.jpg

Cummins Inc

Indiana University (MS CS)

United States

I build NLP systems for multilingual classification, synthetic data generation, and LLM evaluation.

At ACL 2026, I’m presenting work on multilingual polarization detection (SemEval Task 9, 2nd overall) and toxicity detection in gaming chat (EEUCA, 4th/35). I currently work on RAG and agentic systems at Cummins.

srikar.kashyap@gmail.com

news

Jul 01, 2026 Poster at ACL 2026Multilingual Polarization Detection (SemEval Task 9, 2nd place).
Jul 01, 2026 Talk at ACL 2026Toxicity Detection in Gaming Chat (EEUCA, 4th/35).
Jan 01, 2025 HICSS 2025 — ML OSS vulnerability assessment pipeline published.

latest posts

selected publications

  1. SemEval
    PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation
    Srikar Kashyap Pulipaka
    arXiv preprint arXiv:2605.05159Paper website with pipeline, results, and analysis. 2nd place overall, 22 languages. , 2026
  2. EEUCA
    PSK@ EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat
    Srikar Kashyap Pulipaka
    arXiv preprint arXiv:2605.07201Paper website with pipeline, results, and validation-trap analysis. 4th place out of 35 teams (F1-macro: 0.6234). , 2026
  3. PAN
    IUCL at PAN 2024: Using Data Augmentation for Conspiracy Theory Detection
    S Mhalgi, S K Pulipaka, and S Kübler
    In CLEF Working Notes1st place, F1 0.83. , 2024