Srikar Kashyap Pulipaka

NLP Researcher · Multilingual NLP · Synthetic Data · LLM Evaluation

profile.jpg

Cummins Inc

Indiana University (MS CS)

United States

I’m an NLP researcher interested in multilingual NLP, synthetic data augmentation, and LLM evaluation. I build systems that make language models work reliably across languages and domains.

At ACL 2026, I’m presenting a poster on multilingual polarization detection (2nd place, SemEval Task 9) and giving a talk on toxicity detection in gaming chat (4th/35, EEUCA 2026).

I currently work at Cummins Inc where I build RAG and agentic systems for enterprise knowledge retrieval. Previously, I was a research assistant at the Kelley Data Science and AI Lab at Indiana University, working on LLM fine-tuning for mental health QA and stance detection. I hold an MS in Computer Science from Indiana University.

Research interests: Multilingual text classification, synthetic data generation and quality filtering, preference optimization, low-resource NLP, retrieval-augmented generation, LLM-as-judge evaluation.

Feel free to reach out — srikar.kashyap@gmail.com.

news

Jul 01, 2026 Poster at ACL 2026Multilingual Polarization Detection (SemEval Task 9, 2nd place).
Jul 01, 2026 Talk at ACL 2026Toxicity Detection in Gaming Chat (EEUCA, 4th/35).
Jan 01, 2025 HICSS 2025 — ML OSS vulnerability assessment pipeline published.
Sep 01, 2024 IUCL at PAN 2024 — Conspiracy theory detection (1st place).

latest posts

selected publications

  1. SemEval
    PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation
    Srikar Kashyap Pulipaka
    arXiv preprint arXiv:2605.05159Interactive results explorer (draft). 2nd place overall, 22 languages. , 2026
  2. EEUCA
    PSK@ EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat
    Srikar Kashyap Pulipaka
    arXiv preprint arXiv:2605.072014th place out of 35 teams (F1-macro: 0.6234). , 2026
  3. PAN
    IUCL at PAN 2024: Using Data Augmentation for Conspiracy Theory Detection
    S Mhalgi, S K Pulipaka, and S Kübler
    In CLEF Working Notes1st place, F1 0.83. , 2024