RAG for Enterprise Product Search

Published:

Built a production RAG pipeline at Cummins for answering product-related queries from internal store data.

Pipeline:

  • Extracted raw string data from SQL tables and generated structured Pydantic outputs using LLMs, validated by human domain experts
  • Hybrid retrieval combining dense embedding similarity and BM25 keyword queries with cross-encoder reranking
  • Evaluated using both human evaluation and automated LLM-as-judge metrics for top-k retrieval accuracy

Tech: Python, Langchain, ChromaDB, SQL, Pydantic