Kaustubh Dholé

Logo

View My GitHub Profile

I’m a PhD candidate at Emory University’s Department of Computer Science working with Prof. Eugene Agichtein. I deal with a wide variety of problems which generally fall under Natural Language Processing & Information Retrieval.

I completed my bachelor’s at BITS Pilani, India, spent a year and a half at Tata Institute of Fundamental Research (TIFR, Mumbai), after which I worked for 6 years building the AI Agent Amelia at IPsoft Amelia.ai (now SoundHound) in the wonderful cities of Bangalore & New York, collaborating with Prof. Chris Manning. I was fortunate to work with the leadership of Amelia, including Uday Chinta and Chetan Dube and many fellow managers of other teams. I led the R&D team of around 15-20 R&D engineers and scientists working on diverse NLP topics across dialogue modeling (NLU, and NLG), VerbNet & PropBank parsing, KB-based QA, data augmentation, semantic parsing, relation extraction & dialog retrieval, ranking, generation, other real-life conversational AI problems (handling agent fallback, agent ambiguity, etc) & offline AI like building interfaces for efficient and smart data annotation, and active learning, etc. Much of the work also involved managing a team of back-end, front-end, and UX developers for creating different modules of the Amelia stack.

In the summers of the past 4 years (2022 to 2025), I collaborated with the Natural Understanding Team, (now Alexa AGI) at Amazon, Alexa in New York, and San Jose on multi-task learning for their LLMs and creating simulators for training LLMs, in the Search Experience Science team at Seattle ⛰️, and the Stores Foundational AI team on Pretraining and Midtraining Dataset Valuation for Reasoning Tasks.

Currently, I am focusing on

- RAG Evaluation
- Identifying Datasets for Pretraining and Midtraining for Reasoning-Based Tasks
- Post-Training for Math Reasoning

Experience

Applied Scientist (Summers)
~1 year (22, 23, 24, 25)
📍Amazon AGI, Alexa Search Experience Science, Stores Foundational AI
AI R&D Lead (Science)
~3.5 years 2017–2021
📍IPsoft Amelia.ai (now SoundHound), Bangalore & New York
AI R&D Engineer (Science)
~3 years 2015–2017
📍IPsoft Amelia.ai (now SoundHound), Bangalore & New York
Researcher
~1.5 years
📍TIFR, Mumbai

Education

AI PhD Researcher
RAG Evaluation (Present)
📍Emory University
Masters (CS Track)
RAG, Evaluation
📍Emory University
Engineering (Hons)
Electrical Engineering
📍BITS Pilani
Areas of Interest:
Reasoning,NLG Evaluation, Retrieval, Retrieval Augmented Generation, RAG Evaluation
Other Areas I'm happy to collaborate or have coffee chat ons:
Dialog Systems, Graph Neural Networks, Data Augmentation, Efficient Transformers, Privacy Preserving ML, Bigger Picture of LLMs

Publications:
Most upto date stuff can be found on Semantic Scholar and Google Scholar.

Workshops:
- Co-organizer of the Generation, Evaluation & Metrics Workshops GEM 2021, GEM 2022, GEM 2023, 2025.
- Co-organizer of the wisdom-of-researchers collaboration to create the largest data augmentation repositoryNL-Augmenter and a key contributor of LLM task benchmark BIG-Bench.
Recent Mentoring/Speaking:
- Presented some of the work on RAG evaluation at the Workshop on Task Focussed IR in the Era of Generative AI at Redmond, Microsoft Research
- Gave a talk on Retrieval Augmented Generation at the University of Edinburgh, 2024 while on my visit to present LLM based reformulation at ECIR 2024, Scotland
- Intelligence Advanced Research Projects Activity, USA (IARPA) funded project BETTER : Presented IR work at IARPA Demo Day, Maryland. Check related publications.
- Mentored 5 graduate students on efficient variants of GNNs at the London Geometry & Machine Learning Summer School, 2022
- Invited as Speaker & Guest of Honour at VIT's ICAITR 2021, Mumbai. Gave a short talk on "NLP in the Past Decade"
- Bioinformatics article was featured on Global Medical Discovery [ISSN 1929-8536] as a Key Scientific Article contributing to excellence in biomedical research.
Work in Media
Why India needs to counter AI bias and stereotypes Lokmat Times, 2025 (Full Paper Pg. 10)
AI Is Spreading Old Stereotypes to New Languages and Cultures Wired Magazine, 2025
This data set helps researchers spot harmful stereotypes in LLMs MIT Technology Review, 2025
444 Authors From 132 Institutions Release BIG-bench: A 204-Task ‘Extremely Difficult and Diverse’ Benchmark for Large Language Models Synced Technology Review, 2022
55 Researchers From 44 Institutions Propose GEM, a ‘Living Benchmark’ for NLG Synced Technology Review, 2021
Recent Lectures on Retrieval Augmented Generation (May 2024):

Video 1 Video 2 Video 3 Video 4 Video 1 Video 1
Other Projects:

Video 2 Video 2 Video 3 Video 4 Video 4

If you want to get in touch or are interested in collaborating, feel free to reach me at firstname.lastname@emory.edu (or LinkedIN or Twitter where I’m sometimes active.)

Long ago, I used to maintain a personal blog on WordPress where I mostly wrote non-NLP stuff on rare occasions! You can find some of my random writings on Politics, Linguistics, some book reviews and sometimes when I’ve gone backpacking! One serious advice - cook this!

Test out your AI skills at LLM Quiz Time and Quiz Badminton, or check what are the commmon words between languages (United Lexicons).

Mentoring at Amelia R&D (2015 to 2021):
I had the privilege of mentoring/managing several great individuals, particularly on NLP projects at Amelia R&D. Most of these projects were as short as 1 month to as long as 1 year. Some are listed here, in no particular order:
R&D/Senior R&D Engineers & Scientists: Krishna Mohan Barakam, Ashish Srivastava, Aadesh Gupta, Abhinav Bhatt, Arpan Kulshreshtha, Priyank Soni, Venkatesh Magham, Anurag Kashyap, Kaustav Dutta, Ramavtar Malav, Vishwa Teja, Manjunath Hegde, Roopesh Mangal, Mohit Rohatgi, Rohit Kalra
Interns: Bhargav Sagiraju, Chandra Reddy, Pranav Kamojjhala