About me

Today, we work.
- Carnegie Mellon University

👋 Hey there! I’m Soham (so-hum). Pleased to meet you!

I’m a Data Scientist at AppZen, where I work on expense auditing by leveraging transformers and LLMs to improve document categorization. Recently, I delivered a fraud detection model that combines Naïve Bayes, Probability Theory, and Llama-3.1 to identify fraudulent submissions in automated expense platforms.

I earned my Master’s from Carnegie Mellon University (CMU) in 2022, specializing in Language Technologies. My research interests include:

LLM distillation & fine-tuning Multimodal Machine Learning LLM-based audio interactions Natural Language Processing Reinforcement Learning Artificial Social Intelligence (My dream is to build an agent as socially intelligent as humans!)

Research & Experience

I have presented my research at ACL, ISLS, APSIPA, and NeurIPS. My work at CMU focused on interactive intelligence, where I developed an on-device multimodal virtual teaching assistant under the guidance of Prof Carolyn Rose.

During my internship at Apple , I explored on-device LLMs, building a Jax-based online distillation framework for distributed GPUs and evaluating the TinyBERT distillation algorithm for decoder-based models.

I also collaborated with Prof Chng Eng Siong at NTU Singapore on speech and audio representations, using a curriculum-learning approach to improve model convergence.

Looking Ahead

My goal is to maximize my learning, deepen my expertise, and develop applications that enhance user experiences. I believe in the power of collaboration and meaningful technology to make everyday life better ☀️.

Happiness can be found, even in the darkest of times, if one only remembers to turn on the light.
- Albus Dumbledore, created by J. K. Rowling

Soham Dinesh Tiwari

Research & Experience

Looking Ahead