About me
Today, we work.
👋 Hey there! I’m Soham (so-hum). Pleased to meet you!
I’m a Data Scientist at AppZen, where I work on expense auditing by leveraging transformers and LLMs to improve document categorization. Recently, I delivered a fraud detection model that combines Naïve Bayes, Probability Theory, and Llama-3.1 to identify fraudulent submissions in automated expense platforms.
I earned my Master’s from Carnegie Mellon University (CMU) in 2022, specializing in Language Technologies. My research interests include:
LLM distillation & fine-tuning Multimodal Machine Learning LLM-based audio interactions Natural Language Processing Reinforcement Learning Artificial Social Intelligence (My dream is to build an agent as socially intelligent as humans!)
Research & Experience
I have presented my research at ACL, ISLS, APSIPA, and NeurIPS. My work at CMU focused on interactive intelligence, where I developed an on-device multimodal virtual teaching assistant under the guidance of Prof Carolyn Rose.
During my internship at Apple , I explored on-device LLMs, building a Jax-based online distillation framework for distributed GPUs and evaluating the TinyBERT distillation algorithm for decoder-based models.
I also collaborated with Prof Chng Eng Siong at NTU Singapore on speech and audio representations, using a curriculum-learning approach to improve model convergence.
Looking Ahead
My goal is to maximize my learning, deepen my expertise, and develop applications that enhance user experiences. I believe in the power of collaboration and meaningful technology to make everyday life better ☀️.
Happiness can be found, even in the darkest of times, if one only remembers to turn on the light.