Soumyabrata Pal

Adobe Research, India
email: soumyabratapal13 at gmail dot com, soumyabratap at adobe dot com
[Google scholar] [DBLP] [Research Statement]

About me

Currently, I am a Research Scientist at Adobe Research, Bangalore (India). Prior to this, I spent two years as a postdoc at Google Research, India working with Dr. Prateek Jain and Dr. Karthikeyan Shanmugam . Before that, I completed my Ph.D in the Computer Science Department (CICS) at the University of Massachusetts Amherst advised by Dr. Arya Mazumdar. During that time, I was a Visiting Graduate Student at the University of California San Diego from May - November 2021. I had also spent the summer of 2019 as a Research Intern at Ernst & Young AI Lab at Palo Alto and Spring 2020 as an Applied Scientist Intern at Amazon Search (Berkeley). Even earlier, I graduated from Indian Institute of Technology, Kharagpur in August 2016 with a Bachelor's degree in Electronics and Electrical Communication Engineering.

Research

My research interests are LLM Efficiency and Theoretical Machine Learning focused on Non-convex Optimization and Online Learning. More concisely, I love Statistical recovery/reconstruction problems under different reasonable structural assumptions on the data generating mechanism such as sparsity, low-rank, presence of latent clusters among others. Nowadays, I am working on designing algorithms in offline/online/hybrid systems aimed at incorporating personalization efficiently at scale. Most of my work so far can be categorized into five topics namely 1) Scalable Personalization via Low Rank and Sparse Decomposition 2) Multi-agent Online Learning via Collaborative Filtering 3) Latent Variable models - Mixtures of Linear Regression, Linear Classifiers and Distributions 4) Generative models for Graph Clustering - Geometric Block Model and 5) Active learning for Semi-supervised clustering - Disjoint Clusters, Overlapping Clusters and Fuzzy Clusters

Recent News

2 New papers in NeurIPS 2023.

New papers in JMLR, ICLR 2023, AISTATS 2023.

New paper in ALT 2022!! Also started as a postdoc at Google Research!

2 new papers in NeurIPS 2021 and 1 new paper in ITCS !!

Started as a Visiting Graduate Student at UCSD from May 2021!!

PhD Dissertation

Mixture Models in Machine Learning

Preprints

Improved Algorithms for Stochastic Linear Bandits with Prior Observations via D-optimal Exploration
with Sushant Vijayan, Karthikeyan Shanmugam and Arun Sai Suggala
In Submission.

Journal Publications

Random Subgraph Detection Using Queries
with Wasim Huleihel and Arya Mazumdar
Journal of Machine Learning Research (JMLR), 2024.
Support Recovery in Mixture Models with Sparse Parameters
with Arya Mazumdar
IEEE Transactions on Information Theory, 2024. Preliminary version appeared in International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
Improved Support Recovery in One-Bit Compressed Sensing
with Namiko Matsumoto and Arya Mazumdar
IEEE Transactions on Information Theory, 2024. Preliminary version appeared in Innovations in Theoretical Computer Science (ITCS), 2022.
Community Recovery in the Geometric Block Model
with Sainyam Galhotra, Arya Mazumdar and Barna Saha
To appear, Journal of Machine Learning Research (JMLR), 2023. Shorter versions accepted in RANDOM 2019 and AAAI 2018
Trace Reconstruction: Generalized and Parameterized
with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
IEEE Transactions on Information Theory, 2021. Preliminary version appeared in European Symposium on Algorithms (ESA), 2019.
Semisupervised Clustering by Queries and Locally Encodable Source Coding
with Arya Mazumdar
IEEE Transactions on Information Theory, 2021. Preliminary version appeared in Advances in Neural Information Processing Systems (NeurIPS), 2017.

Conference Publications

By year

PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks
with Soumya Suvra Ghosal Koyel Mukherjee and Dinesh Manocha
North American Chapter of the Association for Computational Linguistics (NAACL), 2025.
Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD
with Aniket Das, Dheeraj Nagaraj, Arun Suggala, Prateek Varshney
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Online Matrix Completion: A Collaborative Approach with Hott Items
with Dheeraj Baby
International Conference on Machine Learning (ICML), 2024.
Private and Efficient Meta-Learning with Low Rank and Sparse Decomposition
with Prateek Varshney, Prateek Jain, Abhradeep Guha Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy, and Gaurav Srivastava
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.
Improving Mobile Maternal and Child Health Care Programs: Collaborative Bandits for Time slot selection
with Milind Tambe, Arun Suggala, Karthikeyan Shanmugam, and Aparna Taneja
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024.
Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints
with Arun Sai Suggala, Karthikeyan Shanmugam and Prateek Jain
Advances in Neural Information Processing Systems (NeurIPS), 2023.
Nash Regret Guarantees for Linear Bandits
with Siddharth Barman and Ayush Sawarni
Advances in Neural Information Processing Systems (NeurIPS), 2023.
Optimal Algorithms for Latent Bandits with Cluster Structure
with Arun Sai Suggala, Karthikeyan Shanmugam and Prateek Jain
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023.
Online Low Rank Matrix Completion
with Prateek Jain
International Conference on Learning Representations (ICLR), 2023.
On Learning Mixture of Linear Regressions in the Non-Realizable Setting
with Abhishek Ghosh, Arya Mazumdar and Rajat Sen
International Conference on Machine Learning (ICML), 2022.
On Learning Mixture Models with Sparse Parameters
with Arya Mazumdar
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians
with Sami Davies, Arya Mazumdar and Cyrus Rashtchian
Algorithmic Learning Theory (ALT), 2022.
Support Recovery in Universal One-bit Compressed Sensing
with Arya Mazumdar
The 13th Innovations in Theoretical Computer Science (ITCS), 2022
Support Recovery of Sparse Signals from a Mixture of Linear Measurements
with Venkata Gandikota and Arya Mazumdar
Advances in Neural Information Processing Systems (NeurIPS), 2021.
Fuzzy Clustering with Similarity Queries
with Wasim Huleihel and Arya Mazumdar
Advances in Neural Information Processing Systems (NeurIPS), 2021.
Learning User Preferences in Non-Stationary Environments.
with Wasim Huleihel and Ofer Shayevitz
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021.
Recovery of sparse linear classifiers from mixture of responses.
with Venkata Gandikota and Arya Mazumdar
Advances in Neural Information Processing Systems (NeurIPS), 2020.
Recovery of Sparse Signals from a Mixture of Linear Samples
with Arya Mazumdar
International Conference on Machine Learning (ICML), 2020.
High Dimensional Discrete Integration by Hashing and Optimization
with Raj Kumar Maity and Arya Mazumdar
Uncertainty in Artificial Intelligence (UAI), 2020.
Algebraic and Analytic Approaches for Parameter Learning in Mixture Models
with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
Algorithmic Learning Theory (ALT), 2020.
Same-Cluster Querying for Overlapping Clusters
with Wasim Huleihel, Arya Mazumdar and Muriel Medard
Advances in Neural Information Processing Systems (NeurIPS), 2019.
Sample Complexity of Learning Mixture of Sparse Linear Regressions
with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
Advances in Neural Information Processing Systems (NeurIPS), 2019.
Trace Reconstruction: Generalized and Parameterized
with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
European Symposium on Algorithms (ESA), 2019.
Connectivity in Random Annulus Graphs and the Geometric Block Model
with Sainyam Galhotra, Arya Mazumdar and Barna Saha
International Conference on Randomization and Computation (RANDOM), 2019.
The Geometric Block Model
with Sainyam Galhotra, Arya Mazumdar and Barna Saha
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018.
Semisupervised Clustering, AND-Queries and Locally Encodable Source Coding
with Arya Mazumdar
Advances in Neural Information Processing Systems (NeurIPS), 2017. Spotlight

Workshop Publications

By year

The Geometric Block Model
with Sainyam Galhotra, Arya Mazumdar and Barna Saha
NeurIPS 2017 Workshop on Learning on Distributions, Functions, Graphs and Groups, 2017.