Soumyabrata Pal

Soumyabrata Pal

Adobe Research, India
email: soumyabratapal13 at gmail dot com, soumyabratap at adobe dot com
[Google scholar] [DBLP] [Research Statement]

About me

Currently, I am a Research Scientist at Adobe Research, Bangalore (India). Prior to this, I spent two years as a postdoc at Google Research, India working with Dr. Prateek Jain and Dr. Karthikeyan Shanmugam . Before that, I completed my Ph.D in the Computer Science Department (CICS) at the University of Massachusetts Amherst advised by Dr. Arya Mazumdar. During that time, I was a Visiting Graduate Student at the University of California San Diego from May - November 2021. I had also spent the summer of 2019 as a Research Intern at Ernst & Young AI Lab at Palo Alto and Spring 2020 as an Applied Scientist Intern at Amazon Search (Berkeley). Even earlier, I graduated from Indian Institute of Technology, Kharagpur in August 2016 with a Bachelor's degree in Electronics and Electrical Communication Engineering.

Research

My research interests are LLM Efficiency and Theoretical Machine Learning focused on Non-convex Optimization and Online Learning. More concisely, I love Statistical recovery/reconstruction problems under different reasonable structural assumptions on the data generating mechanism such as sparsity, low-rank, presence of latent clusters among others. Nowadays, I am working on designing algorithms in offline/online/hybrid systems aimed at incorporating personalization efficiently at scale. Most of my work so far can be categorized into five topics namely 1) Scalable Personalization via Low Rank and Sparse Decomposition 2) Multi-agent Online Learning via Collaborative Filtering 3) Latent Variable models - Mixtures of Linear Regression, Linear Classifiers and Distributions 4) Generative models for Graph Clustering - Geometric Block Model and 5) Active learning for Semi-supervised clustering - Disjoint Clusters, Overlapping Clusters and Fuzzy Clusters

Recent News

2 New papers in NeurIPS 2023.

New papers in JMLR, ICLR 2023, AISTATS 2023.

New paper in ALT 2022!! Also started as a postdoc at Google Research!

2 new papers in NeurIPS 2021 and 1 new paper in ITCS !!

Started as a Visiting Graduate Student at UCSD from May 2021!!

PhD Dissertation

  1. Mixture Models in Machine Learning

Preprints

  1. Improved Algorithms for Stochastic Linear Bandits with Prior Observations via D-optimal Exploration
    with Sushant Vijayan, Karthikeyan Shanmugam and Arun Sai Suggala
    In Submission.

Journal Publications

  1. Random Subgraph Detection Using Queries
    with Wasim Huleihel and Arya Mazumdar
    Journal of Machine Learning Research (JMLR), 2024.

  2. Support Recovery in Mixture Models with Sparse Parameters
    with Arya Mazumdar
    IEEE Transactions on Information Theory, 2024. Preliminary version appeared in International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.

  3. Improved Support Recovery in One-Bit Compressed Sensing
    with Namiko Matsumoto and Arya Mazumdar
    IEEE Transactions on Information Theory, 2024. Preliminary version appeared in Innovations in Theoretical Computer Science (ITCS), 2022.

  4. Community Recovery in the Geometric Block Model
    with Sainyam Galhotra, Arya Mazumdar and Barna Saha
    To appear, Journal of Machine Learning Research (JMLR), 2023. Shorter versions accepted in RANDOM 2019 and AAAI 2018

  5. Trace Reconstruction: Generalized and Parameterized
    with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
    IEEE Transactions on Information Theory, 2021. Preliminary version appeared in European Symposium on Algorithms (ESA), 2019.

  6. Semisupervised Clustering by Queries and Locally Encodable Source Coding
    with Arya Mazumdar
    IEEE Transactions on Information Theory, 2021. Preliminary version appeared in Advances in Neural Information Processing Systems (NeurIPS), 2017.

Conference Publications

By year    

  1. Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD
    with Aniket Das, Dheeraj Nagaraj, Arun Suggala, Prateek Varshney
    Advances in Neural Information Processing Systems (NeurIPS), 2024.

  2. Online Matrix Completion: A Collaborative Approach with Hott Items
    with Dheeraj Baby
    International Conference on Machine Learning (ICML), 2024.

  3. Private and Efficient Meta-Learning with Low Rank and Sparse Decomposition
    with Prateek Varshney, Prateek Jain, Abhradeep Guha Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy, and Gaurav Srivastava
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.

  4. Improving Mobile Maternal and Child Health Care Programs: Collaborative Bandits for Time slot selection
    with Milind Tambe, Arun Suggala, Karthikeyan Shanmugam, and Aparna Taneja
    International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024.

  5. Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints
    with Arun Sai Suggala, Karthikeyan Shanmugam and Prateek Jain
    Advances in Neural Information Processing Systems (NeurIPS), 2023.

  6. Nash Regret Guarantees for Linear Bandits
    with Siddharth Barman and Ayush Sawarni
    Advances in Neural Information Processing Systems (NeurIPS), 2023.

  7. Optimal Algorithms for Latent Bandits with Cluster Structure
    with Arun Sai Suggala, Karthikeyan Shanmugam and Prateek Jain
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2023.

  8. Online Low Rank Matrix Completion
    with Prateek Jain
    International Conference on Learning Representations (ICLR), 2023.

  9. On Learning Mixture of Linear Regressions in the Non-Realizable Setting
    with Abhishek Ghosh, Arya Mazumdar and Rajat Sen
    International Conference on Machine Learning (ICML), 2022.

  10. On Learning Mixture Models with Sparse Parameters
    with Arya Mazumdar
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.

  11. Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians
    with Sami Davies, Arya Mazumdar and Cyrus Rashtchian
    Algorithmic Learning Theory (ALT), 2022.

  12. Support Recovery in Universal One-bit Compressed Sensing
    with Arya Mazumdar
    The 13th Innovations in Theoretical Computer Science (ITCS), 2022

  13. Support Recovery of Sparse Signals from a Mixture of Linear Measurements
    with Venkata Gandikota and Arya Mazumdar
    Advances in Neural Information Processing Systems (NeurIPS), 2021.

  14. Fuzzy Clustering with Similarity Queries
    with Wasim Huleihel and Arya Mazumdar
    Advances in Neural Information Processing Systems (NeurIPS), 2021.

  15. Learning User Preferences in Non-Stationary Environments.
    with Wasim Huleihel and Ofer Shayevitz
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2021.

  16. Recovery of sparse linear classifiers from mixture of responses.
    with Venkata Gandikota and Arya Mazumdar
    Advances in Neural Information Processing Systems (NeurIPS), 2020.

  17. Recovery of Sparse Signals from a Mixture of Linear Samples
    with Arya Mazumdar
    International Conference on Machine Learning (ICML), 2020.

  18. High Dimensional Discrete Integration by Hashing and Optimization
    with Raj Kumar Maity and Arya Mazumdar
    Uncertainty in Artificial Intelligence (UAI), 2020.

  19. Algebraic and Analytic Approaches for Parameter Learning in Mixture Models
    with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
    Algorithmic Learning Theory (ALT), 2020.

  20. Same-Cluster Querying for Overlapping Clusters
    with Wasim Huleihel, Arya Mazumdar and Muriel Medard
    Advances in Neural Information Processing Systems (NeurIPS), 2019.

  21. Sample Complexity of Learning Mixture of Sparse Linear Regressions
    with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
    Advances in Neural Information Processing Systems (NeurIPS), 2019.

  22. Trace Reconstruction: Generalized and Parameterized
    with Akshay Krishnamurthy, Arya Mazumdar and Andrew McGregor
    European Symposium on Algorithms (ESA), 2019.

  23. Connectivity in Random Annulus Graphs and the Geometric Block Model
    with Sainyam Galhotra, Arya Mazumdar and Barna Saha
    International Conference on Randomization and Computation (RANDOM), 2019.

  24. The Geometric Block Model
    with Sainyam Galhotra, Arya Mazumdar and Barna Saha
    The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018.

  25. Semisupervised Clustering, AND-Queries and Locally Encodable Source Coding
    with Arya Mazumdar
    Advances in Neural Information Processing Systems (NeurIPS), 2017. Spotlight

Workshop Publications

By year    

  1. The Geometric Block Model
    with Sainyam Galhotra, Arya Mazumdar and Barna Saha
    NeurIPS 2017 Workshop on Learning on Distributions, Functions, Graphs and Groups, 2017.