Sourav Chakraborty

(Pronounced generally as Saw-ruv and Show-oo-rob in Bengali.)

Official Email   •   Personal Email

I am a first year Ph.D student in the Computer Science department, at the University of Colorado at Boulder advised by Dr. Lijun Chen.

My research interests are reinforcement learning and human-robot interaction.

I graduated with a master's degree in computer science from the same institute. Before coming to Boulder, I had worked as a software engineer at Flipkart in India, after graduating from Birla Institute of Technology, Mesra, Ranchi.

When not thinking about work, I treat myself to the world of films, books and music.

Resume  •  CV  •  LinkedIn  •  Facebook  •  Instagram

profile photo
Updates
  • 2022-08: Started Ph.D!
  • 2022-05: Graduated with M.S in Computer Science with GPA 4.0!
  • 2022-04: Successfully defended Master's thesis! (link/ slides)
  • 2022-02: Got offer for Ph.D from CS @ Colorado for Fall 2022!
  • 2019-08: Started M.S in Computer Science at CU Boulder!
  • 2016-12: Joined Flipkart as a Software Engineer, in Bangalore, India.
  • 2016-06: Bachelor's degree in engineering completed from BIT Mesra.
Awards & Honors
  • 2022-09: Received the Early Career Development Fellowship ($1K) from the CS department!
  • 2022-05: Won the Lloyd Botway Award for Outstanding Master's student for "outstanding academics, teaching, research and service to the department"!
  • 2022-04: Won the CU Research Expo annual award for the "work in progress" segment!
  • 2022-04: Selected as a Lead Teaching Assistant (department lead) for CS @ CU for the annual year!
Research
Incentivized Exploration in Non-stationary Stochastic Bandits.
Sourav Chakraborty
Master's Thesis, defended on April 2022.
Committee: (Advisor) Lijun Chen, Raf Frongillo, Bo Waggoner.
ProQuest link / slides

We study the incentivized exploration for the multi-armed bandit (MAB) problem with nonstationary reward distributions, where the players receive compensation for exploring arms other than the greedy choice and may provide a biased feedback on reward. We analyze the impact of the drifted reward feedback on two instances of non-stationary MAB environments: Piecewise-Stationary and Continuously-Changing. We show that our algorithms for both the environments achieve sub-linear regret and compensation under drifted reward, and are therefore effective in incentivizing exploration. Experimental results with synthetic data are provided to complement the theoretical analysis.

Teaching
  • (New) Spring 2023: Teaching Assistant for CSCI 2270 - Data Structures.
  • Fall 2022: Teaching Assistant for CSCI 2270 - Data Structures.
  • Spring 2022: Teaching Assistant for CSCI 2270 - Data Structures.
  • Fall 2021: Instructor for CSCI 1200 - Introduction to Computational Thinking.
  • Summer 2021: Teaching Assistant for CSCI 1300 - Starting Computing.
  • Spring 2021: Teaching Assistant for CSCI 1300 - Starting Computing.
  • Fall 2020: Teaching Assistant for CSCI 1300 - Starting Computing.
  • Summer 2020: Instructor for CSCI 3022 - Introduction to Data Science with Probability & Statistics.
Selected Personal Projects

*Alphabetically

Inverse Reinforcement Learning via Maximum Entropy Formulation.
Tuhina Tripathi, Alexa Reed, Sourav Chakraborty
April , 2022
report  /  code /  demo /  interface-code

Final project for ASEN 5519: Decision Making Under Uncertainty. This project explores the use of Inverse Reinforcement Learning, via Maximum Entropy Formulation, in a Markov Decision Process. The concepts explored in this project were demonstrated using a grid world environment.

Incentivized Exploration for Multi-Armed Bandits under Reward Drift.
Sourav Chakraborty
September, 2020
original paper  /  code

Just playing around with the paper by Liu & Wang et alon Incentivized Exploration for Multi-Armed Bandits under Reward Drift where the players receive compensation for exploring arms other than the greedy choice and may provide biased feedback on reward drift.

Contextual vectorized representation of words: Soam word embeddings
*Amit Baran Roy, Sourav Chakraborty
May, 2020
report  /  code

A word embedding model implementation based on the popular skipgram architecture. It involves alterations of the scoring algorithm to give more weightage to the context words that are closer to the target word in a skipgram sliding window.

Solving Games using the combination of Q-learning and Regret Matching Methods
Sourav Chakraborty, Nagarajan Shanmuganathan
May, 2020
report  /  code

It is known well that Counterfactual regret minimization (CFR) has been used in games which have both terminal states and perfect recall to minimize regret. This project aims to relax those constraints and use a local no-regret algorithm (LONR) by Kash et al, which internally uses a Q-learning like update rule to games which do not have terminal states or perfect recall.

Occupancy Network based 3D Image Reconstruction using Single-Depth View
*Amit Baran Roy, Aparajita Singh, Sourav Chakraborty, Tanmai Gajula
Feb-April, 2020
report  /  code

The complete 3D geometry of an object from a single 2.5D depth view was acquired by using deep learning techniques such as generative adversarial networks and 3D convolution neural networks. The resolution of the final 3D voxelized output was improved by transforming the voxel representation into another representation called occupancy networks.

Last updated: Nov 14, 2022
Thanks Jon Barron!