Optimistic Thompson Sampling for Episodic Reinforcement Learning

UAI 2023

B. Hu, T. Zhang, N. Hegde, M. Schmidt.

Paper Code Poster (Upperbound) Poster (UAI)


Transfer Learning via Online System Identification.

PDF Code

In this project, we implemented Soft Actor-Critic method on OpenAI gym classic control games with Proximal Policy optimization method and GAE lambda advantage function. We trained universal policy with dynamic system identification to aim to reduce Sim-to-Real gap.


Deep exploration via randomized value function: RLRG 2023 Spring Slides

Transformers, large language-models, and the magic behind chatGPT: Guest Lecture for CPSC 340 Recording Slides

Language models are few-shot learners: MLRG 2022 Fall Slides

Active Learning for semantic segmentation: MLRG 2022 Summer Slides

Probabilistic topic modelling Slides

Teaching Assistant

July 2017- Present: CPSC 532M/340: Machine Learning and Data Mining

Jan. 2018- Apr. 2018: Math 152: Linear Systems

Jan. 2017- Apr. 2018: CPSC 121: Models of Computation

July 2017- Aug. 2017: Math 102: Integral Calculus