Publications

On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization

ICML 2024 Paper

M. Sohrabi*, J. Ramirez*, T. H. Zhang, S. Lacoste-Julien and J. Gallego-Posada

Efficient and Adaptive Posterior Sampling Algorithms for Bandits

Paper

B. Hu, Z. Huang, T. H. Zhang, M. Lécuyer, N. Hegde

From 6235149080811616882909238708 to 29: Vanilla Thompson Sampling Revisited

Opt Workshop @ Neurips 2023 Paper Poster

B. Hu, T. H. Zhang

Optimistic Thompson Sampling for Episodic Reinforcement Learning

UAI 2023 Paper Code Poster (Upperbound) Poster (UAI)

B. Hu, T. H. Zhang, N. Hegde, M. Schmidt

Thesis

Optimistic Thompson Sampling: Strategic Exploration in Bandits and Reinforcement Learning

Master's Thesis

Supervisory Committee: Mark Schmidt, Mathias Lecuyer

PDF Slides

Talks

Safe Reinforcement Learning from Human Feedback: RLHF reading group @Mila Slides

Deep exploration via randomized value function: RLRG 2023 Spring Slides

Transformers, large language-models, and the magic behind chatGPT: Guest Lecture for CPSC 340 Recording Slides

Language models are few-shot learners: MLRG 2022 Fall Slides

Active Learning for semantic segmentation: MLRG 2022 Summer Slides

Probabilistic topic modelling Slides

Teaching Assistant

Sept 2023- Dec. 2023: Mila/Polytechnique Montreal INF8245E - Machine Learning

July 2017- Apr. 2023: UBC CPSC 532M/340: Machine Learning and Data Mining

Jan. 2018- Apr. 2018: UBC Math 152: Linear Systems

Jan. 2017- Apr. 2018: UBC CPSC 121: Models of Computation

July 2017- Aug. 2017: UBC Math 102: Integral Calculus