Tianwei Ni 倪天炜
I am a PhD candidate at Mila - Quebec AI Institute and Université de Montréal,
advised by Pierre-Luc Bacon. In the previous years, I worked closely with
Benjamin Eysenbach at Princeton University and Aditya Mahajan at McGill University.
My research centers on reinforcement learning (RL) that enables good sequential-decision-making under uncertainty.
I aim to empower RL with better frameworks, algorithms, and implementations that can tackle real-world-level challenges beyond toy tasks.
To advance this vision of strong AI, I incorporate sequence modeling, representation learning, planning, and more -- techniques that I consider essential for building a modern RL system.
 
 
 
 
 
 
News
- Sept 2024: I started my applied scientist internship at Amazon in Santa Clara, supervised by Rasool Fakoor.
- Jan 2024: One paper got accepted at ICLR as a poster. See you in Vienna!
- Sept 2023: One paper got accepted at NeurIPS as an oral. See you in New Orleans!
- Aug 2023: I passed my predoc exam and became a PhD candidate.
|
Highlighted Papers
Below are selected papers in reverse chronological order, and please see the full publication list in Google Scholar.
Notation: * indicates equal contribution.
|
Do Transformer World Models Give Better Policy Gradients?
Michel Ma*, Tianwei Ni, Clement Gehring, Pierluca D'Oro*, Pierre-Luc Bacon
International Conference on Machine Learning (ICML), 2024
and ICLR 2024 Workshop on Generative Models for Decision Making (oral)
arXiv
Led by Michel and Pierluca, we craft a model-based policy gradient method for long-horizon planning.
Conditioning solely on action sequences, the world model yields better gradients over state-based models and sometimes ground-truth simulators.
Bridging State and History Representations: Understanding Self-Predictive RL
Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon
International Conference on Learning Representations (ICLR), 2024
and NeurIPS 2023 Workshop on Self-Supervised Learning: Theory and Practice (oral)
arXiv / OpenReview / 1-hour Talk /
Provide a unified view on state and history representations in MDPs and POMDPs, and further investigate
the challenge, solution, and benefit of learning self-predictive representations in standard MDPs, distracting MDPs, and sparse-reward POMDPs.
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon
Conference on Neural Information Processing Systems (NeurIPS), 2023 (oral)
and NeurIPS 2023 Workshop on Foundation Models for Decision Making
arXiv / OpenReview / Poster /
11-min Talk / Mila Blog /
Investigate the architectural aspect of history representations in RL on temporal dependencies -- memory and credit assignment, with rigorous quantification.
Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
Tianwei Ni, Benjamin Eysenbach, Ruslan Salakhutdinov
International Conference on Machine Learning (ICML), 2022
project page / arXiv
/ CMU ML Blog /
Find and implement simple but often strong baselines for POMDPs,
including meta-RL, robust RL, generalization in RL, and temporal credit assignment.
Before embarking on my PhD journey, I was a research intern on embodied AI mentored by Jordi Salvador and Luca Weihs at Allen Institute for AI (AI2).
I earned my Master's degree in Machine Learning at Carnegie Mellon University, where I
studied deep RL guided by Ben Eysenbach and Russ Salakhutdinov, and explored human-agent collaboration advised by Katia Sycara.
My research journey started with computer vision for medical images supervised by Alan Yuille at Johns Hopkins University.
I earned my Bachelor's degree in Computer Science at Peking University.
Fun fact: I have experienced university education in three languages - Chinese, English, and French.
Website template is credit to Jon Barron's source code.
|
|