PhD in CS, 2023
UC Berkeley
MSc in CS, 2018
Stanford University
BSc in Math, 2018
Stanford University
Updated version on Google scholar
Project Site
Code
PDF
A benchmark of 1800 human annotated questions, and associated RGBD pose trajectories, with registered 3D scans. For categories like spatial understanding, no VLMs (Mar 2024) do better than a blind LLM that can't see the scene.
Project Site
Code
PDF
A large diverse dataset of multiview posed RGBD images, 3D scans, and more -- of 1.9k scanned + artist-generated buildings + outdoor scenes.
PDF
Code
Project Site
Enforce consistency constraints between different tasks, to improve robustness to various domain shift.
Project Site
Code
PDF
Using large pretrained visual representations improves performance and learning speed for various manipulation tasks with robotic arms, and with sim2real generalization for navigation.
Project Site
Code
PDF
A strong way to enforce consistency constraints across multiple tasks in multi-task learning -- better generalization and ability to detect prediction failure.
Project Site
Code
PDF
Simple method to adapt and control output of a larger network with a smaller "side" network.
Project Site
Code
PDF
Using large pretrained visual representations improves performance and learning speed for various navigation tasks.