Puru Ojha
I'm a Master's student at the International Institute of Information Technology, Hyderabad. My research interests include Robot learning for long-horizon manipulation tasks, with a focus on zero-shot policy transfer, language-grounded perception, and developing neuro-symbolic systems that combine the generalization of deep learning with the verifiability and safety of formal methods.
Email /
LinkedIn /
Github
|
|
Research
I'm interested in robot learning for long-horizon manipulation tasks, with a focus on zero-shot policy transfer, language-grounded perception, and developing neuro-symbolic systems that combine the generalization of deep learning with the verifiability and safety of formal methods.
|
|
Zero-Shot Policy Transfer for Cross-Embodiment Robotic Manipulation
Developed a zero-shot policy transfer framework for cross-embodiment manipulation, enabling a generalist policy (π₀) trained on Franka Panda data to control a morphologically distinct uFactory xArm without any fine-tuning. Implemented a Brownian bridge diffusion model for visual domain adaptation, translating sensory inputs from the target robot (xArm) to the source domain (Panda) for seamless, real-time policy execution. Engineered a robust policy transfer pipeline that leverages real-world data, addressing generalization challenges across different robot hardware and environments.
|
|
Sequential Rearrangement Planning with Language-Guided Graph Transformer
Designed a graph-based transformer model for predicting object removal sequences in cluttered tabletop scenes, conditioned on natural language goals. Built a large-scale simulation and training pipeline with expert A* supervision and PPO fine-tuning, achieving state-of-the-art performance over GRN and NRP baselines.
|
|
Enhanced Transformer-Based Framework for Grounded Image Situation Recognition
Developed a transformer framework for grounded situation recognition to enable robots to follow complex natural language instructions. Improved noun grounding and verb prediction by integrating CLIP and Faster-RCNN features into a CoFormer architecture, achieving state-of-the-art performance on the SWiG dataset.
|
|
Procedural Generation of Architecturally Consistent Simulation Environment
Engineered a framework for procedural generation of complex, structured environments for benchmarking RL agents and studying sim-to-real transfer of navigational policies. Designed a graph-based algorithm to generate scalable, architecturally consistent layouts with dynamic multi-path structures for realistic simulation.
|
Micropapers
|
Squareplus: A Softplus-Like Algebraic Rectifier
A Convenient Generalization of Schlick's Bias and Gain Functions
Continuously Differentiable Exponential Linear Units
Scholars & Big Models: How Can Academics Adapt?
|
Recorded Talks
|
Radiance Fields and the Future of Generative Media, 2025
View Dependent Podcast, 2024
Bay Area Robotics Symposium, 2023
EGSR Keynote, 2021
TUM AI Lecture Series, 2020
Vision & Graphics Seminar at MIT, 2020
|
Academic Service
|
Lead Area Chair, ICCV 2025
Lead Area Chair, CVPR 2025
Area Chair, CVPR 2024
Demo Chair, CVPR 2023
Area Chair, CVPR 2022
Area Chair & Award Committee Member, CVPR 2021
Area Chair, CVPR 2019
Area Chair, CVPR 2018
|
Teaching
|
Graduate Student Instructor, CS188 Spring 2011
Graduate Student Instructor, CS188 Fall 2010
Figures, "Artificial Intelligence: A Modern Approach", 3rd Edition
|
Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.
|
|