Artificial Intelligence and Robotics


I am a Research Scientist with Sony AI working on Reinforcement Learning for development of complex autonomous agents.

Previously, I was a doctoral and also a post-doctoral researcher with the Dynamic Robot Systems group, Oxford Robotics Institute, University of Oxford. I was also part of the Autonomous Intelligence Machines and Systems programme at Oxford.

I am a recipient of the Queen Mary UK Best PhD in Robotics Award for my doctoral research in Robotics and Aritifical Intelligence.

As an undergraduate, I studied Electronics Engineering at the University of Mumbai.

Education and Work

October 2017 - November 2021

University of Oxford

Doctor of Philosophy

Autonomous Intelligent Machines and Systems

Learning System-Adaptive Legged Robotic Locomotion Policies
Dr. Ioannis Havoutis and Prof. Ingmar Posner

July 2012 - June 2016

University of Mumbai

Bachelor of Engineering


Final Year Project:
Micro-controller based Low-Powered Semi-Autonomous Quadcopter

April 2023 - Present

Sony AI

Research Scientist

Research Focus:
Reinforcement Learning for development of complex autonomous agents.

November 2021 - February 2023

University of Oxford

Postdoctoral Researcher

Dynamic Robot Systems Group

Research Focus:
Long-Horizon Motion Planning for Legged and Manipulator Robots using learned system dynamics models.

June 2018 - August 2018

Korea Advanced Institute of Science & Technology

Visiting Researcher


Learning Platform Adaptive Locomotion Policies

June 2018 - August 2018

ETH Zurich

Visiting Researcher

Robotics Systems Lab

Heterogeneous Swarm Optimization using Deep Reinforcement Learning

Selected Projects

Here are some of the projects I have worked on. Feel free to check them out.

Learning Low-Frequency Motion Control for Robust and Dynamics Robotic Locomotion - Accepted to ICRA 2023

Project Site

Video Summary

RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control - Published in T-RO 2022


Video Summary

Real-time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning - Presented at ICRA 2021


Video Summary

Guided Constrained Policy Optimization for Dynamic Quadruped Robot Locomotion - Published in RA-L 2020


Video Summary

Field Trials

Heterogeneous Swarm Optimization using Deep Reinforcement Learning


Generative Adversarial Imitation Learning for Quadrupedal Footstep Planning



Micro-controller based Low-Powered Semi-Autonomous Agricultural Quadcopter


Development of Electronics and Navigation Software Framework for an Autonomous Mobile Robot

Internship at Rucha Yantra



I mostly used C for my Embedded Systems projects in which I largely worked on ARM Cortex M based micro-controllers. I used C++ for my Robotics projects and have been using it extensively for my current research.


I consider Python to be a brilliant prototyping tool. I use it extensively for machine learning, especially for training RL agents as part of my research. I then port most of my models to C++ for use with physical hardware.


I do not use MATLAB often but it has been quite a convenient tool for performing basic control optimizations.


I'm familiar with the instruction sets for Intel 8051 and 8086. I used these for some of my embedded systems projects.

Libraries and Frameworks


Definitely a great library. Was very important in a project where I developed my own shared memory based inter-process communication library.


I have used Eigen in every Robotics C++ project I have worked on.


It has been my go-to Deep Learning framework.

PyTorch C++

This has been very useful for me to be able to port my models trained in Python to C++ with ease.


I definitely use PyTorch more than I use Tensorflow but I do much of my RL training using Tensorflow based frameworks and hence use it often.

OpenAI Baselines

I use baselines for training RL agents using some of the widely used RL algorithms. This is mostly for prototyping after which, in most cases, I use my own implementation of these algorithms.

Robotic Simulators


I first started RL with MuJoCo since it was widely used along with the OpenAI Gym framework.


I also tested PyBullet for training some RL agents. In fact, PyBullet was my first choice when I started training an RL policy for controlling the ANYmal quadruped.


Most RL algorithms have been known to be extremely sample inefficient. To train a feasible RL policy thus necessitates super fast simulators. And that is pretty much why I greatly enjoy using RaiSim. I now use it as the go-to simulator for RL.


I experimented with V-REP for RL. Cannot say I use it a lot. Do like the drag and drop features it supports though.


I use Gazebo with all of my ROS projects. Everything I do on the real robot is first tested using Gazebo.

Get In Touch

I'll be glad to connect, discuss and collaborate. This could be related to work or could be about hiking, painting, music, travelling or history. Feel free to shoot me an e-mail.


Seattle, USA