WaypointGen++
Follow-up to WaypointGen. Ongoing project extending VLM-based navigation waypoint generation. (Under Review)
I’m Kulbir Singh Ahluwalia, a Ph.D. candidate in Computer Science at the University of Illinois, Urbana-Champaign (UIUC), focusing on natural language grounding for agricultural robots to advance this critical field. I am fortunate to be mentored by Prof. Girish Chowdhary and Prof. Julia Hockenmaier. My academic foundation includes a Master of Engineering in Robotics from the University of Maryland, where I gained expertise in robotic systems and physics simulations. Recently, during my summer internship at EarthSense, Inc., I contributed to developing a natural language-conditioned waypoint generation pipeline. My technical skills include Robot Operating System (ROS) and implementing SOTA NLP and CV pipelines for Mobile Manipulators. I also co-developed the CS-498-GC Mobile Robotics course with Prof. Chowdhary. My long-term life goal is scaling up Physical AI for advancing humanity.
View: show all / show selected / show by topic
Research Topics: Mobile Robotics / Computer Vision / Deep Learning / Machine Learning / NLP / Path Planning / Decision Making / 3D Vision
Follow-up to WaypointGen. Ongoing project extending VLM-based navigation waypoint generation. (Under Review)
We introduce WaypointGen, a 14-step pipeline that grounds natural language instructions to 2D navigation waypoints. We utilize a QWEN 3 VLM-based filtering approach with pre-defined templates to extract relevant geometric constraints. The method employs SLIC in Birds-Eye-View (BEV) and Model Predictive Path Integral (MPPI) for trajectory selection, demonstrating enhanced navigation capabilities for mobile manipulators in dynamic environments.
We introduce an efficient active semantic mapping approach for horticultural robotics, using a mobile manipulator with an RGB-D camera. Probabilistic semantic octomaps are used to detect target regions of interest such as fruits, generate candidate viewpoints, and compute information gain for next-best-view planning. An efficient ray-casting strategy and a novel information gain function accounting for semantics and occlusions is introduced for efficient target-focused map exploration.
This poster presents a text-enabled FarmBot system that enables users to control a robotic gardening FarmBot system via natural language. Using a custom Python wrapper built on the FarmBot REST API, natural language commands are grounded using real-time robot state and translated into executable code with a fine-tuned CodeT5 model. The system generates valid plant placement configurations that satisfy natural language defined spatial constraints.
DeepPaSTL aims to accurately forecast long-term pasture growth, tackling the challenge of estimating pasture biomass without relying on extensive site-specific data or frequent field measurements. This approach enables predicting pasture evolution without monitoring fields regularly, using past observed pasture heights as input. DeepPaSTL introduces a bi-directional ConvLSTM encoder–decoder to learn the spatio-temporal pasture growth dynamics purely from spatial height measurements.
Targets large-scale pasture monitoring for precision agriculture, deploying a team of robots to track grassland growth for optimal rotational grazing and land productivity, addressing the lack of timely growth data in current practice. Proposes an integrated pipeline combining synthetic data generation, deep neural network-based spatiotemporal prediction, and an intermittent multi-robot deployment strategy to periodically survey evolving pastureland at low cost.
An abstract accepted at the 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA@40), 2024.
An article featuring the multispectral Fundus Eye camera prototype, as presented in Optics and Photonics News.
A conference presentation on combining reinforcement and supervised learning for non-destructive testing of fruits using near infrared spectrum data.
Co-developed with Prof. Girish Chowdhary
• Co-developed course curriculum focusing on mobile robotics, ROS2, sensor fusion, and SLAM algorithms.
• Managing coding exercises and problem sets involving Extended Kalman Filtering and odometry implementation.
• Conducting office hours and helping students with ROS2 development and debugging.
• Maintaining course website and autograding infrastructure on Gradescope.
• Special Topic Fall 2025: SLAM-ing on Mars
Instructor: Dr. Svetlana Lazebnik
• Updated and verified starter code for assignments, and answered student questions during office hours and through
Campuswire.
• Assessed student submissions via SpeedGrader on Canvas, and designed multimodal quiz questions, including single-
choice, multiple-choice, and matching formats.
Instructor: Dr. Eric Shaffer
• Created multimodal exam questions with integrated visualizations using Python and matplotlib for assessing student understanding of scientific visualization concepts.
• Assisted students with implementation of advanced visualization algorithms including ray marching, transfer functions, and interactive widget development.
Supervisor: Michael McGuire, Lead Computer Vision Engineer
• Key Achievement: Contributed to developing a natural language-conditioned waypoint generation pipeline for agricultural robot navigation.
• Implemented state-of-the-art NLP and CV pipelines for Mobile Manipulators, enabling natural language instruction following.
• Created an automatic labeling pipeline for large outdoor robot navigation datasets using Grounded SAM2, streamlining data processing.
• Deployed and integrated open-source Visual Language Models (Molmo-7B-demo, Gemma-3-27B, Qwen-2.5-VL-72B, Qwen3-30B, Llama4-Scout, Spatial-VLM) for robot reasoning in image space and open-world natural language instruction conditioned question answering for 4 wheeled skid steer outdoor robots.
• Enhanced ROS-based systems for real-world agricultural applications, directly supporting the advancement of Physical AI.
View: show all / show selected / show by topic
Topics: Mobile Robotics / Computer Vision / Deep Learning / Machine Learning / NLP / Path Planning / 3D Vision
Designed an autonomous robot capable of navigating and localizing itself in a test arena using QR codes and arrows. It uses a RGB camera, IMU, optical encoders, and an ultrasonic sensor to detect, retrieve and transport user-specified blocks. Featured video, Robot videos, Featured post
Built a segmentation network using SLIC superpixels as input. A pretrained VGG16 network had its last layers replaced by fully connected layers to classify superpixels. (Accuracy 98%)
Used MobileNet to optimize cross-view image generation with a 5.7X reduction in parameters.
Simulated a Baxter robot transporting cubes between tables in Gazebo using ROS Kinetic. Waypoints were generated using Rviz and obstacles were avoided in the custom-designed Gazebo world.
Implemented the A* algorithm in a configuration space with obstacles. The Turtlebot 3 obeys non-holonomic constraints with 8 combinations of two user-defined RPMs.
Developed an algorithm using hough transform and histogram of lanes. Also implemented homography and warp perspective functions from scratch for overlays.
Developed an industrial system with UR 10 robotic arms, conveyor belts, and AGVs. The system picked parts from a conveyor, disposed faulty items, assembled orders, and delivered them using AGVs.
Simulated a 7 DOF UR5 arm using Moveit and Rviz. Calculated DH parameters and computed forward kinematics manually, verified via Peter Corke’s Robotics toolbox. Simulation videos
Developed controllers for two inverted pendulums on a moving cart.
Engineered a prototype to transport objects between rooms via web-based remote access with live video and gesture control. The robot featured on-board power, a custom LED light source for low-light navigation, smart device control, and a speaker for prerecorded messages.
Awarded First Prize in IIT Roorkee and placed 6th out of 400 teams at IIT Bombay. Video 1 Video 2
I am fortunate to be mentored by distinguished faculty at the intersection of robotics, natural language processing, and agricultural technology:
Director: Prof. Girish Chowdhary
Focus: Agricultural robotics, field robots, autonomous systems, and machine learning for agriculture
Director: Prof. Julia Hockenmaier
Focus: Natural language processing, computational linguistics, vision and language, semantic parsing
Research Focus: Natural Language Grounding for Agricultural Robots - Advancing Physical AI for Humanity
First prize in final year MAJOR PROJECT in the B.Tech. Examination of Electrical Engineering, 2015-19 titled Teleoperated Gesture controlled Robotic arm.
Received certificate of appreciation for contributions to IEEE PEC (2017,2018).
Awarded with the National Bal Shree Award in Creative Scientific Innovations by the Ministry of Human Resource Development, Govt. of India. It consisted of a series of scientific hands-on tests and interviews at city, zonal and national level.
Talk is cheap. Show me the code. Show me the results.- Linus Torvalds, Shivansh Patel