⭐ Extra Credit Opportunities
Important Information
- Total Available: 100 points maximum (50 + 50)
- Deadline: December 9, 2025 @ 11:00 PM CST
- Individual Work: Both options must be completed individually
- Submission: Via Gradescope (separate assignments for each option)
- Late Policy: No late submissions accepted for extra credit
- Both Options Allowed: You may complete both options for up to 100 extra credit points
📊 Choose Your Challenge
Option A: RTAB-Map 3D SLAM
50 Points
Focus: 3D Reconstruction & Visual SLAM
Key Skills:
- RGB-D sensor fusion
- Loop closure detection
- 3D map generation
- Navigation integration
Best For: Students interested in autonomous navigation and 3D mapping
View Details ↓Option B: SAMWISE Vision-Language
50 Points
Focus: Text-Driven Video Segmentation
Key Skills:
- Natural language processing
- Real-time video segmentation
- Object tracking
- Human-robot interaction
Best For: Students interested in AI/ML and vision-language models
View Details ↓Option A: RTAB-Map 3D SLAM Implementation
Implement Real-Time Appearance-Based Mapping on the Husky + UR3 mobile manipulator for real-time 3D reconstruction and visual SLAM.
Point Breakdown (50 Points Total)
| Component | Points | Description |
|---|---|---|
| Installation & Setup | 5 | Successfully install RTAB-Map for ROS2 Humble/Jazzy |
| Basic RGB-D SLAM | 15 | Implement visual SLAM with RGB-D camera |
| Loop Closure Detection | 10 | Demonstrate at least 3 successful loop closures |
| 3D Map Generation | 10 | Create detailed 3D reconstruction and export point cloud |
| Navigation Integration | 5 | Use generated map for path planning with Nav2 |
| Documentation & Report | 5 | Technical report with analysis and video demonstrations |
Key Features to Implement
- Real-time 3D reconstruction from RGB-D camera data
- Loop closure detection for map consistency
- Graph optimization for global map alignment
- Multi-session mapping capability
- Integration with navigation stack for autonomous navigation
Installation Requirements
System Requirements
- Ubuntu 22.04 (ROS2 Humble) or Ubuntu 24.04 (ROS2 Jazzy)
- GPU recommended for real-time processing
- 8GB RAM minimum, 16GB recommended
- 10GB storage for map databases
Installation Commands
# Install RTAB-Map ROS2 package
sudo apt update
sudo apt install ros-$ROS_DISTRO-rtabmap-ros
# Install additional dependencies
sudo apt install ros-$ROS_DISTRO-depth-image-proc \
ros-$ROS_DISTRO-compressed-image-transport \
ros-$ROS_DISTRO-image-transport-plugins \
ros-$ROS_DISTRO-pcl-ros \
ros-$ROS_DISTRO-octomap-server
# Optional: Install RTAB-Map standalone GUI
sudo apt install ros-$ROS_DISTRO-rtabmap
Required Demonstration Scenarios
Scenario 1: Indoor Mapping
- Launch Husky in Gazebo indoor environment
- Start RTAB-Map with RGB-D camera
- Teleoperate robot through environment
- Create complete loop for closure detection
- Save 3D reconstruction
Scenario 2: Object-Rich Environment
- Add multiple objects to scene
- Map environment with detailed 3D reconstruction
- Demonstrate octomap generation
- Export point cloud for analysis
Scenario 3: Multi-Session Mapping (Optional +5 bonus)
- Create initial map
- Save database
- Restart with localization mode
- Extend map to new areas
Deliverables
Required Submission Structure
rtabmap_extra_credit_[netID]/ ├── launch/ │ └── rtabmap_husky.launch.py ├── src/ │ ├── rtabmap_controller.py │ └── loop_closure_monitor.py ├── config/ │ └── rtabmap_params.yaml ├── maps/ │ ├── rtabmap.db │ ├── map_2d.pgm │ └── cloud_map.ply ├── results/ │ ├── trajectory.txt │ └── loop_closures.csv ├── videos/ │ ├── mapping_demo.mp4 │ └── loop_closure_demo.mp4 ├── report/ │ └── technical_report.pdf └── README.md
Rosbag Recording
ros2 bag record -o rtabmap_demo_[netID] \
/camera/color/image_raw \
/camera/aligned_depth_to_color/image_raw \
/scan /odom /tf /tf_static \
/rtabmap/mapData /rtabmap/cloud_map /map
Option B: SAMWISE Text-Driven Video Segmentation
Integrate SAMWISE (CVPR 2025) for natural language-driven object segmentation and manipulation with the Husky + UR3 system.
Point Breakdown (50 Points Total)
| Component | Points | Description |
|---|---|---|
| Installation & Setup | 10 | Install SAMWISE with all dependencies and verify GPU |
| ROS2 Node Implementation | 25 | Create segmentation node with text query handling |
| Demonstration Tasks | 15 | Complete 3 required tasks (static, dynamic, custom) |
| Optional: Choose one advanced feature for +25 points | ||
| Multi-Object Tracking | +25 | Handle multiple simultaneous text queries |
| Action-Based Queries | +25 | Implement motion-dependent queries |
| MoveIt2 Integration | +25 | Complete autonomous pick-and-place pipeline |
| Custom Fine-Tuning | +25 | Train on robot-specific scenarios |
Note: The base 50 points include installation (10), ROS2 implementation (25), and demonstrations (15).
Advanced features are optional and not required for the 50-point extra credit.
SAMWISE Architecture
- Cross-Modal Temporal (CMT) Adapter: 4.2M parameters for temporal modeling
- Conditional Memory Encoder (CME): 0.7M parameters for tracking bias detection
- Dual Prompting Strategy: Contextual and motion prompts
- Performance: State-of-the-art on MeViS (49.5%), Ref-YouTube-VOS (69.2%), Ref-DAVIS (70.6%)
- Efficiency: Only 4.9M trainable parameters (SAM2 remains frozen)
Installation Requirements
Hardware Requirements
- NVIDIA GTX 1080 (8GB minimum), RTX 3090 (24GB recommended)
- 16GB RAM minimum, 32GB recommended
- 20GB storage for models and datasets
Installation Commands
# Create conda environment
conda create --name samwise python=3.10 -y
conda activate samwise
# Install PyTorch with CUDA 11.8
pip install torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu118
# Clone SAMWISE repository
git clone https://github.com/ClaudiaCuttano/SAMWISE
cd SAMWISE
pip install -r requirements.txt
# Optional: Memory-efficient attention
pip install xformers
# Download pretrained checkpoint (choose one)
wget [checkpoint_url] -O checkpoints/samwise_clip.pth # CLIP-B version
# OR
wget [checkpoint_url] -O checkpoints/samwise_roberta.pth # RoBERTa version
ROS2 Node Implementation
Required Functionality
Subscribers:
/camera/image_raw(sensor_msgs/Image) - Camera feed/object_query(std_msgs/String) - Text query
Publishers:
/samwise/segmentation(sensor_msgs/Image) - Segmentation mask overlay/samwise/object_centroid(geometry_msgs/PointStamped) - Object center/samwise/bounding_box(vision_msgs/Detection2D) - Object bounds
Core Features:
- Maintain SAM2 memory bank for temporal consistency
- Handle dynamic text query changes
- Process video stream at minimum 5 FPS
- Visualize results in RViz2
Required Demonstration Tasks
Task 1: Static Object Manipulation (5 pts)
- Query: "Pick up the red cube"
- SAMWISE segments red cube in camera view
- UR3 moves to grasp position based on centroid
- Gripper closes on object (1.05 for close, 0.0 for open)
Task 2: Dynamic Object Tracking (5 pts)
- Query: "Follow the moving ball"
- Husky base tracks ball centroid
- Maintains 1-meter following distance
- Handles temporary occlusions
Task 3: Custom Query (5 pts)
- Student-defined creative text query
- Must demonstrate unique SAMWISE capability
- Examples: "The person waving", "The tallest object", "The transparent bottle"
Deliverables
Required Submission Structure
assignment4_samwise_[netID]/ ├── rosbag/ │ └── samwise_demo_[netID].db3 ├── videos/ │ ├── task1_static.mp4 │ ├── task2_dynamic.mp4 │ └── task3_custom.mp4 ├── src/ │ └── samwise_ros2/ │ ├── samwise_segmentation_node.py │ └── package.xml ├── config/ │ └── samwise_params.yaml ├── report/ │ └── technical_report.pdf └── README.md
Evaluation Criteria
- Correct segmentation accuracy (IoU > 0.6)
- Real-time performance (> 5 FPS)
- Temporal consistency across frames
- Robustness to query changes
- Code quality and documentation
Paper Reference
@misc{cuttano2025samwiseinfusingwisdomsam2,
title={SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation},
author={Claudia Cuttano and Gabriele Trivigno and Gabriele Rosi and
Carlo Masone and Giuseppe Averta},
year={2025},
eprint={2411.17646},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.17646},
}
⚠️ Important Submission Guidelines
- Deadline: December 9, 2025 @ 11:00 PM CST (No extensions for extra credit)
- Gradescope Assignments:
- "Extra Credit - RTAB-Map" for Option A
- "Extra Credit - SAMWISE" for Option B
- File Size Limits: 500MB per submission
- Academic Integrity: Individual work only - collaboration on extra credit will result in zero points
Questions & Support
- Office Hours: Wednesdays 1:30-3:00 PM, SC 4407
- Campuswire Tags:
- Use
#extra-credit-rtabmapfor RTAB-Map questions - Use
#extra-credit-samwisefor SAMWISE questions
- Use
- TA Email: ksa5@illinois.edu (use only for submission issues)