⭐ Extra Credit Opportunities

Important Information

  • Total Available: 100 points maximum (50 + 50)
  • Deadline: December 9, 2025 @ 11:00 PM CST
  • Individual Work: Both options must be completed individually
  • Submission: Via Gradescope (separate assignments for each option)
  • Late Policy: No late submissions accepted for extra credit
  • Both Options Allowed: You may complete both options for up to 100 extra credit points

📊 Choose Your Challenge

Option A: RTAB-Map 3D SLAM

50 Points

Focus: 3D Reconstruction & Visual SLAM

Key Skills:

  • RGB-D sensor fusion
  • Loop closure detection
  • 3D map generation
  • Navigation integration

Best For: Students interested in autonomous navigation and 3D mapping

View Details ↓

Option B: SAMWISE Vision-Language

50 Points

Focus: Text-Driven Video Segmentation

Key Skills:

  • Natural language processing
  • Real-time video segmentation
  • Object tracking
  • Human-robot interaction

Best For: Students interested in AI/ML and vision-language models

View Details ↓

Option A: RTAB-Map 3D SLAM Implementation

Implement Real-Time Appearance-Based Mapping on the Husky + UR3 mobile manipulator for real-time 3D reconstruction and visual SLAM.

Point Breakdown (50 Points Total)

Component Points Description
Installation & Setup 5 Successfully install RTAB-Map for ROS2 Humble/Jazzy
Basic RGB-D SLAM 15 Implement visual SLAM with RGB-D camera
Loop Closure Detection 10 Demonstrate at least 3 successful loop closures
3D Map Generation 10 Create detailed 3D reconstruction and export point cloud
Navigation Integration 5 Use generated map for path planning with Nav2
Documentation & Report 5 Technical report with analysis and video demonstrations

Key Features to Implement

  • Real-time 3D reconstruction from RGB-D camera data
  • Loop closure detection for map consistency
  • Graph optimization for global map alignment
  • Multi-session mapping capability
  • Integration with navigation stack for autonomous navigation

Installation Requirements

System Requirements
  • Ubuntu 22.04 (ROS2 Humble) or Ubuntu 24.04 (ROS2 Jazzy)
  • GPU recommended for real-time processing
  • 8GB RAM minimum, 16GB recommended
  • 10GB storage for map databases
Installation Commands
# Install RTAB-Map ROS2 package
sudo apt update
sudo apt install ros-$ROS_DISTRO-rtabmap-ros

# Install additional dependencies
sudo apt install ros-$ROS_DISTRO-depth-image-proc \
                 ros-$ROS_DISTRO-compressed-image-transport \
                 ros-$ROS_DISTRO-image-transport-plugins \
                 ros-$ROS_DISTRO-pcl-ros \
                 ros-$ROS_DISTRO-octomap-server

# Optional: Install RTAB-Map standalone GUI
sudo apt install ros-$ROS_DISTRO-rtabmap

Required Demonstration Scenarios

Scenario 1: Indoor Mapping
  1. Launch Husky in Gazebo indoor environment
  2. Start RTAB-Map with RGB-D camera
  3. Teleoperate robot through environment
  4. Create complete loop for closure detection
  5. Save 3D reconstruction
Scenario 2: Object-Rich Environment
  1. Add multiple objects to scene
  2. Map environment with detailed 3D reconstruction
  3. Demonstrate octomap generation
  4. Export point cloud for analysis
Scenario 3: Multi-Session Mapping (Optional +5 bonus)
  1. Create initial map
  2. Save database
  3. Restart with localization mode
  4. Extend map to new areas

Deliverables

Required Submission Structure
rtabmap_extra_credit_[netID]/
├── launch/
│   └── rtabmap_husky.launch.py
├── src/
│   ├── rtabmap_controller.py
│   └── loop_closure_monitor.py
├── config/
│   └── rtabmap_params.yaml
├── maps/
│   ├── rtabmap.db
│   ├── map_2d.pgm
│   └── cloud_map.ply
├── results/
│   ├── trajectory.txt
│   └── loop_closures.csv
├── videos/
│   ├── mapping_demo.mp4
│   └── loop_closure_demo.mp4
├── report/
│   └── technical_report.pdf
└── README.md
Rosbag Recording
ros2 bag record -o rtabmap_demo_[netID] \
  /camera/color/image_raw \
  /camera/aligned_depth_to_color/image_raw \
  /scan /odom /tf /tf_static \
  /rtabmap/mapData /rtabmap/cloud_map /map

Option B: SAMWISE Text-Driven Video Segmentation

Integrate SAMWISE (CVPR 2025) for natural language-driven object segmentation and manipulation with the Husky + UR3 system.

Point Breakdown (50 Points Total)

Component Points Description
Installation & Setup 10 Install SAMWISE with all dependencies and verify GPU
ROS2 Node Implementation 25 Create segmentation node with text query handling
Demonstration Tasks 15 Complete 3 required tasks (static, dynamic, custom)
Optional: Choose one advanced feature for +25 points
Multi-Object Tracking +25 Handle multiple simultaneous text queries
Action-Based Queries +25 Implement motion-dependent queries
MoveIt2 Integration +25 Complete autonomous pick-and-place pipeline
Custom Fine-Tuning +25 Train on robot-specific scenarios
Note: The base 50 points include installation (10), ROS2 implementation (25), and demonstrations (15). Advanced features are optional and not required for the 50-point extra credit.

SAMWISE Architecture

  • Cross-Modal Temporal (CMT) Adapter: 4.2M parameters for temporal modeling
  • Conditional Memory Encoder (CME): 0.7M parameters for tracking bias detection
  • Dual Prompting Strategy: Contextual and motion prompts
  • Performance: State-of-the-art on MeViS (49.5%), Ref-YouTube-VOS (69.2%), Ref-DAVIS (70.6%)
  • Efficiency: Only 4.9M trainable parameters (SAM2 remains frozen)

Installation Requirements

Hardware Requirements
  • NVIDIA GTX 1080 (8GB minimum), RTX 3090 (24GB recommended)
  • 16GB RAM minimum, 32GB recommended
  • 20GB storage for models and datasets
Installation Commands
# Create conda environment
conda create --name samwise python=3.10 -y
conda activate samwise

# Install PyTorch with CUDA 11.8
pip install torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu118

# Clone SAMWISE repository
git clone https://github.com/ClaudiaCuttano/SAMWISE
cd SAMWISE
pip install -r requirements.txt

# Optional: Memory-efficient attention
pip install xformers

# Download pretrained checkpoint (choose one)
wget [checkpoint_url] -O checkpoints/samwise_clip.pth  # CLIP-B version
# OR
wget [checkpoint_url] -O checkpoints/samwise_roberta.pth  # RoBERTa version

ROS2 Node Implementation

Required Functionality
Subscribers:
  • /camera/image_raw (sensor_msgs/Image) - Camera feed
  • /object_query (std_msgs/String) - Text query
Publishers:
  • /samwise/segmentation (sensor_msgs/Image) - Segmentation mask overlay
  • /samwise/object_centroid (geometry_msgs/PointStamped) - Object center
  • /samwise/bounding_box (vision_msgs/Detection2D) - Object bounds
Core Features:
  • Maintain SAM2 memory bank for temporal consistency
  • Handle dynamic text query changes
  • Process video stream at minimum 5 FPS
  • Visualize results in RViz2

Required Demonstration Tasks

Task 1: Static Object Manipulation (5 pts)
  • Query: "Pick up the red cube"
  • SAMWISE segments red cube in camera view
  • UR3 moves to grasp position based on centroid
  • Gripper closes on object (1.05 for close, 0.0 for open)
Task 2: Dynamic Object Tracking (5 pts)
  • Query: "Follow the moving ball"
  • Husky base tracks ball centroid
  • Maintains 1-meter following distance
  • Handles temporary occlusions
Task 3: Custom Query (5 pts)
  • Student-defined creative text query
  • Must demonstrate unique SAMWISE capability
  • Examples: "The person waving", "The tallest object", "The transparent bottle"

Deliverables

Required Submission Structure
assignment4_samwise_[netID]/
├── rosbag/
│   └── samwise_demo_[netID].db3
├── videos/
│   ├── task1_static.mp4
│   ├── task2_dynamic.mp4
│   └── task3_custom.mp4
├── src/
│   └── samwise_ros2/
│       ├── samwise_segmentation_node.py
│       └── package.xml
├── config/
│   └── samwise_params.yaml
├── report/
│   └── technical_report.pdf
└── README.md
Evaluation Criteria
  • Correct segmentation accuracy (IoU > 0.6)
  • Real-time performance (> 5 FPS)
  • Temporal consistency across frames
  • Robustness to query changes
  • Code quality and documentation

Paper Reference

@misc{cuttano2025samwiseinfusingwisdomsam2,
      title={SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation},
      author={Claudia Cuttano and Gabriele Trivigno and Gabriele Rosi and
              Carlo Masone and Giuseppe Averta},
      year={2025},
      eprint={2411.17646},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17646},
}

⚠️ Important Submission Guidelines

  • Deadline: December 9, 2025 @ 11:00 PM CST (No extensions for extra credit)
  • Gradescope Assignments:
    • "Extra Credit - RTAB-Map" for Option A
    • "Extra Credit - SAMWISE" for Option B
  • File Size Limits: 500MB per submission
  • Academic Integrity: Individual work only - collaboration on extra credit will result in zero points

Questions & Support

  • Office Hours: Wednesdays 1:30-3:00 PM, SC 4407
  • Campuswire Tags:
    • Use #extra-credit-rtabmap for RTAB-Map questions
    • Use #extra-credit-samwise for SAMWISE questions
  • TA Email: ksa5@illinois.edu (use only for submission issues)