Related Blog Post: For behind-the-scenes details and the full development journey, check out the companion Medium article: How I'm Building an Autonomous Pick-and-Place System with ROS 2 Jazzy and Gazebo Harmonic
The blog dives into simulation setup, robotic control, MoveIt Task Constructor, and lessons learned — perfect if you're curious about the engineering side or want to replicate the project from scratch.
This project integrates the Robotiq 2-Finger Gripper with a Universal Robots UR3 arm using ROS 2 Humble / Jazzy and Ignition Gazebo. It includes URDF models, ROS 2 control configuration, simulation launch files, MoveIt Task Constructor pick-and-place, vision-based object detection, LLM-driven task planning (Ollama), and demonstration recording for behavior cloning.
Note: This setup uses fixed mimic joint configuration for the Robotiq gripper to support simulation in newer Gazebo (Harmonic). Only the primary
finger_jointreceives commands — mimic joints automatically follow.
Make sure you have ROS 2 Humble or ROS 2 Jazzy and Ignition Gazebo installed.
git clone https://github.com/darshmenon/UR3_ROS2_PICK_AND_PLACE.git
cd UR3_ROS2_PICK_AND_PLACE# Set to humble or jazzy
export ROS_DISTRO=humble
sudo apt install ros-$ROS_DISTRO-rviz2 \
ros-$ROS_DISTRO-joint-state-publisher \
ros-$ROS_DISTRO-robot-state-publisher \
ros-$ROS_DISTRO-ros2-control \
ros-$ROS_DISTRO-ros2-controllers \
ros-$ROS_DISTRO-controller-manager \
ros-$ROS_DISTRO-joint-trajectory-controller \
ros-$ROS_DISTRO-position-controllers \
ros-$ROS_DISTRO-gz-ros2-control \
ros-$ROS_DISTRO-ros2controlcli \
ros-$ROS_DISTRO-moveit \
ros-$ROS_DISTRO-moveit-ros-perception \
ros-$ROS_DISTRO-simple-grasping \
ros-$ROS_DISTRO-cv-bridge \
ros-$ROS_DISTRO-tf2-ros \
ros-$ROS_DISTRO-tf2-geometry-msgs \
ros-$ROS_DISTRO-pcl-rosJazzy only — add these two extra packages:
sudo apt install ros-jazzy-ros-gz-sim ros-jazzy-ros-gz-bridge \ ros-jazzy-moveit-planners-stompSTOMP is not packaged for Humble so leave it out there — the planner init fails silently and is harmless.
pip3 install -r requirements.txt
# Ollama is required for the LLM planner:
# Install from https://ollama.com
# Then pull your preferred model:
ollama pull llama2:latestcolcon build --symlink-install
source install/setup.bashThis project supports MoveIt Task Constructor (MTC) for advanced pick-and-place planning.
This repo already includes a patched MTC source in src/moveit_task_constructor/ that works for both ROS 2 Humble and Jazzy — no extra cloning needed. Just build normally:
colcon build --symlink-installMTC uses warehouse_ros_mongo to persist planning scenes and trajectories. MongoDB must be installed and running before launching the demo:
curl -fsSL https://www.mongodb.org/static/pgp/server-7.0.asc | \
sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | \
sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
sudo apt-get update && sudo apt-get install -y mongodb-org
sudo systemctl start mongod && sudo systemctl enable mongodVerify it is running: mongosh should connect to mongodb://127.0.0.1:27017.
For Humble/Jazzy API differences and troubleshooting, see ur_mtc_pick_place_demo/README.md.
bash ur_mtc_pick_place_demo/scripts/robot.shLaunches Gazebo + MoveIt + planning scene server + MTC demo in sequence.
ros2 launch ur_gazebo ur.gazebo.launch.pybash ur_mtc_pick_place_demo/scripts/pointcloud.shros2 launch ur_description view_ur.launch.py ur_type:=ur3ros2 launch robotiq_2finger_grippers robotiq_2f_85_gripper_visualization/launch/test_2f_85_model.launch.pyros2 action send_goal /arm_controller/follow_joint_trajectory control_msgs/action/FollowJointTrajectory \
'{
"trajectory": {
"joint_names": [
"shoulder_pan_joint",
"shoulder_lift_joint",
"elbow_joint",
"wrist_1_joint",
"wrist_2_joint",
"wrist_3_joint"
],
"points": [
{
"positions": [0.0, -1.57, 1.57, 0.0, 1.57, 0.0],
"time_from_start": { "sec": 2, "nanosec": 0 }
}
]
}
}'python3 ~/UR3_ROS2_PICK_AND_PLACE/ur_system_tests/scripts/arm_gripper_loop_controller.pyEstimates grasp poses from the Intel D435 point cloud. Two backends:
| Backend | Method | Dependency |
|---|---|---|
| simple_grasping (primary) | PCL RANSAC → moveit_msgs/Grasp[] |
ros-$ROS_DISTRO-simple-grasping |
| numpy centroid (fallback) | Colour HSV filter + centroid + height | built-in |
ros2 launch ur_grasp grasp_detection.launch.py colour:=red
python3 testing/test_grasp.py --colour red --executesource install/setup.bash
python3 ur_llm_planner/scripts/robot_gui.pyFeatures: live camera feed, preset poses, gripper control (Open/Half/Close), per-joint sliders, Pilz PTP execution.
ros2 run ur_moveit_demos custom_zigzag_motionWait at least 45 seconds after launching the simulation before running this.
chmod +x ~/UR3_ROS2_PICK_AND_PLACE/ur_mtc_pick_place_demo/scripts/robot.sh~/UR3_ROS2_PICK_AND_PLACE/ur_mtc_pick_place_demo/scripts/robot.shThis script launches the Gazebo simulation, MoveIt 2, the planning scene server, and the MTC pick-and-place demo.
Color + optional YOLO object detection + PCL-based cluster extraction from the Intel D435 camera.
ros2 launch ur_perception perception.launch.py
ros2 topic echo /detected_objects
# Annotated feed in RViz: /detection_imageNatural language to robot motion via local Ollama model:
ros2 launch ur_llm_planner llm_planner.launch.py
ros2 topic pub --once /llm_planner/command std_msgs/msg/String \
"{data: 'pick up the red block and place it in the left bin'}"ros2 launch ur_data_collector data_collector.launch.py
ros2 service call /data_collector/start_recording std_srvs/srv/Trigger
ros2 service call /data_collector/stop_recording std_srvs/srv/Trigger
python3 ur_data_collector/scripts/train_bc.py \
--data_dir ~/ur3_demos \
--output_dir ~/bc_policy \
--epochs 50SmolVLA is a compact VLA model from HuggingFace that takes a camera image + joint states and predicts robot actions directly from a natural-language task description. This replaces hardcoded waypoints with a learned policy.
Install lerobot (requires Python >= 3.11):
python3.11 -m pip install "git+https://github.com/huggingface/lerobot.git#egg=lerobot[smolvla]"Run inference against the base model:
# Terminal 1 — start simulation
ros2 launch ur_gazebo ur.gazebo.launch.py
# Terminal 2 — run SmolVLA inference
ros2 launch ur_smolvla smolvla_inference.launch.py \
task:="pick the red block and place it in the bin"Run with a fine-tuned checkpoint:
ros2 launch ur_smolvla smolvla_inference.launch.py \
checkpoint:=/path/to/your/checkpoint \
task:="pick the red block"The inference node subscribes to /camera_head/color/image_raw + /joint_states and publishes JointTrajectory commands to /arm_controller/joint_trajectory at 10 Hz. The camera is a simulated Intel D435 mounted at 0.50 m height with a 25° downward tilt, giving a clear view of the workspace.
Workflow to fine-tune SmolVLA on your own pick-and-place demos:
- Record demonstrations with
ur_data_collector(saves HDF5 episodes) - Convert to LeRobot dataset format and fine-tune SmolVLA
- Point
checkpoint:=at your fine-tuned model and run inference
Note on UR3 Adaptation: The models trained in
mujoco-ur-arm-rlare optimized for the UR5e arm. To use them effectively on the UR3, you will need to tweak the Gymnasium environments (to account for UR3 link lengths/workspace) and retrain the model. Furthermore, ensure spawned objects aren't placed too close to the robotic base, as this causes reachability issues.
Run a pre-trained Soft Actor-Critic (SAC) policy trained in MuJoCo directly on the simulated UR3. Two nodes are included:
ur_policy_node— basic reach policy (arm joints only, no gripper)shared_arm_policy_node— full pick-and-place policy with arm + gripper
Run with a trained model:
# Terminal 1 — start simulation
ros2 launch ur_gazebo ur.gazebo.launch.py
# Terminal 2 — run shared-arm SAC policy
ros2 run mujoco_ur_rl_ros2 shared_arm_policy_node \
--ros-args \
-p model_path:=/path/to/best_model.zip \
-p object_x:=0.45 -p object_y:=0.0 -p object_z:=0.045 \
-p drop_x:=0.45 -p drop_y:=0.2 -p drop_z:=0.025Or use the bundled Gazebo launch (boots simulation + policy together):
ros2 launch mujoco_ur_rl_ros2 gazebo_shared_arm_policy.launch.py \
model_path:=/path/to/best_model.zip \
launch_policy:=trueThe policy subscribes to /joint_states and publishes JointTrajectory commands to /arm_controller/joint_trajectory and /gripper_controller/joint_trajectory at 10 Hz.
Training environments (for re-training or fine-tuning) are in mujoco_ur_rl_ros2/envs/:
| Env | Description |
|---|---|
ur_gazebo_single_arm_env.py |
Single arm at origin — matches Gazebo layout, use this to train a policy that transfers directly |
ur_pick_place_env.py |
Basic pick-place, simple reward |
shared_arm_env.py |
Multi-arm shared policy training |
ur_dual_arm_env.py |
Dual-arm scene with proven phase-based reward |
Train a Gazebo-compatible policy from scratch:
# from the repo root
python3 mujoco_ur_rl_ros2/train_gazebo_single_arm.py \
--timesteps 2000000 \
--n-envs 8 \
--curriculum grasp_focusResume a previous run:
python3 mujoco_ur_rl_ros2/train_gazebo_single_arm.py \
--timesteps 2000000 \
--n-envs 8 \
--curriculum grasp_focus \
--resume models/gazebo_single_arm/<run>/best_model.zipBest model saves to models/gazebo_single_arm/<run>/best_model.zip — checkpoints saved every 100k steps. Then pass that path to shared_arm_policy_node above.
Key hyperparameters (in
train_gazebo_single_arm.py):
--curriculum grasp_focus— starts episodes near the object, critical for learning graspsent_coef=0.1(fixed) — prevents SAC entropy from collapsing before grasps are discovered--learning-rate,--buffer-size,--batch-size— tunable via CLI
ros2 launch ur_gazebo full_demo.launch.py
ros2 launch ur_gazebo full_demo.launch.py use_llm_planner:=trueFeel free to open pull requests or issues for improvements or bug reports.

.gif)








