Weakly-Supervised Depth Completion during Robotic Micromanipulation from a Monocular Microscopic Image

1The Chinese University of Hong Kong, Shenzhen
2University of Toronto
fig1

Methods for obtaining dense depth maps. Conventionally, depth sensors such as lidars are used to obtain sparse depth, which is then converted into dense depth map via depth completion. However, in micromanipulation setups, depth sensors are unavailable. Traditional methods, such as depth from focus/defocus, yield depth maps with poor resolution. Our approach employs contact detection within a robotic micromanipulation system, coupled with deep learning methods, to generate dense depth maps.

Pipepine for depth completion during robotic micromanipulation. The pipeline first plans regions for contact detection, then in each region automated contact detection is performed only once to avoid repeated experiments on regions with similar image features. The collected sparse depth data are then augmented and fed into a depth completion network, followed by a refinement process to generate a dense depth map.

Abstract

Obtaining three-dimensional information, especially the z-axis depth information, is crucial for robotic micromanipulation. Due to the unavailability of depth sensors such as lidars in micromanipulation setups, traditional depth acquisition methods such as depth from focus or depth from defocus directly infer depth from microscopic images and suffer from poor resolution. Alternatively, micromanipulation tasks obtain accurate depth information by detecting the contact between an end-effector and an object (e.g., a cell). Despite its high accuracy, only sparse depth data can be obtained due to its low efficiency.

This paper aims to address the challenge of acquiring dense depth information during robotic cell micromanipulation. A weakly-supervised depth completion network is proposed to take cell images and sparse depth data obtained by contact detection as input to generate a dense depth map. A two-stage data augmentation method is proposed to augment the sparse depth data, and the depth map is optimized by a network refinement method.

The experimental results show that the MAE value of the depth prediction error is less than 0.3 mu m, which proves the accuracy and effectiveness of the method. This deep learning network pipeline can be seamlessly integrated with the robotic micromanipulation tasks to provide accurate depth information.

Video

Experimental Results

Qualitative Depth Completion Results

Qualitative depth completion results by the proposed method.The original images of Hela cells and the corresponding depth map are shown in parallel.

Interpolate start reference image.

BibTeX

@article{han2024weekly,
author    = {Han Yang, Yufei Jin, Guanqian Shan, Yibin Wang, Yongbin Zheng, Jiangfan Yu, Yu Sun, Zhuoran Zhang},
title     = {Weakly-Supervised Depth Completion during Robotic Micromanipulation from a Monocular Microscopic Image},
journal   = {ICRA},
year      = {2024},
}