7 months ago

Zhou et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. arXiv 2017

Problem Definition

- Raw Point Cloud:
使用64環的LIDAR Sensor蒐集而得 (頻率約為1秒10圈, 每圈約可收集到十萬個反射點), 每個單一的點共有4個維度的資料: 座標位置(x, y, z)以及反射強度(Intensity)。

- Problem:
Input: Raw Point Cloud
Output: 3D Bounding Box Prediction

- Dataset:
KITTI Dataset (7000 data with bounding box annotation)

Architecture

- Step 1: Feature Learning Network
- Input: Raw Point Cloud
- Output: 4D Sparse Tensor (Feature Map)

- Step 2: Convolution Middle Layers
- Input: 4D Sparse Tensor (Feature Map)
- Output: High-Level Feature (Reducing the size)

- Step 3: Region Proposal Networks
- Input: High-Level Feature (4D)
- Output: Class and Bounding Box

← [Paper Reading] Visualizing Data using t-SNE [Paper Reading] Phoneme Recognition →
 
comments powered by Disqus