PointPillars: Fast Encoders for Object Detection from Point Clouds
2022.11.22
Pointpillars paper review for 3d object detection competition
Keywords: #3dobjectdetection #3dmodel #pointcloud
0. Abstract
- Proposal: PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). + a lean downstream network to train the encoded features
1. Introduction
- Difference between 2d computer vision and lidar pcd object detection
- The point cloud is a sparse representation, while image is dense
- The point cloud is 3D, while the image is 2D
- Past Literature
- 3D convolution → projection of pc to image → pc to bird’s eye view(BEV)
- BEV tends to be extremely sparse → VoxelNet, SECOND uses 3D convolution middle layers
- Proposal
- PointPillars: a method for object detection in 3D that enables end-to-end learning with only 2D convolutional layers
2. PointPillars Network
- Feature encoder network that converts pc to a sparse pseudo image
- 2D convolutional backbone to process the pseudo-image into high-level representation
- A detection head that detects and regresses 3D boxes.