Skip to main content Link Menu Expand (external link) Document Search Copy Copied

PointPillars: Fast Encoders for Object Detection from Point Clouds

2022.11.22

Pointpillars paper review for 3d object detection competition
Keywords: #3dobjectdetection #3dmodel #pointcloud


0. Abstract

  • Proposal: PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). + a lean downstream network to train the encoded features

1. Introduction

  • Difference between 2d computer vision and lidar pcd object detection
    1. The point cloud is a sparse representation, while image is dense
    2. The point cloud is 3D, while the image is 2D
  • Past Literature
    • 3D convolution → projection of pc to image → pc to bird’s eye view(BEV)
    • BEV tends to be extremely sparse → VoxelNet, SECOND uses 3D convolution middle layers
  • Proposal
    • PointPillars: a method for object detection in 3D that enables end-to-end learning with only 2D convolutional layers

2. PointPillars Network

  1. Feature encoder network that converts pc to a sparse pseudo image
  2. 2D convolutional backbone to process the pseudo-image into high-level representation
  3. A detection head that detects and regresses 3D boxes.