suwesh
/

Parallel-Perception-Network

Image Segmentation

Model card Files Files and versions Community

suwesh commited on May 15, 2024

Commit

3f01960

·

verified ·

1 Parent(s): 767c0b7

Update README.md

Files changed (1) hide show

README.md +16 -1

README.md CHANGED Viewed

@@ -2,4 +2,19 @@
 license: osl-3.0
 ---
 Abstract:
-Autonomous driving when applied for high-speed racing aside for urban environments presents unique challenges due to dynamic nature of racing circuits and the need for optimal future planning at high speeds. While simulator-based approaches, CARLA, Air Sim, TORCS provide clean environment data to models but transfer learning to real-world is a hard problem. In this paper, we propose leveraging LiDAR data obtained from the real-world to train a deep learning network to understand and predict the future states of the perceived scenes. The proposed neural network, named Perception Pyramid Network (PPN), takes a sequence of past environment scans plus the current scan, and learns how the environment evolves over a sequence of future timesteps. Due to limitation of processing raw point clouds in directly capturing spatial relationships between points the 3D point clouds obtained from LiDAR sweeps are converted into 2D Bird’s Eye View map, which encodes information in each grid cell. PPN is a type of encoder-decoder network that extracts these features, across both space and time, in a hierarchical fashion resembling the pyramid shape. The network is trained with a combination of loss functions and exhibits a real-time inference frequency of ~16 Hz on an NVIDIA Tesla P100 GPU. Implementation is available at: https://github.com/suwesh/Parallel-Perception-Network.

 license: osl-3.0
 ---
 Abstract:
+Autonomous driving when applied for high-speed racing aside from urban environments
+ presents challenges in scene understanding due to rapid changes in the track environment.
+ Traditional sequential network approaches might struggle to keep up with the real-time
+ knowledge and decision-making demands of an autonomous agent which covers large
+ displacements in a short time. This paper proposes a novel baseline architecture for
+ developing sophisticated models with the ability of true hardware-enabled parallelism
+ to achieve neural processing speeds to mirror the agent’s high velocity. The proposed
+ model, named Parallel Perception Network (PPN) consists of two independent neural
+ networks, a segmentation and a reconstruction network running in parallel on separate
+ accelerated hardware. The model takes raw 3D point cloud data from the LiDAR sensor as
+ input and converts them into a 2D Bird’s Eye View Map on both devices. Each network
+ extracts its input features along space and time dimensions independently and produces
+ outputs in parallel. Our model is trained on a system with 2 NVIDIA T4 GPUs with a
+ combination of loss functions including edge preservation, and shows a 1.8x speed up in
+ model inference time compared to a sequential configuration. Implementation is available
+ at: https://github.com/suwesh/Parallel-Perception-Network.