CVTNet以激光点云多类投影生成的二维图为输入,利用cross transformer将多类信息交叉融合,为激光点云提取强特异性描述子,实现SLAM闭环检测或全局定位功能。此外,CVTNet生成的全局描述子具备车辆yaw角旋转不变性,提升车辆多视角地点识别精度。各项试验表明,CVTNet达到了SOTA的地点识别精度,且实时性适用于自动驾驶要求。
代码地址:https://github.com/BIT-MJY/CVTNet
论文地址:https://arxiv.org/abs/2302.01665
摘要如下:
LiDAR-based place recognition (LPR) is one of the most crucial components of autonomous vehicles to identify previously visited places in GPS-denied environments. Most existing LPR methods use mundane representations of the input point cloud without considering different views, which may not fully exploit the information from LiDAR sensors. In this paper, we propose a cross-view transformer-based network, dubbed CVTNet, to fuse the range image views (RIVs) and bird’s eye views (BEVs) generated from the LiDAR data. It extracts correlations within the views themselves using intra-transformers and between the two different views using inter-transformers. Based on that, our proposed CVTNet generates a yaw-angle-invariant global descriptor for each laser scan end-to-end online and retrieves previously seen places by descriptor matching between the current query scan and the pre-built database. We evaluate our approach on three datasets collected with different sensor setups and environmental conditions. The experimental results show that our method outperforms the state-of-the-art LPR methods with strong robustness to viewpoint changes and long-time spans. Furthermore, our approach has a good real-time performance that can run faster than the typical LiDAR frame rate.
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)