【论文合集】RGBD Semantic Segmentation

2023-05-16

来源:GitHub - Yangzhangcst/RGBD-semantic-segmentation: A paper list of RGBD semantic segmentation (processing)

RGBD semantic segmentation

A paper list of RGBD semantic segmentation.

*Last updated: 2022/07/26

Update log

2020/May - update all of recent papers and make some diagram about history of RGBD semantic segmentation.
2020/July - update some recent papers (CVPR2020) of RGBD semantic segmentation.
2020/August - update some recent papers (ECCV2020) of RGBD semantic segmentation.
2020/October - update some recent papers (CVPR2020, WACV2020) of RGBD semantic segmentation.
2020/November - update some recent papers (ECCV2020, arXiv), the links of papers and codes for RGBD semantic segmentation.
2020/December - update some recent papers (PAMI, PRL, arXiv, ACCV) of RGBD semantic segmentation.
2021/February - update some recent papers (TMM, NeurIPS, arXiv) of RGBD semantic segmentation.
2021/April - update some recent papers (CVPR2021, ICRA2021, IEEE SPL, arXiv) of RGBD semantic segmentation.
2021/July - update some recent papers (CVPR2021, ICME2021, arXiv) of RGBD semantic segmentation.
2021/August - update some recent papers (IJCV, ICCV2021, IEEE SPL, arXiv) of RGBD semantic segmentation.
2022/January - update some recent papers (TITS, PR, IEEE SPL, arXiv) of RGBD semantic segmentation.
2022/March - update benchmark results on Cityscapes and ScanNet datasets.
2022/April - update some recent papers (CVPR, BMVC, IEEE TMM, arXiv) of RGBD semantic segmentation.
2022/May - update some recent papers of RGBD semantic segmentation.
2022/July - update some recent papers of RGBD semantic segmentation.

Datasets

The papers related to datasets used mainly in natural/color image segmentation are as follows.

  • [NYUDv2] The NYU-Depth V2 dataset consists of 1449 RGB-D images showing interior scenes, which all labels are usually mapped to 40 classes. The standard training and test set contain 795 and 654 images, respectively.
  • [SUN RGB-D] The SUN RGB-D dataset contains 10,335 RGBD images with semantic labels organized in 37 categories. The 5,285 images are used for training, and 5050 images are used for testing.
  • [2D-3D-S] Stanford-2D-3D-Semantic dataset contains 70496 RGB and depth images as well as 2D annotation with 13 object categories. Areas 1, 2, 3, 4, and 6 are utilized as the training and Area 5 is used as the testing set.
  • [Cityscapes] Cityscapes contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames.
  • [ScanNet] ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations.

Metrics

The papers related to metrics used mainly in RGBD semantic segmentation are as follows.

  • [PixAcc] Pixel accuracy
  • [mAcc] Mean accuracy
  • [mIoU] Mean intersection over union
  • [f.w.IOU] Frequency weighted IOU

Performance tables

Speed is related to the hardware spec (e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. We select four indexes namely PixAcc, mAcc, mIoU, and f.w.IOU to make comparison. The closer the segmentation result is to the ground truth, the higher the above four indexes are.

NYUDv2

MethodPixAccmAccmIoUf.w.IOUInputRef. fromPublishedYear
POR59.128.429.1RGBDCVPR2013
RGBD R-CNN60.335.131.347(in LSD-GF)RGBDECCV2014
DeconvNet69.956.442.756RGBLSD-GFICCV2015
DeepLab68.746.936.852.5RGBDSTD2PICLR2015
CRF-RNN66.348.935.451RGBDSTD2PICCV2015
Multi-Scale CNN65.645.134.151.4RGBLCSF-DeconvICCV2015
FCN65.446.13449.5RGBDLCSF-DeconvCVPR2015
Mutex Constraints63.831.548.5 (in LSD-GF)RGBDICCV2015
E2S258.152.93144.2RGBDSTD2PECCV2016
BI-300058.939.327.743RGBDSTD2PECCV2016
BI-100057.737.827.141.9RGBDSTD2PECCV2016
LCSF-Deconv47.3RGBDECCV2016
LSTM-CF49.4RGBDECCV2016
CRF+RF+RFS73.8RGBDPRL2016
RDFNet-1527662.850.1RGBDICCV2017
SCN-ResNet15249.6RGBDICCV2017
RDFNet-5074.860.447.7RGBDICCV2017
CFN(RefineNet)47.7RGBDICCV2017
RefineNet-15273.658.946.5RGBCVPR2017
LSD-GF71.960.745.959.3RGBDCVPR2017
3D-GNN55.743.1RGBDICCV2017
DML-Res5040.2RGBIJCAI2017
STD2P70.153.840.155.7RGBDCVPR2017
PBR-CNN33.2RGBICCBS2017
B-SegNet6845.832.4RGBBMVC2017
FC-CRF63.13929.548.4RGBDTIP2017
LCR55.631.721.839.9RGBDICIP2017
SegNet54.130.52138.5RGBDLCRTPAMI2017
D-Refine-15274.159.547RGBICPR2018
TRL-ResNet5076.256.346.4RGBECCV2018
D-CNN56.343.9RGBDECCV2018
RGBD-Geo70.351.741.254.2RGBDMTA2018
Context7053.640.6RGBTPAMI2018
DeepLab-LFOV70.349.639.454.7RGBDSTD2PTPAMI2018
D-depth-reg66.746.334.850.6RGBDPRL2018
PU-Loop72.144.5RGBCVPR2018
C-DCNN6950.839.8RGBTNNLS2018
GAD84.868.759.6RGBCVPR2019
CTS-IM76.350.6RGBDICIP2019
PAP76.262.550.4RGBCVPR2019
KIL-ResNet10175.158.450.2RGBACPR2019
2.5D-Conv75.949.1RGBDICIP2019
ACNet48.3RGBDICIP2019
3M2RNet766348RGBDSIC2019
FDNet-16s73.960.347.4RGBAAAI2019
DMFNet74.459.346.8RGBDIEEE Access2019
MMAF-Net-15272.259.244.8RGBDarXiv2019
RTJ-AA42RGBICRA2019
JTRL-ResNet5081.360.050.3RGBTPAMI2019
3DN-Conv52.439.3RGB3DV2019
SGNet76.863.151RGBDTIP2020
SCN-ResNet10148.3RGBDTCYB2020
RefineNet-Res152-Pool474.459.647.6RGBTPAMI2020
TSNet73.559.646.1RGBDIEEE IS2020
PSD-ResNet5077.058.651.0RGBCVPR2020
Malleable 2.5D76.950.9RGBDECCV2020
BCMFP+SA-Gate77.952.4RGBDECCV2020
MTI-Net75.362.949.0RGBECCV2020
VCD+RedNet63.550.7RGBDCVPR2020
VCD+ACNet64.451.9RGBDCVPR2020
SANet75.950.7RGBarXiv2020
Zig-Zag Net (ResNet152)77.064.051.2RGBDTPAMI2020
MCN-DRM56.143.1RGBDICNSC2020
CANet76.663.851.2RGBDACCV2020
CEN(ResNet152)77.765.052.5RGBDNeurIPS2020
ESANet50.5RGBDICRA2021
LWM(ResNet152)81.4665.2451.51RGBTMM2021
GLPNet(ResNet101)79.166.654.6RGBDarXiv2021
ESOSD-Net(Xception-65)73.364.745.0RGBarXiv2021
NANet(ResNet101)77.952.3RGBDIEEE SPL2021
InverseForm78.153.1RGBCVPR2021
FSFNet77.952.0RGBDICME2021
CSNet77.563.651.5RGBDISPRS JPRS2021
ShapeConv75.862.850.262.6RGBDICCV2021
CI-Net72.742.6RGBarXiv2021
RGBxD76.763.551.1RGBDNeurocomput.2021
TCD(ResNet101)77.853.1RGBDIEEE SPL2021
RAFNet-5073.860.347.5RGBDDisplays2021
RTLNet77.753.1RGBDIEEE SPL2021
H3S-Fuse78.353.5RGBBMVC2021
EBANet76.8251.51RGBDICCSIP2021
CANet(ResNet101)77.164.651.5RGBDPR2022
ADSD(ResNet50)77.565.352.5RGBDarXiv2022
InvPT53.56RGBarXiv2022
PGDENet78.166.753.7RGBDIEEE TMM2022
CMX80.156.9RGBDarXiv2022
RFNet80.164.753.5RGBDIEEE TETCI2022
MTF79.066.954.2RGBDCVPR2022
FRNet77.666.553.6RGBDIEEE JSTSP2022
DRD51.038.2RGBIEEE ICASSP2022
SAMD74.467.252.361.9RGBDNeurocomput.2022
BFFNet-15247.5RGBDIEEE ICSP2022
MQTransformer49.18RGBDarXiv2022
GED75.962.449.4RGBDMTA2022
LDF84.868.759.6RGBMTA2022
PCGNet77.652.1RGBDIEEE ICMEW2022

SUN RGB-D

MethodPixAccmAccmIoUf.w.IOUInputRef. fromPublishedYear
FCN68.238.427.4RGBSegNetCVPR2015
DeconvNet66.132.322.6RGBSegNetICCV2015
IFCN77.755.542.7RGBarXiv2016
CFN(RefineNet)48.1RGBDICCV2017
RDFNet-15281.560.147.7RGBDICCV2017
RefineNet-Res15280.658.545.9RGBCVPR2017
3D-GNN5745.9RGBDICCV2017
DML-Res5042.3RGBIJCAI2017
HP-SPS75.750.138RGBBMVC2017
FuseNet76.348.337.3RGBDACCV2017
LRN72.546.833.1RGBarXiv2017
SegNet72.644.831.8RGBMMAF-Net-152TPAMI2017
B-SegNet71.245.930.7RGBBMVC2017
LSD-GF58RGBDCVPR2017
TRL-ResNet10184.358.950.3RGBECCV2018
CCF-GMA81.460.347.1RGBCVPR2018
D-Refine-15280.858.946.3RGBICPR2018
Context78.453.442.3RGBTPAMI2018
D-CNN53.542RGBDECCV2018
G-FRNet-Res10175.347.536.9RGBarXiv2018
DeepLab-LFOV71.942.232.1RGBTPAMI2018
PU-Loop80.345.1RGBCVPR2018
C-DCNN77.35039.4RGBTNNLS2018
GAD85.574.954.5RGBCVPR2019
KIL-ResNet10184.85852RGBACPR2019
PAP83.858.450.5RGBCVPR2019
3M2RNet83.163.549.8RGBDSIC2019
CTS82.448.5RGBDICIP2019
2.5D-Conv82.448.2RGBDICIP2019
ACNet48.1RGBDICIP2019
MMAF-Net-1528158.247RGBDarXiv2019
LCR-RGBD42.4RGBDCVPRW2019
EFCN-8s76.953.540.7RGBTIP2019
DSNet75.632.1RGBICASSP2019
JTRL-ResNet10184.859.150.8RGBTPAMI2019
SCN-ResNet15250.7RGBDTCYB2020
SGNet81.860.948.5RGBDTIP2020
CGBNet82.361.348.2RGBTIP2020
CANet-ResNet10181.947.7RGBarXiv2020
RefineNet-Res152-Pool481.157.747RGBTPAMI2020
PSD-ResNet5084.057.350.6RGBCVPR2020
BCMFP+SA-Gate82.549.4RGBDECCV2020
QGN82.445.4RGBDWACV2020
VCD+RedNet62.950.3RGBDCVPR2020
VCD+ACNet64.151.2RGBDCVPR2020
SANet82.351.5RGBarXiv2020
Zig-Zag Net (ResNet152)84.762.951.8RGBDTPAMI2020
MCN-DRM54.642.8RGBDICNSC2020
CANet82.560.549.3RGBDACCV2020
CEN(ResNet152)83.563.251.1RGBDNeurIPS2020
AdapNet++38.4RGBDIJCV2020
ESANet48.3RGBDICRA2021
LWM(ResNet152)82.6570.2153.12RGBTMM2021
GLPNet(ResNet101)82.863.351.2RGBDarXiv2021
NANet(ResNet101)82.348.8RGBDIEEE SPL2021
FSFNet81.850.6RGBDICME2021
CSNet82.063.152.8RGBDISPRS JPRS2021
ShapeConv(ResNet101)82.058.547.671.2RGBDICCV2021
CI-Net80.744.3RGBarXiv2021
RGBxD81.758.847.7RGBDNeurocomput.2021
TCD(ResNet101)83.149.5RGBDIEEE SPL2021
RAFNet-5081.359.447.2RGBDDisplays2021
GRBNet81.345.7RGBDTITS2021
RTLNet81.345.7RGBDIEEE SPL2021
CANet(ResNet101)85.250.6RGBDPR2022
ADSD(ResNet50)81.862.149.6RGBDarXiv2022
PGDENet87.761.751.0RGBDIEEE TMM2022
CMX83.351.1RGBDIEEE TMM2022
RFNet87.359.050.7RGBDIEEE TETCI2022
MTF84.764.153.0RGBDCVPR2022
FRNet87.462.251.8RGBDIEEE JSTSP2022
DRD48.939.5RGBIEEE ICASSP2022
SAMD63.4RGBDNeurocomput.2022
BFFNet-15286.744.6RGBDIEEE ICSP2022
LDF85.568.347.5RGBMTA2022
PCGNet82.149.0RGBDIEEE ICMEW2022

2D-3D-S

MethodPixAccmAccmIoUf.w.IOUInputRef. fromPublishedYear
Deeplab64.346.735.548.5RGBDMMAF-Net-152ICLR2015
D-CNN65.435.9RGBDCMXECCV2018
DeepLab-LFOV88.042.269.8RGBPU-LoopTPAMI2018
D-CNN65.455.539.549.9RGBDECCV2018
PU-Loop91.076.5RGBCVPR2018
MMAF-Net-15276.562.352.9RGBDarXiv2019
3M2RNet79.875.263RGBDSIC2019
ShapeConv82.760.6RGBDCMXICCV2021
CMX82.662.1RGBDarXiv2022

Cityscapes

Benchmark Suite – Cityscapes Dataset

ScanNet

Benchmark Results - ScanNet Benchmark (2D Semantic label benchmark)

Paper list

  • [POR] Gupta, S., et al. (2013). Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images. IEEE Conference on Computer Vision and Pattern Recognition: 564-571. [Paper] [Code]
  • [RGBD R-CNN] Gupta, S., et al. (2014). Learning Rich Features from RGB-D Images for Object Detection and Segmentation. European Conference on Computer Vision: 345-360. [Paper] [Code]
  • [FCN] Long, J., et al. (2015). Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition: 3431-3440. [Paper] [Code]
  • [CRF-RNN] Zheng, S., et al. (2015). Conditional Random Fields as Recurrent Neural Networks. IEEE International Conference on Computer Vision: 1529-1537. [Paper] [Code]
  • [Mutex Constraints] Deng, Z., et al. (2015). Semantic Segmentation of RGBD Images with Mutex Constraints. IEEE International Conference on Computer Vision: 1733-1741. [Paper] [Code]
  • [DeepLab] Chen, L., et al. (2015). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. International Conference on Learning Representations. [Paper] [Code]
  • [Multi-Scale CNN] Eigen, D. and R. Fergus (2015). Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture. IEEE International Conference on Computer Vision: 2650-2658. [Paper] [Code]
  • [DeconvNet] Noh, H., et al. (2015). Learning Deconvolution Network for Semantic Segmentation. International Conference on Computer Vision: 1520-1528. [Paper] [Code]
  • [LSTM-CF] Li, Z., et al. (2016). LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling. European Conference on Computer Vision: 541-557. [Paper] [Code]
  • [LCSF-Deconv] Wang, J., et al. (2016). Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks. European Conference on Computer Vision: 664-679. [Paper] [Code]
  • [BI] Gadde, R., et al. (2016). Superpixel Convolutional Networks using Bilateral Inceptions. European Conference on Computer Vision: 597-613. [Paper] [Code]
  • [E2S2] Caesar, H., et al. (2016). Region-Based Semantic Segmentation with End-to-End Training. European Conference on Computer Vision: 381-397. [Paper] [Code]
  • [IFCN] Shuai, B., et al. (2016). Improving Fully Convolution Network for Semantic Segmentation. arXiv:1611.08986. [Paper] [Code]
  • [CRF+RF+RFS] Thøgersen, M., et al. (2016). Segmentation of RGB-D Indoor Scenes by Stacking Random Forests and Conditional Random Fields. Pattern Recognition Letters 80, 208-215. [Paper] [Code]
  • [SegNet] Badrinarayanan, V., et al. (2017). SegNet: A Deep Convolutional EnCoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(12): 2481-2495. [Paper] [Code]
  • [LSD-GF] Cheng, Y., et al. (2017). Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition: 1475-1483. [Paper] [Code]
  • [LCR] Chu, J., et al. (2017). Learnable contextual regularization for semantic segmentation of indoor scene images. IEEE International Conference on Image Processing: 1267-1271. [Paper] [Code]
  • [RefineNet] Lin, G., et al. (2017). RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition : 5168-5177, [Paper] [Code1] [Code2]
  • [FuseNet] Hazirbas, C., et al. (2017). FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture. Asian Conference on Computer Vision: 213-228. [Paper] [Code]
  • [STD2P] He, Y., et al. (2017). STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling. IEEE Conference on Computer Vision and Pattern Recognition: 7158-7167. [Paper] [Code]
  • [RDFNet] Lee, S., et al. (2017). RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation. IEEE International Conference on Computer Vision: 4990-4999. [Paper] [Code]
  • [CFN(RefineNet)] Lin, D., et al. (2017). Cascaded Feature Network for Semantic Segmentation of RGB-D Images. IEEE International Conference on Computer Vision: 1320-1328. [Paper] [Code]
  • [3D-GNN] Qi, X., et al. (2017). 3D Graph Neural Networks for RGBD Semantic Segmentation. IEEE International Conference on Computer Vision: 5209-5218. [Paper] [Code1] [Code2]
  • [DML-Res50] Shen, T., et al. (2017). Learning Multi-level Region Consistency with Dense Multi-label Networks for Semantic Segmentation. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence: 2708-2714. [Paper] [Code]
  • [PBR-CNN] Zhang, Y., et al. (2017). Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. IEEE Conference on Computer Vision and Pattern Recognition: 5057-5065. [Paper] [Code]
  • [FC-CRF] Liu, F., et al. (2017). Discriminative Training of Deep Fully Connected Continuous CRFs With Task-Specific Loss. IEEE Transactions on Image Processing 26(5), 2127-2136. [Paper] [Code]
  • [HP-SPS] Park, H., et al. (2017). Superpixel-based semantic segmentation trained by statistical process control. British Machine Vision Conference. [Paper] [Code]
  • [LRN] Islam, M. A., et al. (2017). Label Refinement Network for Coarse-to-Fine Semantic Segmentation. arXiv1703.00551. [Paper] [Code]
  • [G-FRNet-Res101] Islam, M. A., et al. (2018). Gated Feedback Refinement Network for Coarse-to-Fine Dense Semantic Image Labeling. arXiv:1806.11266 [Paper] [Code]
  • [CCF-GMA] Ding, H., et al. (2018). Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation. IEEE Conference on Computer Vision and Pattern Recognition: 2393-2402. [Paper] [Code]
  • [Context] Lin, G., et al. (2018). Exploring Context with Deep Structured Models for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1352-1366. [Paper] [Code]
  • [D-Refine-152] Chang, M., et al. (2018). Depth-assisted RefineNet for Indoor Semantic Segmentation. International Conference on Pattern Recognition: 1845-1850. [Paper] [Code]
  • [D-depth-reg] Guo, Y. and T. Chen (2018). Semantic segmentation of RGBD images based on deep depth regression. Pattern Recognition Letters 109: 55-64. [Paper] [Code]
  • [RGBD-Geo] Liu, H., et al. (2018). RGB-D joint modeling with scene geometric information for indoor semantic segmentation. Multimedia Tools and Applications 77(17): 22475-22488. [Paper] [Code]
  • [D-CNN] Wang, W. and U. Neumann (2018). Depth-aware CNN for RGB-D Segmentation. European Conference on Computer Vision: 144-161. [Paper] Code
  • [TRL-ResNet50/101] Zhang, Z., et al. (2018). Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation. European Conference on Computer Vision. [Paper] [Code]
  • [DeepLab-LFOV] Chen, L., et al. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. [Paper] [Code]
  • [PU-Loop] Kong, S. and C. Fowlkes (2018). Recurrent Scene Parsing with Perspective Understanding in the Loop. IEEE Conference on Computer Vision and Pattern Recognition: 956-965. [Paper] [Code]
  • [PAD-Net] Xu, D., et al. (2018). PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing. IEEE Conference on Computer Vision and Pattern Recognition: 675-684. [Paper] [Code]
  • [C-DCNN] Liu, J., et al. (2018) Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation. IEEE Transactions on Neural Networks and Learning Systems 29(11): 5655-5666. [Paper] [Code]
  • [EFCN-8s] Shuai, B., et al. (2019). Toward Achieving Robust Low-Level and High-Level Scene Parsing. IEEE Transactions on Image Processing, 28(3), 1378-1390. [Paper] [Code]
  • [3M2RNet] Fooladgar, F., and Kasaei, S. (2019). 3M2RNet: Multi-Modal Multi-Resolution Refinement Network for Semantic Segmentation. Science and Information Conference: 544-557. [Paper] [Code]
  • [RFBNet] Deng, L., et al. (2019). RFBNet: Deep Multimodal Networks with Residual Fusion Blocks for RGB-D Semantic Segmentation. arXiv:1907.00135 [Paper] [Code]
  • [MMAF-Net-152] Fooladgar, F. and S. Kasaei (2019). "Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images." arXiv:1912.11691. [Paper] [Code]
  • [LCR-RGBD] Giannone, G. and B. Chidlovskii (2019). Learning Common Representation from RGB and Depth Images. IEEE Conference on Computer Vision and Pattern Recognition Workshops. [Paper] [Code]
  • [ACNet] Hu, X., et al. (2019). ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation. IEEE International Conference on Image Processing: 1440-1444. [Paper] [Code]
  • [DSNet] Jiang, F., et al. (2019). DSNET: Accelerate Indoor Scene Semantic Segmentation. IEEE International Conference on Acoustics, Speech and Signal Processing: 3317-3321. [Paper] [Code]
  • [GAD] Jiao, J., et al. (2019). Geometry-Aware Distillation for Indoor Semantic Segmentation*. IEEE Conference on Computer Vision and Pattern Recognition: 2864-2873. [Paper] [Code]
  • [RTJ-AA] Nekrasov, V., et al. (2019). Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations. International Conference on Robotics and Automation: 7101-7107. [Paper] [Code]
  • [CTS-IM] Xing, Y., et al. (2019). Coupling Two-Stream RGB-D Semantic Segmentation Network by Idempotent Mappings. IEEE International Conference on Image Processing: 1850-1854. [Paper] [Code]
  • [2.5D-Conv] Xing, Y. J., et al. (2019). 2.5d Convolution for RGB-D Semantic Segmentation. IEEE International Conference on Image Processing: 1410-1414. [Paper] [Code]
  • [DMFNet] Yuan, J., et al. (2019). DMFNet: Deep Multi-Modal Fusion Network for RGB-D Indoor Scene Segmentation. IEEE Access 7: 169350-169358. [Paper] [Code]
  • [PAP] Zhang, Z., et al. (2019). Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition: 4101-4110. [Paper] [Code]
  • [KIL-ResNet101] Zhou, L., et al. (2019). KIL: Knowledge Interactiveness Learning for Joint Depth Estimation and Semantic Segmentation. Asian Conference on Pattern Recognition: 835-848. [Paper] [Code]
  • [FDNet-16s] Zhen, M., et al. (2019). Learning Fully Dense Neural Networks for Image Semantic Segmentation. The Thirty-Third AAAI Conference on Artificial Intelligence: 9283-9290. [Paper] [Code]
  • [JTRL-ResNet50/101] Zhang, Z., et al. (2019). Joint Task-Recursive Learning for RGB-D Scene Understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence. [Paper] [Code]
  • [3DN-Conv] Chen, Y., et al. (2019). 3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation. International Conference on 3D Vision. [Paper] [Code]
  • [SGNet] Chen, L.-Z., et al. (2020). Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation. IEEE Transactions on Image Processing. [Paper] [Code]
  • [SCN-ResNet101] Lin, D., et al. (2020). SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images. IEEE Transactions on Cybernetics 50(3): 1120-1131. [Paper] [Code]
  • [RefineNet-Res152-Pool4] Lin, G., et al. (2020). RefineNet: Multi-Path Refinement Networks for Dense Prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(5): 1228-1242. [Paper] [Code]
  • [CANet-ResNet101] Tang, Q., et al. (2020). Attention-guided Chained Context Aggregation for Semantic Segmentation. arXiv:2002.12041. [Paper] [Code]
  • [CGBNet] Ding, H., et al. (2020). Semantic Segmentation with Context Encoding and Multi-Path Decoding. IEEE Transactions on Image Processing 29: 3520-3533. [Paper] [Code]
  • [TSNet] Zhou, W., et al. (2020). TSNet: Three-stream Self-attention Network for RGB-D Indoor Semantic Segmentation. IEEE Intelligent Systems. [Paper] [Code]
  • [PSD-ResNet50] Zhou, L., et al. (2020). Pattern-Structure Diffusion for Multi-Task Learning. IEEE Conference on Computer Vision and Pattern Recognition. [Paper] [Code]
  • [Malleable 2.5D] Xing, Y., et al. (2020). Malleable 2.5D Convolution: Learning Receptive Fields along the Depth-axis for RGB-D Scene Parsing. European Conference on Computer Vision. [Paper] [Code]
  • [BCMFP+SA-Gate] Chen X., et al. (2020). Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation. European Conference on Computer Vision. [Paper] [Code]
  • [MTI-Net] Vandenhende S., et al. (2020). MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning. European Conference on Computer Vision. [Paper] [Code]
  • [QGN] Kashyap C., et al. (2020). Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions. IEEE Winter Conference on Applications of Computer Vision. [Paper] [Code]
  • [VCD+RedNet/ACNet] Xiong, Z.-T., et al. (2020). Variational Context-Deformable ConvNets for Indoor Scene Parsing. IEEE Conference on Computer Vision and Pattern Recognition. [Paper] [Code]
  • [SANet] Yu, L., et al. (2020). Multi-layer Feature Aggregation for Deep Scene Parsing Models. arXiv:2011.02572. [Paper] [Code]
  • [Zig-Zag Net] Lin, D., et al. (2020). Zig-Zag Network for Semantic Segmentation of RGB-D Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(10): 2642-2655. [Paper] [Code]
  • [MCN-DRM] Zheng, Z., et al. (2020). Multi-resolution Cascaded Network with Depth-similar Residual Module for Real-time Semantic Segmentation on RGB-D Images. IEEE International Conference on Networking, Sensing and Control (ICNSC). [Paper] [Code]
  • [CANet] Zhou H., et al. (2020). RGB-D Co-attention Network for Semantic Segmentation. Asian Conference on Computer Vision. [Paper] [Code]
  • [CEN] Wang, Y., et al. (2020). Deep Multimodal Fusion by Channel Exchanging. 34th Conference on Neural Information Processing Systems [Paper] [Code]
  • [Z-ACN] Wu, Z., et al. (2020). Depth-Adapted CNN for RGB-D cameras. Asian Conference on Computer Vision. [Paper]
  • [AdapNet++] Valada, A., et al. (2020). Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision [Paper] [Code]
  • [ESANet] Seichter, D., et al. (2021). Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis. IEEE International Conference on Robotics and Automation. [Paper] [Code]
  • [LWM] Gu, Z., et al. (2021). Hard Pixel Mining for Depth Privileged Semantic Segmentation. IEEE Transactions on Multimedia. [Paper] [Code]
  • [GLPNet] Chen, S., et al. (2021). Global-Local Propagation Network for RGB-D Semantic Segmentation. arXiv:2101.10801. [Paper] [Code]
  • [ESOSD-Net] He, L., et al. (2021). SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular images. arXiv:2101.07422. [Paper] [Code]
  • [NANet] Zhang, G., et al. (2021). Non-local Aggregation for RGB-D Semantic Segmentation. IEEE Signal Processing Letters. [Paper] [Code]
  • [ARLoss] Cao, L., et al. (2021). Use square root affinity to regress labels in semantic segmentation. arXiv:2103.04990. [Paper] [Code]
  • [InverseForm] Borse, S., et al. (2021). InverseForm: A Loss Function for Structured Boundary-Aware Segmentation. IEEE Conference on Computer Vision and Pattern Recognition. [Paper] [Code]
  • [FSFNet] Su, Y., et al. (2021). Deep feature selection-and-fusion for RGB-D semantic segmentation. IEEE International Conference on Multimedia and Expo. [Paper] [Code]
  • [3D-to-2D] Liu, Z., et al. (2021). 3D-to-2D Distillation for Indoor Scene Parsing. IEEE Conference on Computer Vision and Pattern Recognition. [Paper] [Code]
  • [ATRC] Bruggemann, D., et al. (2021). Exploring Relational Context for Multi-Task Dense Prediction. International Conference on Computer Vision [Paper] [Code]
  • [CSNet] Huan L., et al. (2021). Learning deep cross-scale feature propagation for indoor semantic segmentation. ISPRS Journal of Photogrammetry and Remote Sensing [Paper] [Code]
  • [ShapeConv] Cao J., et al. (2021). ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation. International Conference on Computer Vision [Paper] [Code]
  • [CI-Net] Gao T., et al. (2021). CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation. arXiv:2107.13800 [Paper]
  • [UMT] Du C., et al. (2021). Improving Multi-Modal Learning with Uni-Modal Teachers. arXiv:2106.11059 [Paper]
  • [RGBxD] Cao J., et al. (2021). RGBxD: Learning depth-weighted RGB patches for RGB-D indoorsemantic segmentation. Neurocomputing [Paper]
  • [TCD] Yue Y., et al. (2021). Two-Stage Cascaded Decoder for Semantic Segmentation of RGB-D Images. IEEE Signal Processing Letters [Paper]
  • [RAFNet] Yan X., et al. (2021). RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation. Displays. [Paper] [Code]
  • [MAMD] RazzaghiYan P., et al. (2021). Modality adaptation in multimodal data. Expert Systems with Applications.[Paper] [Code]
  • [FuseNet-3DEF] Terreran M., et al. (2022). Light deep learning models enriched with Entangled features for RGB-D semantic segmentation. Robotics and Autonomous Systems. [Paper] [Code]
  • [GRBNet] Zhou W., et al. (2021). Gated-Residual Block for Semantic Segmentation Using RGB-D Data. IEEE Transactions on Intelligent Transportation Systems. [Paper] [Code]
  • [RTLNet] Zhou W., et al. (2021). RTLNet: Recursive Triple-Path Learning Network for Scene Parsing of RGB-D Images. IEEE Signal Processing Letters. [Paper] [Code]
  • [EBANet] Wang R., et al. (2021). EBANet: Efficient Boundary-Aware Network for RGB-D Semantic Segmentation. International Conference on Cognitive Systems and Signal Processing. [Paper]
  • [CANet] Zhou H., et al. (2022). CANet: Co-attention network for RGB-D semantic segmentation. Pattern Recognition. [Paper] [Code]
  • [ADSD] Zhang Y., et al. (2022). Attention-based Dual Supervised Decoder for RGBD Semantic Segmentation. arXiv:2201.01427 [Paper] [Code]
  • [HS3] Borse S., et al. (2021). HS3: Learning with Proper Task Complexity in Hierarchically Supervised Semantic Segmentation. British Machine Vision Conference [Paper] [Code]
  • [InvPT] Ye H., et al. (2022). Inverted Pyramid Multi-task Transformer for Dense Scene Understanding. arXiv:2203.07997 [Paper] [Code]
  • [PGDENet] Zhou W., et al. (2022). PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing. IEEE Transactions on Multimedia [Paper] [Code]
  • [CMX] Liu X., et al. (2022). CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers. arXiv:2203.04838 [Paper] [Code]
  • [RFNet] Zhou W., et al. (2022). RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding. IEEE Transactions on Emerging Topics in Computational Intelligence [Paper] [Code]
  • [MTF] Wang Y., et al. (2022). Multimodal Token Fusion for Vision Transformers. IEEE Conference on Computer Vision and Pattern Recognition. [Paper] [Code]
  • [FRNet] Zhou W., et al. (2022). FRNet: Feature Reconstruction NetworkforRGB-DIndoor Scene Parsing. IEEE Journal of Selected Topics in Signal Processing. [Paper] [Code]
  • [DRD] Fang T., et al. (2022). Depth Removal Distillation for RGB-D Semantic Segmentation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). [Paper]
  • [SAMD] Zhou F., et al. (2022). Scale-aware network with modality-awareness for RGB-D indoor semantic segmentation. Neurocomputing. [Paper]
  • [BFFNet-152] He Y., et al. (2022). Bimodal Feature Propagation and Fusion for Realtime Semantic Segmentation on RGB-D Images. International Conference on Intelligent Computing and Signal Processing. [Paper]
  • [MQTransformer] Xu Y., et al. (2022). Multi-Task Learning with Multi-Query Transformer for Dense Prediction. arXiv:2203.04838 [Paper]
  • [GED] Zou W., et al. (2022). RGB‑D Gate‑guided edge distillation for indoor semantic segmentation. Multimedia Tools and Applications. [Paper]
  • [LDF] Chen S., et al. (2022). Learning depth‑aware features for indoor scene understanding. Multimedia Tools and Applications. [Paper]
  • [PCGNet] Liu H., et al. (2022). Pyramid-Context Guided Feature Fusion for RGB-D Semantic Segmentation. IEEE International Conference on Multimedia and Expo Workshops (ICMEW). [Paper]

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

【论文合集】RGBD Semantic Segmentation 的相关文章

随机推荐

  • uubuntu中文无法显示,显示为方框

    问题分析 今天一打开Ubuntu系统 xff0c 竟然不支持中文显示了 开始时以为是搜狗拼音的问题 xff0c 然后重装了一遍fcitx xff0c 然后用重装了一遍搜狗输入法 可是重装后问题根本没有解决 xff0c 在language s
  • ubuntu中共享文件夹没有访问权限问题

    打开终端输入命令 sudo adduser 当前用户名 vboxsf 然后重启虚拟机就可以了
  • C++学习日记——头文件的编写

    目录 1 头文件概述 2 头文件编写格式及要求 3 头文件引用的源文件编写要求 4 主函数的调用 5 其他函数 6 放松时刻 1 头文件概述 对于一些大型程序而言 xff0c 分文件编写尤为重要 xff0c 进而引申出了编写头文件的相关需求
  • 网络基础-linux应用编程和网络编程第8部分-朱有鹏-专题视频课程

    网络基础 linux应用编程和网络编程第8部分 4548人已学习 课程介绍 本课程是网络编程部分的前奏 xff0c 主要讲解了网络相关的一些基础知识 xff0c 譬如网络编程的架构 xff0c 网卡 路由器 集线器 交换机 DHCP NAT
  • Ubuntu 20.04 LTS安装教程

    下载ubuntu 20 04 desktop amd64 iso系统镜像 xff0c 利用软碟通写入U盘或是刻录光盘启动安装 xff08 此次安装环境为联想ThinkPad E570 xff0c 内存为500G 43 128G xff09
  • TypeScript

    什么是TypeScript typescript是拥有类型的javaScript超集 它可以编译成普通 干净 完整的js代码 ES3 js所拥有的内容ts都支持 并且js有es678 ts也都支持这些语法 ts拥有独特的一些数据类型 枚举
  • VNC远程连接树莓派的IP问题

    在进行VNC远程连接树莓派的时候 xff0c 需要输入树莓派的IP地址 xff0c 我首先使用了 hostname i的指令 xff0c 但是这个显示连接失败 xff0c 拒绝访问 xff0c 后来发现hostname i是获取了主机名 正
  • VNC链接和配置

    1 创建vnc进程ID xff1a vnc4server 2 关掉vnc进程 xff1a vncserver kill lt portnum gt vncserver kill 28 3 设置vnc分辨率 xff1a vnc4server
  • Ubuntu系统下Python基础库的安装以及pip和pip3常见报错解答

    Ubuntu系统下Python基础库的安装以及pip和pip3常见报错解答 安装python依赖库Python基础库安装Python项目requirements文件的生成和使用1 生成requirements txt方法一 xff1a pi
  • 嵌入式服务器boa移植

    移植嵌入式服务器boa的过程 xff0c 在论坛里面可以搜到好多 xff0c 其中也会有出现错误时对应的解决方法 xff0c 在这里就不赘述了 在这里我介绍一下我移植过程中出现的问题 xff1a boa not found 总结一下这个问题
  • 【深度学习-tensorflow】使用Tensorflow Lite部署模型时遇到不支持tf.keras.layers.UpSampling2D()函数问题的解决方法

    在工作中遇到的一个问题 xff0c 记录一下 在模型转化为tflite之后 xff0c 进行模型的部署 xff0c 之后遇到的使用Tensorflow Lite部署模型时遇到不支持tf keras layers UpSampling2D 函
  • Anaconda3安装好后“pip、python不是内部或外部命令,也不是可运行的程序或批处理文件”的问题解决

    这种情况就是环境配置的问题 xff0c 系统找不到你的python和pip xff0c 因此需要配置相应的环境变量 重点来了 xff01 xff01 xff01 相信很多人说 xff0c 我明明配置好了环境变量为什么还是显示错误 xff1f
  • 解决Maven的project标签(爆红) 插架无法导入

    目录 问题描述 解决方案 1 利用开发工具 idea 自动导入 2 pom依赖自动导入 错误原因 解决问题 3 网站下载插件 4 去本地仓库解决插件报红状态 4 1错误原因 4 2解决方法 问题描述 在拉取项目的时候project标签报错
  • typora全局替换文本

    主要讲解在typora中如何将整个文档中所有的字段替换为另一个字段 例如对于下面的句子 xff0c 需要把 曲面重建 全部改为 曲面三维重建 首先按ctrl 43 f打开全局搜索框 xff0c 在最顶上会弹出 xff0c 在这里输入需要替换
  • linux网络编程实践-linux应用编程和网络编程第9部分-朱有鹏-专题视频课程

    linux网络编程实践 linux应用编程和网络编程第9部分 14177人已学习 课程介绍 本课程是网络编程实践部分 xff0c 带大家使用socket接口及其相关函数 xff0c 从头编写一个服务器和客户端的通信程序 xff0c 并且引出
  • 链路聚合的原理以及配置

    链路聚合的原理以及配置 一 链路聚合的概述二 链路聚合的原理三 链路聚合的配置 一 链路聚合的概述 链路聚合 xff08 Link Aggregation xff09 是将两个或更多数据信道结合成一个单个的信道 xff0c 该信道以一个单个
  • 关于逻辑回归完成手写数字识别的一点愚蠢错误回顾

    最近的机器学习课上作业里要我们完成通过神经网络和逻辑回归进行手写数字的识别任务 xff0c 神经网络的部分通过课上的辅助教材 xff1a 神经网络与深度学习 其中所提供的代码足以完成 xff0c 另外这本书真的写的很好 xff0c 在易读性
  • OpenCV安装及其开发环境配置(C++)

    目录 第一章 Opencv安装及其环境变量配置 1 1下载并安装OpenCV 1 2 OpenCV环境变量配置 第二章 Visual Studio 2019 编译器下载安装 第三章 OpenCV开发环境配置 C 43 43 3 1创建项目
  • 目标检测中的anchor-base与anchor-free

    前言 本文参考目标检测阵营 Anchor Base vs Anchor Free 如何评价zhangshifeng最新的讨论anchor based free的论文 知乎 基础知识 目标检测中Anchor的认识及理解 目标检测领域的发展从a
  • 【论文合集】RGBD Semantic Segmentation

    来源 xff1a GitHub Yangzhangcst RGBD semantic segmentation A paper list of RGBD semantic segmentation processing RGBD seman