Shaoshuai Shi (Postdoc)

Personal Information

Publications

2024

Conference paper

H. Wang, H. Tang, L. Jiang, S. Shi, M. F. Naeem, H. Li, B. Schiele, and L. Wang

“GiT: Towards Generalist Vision Transformer through Universal Language Interface,” in Computer Vision -- ECCV 2024, Milano, Italy, 2024.

mehr

BibTeX

@inproceedings{WangECCV24,
TITLE = {{GiT}: {T}owards Generalist Vision Transformer through Universal Language Interface},
AUTHOR = {Wang, Haiyang and Tang, Hao and Jiang, Li and Shi, Shaoshuai and Naeem, Muhammad Ferjad and Li, Hongsheng and Schiele, Bernt and Wang, Liwei},
LANGUAGE = {eng},
ISBN = {978-3-031-73396-3},
DOI = {0.1007/978-3-031-73397-0_4},
PUBLISHER = {Springer},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {Computer Vision -- ECCV 2024},
EDITOR = {Leonardis, Ale{\v s} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l},
PAGES = {55--73},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {15087},
ADDRESS = {Milano, Italy},
}

Endnote

%0 Conference Proceedings
%A Wang, Haiyang
%A Tang, Hao
%A Jiang, Li
%A Shi, Shaoshuai
%A Naeem, Muhammad Ferjad
%A Li, Hongsheng
%A Schiele, Bernt
%A Wang, Liwei
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T GiT: Towards Generalist Vision Transformer through Universal Language Interface : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-D87B-4
%R 0.1007/978-3-031-73397-0_4
%D 2024
%B 18th European Conference on Computer Vision 
%Z date of event: 2024-09-29 - 2024-10-04
%C Milano, Italy
%B Computer Vision -- ECCV 2024
%E Leonardis, Ale&#353; ; Ricci, Elisa; Roth, Stefan; Russakovsky, Olga; Sattler, Torsten; Varol, G&#252;l
%P 55 - 73
%I Springer
%@ 978-3-031-73396-3
%B Lecture Notes in Computer Science
%N 15087

Conference paper

L. Jiang, S. Shi, and B. Schiele

“Open-Vocabulary 3D Semantic Segmentation with Foundation Models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), Seattle, WA, USA, 2024.

mehr

BibTeX

@inproceedings{Jiang_CVPR24,
TITLE = {Open-Vocabulary {3D} Semantic Segmentation with Foundation Models},
AUTHOR = {Jiang, Li and Shi, Shaoshuai and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {979-8-3503-5300-6},
DOI = {10.1109/CVPR52733.2024.02011},
PUBLISHER = {IEEE},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)},
PAGES = {21284--21294},
ADDRESS = {Seattle, WA, USA},
}

Endnote

%0 Conference Proceedings
%A Jiang, Li
%A Shi, Shaoshuai
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Open-Vocabulary 3D Semantic Segmentation with Foundation Models : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-103B-A
%R 10.1109/CVPR52733.2024.02011
%D 2024
%B 41st IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2024-06-17 - 2024-06-21
%C Seattle, WA, USA
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 21284 - 21294
%I IEEE
%@ 979-8-3503-5300-6

Article

S. Shi, L. Jiang, D. Dai, and B. Schiele

“MTR++: Multi-Agent Motion Prediction With Symmetric Scene Modeling and Guided Intention Querying,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 5, 2024.

mehr

BibTeX

@article{Shi2024,
TITLE = {{MTR}++: {M}ulti-Agent Motion Prediction With Symmetric Scene Modeling and Guided Intention Querying},
AUTHOR = {Shi, Shaoshuai and Jiang, Li and Dai, Dengxin and Schiele, Bernt},
LANGUAGE = {eng},
DOI = {10.1109/TPAMI.2024.3352811},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {46},
NUMBER = {5},
PAGES = {3955--3971},
}

Endnote

%0 Journal Article
%A Shi, Shaoshuai
%A Jiang, Li
%A Dai, Dengxin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T MTR++: Multi-Agent Motion Prediction With Symmetric Scene Modeling and Guided Intention Querying : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-4432-9
%R 10.1109/TPAMI.2024.3352811
%7 2024-01-12
%D 2024
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 46
%N 5
%& 3955
%P 3955 - 3971
%I IEEE
%C Piscataway, NJ

2023

Conference paper

D2D6

L. Jiang, Z. Yang, S. Shi, V. Golyanik, D. Dai, and B. Schiele

“Self-Supervised Pre-Training With Masked Shape Prediction for 3D Scene Understanding,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.

mehr

BibTeX

@inproceedings{Jiang_CVPR23,
TITLE = {Self-Supervised Pre-Training With Masked Shape Prediction for {3D} Scene Understanding},
AUTHOR = {Jiang, Li and Yang, Zetong and Shi, Shaoshuai and Golyanik, Vladislav and Dai, Dengxin and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {979-8-3503-0129-8},
DOI = {10.1109/CVPR52729.2023.00119},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
PAGES = {1168--1178},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Jiang, Li
%A Yang, Zetong
%A Shi, Shaoshuai
%A Golyanik, Vladislav
%A Dai, Dengxin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Visual Computing and Artificial Intelligence, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Self-Supervised Pre-Training With Masked Shape Prediction for 3D Scene Understanding  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-1F66-F
%R 10.1109/CVPR52729.2023.00119
%D 2023
%B 36th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2023-06-18 - 2023-06-23
%C Vancouver, Canada
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 1168 - 1178
%I IEEE
%@ 979-8-3503-0129-8

Conference paper

H. Wang, C. Shi, S. Shi, M. Lei, S. Wang, D. He, B. Schiele, and L. Wang

“DSVT: Dynamic Sparse Voxel Transformer With Rotated Sets,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.

mehr

BibTeX

@inproceedings{Wang_CVPR23,
TITLE = {{DSVT}: {D}ynamic Sparse Voxel Transformer With Rotated Sets},
AUTHOR = {Wang, Haiyang and Shi, Chen and Shi, Shaoshuai and Lei, Meng and Wang, Sen and He, Di and Schiele, Bernt and Wang, Liwei},
LANGUAGE = {eng},
ISBN = {979-8-3503-0129-8},
DOI = {10.1109/CVPR52729.2023.01299},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
PAGES = {13520--13529},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Wang, Haiyang
%A Shi, Chen
%A Shi, Shaoshuai
%A Lei, Meng
%A Wang, Sen
%A He, Di
%A Schiele, Bernt
%A Wang, Liwei
%+ External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T DSVT: Dynamic Sparse Voxel Transformer With Rotated Sets  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-1F6C-9
%R 10.1109/CVPR52729.2023.01299
%D 2023
%B 36th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2023-06-18 - 2023-06-23
%C Vancouver, Canada
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 13520 - 13529
%I IEEE
%@ 979-8-3503-0129-8

Conference paper

H. Wu, C. Wen, S. Shi, X. Li, and C. Wang

“Virtual Sparse Convolution for Multimodal 3D Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.

mehr

BibTeX

@inproceedings{Wu_CVPR23,
TITLE = {Virtual Sparse Convolution for Multimodal {3D} Object Detection},
AUTHOR = {Wu, Hai and Wen, Chenglu and Shi, Shaoshuai and Li, Xin and Wang, Cheng},
LANGUAGE = {eng},
ISBN = {979-8-3503-0129-8},
DOI = {10.1109/CVPR52729.2023.02074},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
PAGES = {21653--21662},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Wu, Hai
%A Wen, Chenglu
%A Shi, Shaoshuai
%A Li, Xin
%A Wang, Cheng
%+ External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Virtual Sparse Convolution for Multimodal 3D Object Detection : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-DD51-F
%R 10.1109/CVPR52729.2023.02074
%D 2023
%B 36th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2023-06-18 - 2023-06-23
%C Vancouver, Canada
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 21653 - 21662
%I IEEE
%@ 979-8-3503-0129-8

Conference paper

B. Zhu, Z. Wang, S. Shi, H. Xu, L. Hong, and H. Li

“ConQueR: Query Contrast Voxel-DETR for 3D Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.

mehr

BibTeX

@inproceedings{Zhu_CVPR2023,
TITLE = {{ConQueR}: {Q}uery Contrast Voxel-{DETR} for 3{D} Object Detection},
AUTHOR = {Zhu, Benjin and Wang, Zhe and Shi, Shaoshuai and Xu, Hang and Hong, Lanqing and Li, Hongsheng},
LANGUAGE = {eng},
ISBN = {979-8-3503-0129-8},
DOI = {10.1109/CVPR52729.2023.00897},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
PAGES = {9296--9305},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Zhu, Benjin
%A Wang, Zhe
%A Shi, Shaoshuai
%A Xu, Hang
%A Hong, Lanqing
%A Li, Hongsheng
%+ External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
%T ConQueR: Query Contrast Voxel-DETR for 3D Object Detection  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000E-00B7-3
%R 10.1109/CVPR52729.2023.00897
%D 2023
%B 36th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2023-06-18 - 2023-06-23
%C Vancouver, Canada
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 9296 - 9305
%I IEEE
%@ 979-8-3503-0129-8

Conference paper

X. Chen, S. Shi, C. Zhang, B. Zhu, Q. Wang, K. C. Cheung, S. See, and H. Li

“TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.

mehr

BibTeX

@inproceedings{chen_ICCV23,
TITLE = {{TrajectoryFormer}: {3D} Object Tracking Transformer with Predictive Trajectory Hypotheses},
AUTHOR = {Chen, Xuesong and Shi, Shaoshuai and Zhang, Chao and Zhu, Benjin and Wang, Qiang and Cheung, Ka Chun and See, Simon and Li, Hongsheng},
LANGUAGE = {eng},
ISBN = {979-8-3503-0718-4},
DOI = {10.1109/ICCV51070.2023.01698},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF International Conference on Computer Vision (ICCV 2023)},
PAGES = {18481--18490},
ADDRESS = {Paris, France},
}

Endnote

%0 Conference Proceedings
%A Chen, Xuesong
%A Shi, Shaoshuai
%A Zhang, Chao
%A Zhu, Benjin
%A Wang, Qiang
%A Cheung, Ka Chun
%A See, Simon
%A Li, Hongsheng
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
External Organizations
External Organizations
%T TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-DD54-C
%R 10.1109/ICCV51070.2023.01698
%D 2023
%B IEEE/CVF International Conference on Computer Vision
%Z date of event: 2023-10-02 - 2023-10-06
%C Paris, France
%B IEEE/CVF International Conference on Computer Vision
%P 18481 - 18490
%I IEEE
%@ 979-8-3503-0718-4

Conference paper

H. Wang, H. Tang, S. Shi, A. Li, Z. Li, B. Schiele, and L. Wang

“UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.

mehr

BibTeX

@inproceedings{wang_ICCV23,
TITLE = {{UniTR}: {A} Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation},
AUTHOR = {Wang, Haiyang and Tang, Hao and Shi, Shaoshuai and Li, Aoxue and Li, Zhenguo and Schiele, Bernt and Wang, Liwei},
LANGUAGE = {eng},
ISBN = {979-8-3503-0718-4},
DOI = {10.1109/ICCV51070.2023.00625},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF International Conference on Computer Vision (ICCV 2023)},
PAGES = {6769--6779},
ADDRESS = {Paris, France},
}

Endnote

%0 Conference Proceedings
%A Wang, Haiyang
%A Tang, Hao
%A Shi, Shaoshuai
%A Li, Aoxue
%A Li, Zhenguo
%A Schiele, Bernt
%A Wang, Liwei
%+ External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-DD59-7
%R 10.1109/ICCV51070.2023.00625
%D 2023
%B IEEE/CVF International Conference on Computer Vision
%Z date of event: 2023-10-02 - 2023-10-06
%C Paris, France
%B IEEE/CVF International Conference on Computer Vision
%P 6769 - 6779
%I IEEE
%@ 979-8-3503-0718-4

Conference paper

Z. Li, S. Shi, B. Schiele, and D. Dai

“Test-time Domain Adaptation for Monocular Depth Estimation,” in IEEE International Conference on Robotics and Automation (ICRA 2023), London, UK, 2023.

mehr

BibTeX

@inproceedings{Li_ICRA2023,
TITLE = {Test-time Domain Adaptation for Monocular Depth Estimation},
AUTHOR = {Li, Zhi and Shi, Shaoshuai and Schiele, Bernt and Dai, Dengxin},
LANGUAGE = {eng},
ISBN = {979-8-3503-2365-8},
DOI = {10.1109/ICRA48891.2023.10161304},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
BOOKTITLE = {IEEE International Conference on Robotics and Automation (ICRA 2023)},
PAGES = {4873--4879},
ADDRESS = {London, UK},
}

Endnote

%0 Conference Proceedings
%A Li, Zhi
%A Shi, Shaoshuai
%A Schiele, Bernt
%A Dai, Dengxin
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Test-time Domain Adaptation for Monocular Depth Estimation : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-C7F3-0
%R 10.1109/ICRA48891.2023.10161304
%D 2023
%B IEEE International Conference on Robotics and Automation
%Z date of event: 2023-05-29 - 2023-06-02
%C London, UK
%B IEEE International Conference on Robotics and Automation
%P 4873 - 4879
%I IEEE
%@ 979-8-3503-2365-8

Article

J. Mao, S. Shi, X. Wang, and H. Li

“3D Object Detection for Autonomous Driving: A Comprehensive Survey,” International Journal of Computer Vision, vol. 131, 2023.

mehr

BibTeX

@article{Mao23,
TITLE = {{3D} Object Detection for Autonomous Driving: {A} Comprehensive Survey},
AUTHOR = {Mao, Jiageng and Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
LANGUAGE = {eng},
ISSN = {0920-5691},
DOI = {10.1007/s11263-023-01790-1},
PUBLISHER = {Kluwer Academic Publishers},
ADDRESS = {Hingham, Mass.},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
JOURNAL = {International Journal of Computer Vision},
VOLUME = {131},
PAGES = {1909--1963},
}

Endnote

%0 Journal Article
%A Mao, Jiageng
%A Shi, Shaoshuai
%A Wang, Xiaogang
%A Li, Hongsheng
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T 3D Object Detection for Autonomous Driving: A Comprehensive Survey : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-3FEF-1
%R 10.1007/s11263-023-01790-1
%7 2023-04-27
%D 2023
%J International Journal of Computer Vision
%O Int. J. Comput. Vis.
%V 131
%& 1909
%P 1909 - 1963
%I Kluwer Academic Publishers
%C Hingham, Mass.
%@ false
%U https://rdcu.be/ddRPd

2022

Conference paper

S. Shi, L. Jiang, D. Dai, and B. Schiele

“Motion Transformer with Global Intention Localization and Local Movement Refinement,” in Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Shi_Neurips22,
TITLE = {Motion Transformer with Global Intention Localization and Local Movement Refinement},
AUTHOR = {Shi, Shaoshuai and Jiang, Li and Dai, Dengxin and Schiele, Bernt},
LANGUAGE = {eng},
PUBLISHER = {Curran Associates, Inc.},
YEAR = {2022},
BOOKTITLE = {Advances in Neural Information Processing Systems 35 (NeurIPS 2022)},
EDITOR = {Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D. and Cho, K. and Oh, A.},
PAGES = {6531--6543},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Shi, Shaoshuai
%A Jiang, Li
%A Dai, Dengxin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Motion Transformer with Global Intention Localization and Local Movement Refinement  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-1853-C
%D 2022
%B 36th Conference on Neural Information Processing Systems
%Z date of event: 2022-11-28 - 2022-12-09
%C New Orleans, LA, USA
%B Advances in Neural Information Processing Systems 35
%E Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A.
%P 6531 - 6543
%I Curran Associates, Inc.
%U https://openreview.net/forum?id=9t-j3xDm7_Q

Conference paper

H. Wang, L. Ding, S. Dong, S. Shi, A. Li, J. Li, Z. Li, and L. Wang

“CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds,” in Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Shi_Neurips22C,
TITLE = {{CAGroup3D}: {C}lass-Aware Grouping for {3D} Object Detection on Point Clouds},
AUTHOR = {Wang, Haiyang and Ding, Lihe and Dong, Shaocong and Shi, Shaoshuai and Li, Aoxue and Li, Jianan and Li, Zhenguo and Wang, Liwei},
LANGUAGE = {eng},
PUBLISHER = {Curran Associates, Inc.},
YEAR = {2022},
BOOKTITLE = {Advances in Neural Information Processing Systems 35 (NeurIPS 2022)},
EDITOR = {Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D. and Cho, K. and Oh, A.},
PAGES = {29975--29988},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Wang, Haiyang
%A Ding, Lihe
%A Dong, Shaocong
%A Shi, Shaoshuai
%A Li, Aoxue
%A Li, Jianan
%A Li, Zhenguo
%A Wang, Liwei
%+ External Organizations
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
%T CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-72EC-A
%D 2022
%B 36th Conference on Neural Information Processing Systems
%Z date of event: 2022-11-28 - 2022-12-09
%C New Orleans, LA, USA
%B Advances in Neural Information Processing Systems 35
%E Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A.
%P 29975 - 29988
%I Curran Associates, Inc.
%U https://openreview.net/forum?id=nLKkHwYP4Au

Conference paper

J. Yang, S. Shi, R. Ding, Z. Wang, and X. Qi

“Towards Efficient 3D Object Detection with Knowledge Distillation,” in Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Shi_Neurips22B,
TITLE = {Towards Efficient {3D} Object Detection with Knowledge Distillation},
AUTHOR = {Yang, Jihan and Shi, Shaoshuai and Ding, Runyu and Wang, Zhe and Qi, Xiaojuan},
LANGUAGE = {eng},
PUBLISHER = {Curran Associates, Inc.},
YEAR = {2022},
BOOKTITLE = {Advances in Neural Information Processing Systems 35 (NeurIPS 2022)},
EDITOR = {Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D. and Cho, K. and Oh, A.},
PAGES = {21300--21313},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Yang, Jihan
%A Shi, Shaoshuai
%A Ding, Runyu
%A Wang, Zhe
%A Qi, Xiaojuan
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
%T Towards Efficient 3D Object Detection with Knowledge Distillation  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-72E9-D
%D 2022
%B 36th Conference on Neural Information Processing Systems
%Z date of event: 2022-11-28 - 2022-12-09
%C New Orleans, LA, USA
%B Advances in Neural Information Processing Systems 35
%E Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A.
%P 21300 - 21313
%I Curran Associates, Inc.
%U https://openreview.net/forum?id=1tnVNogPUz9

Conference paper

X. Chen, S. Shi, B. Zhu, K. C. Cheung, H. Xu, and H. Li

“MPPNet: Multi-frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection,” in Computer Vision -- ECCV 2022, Tel Aviv, Israel, 2022.

mehr

BibTeX

@inproceedings{Chen_ECCV2022x,
TITLE = {{MPPNet}: {M}ulti-frame Feature Intertwining with Proxy Points for {3D} Temporal Object Detection},
AUTHOR = {Chen, Xuesong and Shi, Shaoshuai and Zhu, Benjin and Cheung, Ka Chun and Xu, Hang and Li, Hongsheng},
LANGUAGE = {eng},
ISBN = {978-3-031-20073-1},
DOI = {10.1007/978-3-031-20074-8_39},
PUBLISHER = {Springer},
YEAR = {2022},
DATE = {2022},
BOOKTITLE = {Computer Vision -- ECCV 2022},
EDITOR = {Avidan, Shai and Brostow, Gabriel and Ciss{\'e}, Moustapha and Farinella, Giovanni Maria and Hassner, Tal},
PAGES = {680--697},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {13668},
ADDRESS = {Tel Aviv, Israel},
}

Endnote

%0 Conference Proceedings
%A Chen, Xuesong
%A Shi, Shaoshuai
%A Zhu, Benjin
%A Cheung, Ka Chun
%A Xu, Hang
%A Li, Hongsheng
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
%T MPPNet: Multi-frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-72F6-E
%R 10.1007/978-3-031-20074-8_39
%D 2022
%B 17th European Conference on Computer Vision
%Z date of event: 2022-10-23 - 2022-10-27
%C Tel Aviv, Israel
%B Computer Vision -- ECCV 2022
%E Avidan, Shai; Brostow, Gabriel; Ciss&#233;, Moustapha; Farinella, Giovanni Maria; Hassner, Tal
%P 680 - 697
%I Springer
%@ 978-3-031-20073-1
%B Lecture Notes in Computer Science
%N 13668
%U https://rdcu.be/c33Eb

Conference paper

H. Wang, S. Shi, Z. Yang, R. Fang, Q. Qian, H. Li, B. Schiele, and L. Wang

“RBGNet: Ray-based Grouping for 3D Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Wang_CVPR22c,
TITLE = {{RBGNet}: {R}ay-based Grouping for {3D} Object Detection},
AUTHOR = {Wang, Haiyang and Shi, Shaoshuai and Yang, Ze and Fang, Rongyao and Qian, Qi and Li, Hongsheng and Schiele, Bernt and Wang, Liwei},
LANGUAGE = {eng},
ISBN = {978-1-6654-6946-3},
DOI = {10.1109/CVPR52688.2022.00118},
PUBLISHER = {IEEE},
YEAR = {2022},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
PAGES = {1100--1109},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Wang, Haiyang
%A Shi, Shaoshuai
%A Yang, Ze
%A Fang, Rongyao
%A Qian, Qi
%A Li, Hongsheng
%A Schiele, Bernt
%A Wang, Liwei
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T RBGNet: Ray-based Grouping for 3D Object Detection : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-1833-0
%R 10.1109/CVPR52688.2022.00118
%D 2022
%B 35th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2022-06-19 - 2022-06-24
%C New Orleans, LA, USA
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 1100 - 1109
%I IEEE
%@ 978-1-6654-6946-3

Article

S. Shi, L. Jiang, J. Deng, Z. Wang, C. Guo, J. Shi, X. Wang, and H. Li

“PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection,” International Journal of Computer Vision, vol. 131, 2022.

mehr

BibTeX

@article{Shi2022x,
TITLE = {{PV-RCNN}++: {P}oint-Voxel Feature Set Abstraction With Local Vector Representation for {3D} Object Detection},
AUTHOR = {Shi, Shaoshuai and Jiang, Li and Deng, Jiajun and Wang, Zhe and Guo, Chaoxu and Shi, Jianping and Wang, Xiaogang and Li, Hongsheng},
LANGUAGE = {eng},
ISSN = {0920-5691},
URL = {https://rdcu.be/c14JE},
DOI = {10.1007/s11263-022-01710-9},
PUBLISHER = {Springer},
ADDRESS = {New York, NY},
YEAR = {2022},
JOURNAL = {International Journal of Computer Vision},
VOLUME = {131},
PAGES = {531--551},
}

Endnote

%0 Journal Article
%A Shi, Shaoshuai
%A Jiang, Li
%A Deng, Jiajun
%A Wang, Zhe
%A Guo, Chaoxu
%A Shi, Jianping
%A Wang, Xiaogang
%A Li, Hongsheng
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
External Organizations
External Organizations
%T PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-11CC-B
%R 10.1007/s11263-022-01710-9
%U https://rdcu.be/c14JE
%7 2022
%D 2022
%J International Journal of Computer Vision
%O Int. J. Comput. Vis.
%V 131
%& 531
%P 531 - 551
%I Springer
%C New York, NY
%@ false

Paper

S. Shi, L. Jiang, D. Dai, and B. Schiele

“MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge -- Motion Prediction,” 2022. [Online]. Available: https://arxiv.org/abs/2209.10033.

mehr

Abstract

In this report, we present the 1st place solution for motion prediction track
in 2022 Waymo Open Dataset Challenges. We propose a novel Motion Transformer
framework for multimodal motion prediction, which introduces a small set of
novel motion query pairs for generating better multimodal future trajectories
by jointly performing the intention localization and iterative motion
refinement. A simple model ensemble strategy with non-maximum-suppression is
adopted to further boost the final performance. Our approach achieves the 1st
place on the motion prediction leaderboard of 2022 Waymo Open Dataset
Challenges, outperforming other methods with remarkable margins. Code will be
available at github.com/sshaoshuai/MTR.

BibTeX

@online{Shi2209.10033,
TITLE = {{MTR}-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge -- Motion Prediction},
AUTHOR = {Shi, Shaoshuai and Jiang, Li and Dai, Dengxin and Schiele, Bernt},
LANGUAGE = {eng},
URL = {https://arxiv.org/abs/2209.10033},
EPRINT = {2209.10033},
EPRINTTYPE = {arXiv},
YEAR = {2022},
ABSTRACT = {In this report, we present the 1st place solution for motion prediction track<br>in 2022 Waymo Open Dataset Challenges. We propose a novel Motion Transformer<br>framework for multimodal motion prediction, which introduces a small set of<br>novel motion query pairs for generating better multimodal future trajectories<br>by jointly performing the intention localization and iterative motion<br>refinement. A simple model ensemble strategy with non-maximum-suppression is<br>adopted to further boost the final performance. Our approach achieves the 1st<br>place on the motion prediction leaderboard of 2022 Waymo Open Dataset<br>Challenges, outperforming other methods with remarkable margins. Code will be<br>available at https://github.com/sshaoshuai/MTR.<br>},
}

Endnote

%0 Report
%A Shi, Shaoshuai
%A Jiang, Li
%A Dai, Dengxin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge --
  Motion Prediction : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-184C-5
%U https://arxiv.org/abs/2209.10033
%D 2022
%X   In this report, we present the 1st place solution for motion prediction track<br>in 2022 Waymo Open Dataset Challenges. We propose a novel Motion Transformer<br>framework for multimodal motion prediction, which introduces a small set of<br>novel motion query pairs for generating better multimodal future trajectories<br>by jointly performing the intention localization and iterative motion<br>refinement. A simple model ensemble strategy with non-maximum-suppression is<br>adopted to further boost the final performance. Our approach achieves the 1st<br>place on the motion prediction leaderboard of 2022 Waymo Open Dataset<br>Challenges, outperforming other methods with remarkable margins. Code will be<br>available at https://github.com/sshaoshuai/MTR.<br>
%K Computer Science, Computer Vision and Pattern Recognition, cs.CV