Yiting Xia

Dr. Yiting Xia

Address
Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
Location
E1 4 - 526
Phone
+49 681 9325 3901
Fax
+49 681 9325 5719

Personal Information

I am a tenure-track faculty at Max Planck Institute for Informatics (MPI-INF), where I lead the network and cloud systems research group.

Before joining MPI, I was a research scientist at Facebook. I received my PhD degree in Computer Science from Rice University in 2018. I got my Bachelor’s Degree in Telecommunications Engineering with First Class Honors from a joint program between Beijing University of Posts and Telecommunications (BUPT) and Queen Mary University of London (QMUL) in 2011.

 

For more information, please visit my personal website: https://sites.google.com/view/yitingxia

Publications

2024
De Marchi, F., Bai, W., Li, J., & Xia, Y. (2024). Rethinking Transport Protocols for Reconfigurable Data Centers: An Empirical Study. In HotOptics ’24, 1st SIGCOMM Workshop on Hot Topics in Optical Technologies and Applications in Networking. Sidney, Australia: ACM. doi:10.1145/3672201.367412
Export
BibTeX
@inproceedings{DeMarchi_HotOPTICS24, TITLE = {Rethinking Transport Protocols for Reconfigurable Data Centers: {A}n Empirical Study}, AUTHOR = {De Marchi, Federico and Bai, Wei and Li, Jialong and Xia, Yiting}, LANGUAGE = {eng}, ISBN = {979-8-4007-0716-2}, DOI = {10.1145/3672201.367412}, PUBLISHER = {ACM}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, BOOKTITLE = {HotOptics '24, 1st SIGCOMM Workshop on Hot Topics in Optical Technologies and Applications in Networking}, PAGES = {7--13}, ADDRESS = {Sidney, Australia}, }
Endnote
%0 Conference Proceedings %A De Marchi, Federico %A Bai, Wei %A Li, Jialong %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Rethinking Transport Protocols for Reconfigurable Data Centers: An Empirical Study : %G eng %U http://hdl.handle.net/21.11116/0000-0010-2E36-F %R 10.1145/3672201.367412 %D 2024 %B 1st SIGCOMM Workshop on Hot Topics in Optical Technologies and Applications in Networking %Z date of event: 2024-08-04 - 2024-08-08 %C Sidney, Australia %B HotOptics '24 %P 7 - 13 %I ACM %@ 979-8-4007-0716-2
Lei, Y., De Marchi, F., Joshi, R., Li, J., Chandrasekaran, B., & Xia, Y. (2024). DEMO: An Open Research Framework for Optical Data Center Networks. In ACM SIGCOMM Posters and Demos ’24. Sydney, Australia: ACM. doi:10.1145/3672202.3673712
Export
BibTeX
@inproceedings{Lei_SIGCOMM, TITLE = {{DEMO}: 7A}n Open Research Framework for Optical Data Center Networks}, AUTHOR = {Lei, Yiming and De Marchi, Federico and Joshi, Raj and Li, Jialong and Chandrasekaran, Balakrishnan and Xia, Yiting}, LANGUAGE = {eng}, ISBN = {979-8-4007-0717-9}, DOI = {10.1145/3672202.3673712}, PUBLISHER = {ACM}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, BOOKTITLE = {ACM SIGCOMM Posters and Demos '24}, PAGES = {86--88}, ADDRESS = {Sydney, Australia}, }
Endnote
%0 Conference Proceedings %A Lei, Yiming %A De Marchi, Federico %A Joshi, Raj %A Li, Jialong %A Chandrasekaran, Balakrishnan %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T DEMO: An Open Research Framework for Optical Data Center Networks : %G eng %U http://hdl.handle.net/21.11116/0000-0010-43CF-A %R 10.1145/3672202.3673712 %D 2024 %B 38th ACM SIGCOMM %Z date of event: 2024-08-04 - 2024-08-08 %C Sydney, Australia %B ACM SIGCOMM Posters and Demos '24 %P 86 - 88 %I ACM %@ 979-8-4007-0717-9
Lei, Y., Li, J., Liu, Z., Joshi, R., & Xia, Y. (2024). Nanosecond Precision Time Synchronization for Optical Data Center Networks. Retrieved from https://arxiv.org/abs/2410.17012
(arXiv: 2410.17012)
Abstract
Optical data center networks (DCNs) are renovating the infrastructure design<br>for the cloud in the post Moore's law era. The fact that optical DCNs rely on<br>optical circuits of microsecond-scale durations makes nanosecond-precision time<br>synchronization essential for the correct functioning of routing on the network<br>fabric. However, current studies on optical DCNs neglect the fundamental need<br>for accurate time synchronization. In this paper, we bridge the gap by<br>developing Nanosecond Optical Synchronization (NOS), the first<br>nanosecond-precision synchronization solution for optical DCNs general to<br>various optical hardware. NOS builds clock propagation trees on top of the<br>dynamically reconfigured circuits in optical DCNs, allowing switches to seek<br>better sync parents throughout time. It predicts drifts in the tree-building<br>process, which enables minimization of sync errors. We also tailor today's sync<br>protocols to the needs of optical DCNs, including reducing the number of sync<br>messages to fit into short circuit durations and correcting timestamp errors<br>for higher sync accuracy. Our implementation on programmable switches shows<br>28ns sync accuracy in a 192-ToR setting.<br>
Export
BibTeX
@online{Lei_2410.17012, TITLE = {Nanosecond Precision Time Synchronization for Optical Data Center Networks}, AUTHOR = {Lei, Yiming and Li, Jialong and Liu, Zhengqing and Joshi, Raj and Xia, Yiting}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2410.17012}, EPRINT = {2410.17012}, EPRINTTYPE = {arXiv}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, ABSTRACT = {Optical data center networks (DCNs) are renovating the infrastructure design<br>for the cloud in the post Moore's law era. The fact that optical DCNs rely on<br>optical circuits of microsecond-scale durations makes nanosecond-precision time<br>synchronization essential for the correct functioning of routing on the network<br>fabric. However, current studies on optical DCNs neglect the fundamental need<br>for accurate time synchronization. In this paper, we bridge the gap by<br>developing Nanosecond Optical Synchronization (NOS), the first<br>nanosecond-precision synchronization solution for optical DCNs general to<br>various optical hardware. NOS builds clock propagation trees on top of the<br>dynamically reconfigured circuits in optical DCNs, allowing switches to seek<br>better sync parents throughout time. It predicts drifts in the tree-building<br>process, which enables minimization of sync errors. We also tailor today's sync<br>protocols to the needs of optical DCNs, including reducing the number of sync<br>messages to fit into short circuit durations and correcting timestamp errors<br>for higher sync accuracy. Our implementation on programmable switches shows<br>28ns sync accuracy in a 192-ToR setting.<br>}, }
Endnote
%0 Report %A Lei, Yiming %A Li, Jialong %A Liu, Zhengqing %A Joshi, Raj %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Nanosecond Precision Time Synchronization for Optical Data Center Networks : %G eng %U http://hdl.handle.net/21.11116/0000-0010-43D5-2 %U https://arxiv.org/abs/2410.17012 %D 2024 %X Optical data center networks (DCNs) are renovating the infrastructure design<br>for the cloud in the post Moore's law era. The fact that optical DCNs rely on<br>optical circuits of microsecond-scale durations makes nanosecond-precision time<br>synchronization essential for the correct functioning of routing on the network<br>fabric. However, current studies on optical DCNs neglect the fundamental need<br>for accurate time synchronization. In this paper, we bridge the gap by<br>developing Nanosecond Optical Synchronization (NOS), the first<br>nanosecond-precision synchronization solution for optical DCNs general to<br>various optical hardware. NOS builds clock propagation trees on top of the<br>dynamically reconfigured circuits in optical DCNs, allowing switches to seek<br>better sync parents throughout time. It predicts drifts in the tree-building<br>process, which enables minimization of sync errors. We also tailor today's sync<br>protocols to the needs of optical DCNs, including reducing the number of sync<br>messages to fit into short circuit durations and correcting timestamp errors<br>for higher sync accuracy. Our implementation on programmable switches shows<br>28ns sync accuracy in a 192-ToR setting.<br> %K Computer Science, Networking and Internet Architecture, cs.NI
Xing, J., Hsu, K.-F., Xia, Y., Cai, Y., Li, Y., Zhang, Y., & Chen, A. (2024). Occam: A Programming System for Reliable Network Management. In EuroSys ’24, Nineteenth European Conference on Computer Systems. Athens, Greece: ACM. doi:10.1145/3627703.3650086
Export
BibTeX
@inproceedings{Xing_EuroSys24, TITLE = {Occam: {A} Programming System for Reliable Network Management}, AUTHOR = {Xing, Jiarong and Hsu, Kuo-Feng and Xia, Yiting and Cai, Yan and Li, Yanping and Zhang, Ying and Chen, Ang}, LANGUAGE = {eng}, ISBN = {979-8-4007-0437-6}, DOI = {10.1145/3627703.3650086}, PUBLISHER = {ACM}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, BOOKTITLE = {EuroSys '24, Nineteenth European Conference on Computer Systems}, PAGES = {148--162}, ADDRESS = {Athens, Greece}, }
Endnote
%0 Conference Proceedings %A Xing, Jiarong %A Hsu, Kuo-Feng %A Xia, Yiting %A Cai, Yan %A Li, Yanping %A Zhang, Ying %A Chen, Ang %+ External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations External Organizations External Organizations %T Occam: A Programming System for Reliable Network Management : %G eng %U http://hdl.handle.net/21.11116/0000-0010-43CB-E %R 10.1145/3627703.3650086 %D 2024 %B Nineteenth European Conference on Computer Systems %Z date of event: 2024-04-22 - 2024-04-25 %C Athens, Greece %B EuroSys '24 %P 148 - 162 %I ACM %@ 979-8-4007-0437-6
Li, J., Tripathi, S., Rastogi, L., Lei, Y., Pan, R., & Xia, Y. (2024). Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling. Retrieved from https://arxiv.org/abs/2410.17043
(arXiv: 2410.17043)
Abstract
As machine learning models scale in size and complexity, their computational<br>requirements become a significant barrier. Mixture-of-Experts (MoE) models<br>alleviate this issue by selectively activating relevant experts. Despite this,<br>MoE models are hindered by high communication overhead from all-to-all<br>operations, low GPU utilization due to the synchronous communication<br>constraint, and complications from heterogeneous GPU environments.<br> This paper presents Aurora, which optimizes both model deployment and<br>all-to-all communication scheduling to address these challenges in MoE<br>inference. Aurora achieves minimal communication times by strategically<br>ordering token transmissions in all-to-all communications. It improves GPU<br>utilization by colocating experts from different models on the same device,<br>avoiding the limitations of synchronous all-to-all communication. We analyze<br>Aurora's optimization strategies theoretically across four common GPU cluster<br>settings: exclusive vs. colocated models on GPUs, and homogeneous vs.<br>heterogeneous GPUs. Aurora provides optimal solutions for three cases, and for<br>the remaining NP-hard scenario, it offers a polynomial-time sub-optimal<br>solution with only a 1.07x degradation from the optimal.<br> Aurora is the first approach to minimize MoE inference time via optimal model<br>deployment and communication scheduling across various scenarios. Evaluations<br>demonstrate that Aurora significantly accelerates inference, achieving speedups<br>of up to 2.38x in homogeneous clusters and 3.54x in heterogeneous environments.<br>Moreover, Aurora enhances GPU utilization by up to 1.5x compared to existing<br>methods.<br>
Export
BibTeX
@online{Li_2410.17043, TITLE = {Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling}, AUTHOR = {Li, Jialong and Tripathi, Shreyansh and Rastogi, Lakshay and Lei, Yiming and Pan, Rui and Xia, Yiting}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2410.17043}, EPRINT = {2410.17043}, EPRINTTYPE = {arXiv}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, ABSTRACT = {As machine learning models scale in size and complexity, their computational<br>requirements become a significant barrier. Mixture-of-Experts (MoE) models<br>alleviate this issue by selectively activating relevant experts. Despite this,<br>MoE models are hindered by high communication overhead from all-to-all<br>operations, low GPU utilization due to the synchronous communication<br>constraint, and complications from heterogeneous GPU environments.<br> This paper presents Aurora, which optimizes both model deployment and<br>all-to-all communication scheduling to address these challenges in MoE<br>inference. Aurora achieves minimal communication times by strategically<br>ordering token transmissions in all-to-all communications. It improves GPU<br>utilization by colocating experts from different models on the same device,<br>avoiding the limitations of synchronous all-to-all communication. We analyze<br>Aurora's optimization strategies theoretically across four common GPU cluster<br>settings: exclusive vs. colocated models on GPUs, and homogeneous vs.<br>heterogeneous GPUs. Aurora provides optimal solutions for three cases, and for<br>the remaining NP-hard scenario, it offers a polynomial-time sub-optimal<br>solution with only a 1.07x degradation from the optimal.<br> Aurora is the first approach to minimize MoE inference time via optimal model<br>deployment and communication scheduling across various scenarios. Evaluations<br>demonstrate that Aurora significantly accelerates inference, achieving speedups<br>of up to 2.38x in homogeneous clusters and 3.54x in heterogeneous environments.<br>Moreover, Aurora enhances GPU utilization by up to 1.5x compared to existing<br>methods.<br>}, }
Endnote
%0 Report %A Li, Jialong %A Tripathi, Shreyansh %A Rastogi, Lakshay %A Lei, Yiming %A Pan, Rui %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling : %G eng %U http://hdl.handle.net/21.11116/0000-0010-43DB-C %U https://arxiv.org/abs/2410.17043 %D 2024 %8 22.10.2024 %X As machine learning models scale in size and complexity, their computational<br>requirements become a significant barrier. Mixture-of-Experts (MoE) models<br>alleviate this issue by selectively activating relevant experts. Despite this,<br>MoE models are hindered by high communication overhead from all-to-all<br>operations, low GPU utilization due to the synchronous communication<br>constraint, and complications from heterogeneous GPU environments.<br> This paper presents Aurora, which optimizes both model deployment and<br>all-to-all communication scheduling to address these challenges in MoE<br>inference. Aurora achieves minimal communication times by strategically<br>ordering token transmissions in all-to-all communications. It improves GPU<br>utilization by colocating experts from different models on the same device,<br>avoiding the limitations of synchronous all-to-all communication. We analyze<br>Aurora's optimization strategies theoretically across four common GPU cluster<br>settings: exclusive vs. colocated models on GPUs, and homogeneous vs.<br>heterogeneous GPUs. Aurora provides optimal solutions for three cases, and for<br>the remaining NP-hard scenario, it offers a polynomial-time sub-optimal<br>solution with only a 1.07x degradation from the optimal.<br> Aurora is the first approach to minimize MoE inference time via optimal model<br>deployment and communication scheduling across various scenarios. Evaluations<br>demonstrate that Aurora significantly accelerates inference, achieving speedups<br>of up to 2.38x in homogeneous clusters and 3.54x in heterogeneous environments.<br>Moreover, Aurora enhances GPU utilization by up to 1.5x compared to existing<br>methods.<br> %K Computer Science, Learning, cs.LG,Computer Science, Networking and Internet Architecture, cs.NI
De Marchi, F., Li, J., Bai, W., & Xia, Y. (2024). POSTER: Opportunistic Credit-Based Transport for Reconfigurable Data Center Networks with Tidal. In ACM SIGCOMM Posters and Demos ’24. Sydney, Australia: ACM. doi:10.1145/3672202.3673714
Export
BibTeX
@inproceedings{DeMarchi_SIGCOMM, TITLE = {{POSTER}: Opportunistic Credit-Based Transport for Reconfigurable Data Center Networks with Tidal}, AUTHOR = {De Marchi, Federico and Li, Jialong and Bai, Wei and Xia, Yiting}, LANGUAGE = {eng}, ISBN = {979-8-4007-0717-9}, DOI = {10.1145/3672202.3673714}, PUBLISHER = {ACM}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, BOOKTITLE = {ACM SIGCOMM Posters and Demos '24}, PAGES = {4--6}, ADDRESS = {Sydney, Australia}, }
Endnote
%0 Conference Proceedings %A De Marchi, Federico %A Li, Jialong %A Bai, Wei %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T POSTER: Opportunistic Credit-Based Transport for Reconfigurable Data Center Networks with Tidal : %G eng %U http://hdl.handle.net/21.11116/0000-0010-41E5-2 %R 10.1145/3672202.3673714 %D 2024 %B 38th ACM SIGCOMM %Z date of event: 2024-08-04 - 2024-08-08 %C Sydney, Australia %B ACM SIGCOMM Posters and Demos '24 %P 4 - 6 %I ACM %@ 979-8-4007-0717-9
Li, J., Gong, H., De Marchi, F., Gong, A., Lei, Y., Bai, W., & Xia, Y. (2024). Uniform-Cost Multi-Path Routing for Reconfigurable Data Center Networks. In ACM SIGCOMM ’24. Sydney, Australia: ACM. doi:10.1145/3651890.3672245
Export
BibTeX
@inproceedings{LiSIGCOMM24, TITLE = {Uniform-Cost Multi-Path Routing for Reconfigurable Data Center Networks}, AUTHOR = {Li, Jialong and Gong, Haotian and De Marchi, Federico and Gong, Aoyu and Lei, Yiming and Bai, Wei and Xia, Yiting}, LANGUAGE = {eng}, ISBN = {979-8-4007-0614-1}, DOI = {10.1145/3651890.3672245}, PUBLISHER = {ACM}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, DATE = {2024}, BOOKTITLE = {ACM SIGCOMM '24}, PAGES = {433--448}, ADDRESS = {Sydney, Australia}, }
Endnote
%0 Conference Proceedings %A Li, Jialong %A Gong, Haotian %A De Marchi, Federico %A Gong, Aoyu %A Lei, Yiming %A Bai, Wei %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Uniform-Cost Multi-Path Routing for Reconfigurable Data Center Networks : %G eng %U http://hdl.handle.net/21.11116/0000-000F-E324-8 %R 10.1145/3651890.3672245 %D 2024 %B ACM SIGCOMM Conference %Z date of event: 2024-08-04 - 2024-08-08 %C Sydney, Australia %B ACM SIGCOMM '24 %P 433 - 448 %I ACM %@ 979-8-4007-0614-1
2022
Li, J., Lei, Y., De Marchi, F., Joshi, R., Chandrasekaran, B., & Xia, Y. (2022). Hop-On Hop-Off Routing: A Fast Tour across the Optical Data Center Network for Latency-Sensitive Flow. In 6th Asia-Pacific Workshop on Networking (APNet 2022). Fuzhou, China: ACM. doi:10.1145/3542637.3542647
Export
BibTeX
@inproceedings{Li_APNet2022, TITLE = {{Hop-On Hop-Off} Routing: {A} Fast Tour across the Optical Data Center Network for Latency-Sensitive Flow}, AUTHOR = {Li, Jialong and Lei, Yiming and De Marchi, Federico and Joshi, Raj and Chandrasekaran, Balakrishnan and Xia, Yiting}, LANGUAGE = {eng}, DOI = {10.1145/3542637.3542647}, PUBLISHER = {ACM}, YEAR = {2022}, BOOKTITLE = {6th Asia-Pacific Workshop on Networking (APNet 2022)}, ADDRESS = {Fuzhou, China}, }
Endnote
%0 Conference Proceedings %A Li, Jialong %A Lei, Yiming %A De Marchi, Federico %A Joshi, Raj %A Chandrasekaran, Balakrishnan %A Xia, Yiting %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Hop-On Hop-Off Routing: A Fast Tour across the Optical Data Center Network for Latency-Sensitive Flow : %G eng %U http://hdl.handle.net/21.11116/0000-000C-7CD6-8 %R 10.1145/3542637.3542647 %D 2022 %B 6th Asia-Pacific Workshop on Networking %Z date of event: 2022-07-01 - 2022-07-02 %C Fuzhou, China %B 6th Asia-Pacific Workshop on Networking %I ACM
Lin, G. C., Schoenfeld, I., Thompson, M., Xia, Y., Uz-Bilgin, C., & Leech, K. (2022). ”What Color Are the Fish’s Scales?” Exploring Parents’ and Children’s Natural Interactions with a Child-Friendly Virtual Agent during Storybook Reading. In Proceedings of IDC ’22. Braga, Portugal: ACM. doi:10.1145/3501712.3529734
Export
BibTeX
@inproceedings{Lin_IDC22, TITLE = {{\textquotedblright}What Color Are the Fish{\textquoteright}s Scales?{\textquotedblright} Exploring Parents{\textquoteright} and Children{\textquoteright}s Natural Interactions with a Child-Friendly Virtual Agent during Storybook Reading}, AUTHOR = {Lin, Grace C. and Schoenfeld, Ilana and Thompson, Meredith and Xia, Yiting and Uz-Bilgin, Cigdem and Leech, Kathryn}, LANGUAGE = {eng}, ISBN = {978-1-4503-9197-9}, DOI = {10.1145/3501712.3529734}, PUBLISHER = {ACM}, YEAR = {2022}, BOOKTITLE = {Proceedings of IDC '22}, PAGES = {185--195}, ADDRESS = {Braga, Portugal}, }
Endnote
%0 Conference Proceedings %A Lin, Grace C. %A Schoenfeld, Ilana %A Thompson, Meredith %A Xia, Yiting %A Uz-Bilgin, Cigdem %A Leech, Kathryn %+ External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations %T &#8221;What Color Are the Fish&#8217;s Scales?&#8221; Exploring Parents&#8217; and Children&#8217;s Natural Interactions with a Child-Friendly Virtual Agent during Storybook Reading : %G eng %U http://hdl.handle.net/21.11116/0000-000C-181B-C %R 10.1145/3501712.3529734 %D 2022 %B 21st ACM Interaction Design and Children Conference %Z date of event: 2022-06-27 - 2022-06-30 %C Braga, Portugal %B Proceedings of IDC '22 %P 185 - 195 %I ACM %@ 978-1-4503-9197-9
Pan, R., Lei, Y., Li, J., Xie, Z., Yuan, B., & Xia, Y. (2022). Efficient Flow Scheduling in Distributed Deep Learning Training with Echelon Formation. In HotNets ’22, 21st ACM Workshop on Hot Topics in Networks. Austin, TX, USA: ACM. doi:10.1145/3563766.3564096
Export
BibTeX
@inproceedings{echelon, TITLE = {Efficient Flow Scheduling in Distributed Deep Learning Training with Echelon Formation}, AUTHOR = {Pan, Rui and Lei, Yiming and Li, Jialong and Xie, Zhiqiang and Yuan, Binhang and Xia, Yiting}, LANGUAGE = {eng}, ISBN = {978-1-4503-9899-2}, DOI = {10.1145/3563766.3564096}, PUBLISHER = {ACM}, YEAR = {2022}, BOOKTITLE = {HotNets '22, 21st ACM Workshop on Hot Topics in Networks}, PAGES = {93--100}, ADDRESS = {Austin, TX, USA}, }
Endnote
%0 Conference Proceedings %A Pan, Rui %A Lei, Yiming %A Li, Jialong %A Xie, Zhiqiang %A Yuan, Binhang %A Xia, Yiting %+ External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society %T Efficient Flow Scheduling in Distributed Deep Learning Training with Echelon Formation : %G eng %U http://hdl.handle.net/21.11116/0000-000C-1822-3 %R 10.1145/3563766.3564096 %D 2022 %B 21st ACM Workshop on Hot Topics in Networks %Z date of event: 2022-11-14 - 2022-11-15 %C Austin, TX, USA %B HotNets '22 %P 93 - 100 %I ACM %@ 978-1-4503-9899-2
2021
Ahuja, S. S., Gupta, V., Dangui, V., Bali, S., Gopalan, A., Zhong, H., … Zhang, Y. (2021). Capacity-efficient and Uncertainty-resilient Backbone Network Planning with Hose. In SIGCOMM ’21. Virtual Event, USA: ACM. doi:10.1145/3452296.3472918
Export
BibTeX
@inproceedings{Ahuja_SIGCOMM21, TITLE = {Capacity-efficient and Uncertainty-resilient Backbone Network Planning with Hose}, AUTHOR = {Ahuja, Satyajeet Singh and Gupta, Varun and Dangui, Vinayak and Bali, Soshant and Gopalan, Abishek and Zhong, Hao and Lapukhov, Petr and Xia, Yiting and Zhang, Ying}, LANGUAGE = {eng}, ISBN = {978-1-4503-8383-7}, DOI = {10.1145/3452296.3472918}, PUBLISHER = {ACM}, YEAR = {2021}, BOOKTITLE = {SIGCOMM '21}, EDITOR = {Kuipers, Fernando and Caesar, Matthew}, PAGES = {547--559}, ADDRESS = {Virtual Event, USA}, }
Endnote
%0 Conference Proceedings %A Ahuja, Satyajeet Singh %A Gupta, Varun %A Dangui, Vinayak %A Bali, Soshant %A Gopalan, Abishek %A Zhong, Hao %A Lapukhov, Petr %A Xia, Yiting %A Zhang, Ying %+ External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations %T Capacity-efficient and Uncertainty-resilient Backbone Network Planning with Hose : %G eng %U http://hdl.handle.net/21.11116/0000-0009-74AF-0 %R 10.1145/3452296.3472918 %D 2021 %B ACM SIGCOMM Conference %Z date of event: 2021-08-23 - 2021-08-27 %C Virtual Event, USA %B SIGCOMM '21 %E Kuipers, Fernando; Caesar, Matthew %P 547 - 559 %I ACM %@ 978-1-4503-8383-7
Xia, Y., Zhang, Y., Zhong, Z., Yan, G., Lim, C., Ahuja, S. S., … Ghobadi, M. (2021). A Social Network Under Social Distancing: Risk-Driven Backbone Management During COVID-19 and Beyond. In Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation. Virtual Event: USENIX Association.
Export
BibTeX
@inproceedings{Xia_NSDI21, TITLE = {A Social Network Under Social Distancing: {R}isk-Driven Backbone Management During {COVID}-19 and Beyond}, AUTHOR = {Xia, Yiting and Zhang, Ying and Zhong, Zhizhen and Yan, Guanqing and Lim, Chiunlin and Ahuja, Satyajeet Singh and Bali, Soshant and Nikolaidis, Alexander and Ghobadi, Kimia and Ghobadi, Manya}, LANGUAGE = {eng}, ISBN = {978-1-939133-21-2}, PUBLISHER = {USENIX Association}, YEAR = {2021}, BOOKTITLE = {Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation}, PAGES = {217--231}, ADDRESS = {Virtual Event}, }
Endnote
%0 Conference Proceedings %A Xia, Yiting %A Zhang, Ying %A Zhong, Zhizhen %A Yan, Guanqing %A Lim, Chiunlin %A Ahuja, Satyajeet Singh %A Bali, Soshant %A Nikolaidis, Alexander %A Ghobadi, Kimia %A Ghobadi, Manya %+ Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations %T A Social Network Under Social Distancing: Risk-Driven Backbone Management During COVID-19 and Beyond : %G eng %U http://hdl.handle.net/21.11116/0000-0009-74AB-4 %D 2021 %B 18th USENIX Symposium on Networked Systems Design and Implementation %Z date of event: 2021-04-12 - 2021-04-14 %C Virtual Event %B Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation %P 217 - 231 %I USENIX Association %@ 978-1-939133-21-2
Zhong, Z., Ghobadi, M., Khaddaj, A., Leach, J., Xia, Y., & Zhang, Y. (2021). ARROW: Restoration-Aware Traffic Engineering. In SIGCOMM ’21. Virtual Event, USA: ACM. doi:10.1145/3452296.3472921
Export
BibTeX
@inproceedings{Zhong_SIGCOMM21, TITLE = {{ARROW}: {R}estoration-Aware Traffic Engineering}, AUTHOR = {Zhong, Zhizhen and Ghobadi, Manya and Khaddaj, Alaa and Leach, Jonathan and Xia, Yiting and Zhang, Ying}, LANGUAGE = {eng}, ISBN = {978-1-4503-8383-7}, DOI = {10.1145/3452296.3472921}, PUBLISHER = {ACM}, YEAR = {2021}, BOOKTITLE = {SIGCOMM '21}, EDITOR = {Kuipers, Fernando and Caesar, Matthew}, PAGES = {560--579}, ADDRESS = {Virtual Event, USA}, }
Endnote
%0 Conference Proceedings %A Zhong, Zhizhen %A Ghobadi, Manya %A Khaddaj, Alaa %A Leach, Jonathan %A Xia, Yiting %A Zhang, Ying %+ External Organizations External Organizations External Organizations External Organizations Networks and Cloud Systems, MPI for Informatics, Max Planck Society External Organizations %T ARROW: Restoration-Aware Traffic Engineering : %G eng %U http://hdl.handle.net/21.11116/0000-0009-7533-A %R 10.1145/3452296.3472921 %D 2021 %B ACM SIGCOMM Conference %Z date of event: 2021-08-23 - 2021-08-27 %C Virtual Event, USA %B SIGCOMM '21 %E Kuipers, Fernando; Caesar, Matthew %P 560 - 579 %I ACM %@ 978-1-4503-8383-7

Previous publications

Xin Huang, Yiting Xia, T. S. Eugene Ng, Weaver: Efficient Coflow Scheduling in Heterogeneous Parallel Networks, IEEE International Symposium on Parallel and Distributed Processing(IPDPS 2020).

Yiting Xia*,Dingming Wu, Xiaoye Steven Sun, Xin Sunny Huang, Simbarashe Dzinamarira, T. S. Eugene Ng, Masking Failures from Application Performance in Data Center Networks with Shareable Backup, 2018 ACM Special Interest Group on Data Communication (SIGCOMM 2018).

Xiaoye Steven Sun, Yiting Xia, Simbarashe Dzinamarira, Xin Sunny Huang, Dingming Wu, T. S. Eugene Ng, Republic: Data Multicast Meets Hybrid Rack-Level Interconnections in Data Centers, IEEE 26th International Conference on Network Protocols (ICNP 2018).

Yiting Xia, On Designing Convertible Data Center Network Architectures, PhD thesis, Rice University, February 2018.

Yiting Xia, Xin Sunny Huang, T. S. Eugene Ng, Stop Rerouting! Enabling ShareBackup for Failure Recovery in Data Center Networks, 16th ACM Workshop on Hot Topics in Networks (HotNets 2017).

Yiting Xia, Xiaoye Steven Sun, Simbarashe Dzinamarira, Dingming Wu, Xin Sunny Huang, T. S. Eugene Ng, A Tale of Two Topologies: Exploring Convertible Data Center Network Architectures with Flat-tree, 2017 ACM Special Interest Group on Data Communication (SIGCOMM 2017).

Yiting Xia, T. S. Eugene Ng, Flat-tree: A Convertible Data Center Network Architecture from Clos to Random Graph, 15th ACM Workshop on Hot Topics in Networks (HotNets 2016).

Dingming Wu, Xiaoye Sun, Yiting Xia, Xin Huang, T. S. Eugene Ng, HyperOptics: A High Throughput and Low Latency Multicast Architecture for Datacenters, 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 2016).

Yiting Xia, T. S. Eugene Ng, Xiaoye Steven Sun, Blast: Accelerating High-Performance Data Analytics Applications by Optical Multicast, 2015 IEEE International Conference on Computer Communications (INFOCOM 2015).

Yiting Xia, Mike Schlansker, T. S. Eugene Ng, Jean Tourrilhes, Enabling Topological Flexibility for Data Centers Using OmniSwitch, 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 2015).

Howard Wang, Yiting Xia, Keren Bergman, T. S. Eugene Ng, Kunwadee Sripanidkulchai, λgroup: Using Optics to Take Group Data Delivery in the Datacenter to the Next Degree, Rice University Technical Report TR14-01, February 2014.

Yiting Xia, T. S. Eugene Ng, A Cross-Layer SDN Control Plane for Optical Multicast-Featured Datacenters,2014 ACM Hot Topics in Software Defined Networking (HotSDN 2014).

Payman Samadi, Howard Wang, David Calhoun, Yiting Xia, Kunwadee Sripanidkulchai, T. S. Eugene Ng, Keren Bergman, An Optical Programmable Network Architecture Supporting Iterative Multicast for Data-intensive Applications, IEEE 3rd Optical Interconnects Conference (OIC 2014).

Yiting Xia, Control Plane Design and Performance Analysis for Optical Multicast-Capable Datacenter Networks, Master’s Thesis, Rice University, April 2014.

Howard Wang, Yiting Xia, Keren Bergman, T. S. Eugene Ng, Sambit Sahu, Kunwadee Sripanidkulchai, Rethinking the Physical Layer of Data Center Networks of the Next Decade: Using Optics to Enable Efficient *-Cast Connectivity, ACM Computer Communication Review (CCR), July 2013.

Howard Wang, Yiting Xia, Kunwadee Sripanidkulchai, Sambit Sahu, T. S. Eugene Ng, Keren Bergman, End-to-End Demonstration of Optical Multicasting Over a Dynamically-Reconfigurable Hybrid Data Center Network Architecture, 2013 IEEE Photonics Society Summer Topical Meeting Series.

Wei Liang, Jingping Bi, Yiting Xia, Chengchen Hu, RPIM: Inferring BGP Routing Policies in ISP Networks, 2011 IEEE Global Telecommunications Conference (GLOBECOM 2011).

Wei Liang, Jingping Bi, Wenzhong Li, Yiting Xia, Towards Inferring Inter-Domain Routing Policies in ISP Networks, in IEICE Transactions on Communications, November 2011.

Research Interests

  • Data center network
  • Wide area network
  • Optical network
  • Software-defined network
  • Network measurement
  • Network management
  • Cloud computing
  • Data parallel systems

Honours and Awards

  • Ken Kennedy-Cray Fellowship
  • Pollard Fellowship
  • N2Women Rising Star Award in 2021

Invited Talks & Key Notes

  • Towards High-Performance and Power-Effcient Network Infrastructure for Cloud Computing, 8th Winter Seminar Series at Sharif University of Technology, April 2023
  • Towards High-Performance and Power-Effcient Network Infrastructure for Cloud Computing, Peking University, March 2023
  • Towards High-Performance and Power-Effcient Network Infrastructure for Cloud Computing, Tsinghua University, March 2023
  • Towards Deployable Optical Data Center Networks with Unified Routing, MPI-SWS Lightning Tutorials Series, July 2022
  • Towards Deployable Optical Data Center Networks with Unified Routing, Huawei InnovWave HCIO2022, June 2022
  • A Social Network under Social Distancing – Experience and Insights during COVID-19 on Risk-Driven Backbone Management, Saarland Informatics Campus Lecture Series, February 2021
  • A Social Network under Social Distancing – Experience and Insights during COVID-19 on Risk-Driven Backbone Management, MIT Computer Science & Artificial Intelligence Laboratory, October 2020

Reviewing Activity & Workshop / Conference positions

  • Co-chair N2Women Workshop 2022
  • TPC member of SOSR'21
  • TPC member of ICNP'21
  • TPC member of SOSR’19 Demo and Poster session
  • Reviewer of IEEE Journal on Selected Areas in Communications (JSAC’19)
  • Reviewer of IEEE Transactions on Cloud Computing 2019
  • TPC member of IEEE Electrical Engineering, Computer Science and Informatics (EECSI’18)
  • Reviewer of IEEE Transactions on Communications 2018
  • Reviewer of IEEE Transactions on Services Computing 2018
  • TPC member of IEEE IC on Computer Intelligent Systems and Networking (ICCISN’17)
  • Reviewer of IEEE Transactions on Parallel and Distributed Systems 2017
  • TPC member of IEEE Global Summit on Computer and Information
  • Technology (GSCIT’16)
  • Reviewer of IEEE Transactions on Cloud Computing 2016
  • Reviewer of IEEE International Conference on Cloud Computing (CLOUD’16)
  • TPC member of IEEE International Conference on Cloud Computing and Cryptography (ICCCC’15)
  • TPC member of IEEE Global Summit on Computer and Information Technology (GSCIT’15)
  • Reviewer of IEEE International Conference on Cloud Computing (CLOUD’15)
  • External reviewer of OSA Journal of Optical Communications and Networking 2013

Teachings

At Saarland University and MPI:

  • Summer 2023: Distributed Systems; Co-lecturer
  • Summer 2022: Data Networks; Co-lecturer
  • Summer 2021: Distributed Systems; Co-lecturer
  • Winter 2021: Advanced Topics in Data Networks; Co-lecturer
  • Fall 2020: Hot Topics in Data Networks Seminar; Lecturer

At Rice University:

  • Spring 2014, Guest Lecturer, COMP429 Introduction to Computer Networks
  • Fall 2014, Guest Lecturer, COMP526 Big Data Systems
  • Fall 2014, Teaching Assistant, COMP425 Computer Systems Architecture
  • Spring 2014, Teaching Assistant, COMP429 Introduction to Computer Networks
  • Fall 2013, Teaching Assistant, COMP221 Introduction to Computer Systems
  • Spring 2013, Teaching Assistant, COMP429 Introduction to Computer Networks
  • Fall 2012, Teaching Assistant, COMP221 Introduction to Computer Systems

Recent Positions

September 2020 - present:
Tenure-track faculty, MPI-INF, Saarbrücken, Germany

April 2018 - September 2020:
Research Scientist, Facebook, Menlo Park, USA

Education

April 2014 - February 2018:
Ph.D. in Computer Science, Rice University


August 2011 - April 2014:
M.S. in Computer Science, Rice University

September 2007 - June 2011:
B.S. in Telecommunications Engineering, Beijing University of Posts and Telecommunications & Queen Mary University of London