Publications

2022

Learning to Separate Voices by Spatial Regions. [paper] [demo]
Zhongweiyang Xu, Romit Roy Choudhury.
In The Thirty-ninth International Conference on Machine Learning (ICML 2022)

Dual-path Attention is All You Need for Audio-Visual Speech Extraction. [paper]
Zhongweiyang Xu^*, Xulin Fan^*, Mark Hasegawa-Johnson.
In The Fourty-eighth IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)
MULTI-CHANNEL SPEECH ENHANCEMENT FOR SPEAR CHALLENGE: A THREE STAGE APPROACH. [paper]
Zhongweiyang Xu, Debottam Dutta, Xulin Fan, Mark Hasegawa-Johnson, Romit Roy Choudhury.
Rank 2nd in [SPEAR Challenge]
Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions. [paper] [demo]
Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu.
(Preprint)

TaBE: Decoupling spatial and spectral processing with Taylor’s unfolding method in the beamspace domain for multi-channel speech enhancement. [paper]
Andong Li, Guochen Yu, Zhongweiyang Xu, Cunhang Fan, Xiaodong Li, Chengshi Zheng.
In Information Fusion, Volume 101, January 2024, 101976
SpatialCodec: Neural Spatial Speech Coding. [paper] [demo] [code]
Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu.
(ICASSP2024)
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models. [paper] [demo]
Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu.
(ICASSP2024)
FoVNet: FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses [paper]
Zhongweiyang Xu, Ali Aroudi, Ke Tan, Ashutosh Pandey, Jung-Suk Lee, Buye Xu, Francesco Nesta.
(INTERSPEECH2024)
Multi-Source Music Generation with Latent Diffusion [paper][demo] [code]
Zhongweiyang Xu, Debottam Dutta, Yu-Lin Wei, Romit Roy Choudhury.
(NeuRIPS 2025 Audio Imaginization Workshop)

ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior [paper] [demo] [code]
Zhongweiyang Xu, Xulin Fan, Zhong-Qiu Wang, Xilin Jiang, Romit Roy Choudhury.
In The Fourty-Second International Conference on Machine Learning (ICML 2025)
Unsupervised Multi-channel Speech Dereverberation via Diffusion [paper]
Yulun Wu, Zhongweiyang Xu, Jianchong Chen, Zhong-Qiu Wang, Romit Roy Choudhury.
(WASPAA 2025)