Publications

2025

JOSH
Multiprocessor scheduling with testing: improved online algorithms and numerical experiments
Mingyang Gong, Jing Fan, Guohui Lin, Bing Su, Zihan Su, Xiang Zhang
Journal of Scheduling, 2025
ICML
Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing
Xiang Zhang*, Jiaqi Wei*, Zijie Qiu*, Sheng Xu, Nanqing Dong, Zhiqiang Gao, Siqi Sun
ArXiv Preprint
ICML
Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing
Zijie Qiu*, Jiaqi Wei*, Xiang Zhang*, Sheng Xu, Kai Zou, Zhi Jin, Zhiqiang Gao, Nanqing Dong, Siqi Sun
ArXiv Preprint
ArXiv
MassNet: billion-scale AI-friendly mass spectral corpus enables robust de novo peptide sequencing
A Jun*, Xiang Zhang*, Xiaofan Zhang*, Jiaqi Wei*, Te Zhang, Yamin Deng, Pu Liu, Zongxiang Nie, Yi Chen, Nanqin Dong, Zhiqiang Gao, Siqi Sun, Tiannan Guo
ArXiv Preprint

2024

BioRxiv
ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing
Zhi Jin*, Sheng Xu*, Xiang Zhang*, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun
BioRxiv, 2024
ArXiv
Cross-Modal Consistency in Multimodal Large Language Models
Xiang Zhang*, Senyu Li*, Ning Shi, Bradley Hauer, Zijun Wu, Grzegorz Kondrak, Muhammad Abdul-Mageed, Laks VS Lakshmanan
ArXiv, 2024
ArXiv
Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer
Xiang Zhang, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan
ArXiv, 2024

2023

ACL
Don't trust ChatGPT when your question is not in English: a study of multilingual abilities and types of LLMs
Xiang Zhang*, Senyu Li*, Bradley Hauer, Ning Shi, Grzegorz Kondrak
Association for Computational Linguistics (ACL), 2023
ArXiv
TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models
Yuwei Yin, Jean Kaddour, Xiang Zhang, Yixin Nie, Zhenguang Liu, Lingpeng Kong, Qi Liu
ArXiv, 2023
ArXiv
Bridging the Gap Between BabelNet and HowNet: Unsupervised Sense Alignment and Sememe Prediction
Xiang Zhang, Ning Shi, Bradley Hauer, Grzegorz Kondrak
ArXiv, 2023

2022

NeurIPS
A character-level length-control algorithm for non-autoregressive sentence summarization
Puyuan Liu, Xiang Zhang, Lili Mou
Advances in Neural Information Processing Systems (NeurIPS), 2022
ArXiv
Improving HowNet-Based Chinese Word Sense Disambiguation with Translations
Xiang Zhang, Bradley Hauer, Grzegorz Kondrak
ArXiv, 2022

2021

ArXiv
Counting ability of large language models and impact of tokenization
Xiang Zhang, Juntai Cao, Chenyu You
ArXiv, 2021

2020

ArXiv
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
Zhilin Zhang*, Xiang Zhang*, Jiaqi Wei, Yiwei Xu, Chenyu You
ArXiv, 2020
ArXiv
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
Jiaqi Wei*, Yuejin Yang*, Xiang Zhang*, Yuhan Chen*, Xiang Zhuang*, Zhangyang Gao*, Dongzhan Zhou, Guangshuai Wang, Zhiqiang Gao, Juntai Cao, Zijie Qiu, Xuming He, Qiang Zhang, Chenyu You, Shuangjia Zheng, Ning Ding, Wanli Ouyang, Nanqing Dong, Yu Cheng, Siqi Sun, Lei Bai, Bowen Zhou
ArXiv, 2020

2019

Preprints & Working Papers

ArXiv
Why Prompt Design Matters and Works: A Complexity Analysis of Prompt Search Space in LLMs
Xiang Zhang*, Juntai Cao*, Jiaqi Wei, Chenyu You, Dujian Ding
ArXiv Preprint
ArXiv
π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing
Xiang Zhang*, Tianze Ling*, Zhi Jin*, Sheng Xu*, Zhiqiang Gao, Boyan Sun, Zijie Qiu, Jiaqi Wei, Nanqing Dong, Guangshuai Wang, Guibin Wang, Leyuan Li, Muhammad Abdul-Mageed, Laks VS Lakshmanan, Fuchu He, Wanli Ouyang, Cheng Chang, Siqi Sun
ArXiv Preprint
ArXiv
AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning
Jiaqi Wei, Hao Zhou, Xiang Zhang, Di Zhang, Zijie Qiu, Wei Wei, Jinzhe Li, Wanli Ouyang, Siqi Sun
ArXiv Preprint
ArXiv
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Juntai Cao*, Xiang Zhang*, Raymond Li, Chuyuan Li, Shafiq Joty, Giuseppe Carenini
ArXiv Preprint
ArXiv
Tokenization Constraints in LLMs: A Study of Symbolic and Arithmetic Reasoning Limits
Xiang Zhang*, Juntai Cao*, Jiaqi Wei, Yiwei Xu, Chenyu You
ArXiv Preprint