Publications
2025
JOSH
Multiprocessor scheduling with testing: improved online algorithms and numerical experiments
Journal of Scheduling, 2025
ICML
Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing
ArXiv Preprint
ICML
ArXiv
MassNet: billion-scale AI-friendly mass spectral corpus enables robust de novo peptide sequencing
ArXiv Preprint
2024
BioRxiv
ArXiv
ArXiv
Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer
ArXiv, 2024
2023
ACL
Don't trust ChatGPT when your question is not in English: a study of multilingual abilities and types of LLMs
Association for Computational Linguistics (ACL), 2023
ArXiv
TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models
ArXiv, 2023
ArXiv
Bridging the Gap Between BabelNet and HowNet: Unsupervised Sense Alignment and Sememe Prediction
ArXiv, 2023
2022
NeurIPS
A character-level length-control algorithm for non-autoregressive sentence summarization
Advances in Neural Information Processing Systems (NeurIPS), 2022
ArXiv
2021
ArXiv
2020
ArXiv
ArXiv
2019
Preprints & Working Papers
ArXiv
Why Prompt Design Matters and Works: A Complexity Analysis of Prompt Search Space in LLMs
ArXiv Preprint
ArXiv
π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing
ArXiv Preprint
ArXiv
AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning
ArXiv Preprint
ArXiv
ArXiv
Tokenization Constraints in LLMs: A Study of Symbolic and Arithmetic Reasoning Limits
ArXiv Preprint