XuanjiNovo

XuanjiNovo is a transformer-based de novo peptide sequencing framework that integrates curriculum learning to enhance peptide prediction accuracy. The model begins by encoding spectral features through a transformer-based spectrum encoder, followed by a peptide decoder that reconstructs peptide sequences from amino acid tokens. In training, a CTC-based path sampling module generates candidate peptides, and a dynamic masking strategy progressively increases task difficulty—ranging from basic to advanced—allowing the model to learn peptide reconstruction in a curriculum fashion. During inference, spectral features are iteratively refined through the peptide decoder, and a probability matrix is decoded by the peptide mass constraint (PMC) module to yield the final predicted peptide sequence. This design enables robust learning from large-scale PSMs and achieves improved de novo sequencing performance compared to conventional approaches.