FragPipe-derived MSDT

scan Scan number of the MS/MS spectrum used for peptide identification

precursor_mz Observed m/z value of the peptide precursor ion

rt Retention time of the peptide precursor in the LC-MS run

mz_array List of fragment ion m/z values in the MS/MS spectrum

intensity_array List of corresponding ion intensities for the fragment ions

label Target/decoy label assigned to the PSM (1 for target, –1 for decoy)

charge Peptide precursor charge state(s)

ExpMass Experimental precursor mass (observed peptide mass)

retentiontime Chromatographic retention time of the peptide in the LC run

rank Rank of the PSM among all candidate matches for a given spectrum (1 = best)

isotope_errors Isotopic mass offsets allowed for precursor ion matching (e.g., 0, 1, 2)

hyperscore Similarity score between observed and theoretical spectra, higher values indicate greater similarity

delta_hyperscore Difference in Hyperscore between the top and second-best matches for a given spectrum

matched_ion_num Number of fragment ions that successfully matched theoretical ions

ion_series Type(s) of fragment ions matched (e.g., b, y, or both)

unweighted_spectral_entropy Entropy-based score describing the distribution of fragment ion intensities within a spectrum; lower values indicate more concentrated (higher-quality) spectra, while higher values suggest more uniform or noisy ion distributions.

delta_RT_loess Retention time deviation between observed and predicted RT after LOESS correction

precursor_sequence Peptide amino acid sequence with modifications

proteins List of protein accessions containing this peptide sequence

Sage-derived MSDT

scan Spectrum scan number used for PSM scoring

precursor_sequence Identified peptide sequence (no modifications included)

proteins Protein accessions or headers corresponding to the peptide

label Binary label (1 = target, –1 = decoy) used for model training or FDR estimation

charge Precursor charge state

matched_peaks Number of fragment ions matched between experimental and theoretical spectra

peptide_q q-value for peptide-level FDR estimation

protein_q q-value for protein-level FDR estimation

predicted_rt Predicted retention time

ion_mobility Ion mobility value (if available from the instrument)

delta_rt Retention time deviation between predicted and observed RT

spectrum_q q-value at the PSM (spectrum) level

sage_discriminant_score Discriminant score (e.g., linear combination of features) used by Sage for classification

precursor_mz Observed precursor ion m/z

rt Retention time of precursor in LC-MS run

mz_array List of fragment ion m/z values

intensity_array List of fragment ion intensities corresponding to mz_array

To enhance interoperability and support large-scale data analysis, our automated processing tool (link: xxx) also enables the conversion of .mgf files into the columnar Parquet format, making the data readily accessible for downstream applications such as machine learning and distributed computing.

mgf-derived MSDT

mz m/z values of fragment ions in the MS/MS spectrum

intensity Corresponding ion intensities for each m/z value

TITLE Unique identifier or description for the spectrum (often includes file name, scan number, or charge)

PEPMASS Precursor ion m/z and (optionally) intensity

CHARGE Precursor charge state (e.g., 2+, 3+)

INSTRUMENT Type or model of the mass spectrometer used to acquire the data

RTINSECONDS Retention time of the precursor ion in seconds