AI Proteomics Competition (AIPC) Series – Laboratory for Proteome Complexity Science

AIPC (Artificial Intelligence Proteomics Competition) series focuses on leveraging AI to uncover the ‘dark matter’ of MS-based proteomics.

1. Competition task

Peptide-spectrum matching (PSM) rescoring optimization: Participants will develop AI models to refine PSM and enhance the ranking of potential peptide sequences using machine learning techniques and existing protein databases.

2. Participants

Students, researchers, bioinformaticians, computational biologists, AI practitioners, and anyone worldwide interested in applying AI to proteomics.

3. Sponsors

4. MSDT Dataset

Each row in the MSDT files represents one PSM.

We plan to launch two competition tracks: the Enthusiast Track and the Professional Track, each corresponding to training datasets of different scales.

The Enthusiast Track will include approximately 37 million PSMs (16 million from Bruker, 16 million from Thermo, 5 million from SCIEX), while the Professional Track will consist of around 300 million PSMs from Thermo.

We will use two different test datasets for two tracks.

5. Baseline Model

We will offer a PSM-rescoring baseline to help participants know how to use MSDT files and train a base model on GPUs.

6. Evaluation Metrics

Identification Quantity: Identification Quantity: In a two-species scenario, the number of unique peptides correctly identified when controlling the False Discovery Rate (FDR) at 1%.
Evaluation Time: The model must complete processing a single file within 3 hours; otherwise, it will not be evaluated.

7. Competition Rules

7.1 Team Formation:

Each team must have a minimum of one and a maximum of five members.

Code sharing between teams is strictly prohibited—violators will be disqualified.

7.2 Submission Rules:

Each team can submit up to three times per day, with a total submission limit of 100 within 100 days.

Invalid submissions will not count toward the total submission limit.

7.3 Ranking Rules:

A/B Leaderboard System:

The test dataset is split into A leaderboard (40%) and B leaderboard (60%).
The A leaderboard updates in real-time and displays rankings.
The B leaderboard (used to determine the final awards) will be revealed three days after the competition ends.

Final Ranking Criteria:

Score > Submission Count > Submission Time
If two teams have the same Score, the team with fewer submissions ranks higher.
If both Score and submission count are identical, the team that submitted earlier ranks higher.

Final Submission Selection:

Each team’s leader can designate two final submissions for the B leaderboard ranking.
If no selection is made, the highest-ranked A leaderboard submission will be used by default.

8. Anticipated Competition Duration

Jul – Oct 2025 (100 days)

9. Awards & Prizes

First Prize: $5,000 (each track)

Second Prize: $1,500 (each track)

Third Prize: $500 (each track)

10. Additional Benefits

Computing Credit Rewards: Every registered participant will receive a $15 Bohrium® computing credit.

Best Notebook Award: Participants must submit complete code using the Bohrium Notebook platform. We encourage contestants to publish relevant content in Notebook format on the Case Plaza with the tag AI4SCUP-[AIPC]. The top three notebooks with the most likes will each receive a $150 computing credit.

Internship Opportunities: Outstanding participants may be recommended for internship opportunities at relevant institutions, gaining access to top-tier research resources and networking with leading interdisciplinary experts.

Invited tour in participating institutes.