2025 Automatic Music Transcription Challenge

Join the cutting-edge challenge to develop advanced music transcription models!
Register now for the 2025 Automatic Music Transcription Challenge!
Submission window: April 1, 2025 - May 1, 2025

Summary

The 2025 Automatic Music Transcription (AMT) Challenge invites participants to develop computer programs capable of accurately transcribing synthesized audio recordings of classical music into Musical Instrument Digital Interface (MIDI) files. Each submission will process 100 recordings, each up to 20 seconds long, within a maximum time limit of 4 hours. The audio data has been synthesized to sound as realistic as possible, closely resembling natural instrumental performances. Unlike previous challenges, participants will be informed of the specific instruments present in each recording. While utilizing this information is optional, incorrect instrument identification will incur a penalty, with smaller penalties applied if the mistake involves similar instrument families. Evaluation criteria include the accuracy of instrument identification, pitch, onset, offset, and dynamics.

Competition Leaderboard

An Online Competition

The challenge is proudly sponsored by the IEEE Technical Community on Multimedia Computing (TCMC).

Technical Details

Participants will register on the official website ai4musicians.org, where sample music files, including scores and audio recordings, will be provided to assist in model development. Contestants may use any public or proprietary data for training their models. Submissions will be open in April 2025, and each team's program will be executed on a GPU-equipped system at Purdue's Rosen Center for Advanced Computing.

Teams may submit their models once every 24 hours, and a live leaderboard will display performance results based on the sample data. Final rankings will be determined using an additional set of holdout data. A sample open-source solution will be made available to demonstrate input and output formats. Models performing worse than the sample will not be eligible for awards. Winning teams will be invited to present their solutions at the 2025 IEEE ICME Conference.

For questions and updates, participants are encouraged to join the AMT Slack Workspace.

Submission Details

Participants who have registered for the competition must follow these guidelines for submitting their models:

Repository Access

During registration, participants are required to provide a link to their code repository along with a fine-grained access token. This token is necessary for the competition backend to pull the model for execution.

Submission Branch

To submit a model, participants must create a branch titled submission in their repository. The competition backend will automatically pull from this branch to run the model.

Submissions are validated if the following conditions are met:

  • The branch titled submission exists.
  • New commits have been made to the branch since the last successful run.

Environment Configuration

Participants must include an environment.yml file in the root directory of their GitHub repository. This file will be used to create a conda environment during the model execution. Please ensure that the environment file correctly specifies all dependencies and versions required to run the model.

Model Execution Requirements

To run the model, the repository must contain a file named main.py located in the root directory. This script must accept the following command-line arguments:

  • -i: Path to the input audio file (in .mp3 format).
  • -o: Path to save the output MIDI file.

Example usage:

python main.py -i input.mp3 -o output.midi

All paths and directories referenced in the main.py file should be relative to its own location. This ensures compatibility with the backend environment.

Input File Naming Convention

The input audio file names will include the MIDI instrument codes to specify the instruments used in the recording. The MIDI instrument codes follow the General MIDI standard as described here. The file name format will be as follows:

Example:

1._0_40_70.mp3

In this example, the numbers 0, 40, and 70 correspond to the MIDI instrument codes of the instruments present in the audio file. Participants may choose to use this instrument information or ignore it. However, if the model incorrectly identifies the instruments, there will be a scoring penalty. The penalty is reduced if the mistake involves instruments from similar families.

Submission Process

As long as the submission branch exists and new commits are detected, the system will automatically run the model. The backend will handle the following:

  • Pulling the latest code from the submission branch.
  • Creating a conda environment using the provided environment.yml file.
  • Executing the model using the specified command.
  • Sending email notifications to the participant, including:
    • Status updates on the model run.
    • Input and output of the model.
    • Performance statistics of the model.
    • The SLURM job output, allowing the participant to verify that the backend successfully processed their model.

Model Weights

Participants may use Git LFS (Large File Storage) for managing model weights within their repository. The backend supports Git LFS, so there are no restrictions on using large files.

Schedule

The following schedule outlines the key dates for the 2025 Automatic Music Transcription Challenge. Please refer to these dates to stay on track throughout the competition.

11/20/2024
Release of ten sample compositions from each contributing composer (available to contestants).
12/15/2024
Competition announcement.
12/31/2024
Release of ten additional compositions from each composer (not available to contestants).
01/31/2025
Registration opens.
02/01/2025
Sample solution release.
04/01/2025
Submission window opens.
05/01/2025
Submission window closes.
05/15/2025
Winner announcement.
06/2025
Presentation of winning solutions at the 2025 IEEE ICME conference.

Cash Awards

The top-performing teams will be awarded cash prizes as follows:

  • First Place: Up to $1,500 USD
  • Second Place: Up to $1,000 USD
  • Third Place: $500 USD

To receive the cash award, winners must open-source their solutions as specified in the registration agreement. The submitted code must include adequate documentation to ensure reproducibility. The organizers will conduct a thorough examination of the source code before officially announcing the winners.

Please note that cash awards will only be provided to participants from countries not subject to United States embargoes or sanctions. In some cases, the award may take the form of travel grants, covering conference registration, hotel accommodations, and airfare.

Sample Compositions

To provide participants with a clear understanding of the type of music their models will be evaluated on, we are releasing 20 sample compositions. These compositions feature a diverse range of instrumental arrangements, allowing participants to fine-tune their models for optimal performance.

The following instruments are included in the sample compositions:

  1. Piano
  2. Violin
  3. Cello
  4. Flute
  5. Bassoon
  6. Trombone
  7. Oboe
  8. Viola

Each sample composition is provided as an MP3 audio file along with its corresponding sheet music, available in both MIDI and PDF formats. These files serve as a representative dataset to assist participants in developing and testing their models.

You can access the sample compositions here: Sample Compositions Google Drive Link

Sample Solution

To assist participants in developing their models, we provide a reference implementation of the MT3 (Multi-Task Multitrack Music Transcription) model. MT3, developed by Google's Magenta team, is an advanced model designed to transcribe music involving multiple instruments simultaneously. It utilizes a Transformer-based architecture to process audio inputs and generate accurate musical notations.

To further support participants, we have created a dedicated repository that adapts MT3 for the competition environment. This repository demonstrates how to set up and run the model within the submission framework.

You can access the MT3 competition-ready implementation here:

Additional Sample Implementations

In addition to MT3, we provide implementations of two other models to offer a broader perspective on music transcription approaches.

Participants are encouraged to study these implementations as foundational models to gain insights into effective music transcription architectures. They serve as valuable starting points for developing and refining your own transcription models.

Note: To qualify as a winning submission, your model must achieve a transcription accuracy score higher than MT3. This criterion ensures that new models demonstrate meaningful progress in the field of music transcription.

Organizers

The 2025 Automatic Music Transcription Challenge is organized by leading experts from multiple institutions, dedicated to advancing research in music transcription and computational musicology.

Photo Name Email Organization
Kristen Yeon Ji Yun Kristen Yeon-Ji Yun yun98@purdue.edu Department of Music, Purdue University
Yung-Hsiang Lu Yung-Hsiang Lu yunglu@purdue.edu School of Electrical and Computer Engineering, Purdue University
George Thiruvathukal George K. Thiruvathukal gthiruvathukal@luc.edu Department of Computer Science, Loyola University Chicago
Tae Hong Park Tae Hong Park thp@purdue.edu Department of Music, Purdue University
Harry Bulow Harry Bulow hbulow@purdue.edu Department of Music, Purdue University
Ojas Chaturvedi Ojas Chaturvedi ochaturv@purdue.edu Department of Computer Science, Purdue University
Kayshav Bhardwaj Kayshav Bhardwaj bhardw43@purdue.edu Department of Liberal Arts, Purdue University

Contributing Composers

The following composers have contributed music to the challenge, helping to establish a diverse and realistic dataset for model training and evaluation.

Photo Name Email Organization
Harry Bulow Harry Bulow hbulow@purdue.edu Department of Music, Purdue University
Tae Hong Park Tae Hong Park thp@purdue.edu Department of Music, Purdue University
Hubert Howe Hubert Howe hubert.howe@gmail.com Juilliard School of Music (retired)
Ka-Wai Yu Ka-Wai Yu ka-wai.yu@utahtech.edu Department of Music, Utah Tech University

For Contributing Composers

The 2025 challenge features invited composers to ensure the controlled complexity of the musical data. The objective is to produce high-quality compositions that allow meaningful evaluation of automatic transcription models.

Composers retain the copyright of their works while granting royalty-free, non-exclusive rights to the challenge organizers for redistribution and analysis. The organizers may modify compositions to meet scoring requirements.

Each invited composer is expected to contribute between 10 to 30 pieces, each approximately 20 seconds long, and divided into three difficulty levels: easy, medium, and difficult.

Guidelines for Composition:

  • Tempo: 60-90 bpm
  • Pitch Range: C2 to C7
  • Smallest rhythmic duration: sixteenth-notes/rests
  • No swing rhythms; use precise notation
  • No doubly-dotted notes
  • No trills or mordents
  • Meters: 3/4, 4/4, 6/8
  • Dynamic range: pp to ff
  • Up to three distinct instruments per composition, avoiding combinations like violin and cello due to similar timbre
  • Submit files in PDF (score), MusicXML, and MIDI formats

The selected instrument set is as follows:

  1. Piano
  2. Violin
  3. Cello
  4. Flute
  5. Bassoon
  6. Trombone
  7. Oboe
  8. Viola

Frequently Asked Questions

Question Answer
Will a leaderboard be provided? Yes, a live leaderboard will display the ranking of participating teams based on the performance of their models.
If my team is ranked No. 1 on the leaderboard, does that guarantee the No. 1 spot in the final winner announcement? Not necessarily. The final ranking will be determined based on a set of holdout data that differs from the public sample data. This ensures that the final evaluation considers the model's generalization capabilities.
How can it be possible that the No. 1 team on the leaderboard does not become the final winner? This can happen if a model overfits the public sample data and fails to perform well on the holdout data. The final ranking reflects the model's ability to generalize beyond the public samples.
Will the organizers provide training data? The competition will provide sample data for participants to understand the expected input and output formats. However, contestants are free to use any data (including public or proprietary data) for training their models.
Can industry professionals participate? Yes, industry participation is welcome.
Can industry teams collaborate with academia? Yes, partnerships between industry and academic teams are encouraged.
Is open-source required to receive an award? Yes, open-sourcing the solution is mandatory for winners. Cash awards will only be provided after the winning models are made publicly available. This requirement is clearly stated in the registration agreement.
Can I participate and receive a ranking without open-sourcing my solution? Yes, non-winning participants are not required to open-source their solutions. However, if a team wins and chooses not to open-source their model, they will not receive any cash award.
What license should winners use for their solutions? Winners are free to choose the license under which they release their code. The competition does not mandate a specific license.
Can winners publish academic papers about their solutions? Yes, winners are encouraged to publish their findings. The organizers are also exploring opportunities for a special journal issue dedicated to the competition results.
Is presenting at ICME mandatory for winners? No, presentation at the 2025 ICME is not mandatory, but winners are invited to share their work at the conference.