Artificial Intelligence for Music

A workshop at 2025 ICME Annual Conference

Date: 2025/06/30 Monday

Workshop Summary

Music is an essential component of multimedia content. This workshop will explore the dynamic intersection of artificial intelligence and music. This workshop investigates how AI is changing the music industry and education, from composition to performance, production, collaboration, and audience experience. Participants will gain insights into the ways AI can enhance creativity and enable musicians and producers to push the boundaries of their art. The workshop will also discuss AI's impacts on music education and the careers of musicians. We will cover topics such as AI-driven music composition, where algorithms generate melodies, harmonies, and even full orchestral arrangements. Computer-generated music may be combined with computer-generated video to create the entire multimedia content. The workshop will discuss how AI tools can assist in sound design, remixing, and mastering, allowing for new sonic possibilities and efficiencies in music production. Additionally, the workshop will discuss the legal and ethical implications of AI in music, including questions of authorship, originality, and the role of the human artist in an increasingly automated world. This workshop is designed for AI researchers, musicians, producers, and educators interested in the status and future of AI in music. The organizing team will hold a competition for Automatic Music Transcription (AMT). This online competition will accept submissions worldwide, including both academia and industry. The winners will present their solutions at this ICME workshop. This competition is sponsored by the IEEE Technical Community on Multimedia Computing (TCMC) and the Computer Society. More details about this challenge will be available here.

Call for Papers

This one-day workshop will explore the dynamic intersection of artificial intelligence and multimedia with an emphasis on music and audio technologies. The workshop explores how AI is transforming music creation, recognition, and education, ethical and legal implications, as well as business opportunities. We will investigate how AI is changing the music industry and education—from composition to performance, production, collaboration, and audience experience. Participants will gain insights into the technological challenges in music and how AI can enhance creativity, enabling musicians and producers to push the boundaries of their art. The workshop will cover topics such as AI-driven music composition, where algorithms generate melodies, harmonies, and even full orchestral arrangements. We will discuss how AI tools assist in sound design, remixing, and mastering, allowing for new sonic possibilities and efficiencies in music production. Additionally, we'll examine AI's impact on music education and the careers of musicians, exploring advanced learning tools and teaching methods. AI technologies are increasingly adopted in the music and entertainment industry. The workshop will also discuss the legal and ethical implications of AI in music, including questions of authorship, originality, and the evolving role of human artists in an increasingly automated world. This workshop is designed for AI researchers, musicians, producers, and educators interested in the current status and future of AI in music.

Topics of Interest

Topics of Interest include, but are not limited to

AI-Driven Music Composition and Generation
AI in Music Practice and Performance
AI-based Music Recognition and Transcription
AI Applications in Sound Design
AI-Generated Videos to Accompany Music
AI-Generated Lyrics Based on Music
Legal or Ethical Implications of AI on Music
AI's Impacts on Musicians' Careers
AI Assisted Music Education
Business Opportunities of AI and Music
Music Datasets and Data Analysis

Submission Requirements

Please follow the submission requirements of ICME 2025. Papers must be no longer than 6 pages, including all text, figures, and references. This workshop will follow ICME submission and adopt double blind reviews. Authors should not identify themselves in the submitted PDF files.

Work in progress is welcome. Authors are encouraged to include descriptions of their prototype implementations. Additionally, authors are encouraged to interact with workshop attendees by including posters or demonstrations at the end of the workshop. Conceptual designs without any evidence of practical implementation are discouraged.

The authors agree that their papers submitted to this workshop have not been previously published (or accepted) in substantially similar forms. Furthermore, authors should not submit any papers that contain significant overlap with any papers that are being reviewed by a conference or a journal.

Submit papers to CMT.

Important Dates

Submission Deadline: April 1, 2025 (11:59PM Pacific Time)
Notification of Acceptance: April 25, 2025
Final Version Due: May 15, 2025

Accepted papers will be posted on the workshop website and IEEEXplore.

Workshop Schedule

Time	Topic
09:30AM	Welcome by Organizers: Yung-Hsiang Lu and Yeon-Ji Yun
09:35AM	Keynote Speech by Zhiyao Duan. Moderator: Yeon-Ji Yun
10:15AM	Invited Speech by Fatemeh Jamshidi. Moderator: Yeon-Ji Yun
10:50AM	Break
11:00AM	Invited Speech by Gus Xia. Moderator: Emmanouil Benetos
11:30AM	Invited Speech by Geoffroy Peeters. Moderator: Emmanouil Benetos
12:00PM	Invited Speech by Emmanouil Benetos. Moderator: Zhiyao Duan
12:30PM	Lunch Break
02:00PM	Paper Presentations. Moderator: Zhiyao Duan Analysis of Improvised Jazz Melodies Using Harmonic Tags Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic Algorithms, Probabilistic Methods and GPT Models in any Progression and Time Signature AI Music Artist Toolkit (AIMAT) - A Modular Environment for Experimenting with AI in Music
03:30PM	Panel Discussion Moderator: Gus Xia. Panelists: Geoffroy Peeters, Emmanouil Benetos, Zhiyao Duan, Ziyu Wang.
04:30PM	Winners of the Transcription Challenge. Moderator: Yung-Hsiang Lu MIROS (Music Information Retrieval Osnabrück): Deniz Gün, Fernando Riveros, Paul Koesling, Anish Haluvani Sundresh, William Shelor YourMT3-YPTF-MoE-M: Sungkyun Chang, Simon Dixon, Emmanouil Benetos
05:00PM	Adjourn

Invited Speakers

Geoffroy Peeters

Geoffroy Peeters is full-professor in the (Laboratoire Traitement et Communication de l'Information ) S2A team at Télécom Paris. He received his PHDs degree in 2001 and Habilitation in 2013 from University Paris-VI on audio signal processing, data analysis and machine learning. Before joining Télécom Paris, he lead research related to Music Information Retrieval at IRCAM ('Institut de recherche et coordination acoustique/musique). His current research work is on signal processing, machine learning and deep learning applied to audio and music data analysis.

Self-Supervised Learning for Invariant and Equivariant representations

Abstract: Self-supervised learning aims to apply supervised learning algorithms without the need for annotated data. It can therefore offer a solution for training ML-based systems in music, a domain where annotated data is often scarce. In this talk, we review recent advances in self-supervised learning applied to music, focusing on its two main paradigms: invariance (e.g., contrastive, masking, teacher-student, clustering, information-based, multi-modal) and equivariance. More precisely, we present our contributions: MatPac as foundation models, Stem-JEPA for generation, PESTO for pitch, PESTO-T for tempo, and CPC for beat detection.

Zhiyao Duan

Zhiyao Duan is an associate professor in Electrical and Computer Engineering, Computer Science, and Data Science at the University of Rochester. He is also a co-founder of Violy, a company aiming to improve music education through AI. His research interest is in computer audition and its connections with computer vision, natural language processing, and augmented and virtual reality. He received a best paper award at the Sound and Music Computing (SMC) Conference in 2017, a best paper nomination at the International Society for Music Information Retrieval (ISMIR) Conference in 2017, and a CAREER award from the National Science Foundation (NSF). His work has been funded by NSF, National Institute of Health, National Institute of Justice, New York State Center of Excellence in Data Science, and University of Rochester internal awards on AR/VR, health analytics, and data science. He is a senior area editor of IEEE Signal Processing Letters, an associate editor for IEEE Open Journal of Signal Processing, and a guest editor for Transactions of the International Society for Music Information Retrieval. He is the President of ISMIR.

Fatemeh Jamshidi

Fatemeh Jamshidi is an Assistant Professor in the Department of Computer Science at Cal Poly Pomona. Her research spans artificial intelligence, computer science education, computer music, machine learning and deep learning in music, game AI, human-AI collaboration, as well as augmented and mixed reality. She has published in prestigious venues, including ACM SIGCSE, ISMIR, IEEE, and HCII. Fatemeh earned her Ph.D. in Computer Science and Software Engineering and a master's in Music Education from Auburn University in 2024 and 2023, respectively. During her Ph.D., she founded the Computing + Music programs, which have engaged hundreds of participants from underrepresented groups since 2018. From 2020 to 2023, she also served as the Director of the Persian Music Ensemble at Auburn University. Her long-term goal is to establish a music technology center that fosters undergraduate and graduate research in areas such as music therapy, music generation, game music, and mixed reality in music.

Gus Xia

Gus Xia is an assistant professor of Machine Learning at the Mohamed bin Zayed University of Artificial Intelligence in Masdar City, Abu Dhabi. His research includes the design of interactive intelligent systems to extend human musical creation and expression. This research lies at the intersection of machine learning, human-computer interaction, robotics, and computer music. Some representative works include interactive composition via style transfer, human-computer interactive performances, autonomous dancing robots, large-scale content-based music retrieval, haptic guidance for flute tutoring, and bio-music computing using slime mold.

Emmanouil Benetos

Emmanouil Benetos is Reader in Machine Listening and Director of Research at the School of Electronic Engineering and Computer Science of Queen Mary University of London. Within Queen Mary, he is member of the Centre for Digital Music and Centre for Multimodal AI, is Deputy Director at the UKRI Centre for Doctoral Training in AI and Music (AIM), and co-leads the School's Machine Listening Lab. His main area of research is computational audio analysis, also referred to as machine listening or computer audition - with applications to music, urban, everyday and nature sounds.

Website: https://www.eecs.qmul.ac.uk/~emmanouilb/

Machine learning paradigms for music and audio understanding

Abstract: The area of computational audio analysis -also called machine listening- continues to evolve. Starting from methods grounded in digital signal processing and acoustics, followed by supervised machine learning methods that require large amounts of labelled data, recent approaches for learning music audio representations are fueled by advances in the broader field of artificial intelligence. The talk will outline recent research carried out at the Centre for Digital Music of Queen Mary University of London focusing on emerging learning paradigms for making sense of music and audio data. Topics covered will include learning in the presence of limited audio data, the inclusion of other modalities such as natural language to aid learning music representations, and finally methods for learning from unlabelled audio data - with the latter being used as a first step towards the creation of music foundation models.

Artificial Intelligence for Music

Date: 2025/06/30 Monday

Workshop Summary

Call for Papers

Topics of Interest

Submission Requirements

Important Dates

Workshop Schedule

Invited Speakers

Geoffroy Peeters

Self-Supervised Learning for Invariant and Equivariant representations

Zhiyao Duan

Fatemeh Jamshidi

Gus Xia

Emmanouil Benetos

Machine learning paradigms for music and audio understanding

Paper Presentations

Organizers

Yung-Hsiang Lu

Professor of Electrical and Computer Engineering

Kristen Yeon-Ji Yun

Clinical Associate Professor of Music

George K. Thiruvathukal

Professor and Chairperson of Computer Science

Technical Program Committee

Charalampos Saitis

Lecturer in Digital Music Processing

Hao-Wen (Herman) Dong

Mei-Ling Shyu

Professor Science and Engineering

Wen-Huang Cheng

Distinguished Chair Professor Department of Computer Science and Information Engineering