AIMAIM
Events

March 3, 2025 · Co-located with AAAI 2025

Artificial Intelligence for Music

A one-day workshop exploring the dynamic intersection of artificial intelligence and music — composition, performance, production, education, and the legal and ethical implications of AI in the music industry.

Workshop summary

This one-day workshop will explore the dynamic intersection of artificial intelligence and music. It explores how AI is transforming music creation, recognition, and education, ethical and legal implications, as well as business opportunities. We will investigate how AI is changing the music industry and education — from composition to performance, production, collaboration, and audience experience. Participants will gain insights into the technological challenges in music and how AI can enhance creativity, enabling musicians and producers to push the boundaries of their art.

The workshop will cover topics such as AI-driven music composition, where algorithms generate melodies, harmonies, and even full orchestral arrangements. We will discuss how AI tools assist in sound design, remixing, and mastering, allowing for new sonic possibilities and efficiencies in music production. Additionally, we'll examine AI's impact on music education and the careers of musicians, exploring advanced learning tools and teaching methods. AI technologies are increasingly adopted in the music and entertainment industry.

The workshop will also discuss the legal and ethical implications of AI in music, including questions of authorship, originality, and the evolving role of human artists in an increasingly automated world. This workshop is designed for AI researchers, musicians, producers, and educators interested in the current status and future of AI in music.

Topics

What we cover

  • AI-Driven Music Composition and Generation
  • AI in Music Practice and Performance
  • AI-based Music Recognition and Transcription
  • AI Applications in Sound Design
  • AI-Generated Videos to Accompany Music
  • AI-Generated Lyrics Based on Music
  • Legal or Ethical Implications of AI on Music
  • AI's Impacts on Musicians' Careers
  • AI Assisted Music Education
  • Business Opportunities of AI and Music
  • Music Datasets and Data Analysis

Schedule

March 3, 2025

TimeSession
09:00 AMWelcome by Organizers
09:10 AMInvited Speech — Zhiyao Duan. Moderator: Kristen Yeon-Ji Yun
09:50 AMInvited Speech — Miguel Willis. Moderator: Kristen Yeon-Ji Yun
10:30 AMBreak
10:40 AMPaper Presentations. Moderator: George K. Thiruvathukal — full list under “Accepted papers” below.
12:30 PMLunch Break
01:00 PMInvited Speech — Hao-Wen (Herman) Dong. Moderator: Yung-Hsiang Lu
01:40 PMInvited Speech — Ziyu Wang, on behalf of Gus Xia (Dr. Xia is unable to present in person). Moderator: Yung-Hsiang Lu
02:20 PMPanel Discussion organized by Hao-Wen Dong
03:20 PMBreak
03:30 PMPoster Presentations and Participant Interaction
04:30 PMOpen Discussion: Future of AI and Music. Moderator: Zhiyao Duan
05:00 PMAdjourn

Invited speakers

Talks

Hao-Wen (Herman) Dong

Hao-Wen (Herman) Dong

University of Michigan · Performing Arts Technology

Generative AI for Music: Challenges and Opportunities

Generative AI has been transforming the way we interact with technology and consume content. The recent successes of LLM-based chatbots, AI assistants, text-to-image, text-to-video, and text-to-music systems have showcased how AI can augment human creativity and boost human productivity. In the next decade, generative AI technology will also reshape how we create music in the music, film, TV, podcast, and gaming industries across the entertainment, commercial, and education sectors. In the first half of this talk, I will introduce some of our recent work on the various applications of generative AI in music creation, including multitrack music generation, automatic instrumentation, and violin performance synthesis. In the second half, I will discuss the unique challenges of applying, scaling, and deploying generative AI music models in practice. Finally, I will discuss research opportunities towards controllable and interactable generative AI systems for music.

Hao-Wen (Herman) Dong is an Assistant Professor in the Performing Arts Technology Department at the University of Michigan. Herman's research aims to empower music and audio creation with machine learning. His long-term goal is to lower the barrier of entry for music composition and democratize audio content creation. He is broadly interested in music generation, audio synthesis, multimodal machine learning, and music information retrieval. Herman received his PhD in Computer Science from UC San Diego, where he worked with Julian McAuley and Taylor Berg-Kirkpatrick. His research has been recognized by the UCSD CSE Doctoral Award for Excellence in Research, KAUST Rising Stars in AI, UChicago and UCSD Rising Stars in Data Science, ICASSP Rising Stars in Signal Processing, and UCSD GPSA Interdisciplinary Research Award.

Zhiyao Duan

Zhiyao Duan

University of Rochester · ECE / CS / Data Science

AI Powered Interactive Music Making

Artificial Intelligence (AI) is profoundly transforming human society. As one of the most fun activities with a rich history, music making is being impacted by AI as well. Powered by AI, we can diversify the forms of music interaction, improve the productivity of musicians, lower barriers to entry for music interaction, and enlarge the music-making population. At the same time, however, generative AI poses significant challenges to current copyright regulation and economic norms. In this talk, I share my thoughts about these opportunities and challenges and introduce our recent work on real-time human–AI collaborative improvisation, Chinese–Western music style fusion, music inpainting from hand-drawn curves, software framework development for deploying music AI models on the web and to DAWs, and singing-voice deepfake detection.

Zhiyao Duan is an associate professor in Electrical and Computer Engineering, Computer Science, and Data Science at the University of Rochester. He is also a co-founder of Violy, a company aiming to improve music education through AI. His research interest is in computer audition and its connections with computer vision, natural language processing, and augmented and virtual reality. He received a best paper award at the Sound and Music Computing (SMC) Conference in 2017, a best paper nomination at ISMIR in 2017, and a CAREER award from NSF. His work has been funded by NSF, NIH, NIJ, the New York State Center of Excellence in Data Science, and University of Rochester internal awards on AR/VR, health analytics, and data science. He is a senior area editor of IEEE Signal Processing Letters, an associate editor for IEEE Open Journal of Signal Processing, and a guest editor for TISMIR. He is the President of ISMIR.

Miguel Willis

Miguel Willis

University of Pennsylvania Law School · Future of the Profession Initiative

Miguel Willis is the Innovator in Residence at the Law School's Future of the Profession Initiative (FPI), University of Pennsylvania. He concurrently serves as the Executive Director of Access to Justice Tech Fellows, a national nonprofit organization that develops summer fellowships for law students seeking to leverage technology to create equitable legal access for low-income and marginalized populations. Prior to joining FPI, Willis served as the Law School Admissions Council's (LSAC) inaugural Presidential Innovation Fellow. Willis currently serves on the advisory board of the University of Arizona James E. Rogers College of Law's Innovation for Justice (i4J) program and serves on The Legal Services Corporation's Emerging Leaders Council.

GX

Gus Xia

MBZUAI · Machine Learning

Gus Xia is an assistant professor of Machine Learning at the Mohamed bin Zayed University of Artificial Intelligence in Masdar City, Abu Dhabi. His research includes the design of interactive intelligent systems to extend human musical creation and expression. This research lies at the intersection of machine learning, human–computer interaction, robotics, and computer music. Some representative works include interactive composition via style transfer, human–computer interactive performances, autonomous dancing robots, large-scale content-based music retrieval, haptic guidance for flute tutoring, and bio-music computing using slime mold.

Ziyu Wang

Ziyu Wang

NYU / NYU Shanghai · MBZUAI (visiting)

From Imitation to Creation: When Music AI Truly Understands

Recent advances in generative AI have led to impressive achievements in music generation. Yet, a fundamental challenge remains: how can these black-box models move beyond imitating music data to truly understand human creative intent and collaborate meaningfully with humans? We argue that the missing piece is a deeper alignment between humans and AI — one that involves shared musical concepts, structured knowledge, and even the principles behind how we learn music. In this talk, I will explore various approaches to establish such alignment in generative modeling, which naturally enhances model interpretability and controllability in music generation. Ultimately, we may find that the heart of this alignment challenge lies in understanding content and style, a timeless question that resonates across art and life.

Ziyu Wang is a PhD candidate in Computer Science at the Courant Institute of Mathematical Sciences, New York University, and holds an affiliation with NYU Shanghai. Currently, he is also a visiting scholar in the Machine Learning Department at MBZUAI. His research is conducted under the supervision of Prof. Gus Xia in Music X Lab, where he explores the intersection of music and machine learning. In 2019, he earned his undergraduate degree in Mathematics from Fudan University. Beyond his academic pursuits, he is a passionate conductor, pianist, and Erhu (a traditional Chinese string instrument) player. He has previously served as the conductor of the NYU Shanghai Jazz Ensemble and as the director of the Fudan Musical Club.

Accepted papers

9 accepted contributions

  1. Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis
    Dinh-Viet-Toan Le (Université de Lille); Louis Bigo (Université de Bordeaux); Mikaela Keller (University of Lille)
    Paper ↗Slides ↗
  2. Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation
    Jan Retkowski; Jakub Stępniak; Mateusz Modrzejewski (Warsaw University of Technology)
    Paper ↗Slides ↗Project ↗
  3. M2M-Gen: A Multimodal Framework for Automated Background Music Generation in Japanese Manga Using Large Language Models
    Megha Sharma (Tokyo); Muhammad Haseeb (MBZUAI); Guangyu Xia (NYU Shanghai); Yoshimasa Tsuruoka (Tokyo)
    Paper ↗Slides ↗Project ↗
  4. AffectMachine-Pop: A controllable expert system for real-time pop music generation
    Kat Agres; Adyasha Dash; Phoebe Chua (NUS); Stefan Ehrlich (SETLabs Research GmbH)
    Paper ↗Slides ↗
  5. Understanding Unscripted Music Practice
    Christopher Raphael; Peter Miksza; Brenda Brenner (Indiana University)
    Paper ↗Project ↗
  6. Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models
    Christopher Tralie (Ursinus College); Matt Amery (Create And Innovate UK); Ben Douglas, Ian Utz (Ursinus College)
    Paper ↗Slides ↗Project ↗
  7. Revisiting Your Memory: Reconstruction of Affect-Contextualized Memory via EEG-guided Audiovisual Generation
    Joonwoo Kwon, Heehwan Wang, Jinwoo Lee, Sooyoung Kim (SNU); Shinjae Yoo, Yuewei Lin (Brookhaven); Jiook Cha (SNU)
    Paper ↗Project ↗
  8. MMVA: Multimodal Matching Based on Valence and Arousal across Images, Music, and Musical Captions
    Suhwan Choi (Crabs.ai); Kyu Won Kim (Independent); Myungjoo Kang (SNU)
    Paper ↗
  9. Towards Music Industry 5.0: Perspectives on Artificial Intelligence
    Alexander Williams; Mathieu Barthet (QMUL / Aix-Marseille)
    Paper ↗Slides ↗Project ↗

Call for papers

Submission information

Submission requirements

Submissions should be a maximum of 6 pages without references. Work in progress is welcome. Authors are encouraged to include descriptions of their prototype implementations. Additionally, authors are encouraged to interact with workshop attendees by including posters or demonstrations at the end of the workshop. Conceptual designs without any evidence of practical implementation are discouraged.

Papers must be formatted in AAAI two-column, camera-ready style; see the AAAI-25 author kit for details. Papers must be in trouble-free, high-resolution PDF format, formatted for US Letter (8.5″ × 11″) paper, using Type 1 or TrueType fonts. AAAI submissions are anonymous and must conform to the instructions for double-blind review — authors must remove all author and affiliation information from their submission, and may replace it with paper number and keywords. Submissions may consist of up to 6 pages of technical content plus additional pages solely for references; acknowledgements should be omitted from papers submitted for review. Only PDF files are required at the time of submission for review.

Paper format

Submissions follow the AAAI 2025 Main Technical Call for Papers ↗.

Submission portal

Submit via the CMT Submission Portal ↗.

Important dates

  • Submission Deadline: November 22, 2024
  • Notification of Acceptance: December 9, 2024
  • Final Version Due: December 31, 2024

Accepted papers are posted on the workshop website.

Organizers

Workshop chairs

Yung-Hsiang Lu

Yung-Hsiang Lu

Purdue University · Electrical and Computer Engineering

Yung-Hsiang Lu is a professor in the Elmore Family School of Electrical and Computer Engineering at Purdue University. He is a fellow of the IEEE and a distinguished scientist of the ACM. Yung-Hsiang has published papers on computer vision and machine learning in venues such as AI Magazine, Nature Machine Learning, and Computer. He is one of the editors of the book "Low-Power Computer Vision: Improve the Efficiency of Artificial Intelligence" (Chapman & Hall, 2022).

Kristen Yeon-Ji Yun

Kristen Yeon-Ji Yun

Purdue University · Music

Kristen Yeon-Ji Yun is a clinical associate professor in the Department of Music at the Patti and Rusty Rueff School of Design, Art, and Performance at Purdue University. She is the Principal Investigator of the research project "Artificial Intelligence Technology for Future Music Performers" (US National Science Foundation, IIS 2326198). Kristen is an active soloist, chamber musician, musical scholar, and clinician. She has toured many countries — Malaysia, Thailand, Germany, Mexico, Japan, China, Hong Kong, Spain, France, Italy, Taiwan, and South Korea — giving a series of successful concerts and master classes.

GK

George K. Thiruvathukal

Loyola University Chicago · Computer Science

George K. Thiruvathukal is a professor and chairperson of Computer Science at Loyola University Chicago and a visiting computer scientist at Argonne National Laboratory. His research interests include high-performance computing and distributed systems, programming languages, software engineering, machine learning, digital humanities, and arts (primarily music). George has authored multiple books including "Software Engineering for Science" (Chapman & Hall/CRC, 2016), "Web Programming: Techniques for Integrating Python, Linux, Apache, and MySQL" (Prentice Hall, 2001), and "High-Performance Java Platform Computing" (Prentice Hall, 2000).

Benjamin Shiue-Hal Chou

Benjamin Shiue-Hal Chou

Purdue University · PhD Student

Benjamin Shiue-Hal Chou is a PhD student in Electrical and Computer Engineering at Purdue University, supervised by Dr. Yung-Hsiang Lu. His research focuses on AI applications in music technology, particularly on detecting errors in music performances. Benjamin co-authored "Token Turing Machines are Efficient Vision Models" (arXiv:2409.07613, 2024). He earned his BS in Electrical Engineering from National Cheng Kung University (NCKU) in Taiwan, receiving awards such as the Outstanding Student Scholarship, Transnational Research Scholarship Grant, and the Tainan City Digital Governance Talent Award.