Keunwoo Choi

PhD · AI Researcher / Engineer · New York City · last.first@gene.com

I lead a team of MLEs to train internal large language models (LLMs) for therapeutic applications in Roche / Genentech.

Previously, I applied machine learning and deep learning techniques to solve music and audio AI -- problems such as music tagging, source separation, transcription as well as audio understanding and synthesis.

Experience

Senior Principal ML Scientist

Prescient Design, Genentech / Roche, New York, USA

Leading development of large Language Models.

March 2023 - Present

Adjunct Professor

Steinhardt School of Culture, Education, and Human Development at New York University, New York, USA

DM-GY 9103/Q Deep Learning for Media

Spring 2024

AI Advisor (Former AI Scientist/Director)

Gaudio Lab, Seoul, Korea

Leading an AI music/audio team covering music source separation, audio generation, and 3D audio.

March 2022 - Present

Senior Research Scientist

ByteDance AI Lab, Mountain View, USA

Large-scale deep learning systems for music and video content creation, social media, and music streaming service. Music research roadmapping for scalable and sustainable growth.

August 2020 - March 2022

Research Scientist

Spotify, New York, USA

Music content understanding and discovery for 300M+ users of the global leader in music industry

July 2018 - August 2020

Researcher

ETRI, Daejeon, South Korea

3D audio reproduction systems for immersive acoustic experience

Feburary 2010 - Jun 2014

Experience: LLM

From First Hire to Team Leader

Genentech / Roche, New York, USA

I joined Genentech as the inaugural member of the LLM team, tasked with navigating the rapidly evolving landscape following ChatGPT's initial release. I actually enjoyed the uncertainty—there's something energizing about figuring out real-world constraints and goals to maximize core value. What began as an exploration in uncharted territory has evolved into leading a comprehensive LLM initiative that spans foundational research, model development, and strategic implementation across Roche.

March 2023 - Present

Foundation and Discovery

The early days were marked by profound uncertainty. Investment levels, data acquisition strategies, and training methodologies were all question marks in a field moving at breakneck speed. I identified time as the most constraining factor and adopted a learning-by-doing approach, recognizing that LLMs are inherently empirical and implicit systems.

Delivered: Complete training pipeline, data frameworks, resource management, initial model prototypes.

Earned: Deep understanding of large-scale model development, hands-on experience with large-scale training, and strategic thinking around resource constraints in rapidly evolving fields.

March 2023 - December 2023

Scaling and Leadership

The focus shifted to achieving something real—training and deploying LLMs at enterprise scale. I led critical decisions around model selection, defining desired properties, and ensuring comprehensive data coverage. Simultaneously, I transitioned into a management role while navigating the challenge of finding exceptional talent who would not only join our team but thrive and deliver in our demanding, fast-paced environment.

Delivered: Company-wide LLM objectives, evolving hiring processes, team expansion.

Earned: Leadership experience in high-uncertainty environments, advanced hiring and talent identification skills, team management capabilities, and expertise in scaling technical teams during competitive market conditions.

January 2024 - December 2024

Strategy and Framework

With a larger team in place, my responsibilities expanded to encompass strategic roadmapping for what has become a critical component of the company's AI initiatives. I establish annual goals and quarterly OKRs, supervise team members across all aspects of LLM development, and coordinate with stakeholders throughout the organization.

Delivered: Strategic roadmaps, OKR frameworks, team ownership structures.

Earned: Strategic planning expertise, cross-departmental project management skills, open-source contribution experience, advanced interviewing techniques, and leadership frameworks for distributed technical teams across continents.

January 2025 - Present

Research and Innovation: TalkPlay

Multimodal Generative Recommendation for Music - talkpl.ai

Bridging my expertise in music AI with LLM development, I initiated TalkPlay—among the first LLM-native music recommendation systems. This work explores novel research directions in conversational music discovery, pioneering generative recommendation for music through natural language understanding and multimodal token generation.

The research progression began with foundational work that earned us ISMIR 2023 best paper runner-up recognition. In 2024, we published research on dataset generation for conversational LLM music discovery and recommendation at ISMIR 2024. Building on this foundation, we released the first comprehensive TalkPlay paper in 2025, demonstrating how music recommendation can be reformulated as LLM token generation across multiple modalities.

2024 - Present

Education

Queen Mary University of London

PhD in Computer Science, Centre for Digital Music, London, UK

Supervisor: Mark Sandler and George Fazekas
Thesis title: Deep Neural Networks for Music Tagging

October 2014 - March 2018

New York University

Visiting Scholar, Center for Data Science, New York, USA

Supervisor: Kyunghyun Cho

June 2016 - Janurary 2017

Seoul National University

Master in Electric Engineering and Computer Science, Seoul, South Korea

Supervisor: Koeng-Mo Sung
Thesis title: A NMF-Based Mono-To-Stereo Blind Upmix

March 2009 - Feburary 2011

Seoul National University

Bachelor in Electric Engineering and Computer Science, Seoul, South Korea

Full tuition fee waiver scholarship (2005 - 2009)

March 2005 - Feburary 2009

Activities

🏆 I received Best Paper Award at ISMIR 2017, the most prestigous music information retrieval research conference.
🎖️ I was also nominated for Best Paper Award at ISMIR 2023.

🖥 I am the creator and maintainer of Kapre, a Python package for Tensorflow (Keras) Audio Preprocessing layers.
⌨️ I am a contributor of librosa, Keras, torchdata, and torchaudio.

📄 I first-authored 10+ peer-reviewed papers at venues including ISMIR, ICASSP, EUSIPCO, ICML Workshop, etc.
📖 I review papers at NeurIPS, ICML, ICLR, ISMIR, ICASSP, AAAI, etc.
💡 I created and have co-organized an AI challenge for sound synthesis in 2023 and 2024.
💡 I am one of the general chairs of ISMIR 2025, with Prof. Juhan Nam and Prof. Dasaem Jeong.

📕 I co-authored a web book about music classification. We used it in our tutorial at ISMIR 2021
📕 I also wrote another web book, 'LLMs <3 MIR', 2023 May. It is a tutorial on large language models for music information retrieval.

🎼 I make music as keunwoo.OOO, as a composer, lyricist, singer, and producer.
📹 I record and edit audio and videos for Hayoung Lyou, my wife and an amazing jazz composer / pianist.
✉️ I try to convert my friends into JazzBees through my tiny, personal, and non-intimidating newsletter called JazzBuzz.