I’m Michele Mancusi, a Senior Research Scientist at Sony. My work focuses on deep learning for generative models in speech, audio, and music utilizing Large Language Models (LLMs) and diffusion models to push the boundaries of what’s possible in audio technology.
Before joining Sony, I gained valuable experience as an intern at Microsoft and Musixmatch. At Microsoft, I worked on deep learning for unsupervised speech separation, and at Musixmatch, I focused on deep learning for singing voice detection.
I earned my Ph.D. from Sapienza University of Rome under the supervision of Prof. Emanuele RodolĂ as a member of the Gladia research group. My doctoral research centered on music generation, source separation, and Natural Language Processing (NLP), contributing to advancements in the field of generative AI.
PhD in Computer Science, 2024
Sapienza University of Rome
M.S. in Physics, 2019
Sapienza University of Rome
B.S. in Physics, 2016
Sapienza University of Rome