I make machines that sing, speak, play, and hear.
I am a researcher in the intersection of audio and deep learning currently working as a Music Generation intern at Telecom Paris. I have five years of experience in the field and a background in signal processing. I am also a hobbyist music producer, composer, programmer and guitarist.
Interests
Developing audio‑based machines that are transparent and human‑empowering in topics such as:
- Differentiable Digital Signal Processing
- Singing Voice
- Sound Matching
- Music/Speech Generation & Recognition
- Neural Audio Effects
- Speech Emotion, Age and Gender Recognition
Selected Works
- Interspeech’s SER Challenge (2025): Improving Speech Emotion Recognition Through Cross Modal Attention Alignment and Balanced Stacking Model
- Master’s Thesis (2024): Cross-Speaker Style Transfer for TTS with Singing Voice Conversion Data Augmentation, Style Filtering, and F0 Matching
- Interspeech’s SynData4GenAI (2024): Exploring synthetic data for cross-speaker style transfer in style representation based TTS
- GENEA (2023): Gesture Generation with Diffusion Models Aided by Speech Activity Information
Latest Thoughts
Differentiable Digital Signal Processing
Nothing written. Yet!
Algebraic DSP: An Introduction
Nothing written… Yet!
Speaking X Singing: What differs in our body?
Nothing written… Yet!