I make machines that sing, speak, play, and hear.
I am a PhD in the intersection of audio and machine learning at the Institute Polytechnique de Paris under the supervision of prof. Gaël Richard, with over five years of experience in the field and a background in signal processing. I am also a hobbyist music producer and guitarist.
Interests
Development of audio‑based machines that are transparent and human‑empowering in topics such as:
- Singing Voice & Speech
- (Differentiable) Digital Signal Processing
- Deep/Machine Learning
Selected Works
- Interspeech’s SER Challenge (2025): Improving Speech Emotion Recognition Through Cross Modal Attention Alignment and Balanced Stacking Model
- Master’s Thesis (2024): Cross-Speaker Style Transfer for TTS with Singing Voice Conversion Data Augmentation, Style Filtering, and F0 Matching
- Interspeech’s SynData4GenAI (2024): Exploring synthetic data for cross-speaker style transfer in style representation based TTS
- GENEA (2023): Gesture Generation with Diffusion Models Aided by Speech Activity Information
Latest Thoughts
Differentiable Digital Signal Processing
Nothing written. Yet!
Algebraic DSP: An Introduction
Nothing written… Yet!
Speaking X Singing: What differs in our body?
Nothing written… Yet!