PitchSTAR
This is an accompanying page for the paper “PitchSTAR: Pitch Style Transfer with Auto-Regularized Flow Matching for Singing Voice”, currently under review. PitchSTAR is a self-supervised framework for arbitrary pitch style transfer based on flow matching, which operates on note-relative pitch modulation, allowing it to disentangle note tone from pitch techniques. PitchSTAR also uses an auto-regularization strategy of exploiting the noisy inputs inherent to flow matching training, to allow conditioning on the full reference through a blurred cross-attention, forcing the model to capture both global and local stylistic characteristics while avoiding trivial reference copying.
Effect of CFG
Describe what problem you were solving and why it mattered.
Sound Samples
| Condition A | Condition B | Condition C | Spectrogram | |
|---|---|---|---|---|
| Sample 1 | ![]() |
|||
| Sample 2 | ![]() |
|||
| Sample 3 | ![]() |
|||
| Sample 4 | ![]() |
