declare-lab
/

JAM-0.5

song-generation

direct-preference-optimization

Model card Files Files and versions

liu-hanghang commited on Jul 28

Commit

786e957

·

verified ·

1 Parent(s): 8c6ebd9

Update README.md

Files changed (1) hide show

README.md +17 -1

README.md CHANGED Viewed

@@ -1,3 +1,19 @@
 # JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
 JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
@@ -267,4 +283,4 @@ For questions, concerns, or collaboration inquiries, please contact the Project
 For issues and questions:
 - Open an issue on GitHub
 - Check the troubleshooting section above
-- Review the configuration options for parameter tuning

+---
+language:
+- en
+metrics:
+- PER
+- WER
+- SongEval
+- Audio Aesthetics
+- MuQ
+- FAD
+pipeline_tag: text-to-audio
+library_name: diffusers
+tags:
+- music
+- art
+---
 # JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
 JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
 For issues and questions:
 - Open an issue on GitHub
 - Check the troubleshooting section above
+- Review the configuration options for parameter tuning