Monday, January 30, 2023

Open source implementation of Google's MusicLM in PyTorch

MusicLM - Pytorch (wip)

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch.

They are basically using text-conditioned AudioLM, but surprisingly with the embeddings from a text-audio contrastive learned model named MuLan. MuLan is what will be built out in this repository, with AudioLM modified from the other repository to support the music generation needs here.

Appreciation

  • Stability.ai for the generous sponsorship to work and open source cutting edge artificial intelligence research

Citations

@inproceedings{Agostinelli2023MusicLMGM,
  title     = {MusicLM: Generating Music From Text},
  author    = {Andrea Agostinelli and Timo I. Denk and Zal{\'a}n Borsos and Jesse Engel and Mauro Verzetti and Antoine Caillon and Qingqing Huang and Aren Jansen and Adam Roberts and Marco Tagliasacchi and Matthew Sharifi and Neil Zeghidour and C. Frank},
  year      = {2023}
}
@article{Huang2022MuLanAJ,
    title   = {MuLan: A Joint Embedding of Music Audio and Natural Language},
    author  = {Qingqing Huang and Aren Jansen and Joonseok Lee and Ravi Ganti and Judith Yue Li and Daniel P. W. Ellis},
    journal = {ArXiv},
    year    = {2022},
    volume  = {abs/2208.12415}
}


from Hacker News https://ift.tt/67koM2V

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.