nvidia
/

audio-flamingo-2-0.5B

Model card Files Files and versions Community

nielsr HF staff commited on 2 days ago

Commit

809be91

verified ·

1 Parent(s): 13aa17a

Add pipeline tag, library name and clarify license

Browse files

This PR adds the missing `pipeline_tag` and `library_name` metadata, making the model easier to discover on the Hugging Face Hub. It also clarifies the license, specifying the code is MIT licensed, but the checkpoints are for non-commercial use only.

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -1,10 +1,16 @@
 # PyTorch Implementation of Audio Flamingo 2
 **Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro**
 [[paper]](https://arxiv.org/abs/2503.03983) [[Demo website]](https://research.nvidia.com/labs/adlr/AF2/) [[GitHub]](https://github.com/NVIDIA/audio-flamingo)
-This repo contains the PyTorch implementation of [Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities](). Audio Flamingo 2 achieves the state-of-the-art performance across over 20 benchmarks, with only a 3B parameter small language model. It is improved from our previous [Audio Flamingo](https://arxiv.org/abs/2402.01831).
 - We introduce two datasets, AudioSkills for expert audio reasoning, and LongAudio for long audio understanding, to advance the field of audio understanding.
@@ -34,7 +40,7 @@ Audio Flamingo 2 uses a cross-attention architecture similar to [Audio Flamingo]
 ## License
-- The checkpoints are for non-commercial use only (see NVIDIA OneWay Noncommercial License). They are also subject to the [Qwen Research license](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE), the [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and the original licenses accompanying each training dataset.
 - Notice: Audio Flamingo 2 is built with Qwen-2.5. Qwen is licensed under the Qwen RESEARCH LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.

+---
+pipeline_tag: audio-text-to-text
+library_name: transformers
+license: mit
+---
 # PyTorch Implementation of Audio Flamingo 2
 **Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro**
 [[paper]](https://arxiv.org/abs/2503.03983) [[Demo website]](https://research.nvidia.com/labs/adlr/AF2/) [[GitHub]](https://github.com/NVIDIA/audio-flamingo)
+This repo contains the PyTorch implementation of [Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities](https://arxiv.org/abs/2503.03983). Audio Flamingo 2 achieves state-of-the-art performance across over 20 benchmarks, using only a 3B parameter small language model. It is improved from our previous [Audio Flamingo](https://arxiv.org/abs/2402.01831).
 - We introduce two datasets, AudioSkills for expert audio reasoning, and LongAudio for long audio understanding, to advance the field of audio understanding.
 ## License
+The code in this repo is under MIT license. The checkpoints are for non-commercial use only (see NVIDIA OneWay Noncommercial License). They are also subject to the [Qwen Research license](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE), the [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and the original licenses accompanying each training dataset.
 - Notice: Audio Flamingo 2 is built with Qwen-2.5. Qwen is licensed under the Qwen RESEARCH LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.