PianoBART

The description is generated by Grok3.

Model Details

Model Name: PianoBART
Model Type: Transformer-based model (BART architecture) for symbolic piano music generation and understanding
Version: 1.0
Release Date: August 2025
Developers: Zijian Zhao, Weichao Zeng, Yutong He, Fupeng He, Yiyi Wang
Organization: SYSU
License: Apache License 2.0
Paper: PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training, ICME 2024
Arxiv: https://arxiv.org/abs/2407.03361

Citation:

@INPROCEEDINGS{10688332,
  author={Liang, Xiao and Zhao, Zijian and Zeng, Weichao and He, Yutong and He, Fupeng and Wang, Yiyi and Gao, Chengying},
  booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
  title={PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training},
  year={2024},
  volume={},
  number={},
  pages={1-6},
  doi={10.1109/ICME57554.2024.10688332}
}

Contact: [email protected]
Repository: https://github.com/RS2002/PianoBart

Model Description

PianoBART is a transformer-based model built on the Bidirectional and Auto-Regressive Transformers (BART) architecture, designed for symbolic piano music generation and understanding. It leverages large-scale pre-training to perform tasks such as music generation, composer classification, emotion classification, velocity prediction, and melody prediction. The model processes symbolic music data in an octuple format and is inspired by frameworks like MusicBERT and MidiBERT-Piano.

Architecture: BART (encoder-decoder transformer)
Input Format: Octuple representation of symbolic music (batch_size, sequence_length, 8) for both encoder and decoder
Output Format: Hidden states of dimension [batch_size, sequence_length, 1024]
Hidden Size: 1024
Training Objective: Pre-training with large-scale datasets followed by task-specific fine-tuning
Tasks Supported: Music generation, composer classification, emotion classification, velocity prediction, melody prediction

Training Data

The model was pre-trained and fine-tuned on the following datasets:

Pre-training: POP1K7, ASAP, POP909, Pianist8, EMOPIA
Generation: Maestro, GiantMIDI
Composer Classification: ASAP, Pianist8
Emotion Classification: EMOPIA
Velocity Prediction: GiantMIDI
Melody Prediction: POP909

For dataset preprocessing and organization, refer to the MusicBERT and MidiBERT-Piano repositories.

Usage

Installation

git clone https://huggingface.co/RS2002/PianoBART

Please ensure that the model.py and Octuple.pkl files are located in the same folder.

Example Code

import torch
from model import PianoBART

# Load the model
model = PianoBART.from_pretrained("RS2002/PianoBART")

# Example input
input_ids_encoder = torch.randint(1, 10, (2, 1024, 8))
input_ids_decoder = torch.randint(1, 10, (2, 1024, 8))
encoder_attention_mask = torch.zeros((2, 1024))
decoder_attention_mask = torch.zeros((2, 1024))

# Forward pass
output = model(input_ids_encoder, input_ids_decoder, encoder_attention_mask, decoder_attention_mask)
print(output.last_hidden_state.size())  # Output: [2, 1024, 1024]