MidiBERT-Piano

Contributions

  • Compound word(CP) encoding is better than REMI encoding in general
  • BERT-based model outperforms RNN-based model in following downstream tasks:
    • Melody extraction
    • Velocity Prediction
    • Composer Identification
    • Emotion classfication
  • Pretraining perform much better than the model that train from scratch.

Future work

  • Implement other pretraining method to further boost the performance and robustness. Personally, I think the recent GLM paper is worth trying.