sander-wood commited on
Commit
f20d896
·
1 Parent(s): 0c98af9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +200 -0
README.md CHANGED
@@ -1,3 +1,203 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language: en
4
+ widget:
5
+ - text: "This is a traditional Irish dance music."
6
+ inference:
7
+ parameters:
8
+ top_p: 0.9
9
+ max_length: 1024
10
+ do_sample: True
11
  ---
12
+ # text-to-music
13
+
14
+ ## Model description
15
+
16
+ This language-music model takes [BART-base](https://huggingface.co/facebook/bart-base) fine-tunes on 282,870 English text-music pairs, where all scores are represented in ABC notation. It was introduced in the paper [Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task](https://arxiv.org/abs/2211.11216) by Wu et al. and released in [this repository](https://github.com/sander-wood/text-to-music).
17
+
18
+ It is capable of generating complete and semantically consistent sheet music directly from descriptions in natural language based on text. To the best of our knowledge, this is the first model that achieves text-conditional symbolic music generation which is trained on real text-music pairs, where the music is generated entirely by the model and without any hand-crafted rules.
19
+
20
+ ## Intended uses & limitations
21
+
22
+ You can use this model for text-conditional music generation. All scores generated by this model can be written on one stave (for vocal solo or instrumental solo) in standard classical notation, and are in a variety of styles, e.g., blues, classical, folk, jazz, pop, and world music. We recommend using the script in [this repository](https://github.com/sander-wood/text-to-music) for inference. The generated tunes are in ABC notation, and can be converted to MIDI or audio on [this website](https://www.mandolintab.net/abcconverter.php), or using [this software](https://sourceforge.net/projects/easyabc/).
23
+
24
+ Its creativity is limited, can not perform well on tasks requiring a high degree of creativity (e.g., melody style transfer), and it is input-sensitive. For more information, please check [our paper](https://arxiv.org/abs/2211.11216).
25
+
26
+ ### How to use
27
+
28
+ Here is how to use this model in PyTorch:
29
+
30
+ ```python
31
+ import torch
32
+ from samplings import top_p_sampling, temperature_sampling
33
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
34
+
35
+ tokenizer = AutoTokenizer.from_pretrained('sander-wood/text-to-music')
36
+ model = AutoModelForSeq2SeqLM.from_pretrained('sander-wood/text-to-music')
37
+ model = model
38
+
39
+ max_length = 1024
40
+ top_p = 0.9
41
+ temperature = 1.0
42
+
43
+ text = "This is a traditional Irish dance music."
44
+ input_ids = tokenizer(text,
45
+ return_tensors='pt',
46
+ truncation=True,
47
+ max_length=max_length)['input_ids']
48
+
49
+ decoder_start_token_id = model.config.decoder_start_token_id
50
+ eos_token_id = model.config.eos_token_id
51
+
52
+ decoder_input_ids = torch.tensor([[decoder_start_token_id]])
53
+
54
+ for t_idx in range(max_length):
55
+ outputs = model(input_ids=input_ids,
56
+ decoder_input_ids=decoder_input_ids)
57
+ probs = outputs.logits[0][-1]
58
+ probs = torch.nn.Softmax(dim=-1)(probs).detach().numpy()
59
+ sampled_id = temperature_sampling(probs=top_p_sampling(probs,
60
+ top_p=top_p,
61
+ return_probs=True),
62
+ temperature=temperature)
63
+ decoder_input_ids = torch.cat((decoder_input_ids, torch.tensor([[sampled_id]])), 1)
64
+ if sampled_id!=eos_token_id:
65
+ continue
66
+ else:
67
+ tune = "X:1\n"
68
+ tune += tokenizer.decode(decoder_input_ids[0], skip_special_tokens=True)
69
+ print(tune)
70
+ break
71
+ ```
72
+
73
+ ### Generation Examples
74
+ Here are some examples generated by this model without cherry-picking.
75
+ ```
76
+ ######################## INPUT TEXT ########################
77
+
78
+ This is a traditional Irish dance music.
79
+ Note Length-1/8
80
+ Meter-6/8
81
+ Key-D
82
+
83
+ ####################### OUTPUT TUNES #######################
84
+
85
+ X:1
86
+ L:1/8
87
+ M:6/8
88
+ K:D
89
+ A | BEE BEE | Bdf edB | BAF FEF | DFA BAF | BEE BEE | Bdf edB | BAF DAF | FED E2 :: A |
90
+ Bef gfe | faf edB | BAF FEF | DFA BAF | Bef gfe | faf edB | BAF DAF | FED E2 :|
91
+
92
+ X:2
93
+ L:1/8
94
+ M:6/8
95
+ K:D
96
+ A |: DED F2 A | d2 f ecA | G2 B F2 A | E2 F GFE | DED F2 A | d2 f ecA | Bgf edc |1 d3 d2 A :|2
97
+ d3 d2 a || a2 f d2 e | f2 g agf | g2 e c2 d | e2 f gfe | fed gfe | agf bag | fed cde | d3 d2 a |
98
+ agf fed | Adf agf | gfe ecA | Ace gfe | fed gfe | agf bag | fed cde | d3 d2 ||
99
+
100
+ X:3
101
+ L:1/8
102
+ M:6/8
103
+ K:D
104
+ BEE BEE | Bdf edB | BAF FEF | DFA dBA | BEE BEE | Bdf edB | BAF FEF |1 DED DFA :|2 DED D2 e |:
105
+ faf edB | BAF DFA | BAF FEF | DFA dBA | faf edB | BAF DFA | BdB AFA |1 DED D2 e :|2 DED DFA ||
106
+ ```
107
+
108
+ ```
109
+ ######################## INPUT TEXT ########################
110
+
111
+ This is a jazz-swing lead sheet with chord and vocal.
112
+
113
+ ####################### OUTPUT TUNES #######################
114
+
115
+ X:1
116
+ L:1/8
117
+ M:4/4
118
+ K:F
119
+ "F" CFG |"F" A6 z G |"Fm7" A3 G"Bb7" A3 G |"F" A6 z G |"F7" A4"Eb7" G4 |"F" F6 z F |
120
+ "Dm" A3 G"Dm/C" A3 G |"Bb" A2"Gm" B2"C7" G3 G |"F" F8- |"Dm7""G7" F6 z2 |"C" C4 C3 C |
121
+ "C7" C2 B,2"F" C4 |"F" C4 C3 C |"Dm" D2 C2"Dm/C" D4 |"Bb" D4 D3 D |"Bb" D2 C2"C7" D4 |"F" C8- |
122
+ "F" C4"Gm" z C"C7" FG |"F" A6 z G |"Fm7" A3 G"Bb7" A3 G |"F" A6 z G |"F7" A4"Eb7" G4 |"F" F6 z F |
123
+ "Dm" A3 G"Dm/C" A3 G |"Bb" A2"Gm" B2"C7" G3 G |"F" F8- |"F" F6 z2 |]
124
+
125
+ X:2
126
+ L:1/4
127
+ M:4/4
128
+ K:F
129
+ "^A""F" A3 A |"Am7" A2"D7" A2 |"Gm7" G2"C7" G A |"F" F4 |"F" A3 A |"Am7" A2"D7" A2 |"Gm7" G2"C7" G A |
130
+ "F" F4 |"Gm" B3 B |"Am7" B2"D7" B2 |"Gm" B2"D7" B A |"Gm7" G4 |"F" A3 A |"Am7" A2"D7" A2 |
131
+ "Gm7" G2"C7" G A |"F" F4 |"Bb7" F3 G |"F" A2 A2 |"Gm" B2"C7" B2 |"F" c2"D7" c c |"Gm7" c2"C7" B2 |
132
+ "F" A2"F7" A2 |"Bb" B2"F" B A |"Bb" B2"F" B A |"Gm" B2"F" B A |"Gm7" B2"F" B A |"Gm7" B2"F" B A |
133
+ "C7" B2 c2 |"F""Bb7" A4 |"F""Bb7" z4 |]
134
+
135
+ X:3
136
+ L:1/4
137
+ M:4/4
138
+ K:Bb
139
+ B, ||"Gm""^A1" G,2 B, D |"D7" ^F A2 G/=F/ |"Gm" G2"Cm7" B c |"F7" A2 G =F |"Bb" D2 F A |
140
+ "Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z B, |"Gm""^B" G,2 B, D |"D7" ^F A2 G/=F/ |
141
+ "Gm" G2"Cm7" B c |"F7" A2 G =F |"Bb" D2 F A |"Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z2 ||
142
+ "^C""F7""^A2" F4- | F E D C |"Bb" D2 F B | d3 c/B/ |"F" A2"Cm7" G2 |"D7" ^F2 G2 |"Gm" B3"C7" A |
143
+ "F7" G4 ||"F7""^A3" F4- | F E D C |"Bb" D2 F B | d3 c/B/ |"F" A2"Cm7" G2 |"D7" ^F2 G2 |"Gm" B3 A |
144
+ "C7" G4 ||"^B""Gm""^C" B2 c B |"Cm" c B c B |"Gm7" c2 B A |"C7" B3 A |"Bb" B2 c B |"G7" d c B A |
145
+ "Cm" G2 A G |"F7" F2 z G ||"^C""F7" F F3 |"Bb" D D3 |"Cm" E E3 |"D7" ^F F3 |"Gm" G2 A B |"C7" d3 d |
146
+ "Gm" d3 d |"D7" d3 B, ||"^D""Gm" G,2 B, D |"D7" ^F A2 G/=F/ |"Gm" G2"Cm7" B c |"F7" A2 G =F |
147
+ "Bb" D2 F A |"Cm7" c e2 d/c/ |"Gm7" B3/2 G/-"C7" G2- |"F7" G2 z2 |]
148
+ ```
149
+
150
+ ```
151
+ ######################## INPUT TEXT ########################
152
+
153
+ This is a Chinese folk song from the Jiangnan region. It was created during the Qianlong era (1735-1796) of the Qing dynasty. Over time, many regional variations were created, and the song gained popularity both in China and abroad. One version of the song describes a custom of giving jasmine flowers, popular in the southern Yangtze delta region of China.
154
+
155
+ ####################### OUTPUT TUNES #######################
156
+
157
+ X:1
158
+ L:1/8
159
+ Q:1/4=100
160
+ M:2/4
161
+ K:C
162
+ "^Slow" DA A2 | GA c2- | c2 G2 | c2 GF | GA/G/ F2 | E2 DC | DA A2 | GA c2- | c2 GA | cd- d2 |
163
+ cA c2- | c2 GA | cd- d2 | cA c2- | c2 GA | c2 A2 | c2 d2 | cA c2- | c2 c2 | A2 G2 | F2 AG | F2 ED |
164
+ CA,/C/ D2- | D2 CD | F2 A2 | G2 ED | CG A2 | G2 FD | CA,/C/ D2- | D2 CD | F2 A2 | G2 ED |
165
+ CG A2 | G2 FD | CA,/C/ D2- | D2 z2 :|
166
+
167
+ X:2
168
+ L:1/8
169
+ Q:1/4=100
170
+ M:2/4
171
+ K:C
172
+ "^ MDolce" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | EG ed | c2 AG | cA cd |
173
+ A2 AG | E2 ED | CD E2- | E2 z2 |"^ howeveroda" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- |
174
+ E2 z2 | A2 cA | GA E2- | E2 z2 | GA cd | e2 ed | cd e2- | e2 z2 | ge d2 | cd c2- | c2 z2 |
175
+ Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | EG ed | c2 AG | cA cd | A2 AG | E2 ED |
176
+ CD E2- | E2 z2 |"^DDtisata" Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | A2 cA |
177
+ GA E2- | E2 z2 | GA cd | e2 ed | cd e2- | e2 z2 | ge d2 | cd c2- | c2 z2 | Ac de | d2 AG |
178
+ cA cd | A2 AG | E2 ED | CD E2- | E2 z2 | Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |
179
+ Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |"^ Easy" Ac de | d2 AG | cA cd |
180
+ A2 AG | E2 ED | CD E2- | E2 z2 | Ac de | d2 AG | cA cd | A2 AG | E2 ED | CD E2- | E2 z2 |]
181
+
182
+ X:3
183
+ L:1/8
184
+ Q:1/4=60
185
+ M:4/4
186
+ K:C
187
+ "^S books defe.." AA A2 cdcc | AcAG A4- | A8 | A,4 CD C2 | A,4 cdcA | A2 GA- A4- | A2 GA A2 AA |
188
+ AG E2 D2 C2 | D6 ED | C2 D4 C2 | D2 C2 D4 | C2 A,2 CD C2 | A,4 cdcA | A2 GA- A4- | A2 GA A2 AA |
189
+ AG E2 D2 C2 | D6 z2 |]
190
+ ```
191
+
192
+ ### BibTeX entry and citation info
193
+
194
+ ```bibtex
195
+ @misc{wu2022exploring,
196
+ title={Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task},
197
+ author={Shangda Wu and Maosong Sun},
198
+ year={2022},
199
+ eprint={2211.11216},
200
+ archivePrefix={arXiv},
201
+ primaryClass={cs.SD}
202
+ }
203
+ ```