phtran commited on
Commit
23438b4
1 Parent(s): a6c649f

Push model using huggingface_hub.

Browse files
29fcf21b9e6e4fb5b1ba0bb1efd4197e_vocab.txt ADDED
@@ -0,0 +1,1023 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ##s
2
+ the
3
+ a
4
+ ##t
5
+ to
6
+ and
7
+ i
8
+ of
9
+ ##'
10
+ ##ed
11
+ in
12
+ ##d
13
+ ##ing
14
+ ##n
15
+ ##e
16
+ it
17
+ that
18
+ you
19
+ ##y
20
+ ##er
21
+ ##r
22
+ for
23
+ ##m
24
+ is
25
+ he
26
+ ##re
27
+ was
28
+ be
29
+ ##p
30
+ ##ly
31
+ so
32
+ we
33
+ ##a
34
+ ##g
35
+ ##o
36
+ c
37
+ ##b
38
+ ##u
39
+ on
40
+ have
41
+ but
42
+ ##ll
43
+ with
44
+ re
45
+ ##or
46
+ s
47
+ ##al
48
+ do
49
+ know
50
+ ##ar
51
+ they
52
+ not
53
+ as
54
+ this
55
+ ##in
56
+ ##le
57
+ e
58
+ are
59
+ like
60
+ ##c
61
+ uh
62
+ ##ri
63
+ me
64
+ his
65
+ at
66
+ ##l
67
+ ##es
68
+ de
69
+ yeah
70
+ can
71
+ ##k
72
+ or
73
+ my
74
+ all
75
+ had
76
+ there
77
+ will
78
+ one
79
+ ##il
80
+ no
81
+ what
82
+ ##en
83
+ ##ck
84
+ b
85
+ f
86
+ ##ce
87
+ ##ch
88
+ ##i
89
+ by
90
+ she
91
+ from
92
+ an
93
+ ##ic
94
+ ##ur
95
+ ##ve
96
+ ##w
97
+ ##ter
98
+ ##la
99
+ if
100
+ just
101
+ ##th
102
+ ##li
103
+
104
+ her
105
+ um
106
+ ##on
107
+ ##ation
108
+ w
109
+ would
110
+ ##f
111
+ ##te
112
+ st
113
+ go
114
+ ##ir
115
+ ##it
116
+ out
117
+ ##ro
118
+ pa
119
+ were
120
+ g
121
+ t
122
+ ##ion
123
+ think
124
+ ##an
125
+ right
126
+ about
127
+ ##se
128
+ ##lo
129
+ ##ent
130
+ up
131
+ ##ment
132
+ ##ate
133
+ when
134
+ ##h
135
+ ##ne
136
+ don
137
+ has
138
+ also
139
+ more
140
+ see
141
+ okay
142
+ their
143
+ your
144
+ ##ge
145
+ who
146
+ well
147
+ co
148
+ which
149
+ some
150
+ se
151
+ time
152
+ ba
153
+ said
154
+ con
155
+ ##ers
156
+ ra
157
+ ##us
158
+ ##de
159
+ ##ra
160
+ him
161
+ our
162
+ been
163
+ fa
164
+ po
165
+ pro
166
+ ##et
167
+ ##x
168
+ la
169
+ ##id
170
+ ##ver
171
+ oh
172
+ ma
173
+ ##v
174
+ now
175
+ ##age
176
+ two
177
+ ##ld
178
+ mo
179
+ how
180
+ ##tion
181
+ people
182
+ ##ive
183
+ other
184
+ ##ng
185
+ ##ity
186
+ ##z
187
+ ##ist
188
+ very
189
+ get
190
+ any
191
+ un
192
+ ro
193
+ ##is
194
+ work
195
+ mean
196
+ them
197
+ lo
198
+ ##vi
199
+ because
200
+ ##ies
201
+ ##ul
202
+ ##as
203
+ ##ad
204
+ ##mp
205
+ bo
206
+ ##-
207
+ then
208
+ good
209
+ ##el
210
+ ##nd
211
+ li
212
+ man
213
+ dis
214
+ could
215
+ ho
216
+ ##at
217
+ ##ol
218
+ bu
219
+ te
220
+ ha
221
+ ##est
222
+ ##me
223
+ say
224
+ ##ru
225
+ ##ke
226
+ sp
227
+ k
228
+ ##able
229
+ su
230
+ sa
231
+ di
232
+ fi
233
+ ##ance
234
+ really
235
+ over
236
+ even
237
+ ##ry
238
+ us
239
+ ca
240
+ ##ow
241
+ ##ho
242
+ into
243
+ ##ence
244
+ ##mo
245
+ mi
246
+ ##one
247
+ ##qu
248
+ ##ut
249
+ ##lu
250
+ o
251
+ ##ty
252
+ after
253
+ want
254
+ new
255
+ take
256
+ p
257
+ look
258
+ pre
259
+ ##sh
260
+ day
261
+ should
262
+ th
263
+ need
264
+ cha
265
+ ##co
266
+ much
267
+ where
268
+ d
269
+ ##ant
270
+ fe
271
+ da
272
+ make
273
+ ##om
274
+ did
275
+ le
276
+ ##un
277
+ only
278
+ ##im
279
+ these
280
+ ##ff
281
+ ##ti
282
+ ##ish
283
+ ex
284
+ ##ted
285
+ first
286
+ ##he
287
+ ##ig
288
+ vi
289
+ ri
290
+ en
291
+ com
292
+ ##ated
293
+ than
294
+ ##ma
295
+ way
296
+ ##um
297
+ ##ct
298
+ ##end
299
+ ##ight
300
+ here
301
+ ta
302
+ car
303
+ part
304
+ come
305
+ ##ia
306
+ off
307
+ sc
308
+ ah
309
+ ##am
310
+ tra
311
+ yes
312
+ back
313
+ ##ture
314
+ ##ful
315
+ pri
316
+ ##ction
317
+ ##ine
318
+ three
319
+ ##ard
320
+ let
321
+ ##pe
322
+ little
323
+ down
324
+ ##mb
325
+ si
326
+ dr
327
+ mr
328
+ going
329
+ comp
330
+ ##po
331
+ m
332
+ sta
333
+ gra
334
+ ##day
335
+ many
336
+ ##ian
337
+ ##ta
338
+ long
339
+ pi
340
+ too
341
+ app
342
+ kind
343
+ ##ous
344
+ ##ci
345
+ ga
346
+ ##ten
347
+ ##nt
348
+ before
349
+ may
350
+ got
351
+ ##man
352
+ ##tic
353
+ ##ition
354
+ ##cu
355
+ ##ugh
356
+ ##tra
357
+ n
358
+ ##ward
359
+ give
360
+ every
361
+ hi
362
+ ##ting
363
+ exp
364
+ those
365
+ hu
366
+ ##ot
367
+ something
368
+ lot
369
+ still
370
+ ne
371
+ ##na
372
+ ##ise
373
+ ##pp
374
+ most
375
+ gu
376
+ state
377
+ actually
378
+ such
379
+ bi
380
+ never
381
+ ##tain
382
+ great
383
+ through
384
+ al
385
+ ##no
386
+ mar
387
+ year
388
+ ##ach
389
+ ##les
390
+ school
391
+ ##ally
392
+ ##ial
393
+ ##ha
394
+ old
395
+ made
396
+ ##ary
397
+ ar
398
+ years
399
+ help
400
+ per
401
+ ##ving
402
+ ##ical
403
+ ##ther
404
+ does
405
+ ##ac
406
+ ##ca
407
+ must
408
+ ##di
409
+ own
410
+ ru
411
+ things
412
+ hand
413
+ thing
414
+ high
415
+ last
416
+ ##go
417
+ sh
418
+ under
419
+ four
420
+ place
421
+ ##ations
422
+ sure
423
+ ##mi
424
+ ##nce
425
+ am
426
+ ##for
427
+ ##ness
428
+ name
429
+ five
430
+ ##ound
431
+ op
432
+ cons
433
+ ph
434
+ same
435
+ ##row
436
+ ##ven
437
+ ##ph
438
+ ##ite
439
+ pe
440
+ ##j
441
+ sha
442
+ friend
443
+ wi
444
+ call
445
+ european
446
+ h
447
+ ##ect
448
+ ##ress
449
+ live
450
+ ##port
451
+ mhm
452
+ house
453
+ ##ie
454
+ ##ni
455
+ plan
456
+ jo
457
+ play
458
+ ##side
459
+ va
460
+ ##min
461
+ ##ious
462
+ life
463
+ du
464
+ ti
465
+ six
466
+ men
467
+ again
468
+ thank
469
+ talk
470
+ ##par
471
+ home
472
+ ##op
473
+ both
474
+ why
475
+ put
476
+ another
477
+ ##nc
478
+ being
479
+ ##mit
480
+ came
481
+ ##led
482
+ fo
483
+ end
484
+ member
485
+ ##ative
486
+ thought
487
+ tri
488
+ ##iv
489
+ ##our
490
+ ##red
491
+ went
492
+ ##lic
493
+ find
494
+ pu
495
+ ##land
496
+ start
497
+ far
498
+ eu
499
+ imp
500
+ always
501
+ ju
502
+ wa
503
+ person
504
+ singapore
505
+ ##ap
506
+ show
507
+ chi
508
+ ten
509
+ eight
510
+ while
511
+ point
512
+ y
513
+ ja
514
+ ya
515
+ ##ling
516
+ ##ctor
517
+ use
518
+ acc
519
+ world
520
+ pay
521
+ read
522
+ ##va
523
+ ##vo
524
+ change
525
+ u
526
+ pl
527
+ sw
528
+ war
529
+ might
530
+ ##nk
531
+ ##ments
532
+ ##and
533
+ different
534
+ dec
535
+ ##cent
536
+ ste
537
+ better
538
+ fun
539
+ month
540
+ ##ship
541
+ ##ton
542
+ tell
543
+ twenty
544
+ commission
545
+ exc
546
+ miss
547
+ ##if
548
+ love
549
+ money
550
+ found
551
+ hundred
552
+ ##gg
553
+ add
554
+ real
555
+ ##ities
556
+ na
557
+ pass
558
+ didn
559
+ v
560
+ feel
561
+ week
562
+ win
563
+ ##ible
564
+ try
565
+ upon
566
+ ##ba
567
+ interest
568
+ inter
569
+ ##son
570
+ ##line
571
+ ob
572
+ boy
573
+ big
574
+ used
575
+ seven
576
+ away
577
+ family
578
+ ##less
579
+ ki
580
+ ##ber
581
+ around
582
+ turn
583
+ anything
584
+ care
585
+ young
586
+ guess
587
+ happen
588
+ course
589
+ agree
590
+ support
591
+ conf
592
+ ##ual
593
+ number
594
+ trans
595
+ ##ating
596
+ mister
597
+ hard
598
+ watch
599
+ ##ft
600
+ next
601
+ sea
602
+ open
603
+ without
604
+ ##duc
605
+ ##gra
606
+ ##ak
607
+ cap
608
+ cre
609
+ ##hi
610
+ government
611
+ vo
612
+ between
613
+ each
614
+ ve
615
+ though
616
+ country
617
+ few
618
+ once
619
+ '
620
+ head
621
+ free
622
+ mu
623
+ maybe
624
+ act
625
+ night
626
+ thousand
627
+ face
628
+ uhhuh
629
+ keep
630
+ nine
631
+ close
632
+ case
633
+ che
634
+ against
635
+ done
636
+ ever
637
+ law
638
+ believe
639
+ public
640
+ room
641
+ sub
642
+ order
643
+ important
644
+ ##ient
645
+ el
646
+ children
647
+ second
648
+ bri
649
+ business
650
+ hope
651
+ move
652
+ ##fa
653
+ however
654
+ follow
655
+ able
656
+ word
657
+ yet
658
+ fla
659
+ stand
660
+ ##ize
661
+ je
662
+ service
663
+ nothing
664
+ report
665
+ called
666
+ grow
667
+ continue
668
+ issue
669
+ since
670
+ book
671
+ lu
672
+ qui
673
+ develop
674
+ gen
675
+ certain
676
+ ##light
677
+ cor
678
+ small
679
+ took
680
+ question
681
+ whole
682
+ problem
683
+ side
684
+ child
685
+ full
686
+ best
687
+ mm
688
+ probably
689
+ ##fi
690
+ qua
691
+ sur
692
+ market
693
+ left
694
+ everything
695
+ during
696
+ understand
697
+ ##ook
698
+ ##wa
699
+ cent
700
+ water
701
+ quite
702
+ leave
703
+ himself
704
+ ##ip
705
+ near
706
+ saw
707
+ together
708
+ large
709
+ having
710
+ already
711
+ invest
712
+ pretty
713
+ direct
714
+ hour
715
+ fact
716
+ ##way
717
+ run
718
+ bra
719
+ clear
720
+ fra
721
+ area
722
+ union
723
+ enough
724
+ consider
725
+ lead
726
+ remain
727
+ president
728
+ system
729
+ def
730
+ stuff
731
+ food
732
+ job
733
+ heard
734
+ err
735
+ mind
736
+ rest
737
+ speak
738
+ asked
739
+ ##ator
740
+ half
741
+ father
742
+ ##com
743
+ less
744
+ arm
745
+ human
746
+ ##ency
747
+ matter
748
+ group
749
+ girl
750
+ current
751
+ main
752
+ ##ttle
753
+ later
754
+ learn
755
+ strong
756
+ sign
757
+ check
758
+ light
759
+ else
760
+ true
761
+ term
762
+ ##qui
763
+ minute
764
+ spec
765
+ return
766
+ answer
767
+ reason
768
+ count
769
+ shall
770
+ communi
771
+ travel
772
+ wait
773
+ provide
774
+ low
775
+ mother
776
+ expect
777
+ cause
778
+ line
779
+ general
780
+ ##lf
781
+ getting
782
+ parliament
783
+ bank
784
+ company
785
+ stop
786
+ ##cause
787
+ power
788
+ gi
789
+ europe
790
+ moment
791
+ among
792
+ walk
793
+ allow
794
+ idea
795
+ office
796
+ town
797
+ cannot
798
+ countries
799
+ become
800
+ appear
801
+ present
802
+ bring
803
+ least
804
+ almost
805
+ kids
806
+ remember
807
+ include
808
+ short
809
+ sometimes
810
+ game
811
+ level
812
+ exactly
813
+ particular
814
+ social
815
+ land
816
+ woman
817
+ north
818
+ nice
819
+ concern
820
+ sort
821
+ effect
822
+ national
823
+ several
824
+ safe
825
+ until
826
+ further
827
+ cost
828
+ wonder
829
+ whether
830
+ either
831
+ future
832
+ pra
833
+ council
834
+ knew
835
+ common
836
+ south
837
+ making
838
+ morning
839
+ process
840
+ situation
841
+ white
842
+ result
843
+ suppose
844
+ employ
845
+ political
846
+ program
847
+ along
848
+ women
849
+ ski
850
+ court
851
+ please
852
+ shi
853
+ possible
854
+ protect
855
+ experience
856
+ definitely
857
+ require
858
+ account
859
+ myself
860
+ black
861
+ example
862
+ america
863
+ thirty
864
+ student
865
+ view
866
+ product
867
+ wife
868
+ health
869
+ major
870
+ difficult
871
+ death
872
+ visit
873
+ across
874
+ receive
875
+ voice
876
+ citizen
877
+ regard
878
+ author
879
+ treat
880
+ especially
881
+ local
882
+ taking
883
+ information
884
+ seemed
885
+ success
886
+ ##ability
887
+ break
888
+ whatever
889
+ security
890
+ address
891
+ felt
892
+ fifty
893
+ million
894
+ third
895
+ usually
896
+ gonna
897
+ brother
898
+ began
899
+ period
900
+ east
901
+ economic
902
+ increase
903
+ financial
904
+ respect
905
+ enjoy
906
+ christ
907
+ education
908
+ brought
909
+ organ
910
+ parents
911
+ policy
912
+ round
913
+ became
914
+ region
915
+ lady
916
+ discuss
917
+ single
918
+ early
919
+ couple
920
+ type
921
+ itself
922
+ serve
923
+ measure
924
+ husband
925
+ ##ified
926
+ music
927
+ ground
928
+ companies
929
+ street
930
+ behind
931
+ value
932
+ therefore
933
+ police
934
+ complete
935
+ john
936
+ daughter
937
+ affect
938
+ perhaps
939
+ international
940
+ themselves
941
+ improve
942
+ condition
943
+ hotel
944
+ deliver
945
+ sense
946
+ relation
947
+ sorry
948
+ credit
949
+ effort
950
+ instead
951
+ york
952
+ united
953
+ partner
954
+ spoke
955
+ strange
956
+ everybody
957
+ horse
958
+ depend
959
+ subject
960
+ project
961
+ approach
962
+ involve
963
+ listen
964
+ draw
965
+ computer
966
+ married
967
+ record
968
+ happy
969
+ sudden
970
+ represent
971
+ somebody
972
+ correct
973
+ serious
974
+ decision
975
+ society
976
+ including
977
+ college
978
+ english
979
+ attack
980
+ perform
981
+ cross
982
+ accept
983
+ control
984
+ flow
985
+ although
986
+ drink
987
+ front
988
+ wrong
989
+ twi
990
+ according
991
+ slow
992
+ peace
993
+ amount
994
+ object
995
+ movie
996
+ benefit
997
+ yup
998
+ challenge
999
+ private
1000
+ church
1001
+ wood
1002
+ field
1003
+ above
1004
+ ensure
1005
+ immediate
1006
+ figure
1007
+ foreign
1008
+ available
1009
+ insurance
1010
+ proposal
1011
+ doubt
1012
+ strength
1013
+ difference
1014
+ stood
1015
+ implement
1016
+ economy
1017
+ detail
1018
+ umhum
1019
+ restaurant
1020
+ collect
1021
+ global
1022
+ broke
1023
+ ##q
977d4e24975b431ebb44f2dfcdea8778_tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9b04033136c5d0413047fe94d2f0ab6cb088d292014bb076ee1700bdac545a9
3
+ size 260411
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ {}
3
+ ---
4
+
5
+ ## Model Overview
6
+
7
+ <DESCRIBE IN ONE LINE THE MODEL AND ITS USE>
8
+
9
+ ## NVIDIA NeMo: Training
10
+
11
+ To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest Pytorch version.
12
+ ```
13
+ pip install nemo_toolkit['all']
14
+ ```
15
+
16
+ ## How to Use this Model
17
+
18
+ The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
19
+
20
+ ### Automatically instantiate the model
21
+
22
+ ```python
23
+ import nemo.collections.asr as nemo_asr
24
+ asr_model = nemo_asr.models.ASRModel.from_pretrained("phtran/stt_en_conformer_ctc_small")
25
+ ```
26
+
27
+ ### Transcribing using Python
28
+ First, let's get a sample
29
+ ```
30
+ wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav
31
+ ```
32
+ Then simply do:
33
+ ```
34
+ asr_model.transcribe(['2086-149220-0033.wav'])
35
+ ```
36
+
37
+ ### Transcribing many audio files
38
+
39
+ ```shell
40
+ python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py pretrained_name="phtran/stt_en_conformer_ctc_small" audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
41
+ ```
42
+
43
+ ### Input
44
+
45
+ This model accepts 16000 KHz Mono-channel Audio (wav files) as input.
46
+
47
+ ### Output
48
+
49
+ This model provides transcribed speech as a string for a given audio sample.
50
+
51
+ ## Model Architecture
52
+
53
+ <ADD SOME INFORMATION ABOUT THE ARCHITECTURE>
54
+
55
+ ## Training
56
+
57
+ <ADD INFORMATION ABOUT HOW THE MODEL WAS TRAINED - HOW MANY EPOCHS, AMOUNT OF COMPUTE ETC>
58
+
59
+ ### Datasets
60
+
61
+ <LIST THE NAME AND SPLITS OF DATASETS USED TO TRAIN THIS MODEL (ALONG WITH LANGUAGE AND ANY ADDITIONAL INFORMATION)>
62
+
63
+ ## Performance
64
+
65
+ <LIST THE SCORES OF THE MODEL -
66
+ OR
67
+ USE THE Hugging Face Evaluate LiBRARY TO UPLOAD METRICS>
68
+
69
+ ## Limitations
70
+
71
+ <DECLARE ANY POTENTIAL LIMITATIONS OF THE MODEL>
72
+
73
+ Eg:
74
+ Since this model was trained on publicly available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
75
+
76
+
77
+ ## References
78
+
79
+ <ADD ANY REFERENCES HERE AS NEEDED>
80
+
81
+ [1] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
82
+
83
+
84
+ ## Original Model Name: ABC
85
+ ## Repo ID: nvidia/ABC_XYZ
cf241f7e4d904eaea46bb96f21dd0b1d_tokenizer.vocab ADDED
@@ -0,0 +1,1024 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <unk> 0
2
+ s -3.0263
3
+ ▁the -3.41766
4
+ ▁a -3.9762
5
+ t -4.01176
6
+ ▁to -4.02202
7
+ ▁and -4.04939
8
+ ▁i -4.14288
9
+ ▁of -4.25654
10
+ ' -4.32966
11
+ ed -4.41167
12
+ ▁in -4.45237
13
+ d -4.45877
14
+ ing -4.53241
15
+ n -4.56881
16
+ e -4.62574
17
+ ▁it -4.65761
18
+ ▁that -4.66105
19
+ ▁you -4.66842
20
+ y -4.80029
21
+ er -4.92503
22
+ r -4.94797
23
+ ▁for -5.00422
24
+ m -5.00985
25
+ ▁is -5.04904
26
+ ▁he -5.13475
27
+ re -5.21796
28
+ ▁was -5.22137
29
+ ▁be -5.25517
30
+ p -5.27288
31
+ ly -5.28375
32
+ ▁so -5.31924
33
+ ▁we -5.36619
34
+ a -5.41184
35
+ g -5.43206
36
+ o -5.43275
37
+ ▁c -5.44933
38
+ b -5.47847
39
+ u -5.49905
40
+ ▁on -5.50807
41
+ ▁have -5.5109
42
+ ▁but -5.51464
43
+ ll -5.51628
44
+ ▁with -5.51689
45
+ ▁re -5.54855
46
+ or -5.5861
47
+ ▁s -5.58872
48
+ al -5.60085
49
+ ▁do -5.61746
50
+ ▁know -5.62175
51
+ ar -5.64978
52
+ ▁they -5.65081
53
+ ▁not -5.65161
54
+ ▁as -5.66379
55
+ ▁this -5.68369
56
+ in -5.69316
57
+ le -5.70926
58
+ ▁e -5.71441
59
+ ▁are -5.73661
60
+ ▁like -5.77899
61
+ c -5.81055
62
+ ▁uh -5.86416
63
+ ri -5.86672
64
+ ▁me -5.87717
65
+ ▁his -5.88288
66
+ ▁at -5.88837
67
+ l -5.99474
68
+ es -6.00344
69
+ ▁de -6.00677
70
+ ▁yeah -6.02199
71
+ ▁can -6.05347
72
+ k -6.06257
73
+ ▁or -6.06696
74
+ ▁my -6.06989
75
+ ▁all -6.07977
76
+ ▁had -6.0971
77
+ ▁there -6.10531
78
+ ▁will -6.10972
79
+ ▁one -6.11344
80
+ il -6.11446
81
+ ▁no -6.12262
82
+ ▁what -6.12458
83
+ en -6.13206
84
+ ck -6.13865
85
+ ▁b -6.14598
86
+ ▁f -6.1574
87
+ ce -6.17153
88
+ ch -6.17679
89
+ i -6.18348
90
+ ▁by -6.18665
91
+ ▁she -6.19154
92
+ ▁from -6.19541
93
+ ▁an -6.19568
94
+ ic -6.20819
95
+ ur -6.20979
96
+ ve -6.22601
97
+ w -6.23968
98
+ ter -6.24495
99
+ la -6.2526
100
+ ▁if -6.25995
101
+ ▁just -6.2716
102
+ th -6.27748
103
+ li -6.28385
104
+ ▁ -6.2845
105
+ ▁her -6.28703
106
+ ▁um -6.29534
107
+ on -6.31318
108
+ ation -6.31753
109
+ ▁w -6.32975
110
+ ▁would -6.33558
111
+ f -6.33633
112
+ te -6.36164
113
+ ▁st -6.37187
114
+ ▁go -6.38758
115
+ ir -6.40928
116
+ it -6.41129
117
+ ▁out -6.41964
118
+ ro -6.42389
119
+ ▁pa -6.42399
120
+ ▁were -6.43243
121
+ ▁g -6.43733
122
+ ▁t -6.43754
123
+ ion -6.44863
124
+ ▁think -6.45619
125
+ an -6.4642
126
+ ▁right -6.46855
127
+ ▁about -6.4767
128
+ se -6.48153
129
+ lo -6.48217
130
+ ent -6.48888
131
+ ▁up -6.49777
132
+ ment -6.49927
133
+ ate -6.52341
134
+ ▁when -6.53144
135
+ h -6.53238
136
+ ne -6.53604
137
+ ▁don -6.53936
138
+ ▁has -6.55882
139
+ ▁also -6.56619
140
+ ▁more -6.56684
141
+ ▁see -6.5687
142
+ ▁okay -6.57173
143
+ ▁their -6.57539
144
+ ▁your -6.57664
145
+ ge -6.60217
146
+ ▁who -6.60997
147
+ ▁well -6.61219
148
+ ▁co -6.61255
149
+ ▁which -6.61274
150
+ ▁some -6.61714
151
+ ▁se -6.61804
152
+ ▁time -6.62142
153
+ ▁ba -6.62355
154
+ ▁said -6.63244
155
+ ▁con -6.64083
156
+ ers -6.64156
157
+ ▁ra -6.65025
158
+ us -6.65176
159
+ de -6.66205
160
+ ra -6.66792
161
+ ▁him -6.68277
162
+ ▁our -6.68791
163
+ ▁been -6.69063
164
+ ▁fa -6.69408
165
+ ▁po -6.70744
166
+ ▁pro -6.71192
167
+ et -6.71725
168
+ x -6.71916
169
+ ▁la -6.73553
170
+ id -6.74282
171
+ ver -6.74689
172
+ ▁oh -6.74706
173
+ ▁ma -6.75825
174
+ v -6.76115
175
+ ▁now -6.76235
176
+ age -6.76427
177
+ ▁two -6.76715
178
+ ld -6.76771
179
+ ▁mo -6.76887
180
+ ▁how -6.77155
181
+ tion -6.7937
182
+ ▁people -6.79744
183
+ ive -6.80143
184
+ ▁other -6.80534
185
+ ng -6.80715
186
+ ity -6.81352
187
+ z -6.81663
188
+ ist -6.83718
189
+ ▁very -6.84516
190
+ ▁get -6.85506
191
+ ▁any -6.86148
192
+ ▁un -6.86614
193
+ ▁ro -6.86987
194
+ is -6.87234
195
+ ▁work -6.88255
196
+ ▁mean -6.88295
197
+ ▁them -6.88553
198
+ ▁lo -6.88958
199
+ vi -6.89581
200
+ ▁because -6.89751
201
+ ies -6.89879
202
+ ul -6.90015
203
+ as -6.90824
204
+ ad -6.91023
205
+ mp -6.91547
206
+ ▁bo -6.9169
207
+ - -6.91865
208
+ ▁then -6.91935
209
+ ▁good -6.92813
210
+ el -6.93225
211
+ nd -6.93325
212
+ ▁li -6.936
213
+ ▁man -6.93624
214
+ ▁dis -6.94602
215
+ ▁could -6.95497
216
+ ▁ho -6.96697
217
+ at -6.96766
218
+ ol -6.97041
219
+ ▁bu -6.97092
220
+ ▁te -6.97702
221
+ ▁ha -6.98348
222
+ est -6.98549
223
+ me -6.99011
224
+ ▁say -6.99224
225
+ ru -6.99512
226
+ ke -6.99659
227
+ ▁sp -7.00212
228
+ ▁k -7.01427
229
+ able -7.01431
230
+ ▁su -7.01529
231
+ ▁sa -7.02578
232
+ ▁di -7.02821
233
+ ▁fi -7.03155
234
+ ance -7.03657
235
+ ▁really -7.03735
236
+ ▁over -7.03941
237
+ ▁even -7.05293
238
+ ry -7.05582
239
+ ▁us -7.05738
240
+ ▁ca -7.06337
241
+ ow -7.06468
242
+ ho -7.07761
243
+ ▁into -7.07966
244
+ ence -7.08333
245
+ mo -7.08787
246
+ ▁mi -7.08789
247
+ one -7.08829
248
+ qu -7.09051
249
+ ut -7.09374
250
+ lu -7.09945
251
+ ▁o -7.10004
252
+ ty -7.10264
253
+ ▁after -7.10287
254
+ ▁want -7.10504
255
+ ▁new -7.10788
256
+ ▁take -7.11305
257
+ ▁p -7.12212
258
+ ▁look -7.1312
259
+ ▁pre -7.13218
260
+ sh -7.13355
261
+ ▁day -7.14166
262
+ ▁should -7.14538
263
+ ▁th -7.15043
264
+ ▁need -7.15097
265
+ ▁cha -7.15336
266
+ co -7.15733
267
+ ▁much -7.16213
268
+ ▁where -7.16628
269
+ ▁d -7.16716
270
+ ant -7.16962
271
+ ▁fe -7.1718
272
+ ▁da -7.17488
273
+ ▁make -7.18448
274
+ om -7.18916
275
+ ▁did -7.19359
276
+ ▁le -7.19823
277
+ un -7.19886
278
+ ▁only -7.20642
279
+ im -7.20754
280
+ ▁these -7.20762
281
+ ff -7.20976
282
+ ti -7.21232
283
+ ish -7.21261
284
+ ▁ex -7.21338
285
+ ted -7.21551
286
+ ▁first -7.23735
287
+ he -7.24304
288
+ ig -7.24652
289
+ ▁vi -7.25989
290
+ ▁ri -7.26283
291
+ ▁en -7.27274
292
+ ▁com -7.27526
293
+ ated -7.27628
294
+ ▁than -7.28094
295
+ ma -7.28164
296
+ ▁way -7.28836
297
+ um -7.29041
298
+ ct -7.29154
299
+ end -7.29294
300
+ ight -7.29556
301
+ ▁here -7.31492
302
+ ▁ta -7.3183
303
+ ▁car -7.32546
304
+ ▁part -7.32842
305
+ ▁come -7.3299
306
+ ia -7.33231
307
+ ▁off -7.33717
308
+ ▁sc -7.33728
309
+ ▁ah -7.34222
310
+ am -7.34737
311
+ ▁tra -7.34964
312
+ ▁yes -7.35933
313
+ ▁back -7.35995
314
+ ture -7.3603
315
+ ful -7.36402
316
+ ▁pri -7.37566
317
+ ction -7.37631
318
+ ine -7.38027
319
+ ▁three -7.38034
320
+ ard -7.38252
321
+ ▁let -7.38935
322
+ pe -7.3907
323
+ ▁little -7.39645
324
+ ▁down -7.40305
325
+ mb -7.40994
326
+ ▁si -7.41456
327
+ ▁dr -7.41553
328
+ ▁mr -7.41781
329
+ ▁going -7.4185
330
+ ▁comp -7.42245
331
+ po -7.43274
332
+ ▁m -7.43719
333
+ ▁sta -7.43858
334
+ ▁gra -7.44028
335
+ day -7.44115
336
+ ▁many -7.44476
337
+ ian -7.44484
338
+ ta -7.44597
339
+ ▁long -7.44993
340
+ ▁pi -7.45578
341
+ ▁too -7.45677
342
+ ▁app -7.46234
343
+ ▁kind -7.46504
344
+ ous -7.46577
345
+ ci -7.47054
346
+ ▁ga -7.47277
347
+ ten -7.47296
348
+ nt -7.47766
349
+ ▁before -7.48153
350
+ ▁may -7.48629
351
+ ▁got -7.48682
352
+ man -7.48816
353
+ tic -7.49072
354
+ ition -7.49114
355
+ cu -7.49165
356
+ ugh -7.49891
357
+ tra -7.50018
358
+ ▁n -7.5028
359
+ ward -7.51083
360
+ ▁give -7.51228
361
+ ▁every -7.51262
362
+ ▁hi -7.51464
363
+ ting -7.51719
364
+ ▁exp -7.51776
365
+ ▁those -7.52665
366
+ ▁hu -7.53098
367
+ ot -7.5321
368
+ ▁something -7.53239
369
+ ▁lot -7.53632
370
+ ▁still -7.53767
371
+ ▁ne -7.53773
372
+ na -7.54076
373
+ ise -7.54493
374
+ pp -7.54774
375
+ ▁most -7.54882
376
+ ▁gu -7.55125
377
+ ▁state -7.55449
378
+ ▁actually -7.55843
379
+ ▁such -7.5601
380
+ ▁bi -7.56029
381
+ ▁never -7.56307
382
+ tain -7.56875
383
+ ▁great -7.57037
384
+ ▁through -7.57153
385
+ ▁al -7.57598
386
+ no -7.57722
387
+ ▁mar -7.58161
388
+ ▁year -7.58339
389
+ ach -7.58457
390
+ les -7.58859
391
+ ▁school -7.59211
392
+ ally -7.59229
393
+ ial -7.59368
394
+ ha -7.595
395
+ ▁old -7.59934
396
+ ▁made -7.59943
397
+ ary -7.60185
398
+ ▁ar -7.60606
399
+ ▁years -7.60819
400
+ ▁help -7.61816
401
+ ▁per -7.61937
402
+ ving -7.62696
403
+ ical -7.63029
404
+ ther -7.63423
405
+ ▁does -7.63932
406
+ ac -7.6418
407
+ ca -7.64436
408
+ ▁must -7.64511
409
+ di -7.65192
410
+ ▁own -7.6538
411
+ ▁ru -7.66118
412
+ ▁things -7.666
413
+ ▁hand -7.67196
414
+ ▁thing -7.67317
415
+ ▁high -7.67524
416
+ ▁last -7.67624
417
+ go -7.67763
418
+ ▁sh -7.68286
419
+ ▁under -7.68578
420
+ ▁four -7.68755
421
+ ▁place -7.68929
422
+ ations -7.6929
423
+ ▁sure -7.69793
424
+ mi -7.70033
425
+ nce -7.70056
426
+ ▁am -7.70431
427
+ for -7.70653
428
+ ness -7.70694
429
+ ▁name -7.72033
430
+ ▁five -7.72376
431
+ ound -7.72643
432
+ ▁op -7.73952
433
+ ▁cons -7.74034
434
+ ▁ph -7.74449
435
+ ▁same -7.74932
436
+ row -7.74959
437
+ ven -7.74973
438
+ ph -7.75637
439
+ ite -7.75888
440
+ ▁pe -7.75936
441
+ j -7.7608
442
+ ▁sha -7.76587
443
+ ▁friend -7.76655
444
+ ▁wi -7.76686
445
+ ▁call -7.77014
446
+ ▁european -7.77369
447
+ ▁h -7.77635
448
+ ect -7.77656
449
+ ress -7.77762
450
+ ▁live -7.77843
451
+ port -7.78443
452
+ ▁mhm -7.79247
453
+ ▁house -7.79419
454
+ ie -7.79563
455
+ ni -7.79695
456
+ ▁plan -7.79856
457
+ ▁jo -7.7989
458
+ ▁play -7.80456
459
+ side -7.80475
460
+ ▁va -7.8092
461
+ min -7.81213
462
+ ious -7.81221
463
+ ▁life -7.81311
464
+ ▁du -7.81412
465
+ ▁ti -7.81425
466
+ ▁six -7.81679
467
+ ▁men -7.8204
468
+ ▁again -7.82504
469
+ ▁thank -7.82596
470
+ ▁talk -7.82781
471
+ par -7.82904
472
+ ▁home -7.83486
473
+ op -7.84443
474
+ ▁both -7.84762
475
+ ▁why -7.85314
476
+ ▁put -7.85821
477
+ ▁another -7.85998
478
+ nc -7.8612
479
+ ▁being -7.86229
480
+ mit -7.8639
481
+ ▁came -7.86521
482
+ led -7.86547
483
+ ▁fo -7.86784
484
+ ▁end -7.87541
485
+ ▁member -7.87601
486
+ ative -7.87978
487
+ ▁thought -7.88889
488
+ ▁tri -7.8906
489
+ iv -7.89618
490
+ our -7.90019
491
+ red -7.90031
492
+ ▁went -7.90437
493
+ lic -7.90907
494
+ ▁find -7.90992
495
+ ▁pu -7.90997
496
+ land -7.91316
497
+ ▁start -7.91694
498
+ ▁far -7.92023
499
+ ▁eu -7.9263
500
+ ▁imp -7.92665
501
+ ▁always -7.93151
502
+ ▁ju -7.93233
503
+ ▁wa -7.93349
504
+ ▁person -7.93371
505
+ ▁singapore -7.93402
506
+ ap -7.93525
507
+ ▁show -7.93527
508
+ ▁chi -7.93562
509
+ ▁ten -7.9425
510
+ ▁eight -7.94355
511
+ ▁while -7.9474
512
+ ▁point -7.94801
513
+ ▁y -7.94862
514
+ ▁ja -7.95103
515
+ ▁ya -7.95249
516
+ ling -7.95384
517
+ ctor -7.954
518
+ ▁use -7.95713
519
+ ▁acc -7.9612
520
+ ▁world -7.96149
521
+ ▁pay -7.96624
522
+ ▁read -7.96903
523
+ va -7.97052
524
+ vo -7.97293
525
+ ▁change -7.97626
526
+ ▁u -7.97726
527
+ ▁pl -7.98219
528
+ ▁sw -7.98502
529
+ ▁war -7.98626
530
+ ▁might -7.98774
531
+ nk -7.98824
532
+ ments -7.9957
533
+ and -7.99676
534
+ ▁different -7.99873
535
+ ▁dec -8.00827
536
+ cent -8.01161
537
+ ▁ste -8.0118
538
+ ▁better -8.01318
539
+ ▁fun -8.01416
540
+ ▁month -8.01456
541
+ ship -8.01656
542
+ ton -8.01695
543
+ ▁tell -8.02101
544
+ ▁twenty -8.02348
545
+ ▁commission -8.02641
546
+ ▁exc -8.02751
547
+ ▁miss -8.02755
548
+ if -8.02762
549
+ ▁love -8.02776
550
+ ▁money -8.02894
551
+ ▁found -8.02991
552
+ ▁hundred -8.03069
553
+ gg -8.03105
554
+ ▁add -8.03162
555
+ ▁real -8.03171
556
+ ities -8.03677
557
+ ▁na -8.03825
558
+ ▁pass -8.0427
559
+ ▁didn -8.04393
560
+ ▁v -8.0483
561
+ ▁feel -8.04974
562
+ ▁week -8.05084
563
+ ▁win -8.0563
564
+ ible -8.06954
565
+ ▁try -8.07298
566
+ ▁upon -8.0737
567
+ ba -8.07431
568
+ ▁interest -8.0749
569
+ ▁inter -8.07675
570
+ son -8.08404
571
+ line -8.08887
572
+ ▁ob -8.08934
573
+ ▁boy -8.0899
574
+ ▁big -8.09538
575
+ ▁used -8.09782
576
+ ▁seven -8.09997
577
+ ▁away -8.10552
578
+ ▁family -8.1072
579
+ less -8.10829
580
+ ▁ki -8.10881
581
+ ber -8.1112
582
+ ▁around -8.11357
583
+ ▁turn -8.11442
584
+ ▁anything -8.11519
585
+ ▁care -8.11883
586
+ ▁young -8.12192
587
+ ▁guess -8.1225
588
+ ▁happen -8.12512
589
+ ▁course -8.12644
590
+ ▁agree -8.12835
591
+ ▁support -8.13074
592
+ ▁conf -8.13164
593
+ ual -8.1355
594
+ ▁number -8.14447
595
+ ▁trans -8.15084
596
+ ating -8.15272
597
+ ▁mister -8.15316
598
+ ▁hard -8.15454
599
+ ▁watch -8.15696
600
+ ft -8.15788
601
+ ▁next -8.15965
602
+ ▁sea -8.161
603
+ ▁open -8.16351
604
+ ▁without -8.16441
605
+ duc -8.16665
606
+ gra -8.1671
607
+ ak -8.16803
608
+ ▁cap -8.17043
609
+ ▁cre -8.17626
610
+ hi -8.17739
611
+ ▁government -8.17883
612
+ ▁vo -8.17982
613
+ ▁between -8.18458
614
+ ▁each -8.1848
615
+ ▁ve -8.18887
616
+ ▁though -8.1907
617
+ ▁country -8.19281
618
+ ▁few -8.19797
619
+ ▁once -8.19862
620
+ ▁' -8.19912
621
+ ▁head -8.20153
622
+ ▁free -8.20175
623
+ ▁mu -8.20296
624
+ ▁maybe -8.2058
625
+ ▁act -8.20809
626
+ ▁night -8.21235
627
+ ▁thousand -8.21592
628
+ ▁face -8.21699
629
+ ▁uhhuh -8.21787
630
+ ▁keep -8.21985
631
+ ▁nine -8.22633
632
+ ▁close -8.22707
633
+ ▁case -8.22822
634
+ ▁che -8.23344
635
+ ▁against -8.23536
636
+ ▁done -8.23629
637
+ ▁ever -8.23893
638
+ ▁law -8.24006
639
+ ▁believe -8.24459
640
+ ▁public -8.24677
641
+ ▁room -8.24682
642
+ ▁sub -8.24708
643
+ ▁order -8.24887
644
+ ▁important -8.2519
645
+ ient -8.25677
646
+ ▁el -8.25703
647
+ ▁children -8.25762
648
+ ▁second -8.25816
649
+ ▁bri -8.25859
650
+ ▁business -8.25869
651
+ ▁hope -8.259
652
+ ▁move -8.2649
653
+ fa -8.27045
654
+ ▁however -8.27612
655
+ ▁follow -8.27871
656
+ ▁able -8.27923
657
+ ▁word -8.28135
658
+ ▁yet -8.28793
659
+ ▁fla -8.28909
660
+ ▁stand -8.28963
661
+ ize -8.2908
662
+ ▁je -8.29324
663
+ ▁service -8.2939
664
+ ▁nothing -8.29923
665
+ ▁report -8.30061
666
+ ▁called -8.30077
667
+ ▁grow -8.304
668
+ ▁continue -8.30432
669
+ ▁issue -8.30586
670
+ ▁since -8.30646
671
+ ▁book -8.30884
672
+ ▁lu -8.31129
673
+ ▁qui -8.3221
674
+ ▁develop -8.32443
675
+ ▁gen -8.32549
676
+ ▁certain -8.32734
677
+ light -8.3287
678
+ ▁cor -8.33096
679
+ ▁small -8.33923
680
+ ▁took -8.34293
681
+ ▁question -8.34314
682
+ ▁whole -8.34616
683
+ ▁problem -8.35451
684
+ ▁side -8.35748
685
+ ▁child -8.35765
686
+ ▁full -8.35956
687
+ ▁best -8.36001
688
+ ▁mm -8.36002
689
+ ▁probably -8.36158
690
+ fi -8.36244
691
+ ▁qua -8.36276
692
+ ▁sur -8.36299
693
+ ▁market -8.36613
694
+ ▁left -8.36633
695
+ ▁everything -8.36649
696
+ ▁during -8.36694
697
+ ▁understand -8.36833
698
+ ook -8.36868
699
+ wa -8.37134
700
+ ▁cent -8.37409
701
+ ▁water -8.37583
702
+ ▁quite -8.37962
703
+ ▁leave -8.38036
704
+ ▁himself -8.38307
705
+ ip -8.38461
706
+ ▁near -8.38513
707
+ ▁saw -8.38643
708
+ ▁together -8.38781
709
+ ▁large -8.39035
710
+ ▁having -8.39267
711
+ ▁already -8.39628
712
+ ▁invest -8.39904
713
+ ▁pretty -8.40008
714
+ ▁direct -8.40244
715
+ ▁hour -8.40644
716
+ ▁fact -8.40997
717
+ way -8.41625
718
+ ▁run -8.42466
719
+ ▁bra -8.42567
720
+ ▁clear -8.43217
721
+ ▁fra -8.43417
722
+ ▁area -8.43486
723
+ ▁union -8.43572
724
+ ▁enough -8.43701
725
+ ▁consider -8.43816
726
+ ▁lead -8.44282
727
+ ▁remain -8.4432
728
+ ▁president -8.44349
729
+ ▁system -8.44359
730
+ ▁def -8.44528
731
+ ▁stuff -8.44543
732
+ ▁food -8.44681
733
+ ▁job -8.44745
734
+ ▁heard -8.4497
735
+ ▁err -8.4517
736
+ ▁mind -8.45972
737
+ ▁rest -8.46496
738
+ ▁speak -8.47285
739
+ ▁asked -8.47311
740
+ ator -8.47485
741
+ ▁half -8.47916
742
+ ▁father -8.48547
743
+ com -8.48932
744
+ ▁less -8.48994
745
+ ▁arm -8.49058
746
+ ▁human -8.49272
747
+ ency -8.496
748
+ ▁matter -8.4986
749
+ ▁group -8.50052
750
+ ▁girl -8.50055
751
+ ▁current -8.50726
752
+ ▁main -8.50751
753
+ ttle -8.5106
754
+ ▁later -8.516
755
+ ▁learn -8.51682
756
+ ▁strong -8.52364
757
+ ▁sign -8.5245
758
+ ▁check -8.52607
759
+ ▁light -8.52923
760
+ ▁else -8.5308
761
+ ▁true -8.53455
762
+ ▁term -8.53734
763
+ qui -8.53768
764
+ ▁minute -8.53859
765
+ ▁spec -8.54202
766
+ ▁return -8.55529
767
+ ▁answer -8.55575
768
+ ▁reason -8.55948
769
+ ▁count -8.56043
770
+ ▁shall -8.56069
771
+ ▁communi -8.56213
772
+ ▁travel -8.56939
773
+ ▁wait -8.56987
774
+ ▁provide -8.57123
775
+ ▁low -8.57206
776
+ ▁mother -8.57245
777
+ ▁expect -8.57439
778
+ ▁cause -8.57557
779
+ ▁line -8.5759
780
+ ▁general -8.57882
781
+ lf -8.58031
782
+ ▁getting -8.58089
783
+ ▁parliament -8.59679
784
+ ▁bank -8.6064
785
+ ▁company -8.6065
786
+ ▁stop -8.6096
787
+ cause -8.61016
788
+ ▁power -8.61078
789
+ ▁gi -8.61387
790
+ ▁europe -8.61412
791
+ ▁moment -8.61667
792
+ ▁among -8.62169
793
+ ▁walk -8.62234
794
+ ▁allow -8.62354
795
+ ▁idea -8.62409
796
+ ▁office -8.63983
797
+ ▁town -8.64192
798
+ ▁cannot -8.64504
799
+ ▁countries -8.65571
800
+ ▁become -8.65711
801
+ ▁appear -8.66066
802
+ ▁present -8.66197
803
+ ▁bring -8.66817
804
+ ▁least -8.67734
805
+ ▁almost -8.67744
806
+ ▁kids -8.68149
807
+ ▁remember -8.68482
808
+ ▁include -8.69362
809
+ ▁short -8.69588
810
+ ▁sometimes -8.70013
811
+ ▁game -8.70183
812
+ ▁level -8.70238
813
+ ▁exactly -8.71213
814
+ ▁particular -8.71547
815
+ ▁social -8.71627
816
+ ▁land -8.7235
817
+ ▁woman -8.72472
818
+ ▁north -8.73017
819
+ ▁nice -8.73061
820
+ ▁concern -8.73119
821
+ ▁sort -8.73497
822
+ ▁effect -8.7364
823
+ ▁national -8.73674
824
+ ▁several -8.74399
825
+ ▁safe -8.74482
826
+ ▁until -8.75167
827
+ ▁further -8.75344
828
+ ▁cost -8.75806
829
+ ▁wonder -8.7592
830
+ ▁whether -8.75952
831
+ ▁either -8.76064
832
+ ▁future -8.76157
833
+ ▁pra -8.76177
834
+ ▁council -8.76626
835
+ ▁knew -8.76688
836
+ ▁common -8.76738
837
+ ▁south -8.76883
838
+ ▁making -8.77378
839
+ ▁morning -8.78209
840
+ ▁process -8.78873
841
+ ▁situation -8.79142
842
+ ▁white -8.80433
843
+ ▁result -8.80686
844
+ ▁suppose -8.8105
845
+ ▁employ -8.8111
846
+ ▁political -8.8127
847
+ ▁program -8.81279
848
+ ▁along -8.81438
849
+ ▁women -8.81511
850
+ ▁ski -8.81555
851
+ ▁court -8.81868
852
+ ▁please -8.81922
853
+ ▁shi -8.82138
854
+ ▁possible -8.82444
855
+ ▁protect -8.82701
856
+ ▁experience -8.83437
857
+ ▁definitely -8.8378
858
+ ▁require -8.84286
859
+ ▁account -8.8439
860
+ ▁myself -8.84545
861
+ ▁black -8.84754
862
+ ▁example -8.84764
863
+ ▁america -8.85202
864
+ ▁thirty -8.86407
865
+ ▁student -8.87088
866
+ ▁view -8.87089
867
+ ▁product -8.87095
868
+ ▁wife -8.87413
869
+ ▁health -8.87932
870
+ ▁major -8.88405
871
+ ▁difficult -8.88448
872
+ ▁death -8.88658
873
+ ▁visit -8.88982
874
+ ▁across -8.89176
875
+ ▁receive -8.90186
876
+ ▁voice -8.90391
877
+ ▁citizen -8.90606
878
+ ▁regard -8.91305
879
+ ▁author -8.91561
880
+ ▁treat -8.91734
881
+ ▁especially -8.91817
882
+ ▁local -8.92759
883
+ ▁taking -8.93424
884
+ ▁information -8.93488
885
+ ▁seemed -8.93623
886
+ ▁success -8.94708
887
+ ability -8.953
888
+ ▁break -8.95426
889
+ ▁whatever -8.95489
890
+ ▁security -8.95574
891
+ ▁address -8.9605
892
+ ▁felt -8.96094
893
+ ▁fifty -8.96728
894
+ ▁million -8.96729
895
+ ▁third -8.9674
896
+ ▁usually -8.9677
897
+ ▁gonna -8.9766
898
+ ▁brother -8.98787
899
+ ▁began -8.98976
900
+ ▁period -8.98995
901
+ ▁east -8.99025
902
+ ▁economic -8.99115
903
+ ▁increase -8.99307
904
+ ▁financial -8.9945
905
+ ▁respect -8.99809
906
+ ▁enjoy -8.99966
907
+ ▁christ -9.00516
908
+ ▁education -9.00598
909
+ ▁brought -9.00633
910
+ ▁organ -9.01726
911
+ ▁parents -9.01827
912
+ ▁policy -9.01966
913
+ ▁round -9.02048
914
+ ▁became -9.02163
915
+ ▁region -9.02293
916
+ ▁lady -9.02323
917
+ ▁discuss -9.0259
918
+ ▁single -9.03166
919
+ ▁early -9.03705
920
+ ▁couple -9.04362
921
+ ▁type -9.04527
922
+ ▁itself -9.04776
923
+ ▁serve -9.04968
924
+ ▁measure -9.05259
925
+ ▁husband -9.05591
926
+ ified -9.05626
927
+ ▁music -9.05632
928
+ ▁ground -9.05671
929
+ ▁companies -9.0573
930
+ ▁street -9.05869
931
+ ▁behind -9.06227
932
+ ▁value -9.06366
933
+ ▁therefore -9.06474
934
+ ▁police -9.06516
935
+ ▁complete -9.06974
936
+ ▁john -9.07328
937
+ ▁daughter -9.08041
938
+ ▁affect -9.08141
939
+ ▁perhaps -9.08236
940
+ ▁international -9.08363
941
+ ▁themselves -9.08495
942
+ ▁improve -9.08864
943
+ ▁condition -9.08867
944
+ ▁hotel -9.09553
945
+ ▁deliver -9.09771
946
+ ▁sense -9.09928
947
+ ▁relation -9.10001
948
+ ▁sorry -9.10303
949
+ ▁credit -9.10319
950
+ ▁effort -9.11405
951
+ ▁instead -9.11694
952
+ ▁york -9.12056
953
+ ▁united -9.1221
954
+ ▁partner -9.12363
955
+ ▁spoke -9.12654
956
+ ▁strange -9.12692
957
+ ▁everybody -9.15216
958
+ ▁horse -9.15579
959
+ ▁depend -9.16273
960
+ ▁subject -9.16932
961
+ ▁project -9.17118
962
+ ▁approach -9.17183
963
+ ▁involve -9.1723
964
+ ▁listen -9.17885
965
+ ▁draw -9.17972
966
+ ▁computer -9.18042
967
+ ▁married -9.18112
968
+ ▁record -9.18132
969
+ ▁happy -9.18541
970
+ ▁sudden -9.18597
971
+ ▁represent -9.18688
972
+ ▁somebody -9.1921
973
+ ▁correct -9.19762
974
+ ▁serious -9.19998
975
+ ▁decision -9.20599
976
+ ▁society -9.21113
977
+ ▁including -9.2118
978
+ ▁college -9.21197
979
+ ▁english -9.21531
980
+ ▁attack -9.21725
981
+ ▁perform -9.2217
982
+ ▁cross -9.22394
983
+ ▁accept -9.23292
984
+ ▁control -9.23361
985
+ ▁flow -9.23439
986
+ ▁although -9.23706
987
+ ▁drink -9.2433
988
+ ▁front -9.24346
989
+ ▁wrong -9.24545
990
+ ▁twi -9.24584
991
+ ▁according -9.24899
992
+ ▁slow -9.25313
993
+ ▁peace -9.25821
994
+ ▁amount -9.2596
995
+ ▁object -9.26398
996
+ ▁movie -9.27258
997
+ ▁benefit -9.27843
998
+ ▁yup -9.28352
999
+ ▁challenge -9.28654
1000
+ ▁private -9.28793
1001
+ ▁church -9.29014
1002
+ ▁wood -9.29369
1003
+ ▁field -9.29393
1004
+ ▁above -9.29623
1005
+ ▁ensure -9.30575
1006
+ ▁immediate -9.30949
1007
+ ▁figure -9.31677
1008
+ ▁foreign -9.32009
1009
+ ▁available -9.32064
1010
+ ▁insurance -9.32202
1011
+ ▁proposal -9.32661
1012
+ ▁doubt -9.32706
1013
+ ▁strength -9.33368
1014
+ ▁difference -9.33387
1015
+ ▁stood -9.33846
1016
+ ▁implement -9.34025
1017
+ ▁economy -9.34174
1018
+ ▁detail -9.34179
1019
+ ▁umhum -9.34831
1020
+ ▁restaurant -9.35371
1021
+ ▁collect -9.35827
1022
+ ▁global -9.36609
1023
+ ▁broke -9.36914
1024
+ q -9.69201
model_config.yaml ADDED
@@ -0,0 +1,1209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ sample_rate: 16000
2
+ log_prediction: true
3
+ ctc_reduction: mean_batch
4
+ train_ds:
5
+ manifest_filepath: /data/NeMo_ASR_SET/English/v2.0/train/tarred_audio_manifest.json
6
+ sample_rate: 16000
7
+ batch_size: 64
8
+ shuffle: true
9
+ num_workers: 8
10
+ pin_memory: true
11
+ use_start_end_token: false
12
+ trim_silence: false
13
+ max_duration: 20.0
14
+ min_duration: 0.1
15
+ shuffle_n: 2048
16
+ is_tarred: true
17
+ tarred_audio_filepaths: /data/NeMo_ASR_SET/English/v2.0/train/audio__OP_0..4095_CL_.tar
18
+ validation_ds:
19
+ manifest_filepath:
20
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-dev-other.json
21
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-dev-clean.json
22
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-test-other.json
23
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-test-clean.json
24
+ sample_rate: 16000
25
+ batch_size: 64
26
+ shuffle: false
27
+ num_workers: 8
28
+ pin_memory: true
29
+ use_start_end_token: false
30
+ is_tarred: false
31
+ tarred_audio_filepaths: na
32
+ test_ds:
33
+ manifest_filepath:
34
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-test-other.json
35
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-dev-clean.json
36
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-dev-other.json
37
+ - /data/ASR/LibriSpeech/librispeech_withsp2/manifests/librivox-test-clean.json
38
+ sample_rate: 16000
39
+ batch_size: 64
40
+ shuffle: false
41
+ num_workers: 8
42
+ pin_memory: true
43
+ use_start_end_token: false
44
+ is_tarred: false
45
+ tarred_audio_filepaths: na
46
+ tokenizer:
47
+ dir: /tokenizers/NeMo_ASR_SET/English/asr_set_2.0/tokenizer_spe_unigram_v1024/
48
+ type: bpe
49
+ model_path: nemo:977d4e24975b431ebb44f2dfcdea8778_tokenizer.model
50
+ vocab_path: nemo:29fcf21b9e6e4fb5b1ba0bb1efd4197e_vocab.txt
51
+ spe_tokenizer_vocab: nemo:cf241f7e4d904eaea46bb96f21dd0b1d_tokenizer.vocab
52
+ preprocessor:
53
+ _target_: nemo.collections.asr.modules.AudioToMelSpectrogramPreprocessor
54
+ sample_rate: 16000
55
+ normalize: per_feature
56
+ window_size: 0.025
57
+ window_stride: 0.01
58
+ window: hann
59
+ features: 80
60
+ n_fft: 512
61
+ log: true
62
+ frame_splicing: 1
63
+ dither: 1.0e-05
64
+ pad_to: 0
65
+ pad_value: 0.0
66
+ spec_augment:
67
+ _target_: nemo.collections.asr.modules.SpectrogramAugmentation
68
+ freq_masks: 2
69
+ time_masks: 5
70
+ freq_width: 27
71
+ time_width: 0.05
72
+ encoder:
73
+ _target_: nemo.collections.asr.modules.ConformerEncoder
74
+ feat_in: 80
75
+ feat_out: -1
76
+ n_layers: 16
77
+ d_model: 176
78
+ subsampling: striding
79
+ subsampling_factor: 4
80
+ subsampling_conv_channels: 176
81
+ ff_expansion_factor: 4
82
+ self_attention_model: rel_pos
83
+ n_heads: 4
84
+ att_context_size:
85
+ - -1
86
+ - -1
87
+ xscaling: true
88
+ untie_biases: true
89
+ pos_emb_max_len: 5000
90
+ conv_kernel_size: 31
91
+ dropout: 0.1
92
+ dropout_emb: 0.0
93
+ dropout_att: 0.1
94
+ decoder:
95
+ _target_: nemo.collections.asr.modules.ConvASRDecoder
96
+ feat_in: 176
97
+ num_classes: 1024
98
+ vocabulary:
99
+ - <unk>
100
+ - s
101
+ - ▁the
102
+ - ▁a
103
+ - t
104
+ - ▁to
105
+ - ▁and
106
+ - ▁i
107
+ - ▁of
108
+ - ''''
109
+ - ed
110
+ - ▁in
111
+ - d
112
+ - ing
113
+ - 'n'
114
+ - e
115
+ - ▁it
116
+ - ▁that
117
+ - ▁you
118
+ - 'y'
119
+ - er
120
+ - r
121
+ - ▁for
122
+ - m
123
+ - ▁is
124
+ - ▁he
125
+ - re
126
+ - ▁was
127
+ - ▁be
128
+ - p
129
+ - ly
130
+ - ▁so
131
+ - ▁we
132
+ - a
133
+ - g
134
+ - o
135
+ - ▁c
136
+ - b
137
+ - u
138
+ - ▁on
139
+ - ▁have
140
+ - ▁but
141
+ - ll
142
+ - ▁with
143
+ - ▁re
144
+ - or
145
+ - ▁s
146
+ - al
147
+ - ▁do
148
+ - ▁know
149
+ - ar
150
+ - ▁they
151
+ - ▁not
152
+ - ▁as
153
+ - ▁this
154
+ - in
155
+ - le
156
+ - ▁e
157
+ - ▁are
158
+ - ▁like
159
+ - c
160
+ - ▁uh
161
+ - ri
162
+ - ▁me
163
+ - ▁his
164
+ - ▁at
165
+ - l
166
+ - es
167
+ - ▁de
168
+ - ▁yeah
169
+ - ▁can
170
+ - k
171
+ - ▁or
172
+ - ▁my
173
+ - ▁all
174
+ - ▁had
175
+ - ▁there
176
+ - ▁will
177
+ - ▁one
178
+ - il
179
+ - ▁no
180
+ - ▁what
181
+ - en
182
+ - ck
183
+ - ▁b
184
+ - ▁f
185
+ - ce
186
+ - ch
187
+ - i
188
+ - ▁by
189
+ - ▁she
190
+ - ▁from
191
+ - ▁an
192
+ - ic
193
+ - ur
194
+ - ve
195
+ - w
196
+ - ter
197
+ - la
198
+ - ▁if
199
+ - ▁just
200
+ - th
201
+ - li
202
+ - ▁
203
+ - ▁her
204
+ - ▁um
205
+ - 'on'
206
+ - ation
207
+ - ▁w
208
+ - ▁would
209
+ - f
210
+ - te
211
+ - ▁st
212
+ - ▁go
213
+ - ir
214
+ - it
215
+ - ▁out
216
+ - ro
217
+ - ▁pa
218
+ - ▁were
219
+ - ▁g
220
+ - ▁t
221
+ - ion
222
+ - ▁think
223
+ - an
224
+ - ▁right
225
+ - ▁about
226
+ - se
227
+ - lo
228
+ - ent
229
+ - ▁up
230
+ - ment
231
+ - ate
232
+ - ▁when
233
+ - h
234
+ - ne
235
+ - ▁don
236
+ - ▁has
237
+ - ▁also
238
+ - ▁more
239
+ - ▁see
240
+ - ▁okay
241
+ - ▁their
242
+ - ▁your
243
+ - ge
244
+ - ▁who
245
+ - ▁well
246
+ - ▁co
247
+ - ▁which
248
+ - ▁some
249
+ - ▁se
250
+ - ▁time
251
+ - ▁ba
252
+ - ▁said
253
+ - ▁con
254
+ - ers
255
+ - ▁ra
256
+ - us
257
+ - de
258
+ - ra
259
+ - ▁him
260
+ - ▁our
261
+ - ▁been
262
+ - ▁fa
263
+ - ▁po
264
+ - ▁pro
265
+ - et
266
+ - x
267
+ - ▁la
268
+ - id
269
+ - ver
270
+ - ▁oh
271
+ - ▁ma
272
+ - v
273
+ - ▁now
274
+ - age
275
+ - ▁two
276
+ - ld
277
+ - ▁mo
278
+ - ▁how
279
+ - tion
280
+ - ▁people
281
+ - ive
282
+ - ▁other
283
+ - ng
284
+ - ity
285
+ - z
286
+ - ist
287
+ - ▁very
288
+ - ▁get
289
+ - ▁any
290
+ - ▁un
291
+ - ▁ro
292
+ - is
293
+ - ▁work
294
+ - ▁mean
295
+ - ▁them
296
+ - ▁lo
297
+ - vi
298
+ - ▁because
299
+ - ies
300
+ - ul
301
+ - as
302
+ - ad
303
+ - mp
304
+ - ▁bo
305
+ - '-'
306
+ - ▁then
307
+ - ▁good
308
+ - el
309
+ - nd
310
+ - ▁li
311
+ - ▁man
312
+ - ▁dis
313
+ - ▁could
314
+ - ▁ho
315
+ - at
316
+ - ol
317
+ - ▁bu
318
+ - ▁te
319
+ - ▁ha
320
+ - est
321
+ - me
322
+ - ▁say
323
+ - ru
324
+ - ke
325
+ - ▁sp
326
+ - ▁k
327
+ - able
328
+ - ▁su
329
+ - ▁sa
330
+ - ▁di
331
+ - ▁fi
332
+ - ance
333
+ - ▁really
334
+ - ▁over
335
+ - ▁even
336
+ - ry
337
+ - ▁us
338
+ - ▁ca
339
+ - ow
340
+ - ho
341
+ - ▁into
342
+ - ence
343
+ - mo
344
+ - ▁mi
345
+ - one
346
+ - qu
347
+ - ut
348
+ - lu
349
+ - ▁o
350
+ - ty
351
+ - ▁after
352
+ - ▁want
353
+ - ▁new
354
+ - ▁take
355
+ - ▁p
356
+ - ▁look
357
+ - ▁pre
358
+ - sh
359
+ - ▁day
360
+ - ▁should
361
+ - ▁th
362
+ - ▁need
363
+ - ▁cha
364
+ - co
365
+ - ▁much
366
+ - ▁where
367
+ - ▁d
368
+ - ant
369
+ - ▁fe
370
+ - ▁da
371
+ - ▁make
372
+ - om
373
+ - ▁did
374
+ - ▁le
375
+ - un
376
+ - ▁only
377
+ - im
378
+ - ▁these
379
+ - ff
380
+ - ti
381
+ - ish
382
+ - ▁ex
383
+ - ted
384
+ - ▁first
385
+ - he
386
+ - ig
387
+ - ▁vi
388
+ - ▁ri
389
+ - ▁en
390
+ - ▁com
391
+ - ated
392
+ - ▁than
393
+ - ma
394
+ - ▁way
395
+ - um
396
+ - ct
397
+ - end
398
+ - ight
399
+ - ▁here
400
+ - ▁ta
401
+ - ▁car
402
+ - ▁part
403
+ - ▁come
404
+ - ia
405
+ - ▁off
406
+ - ▁sc
407
+ - ▁ah
408
+ - am
409
+ - ▁tra
410
+ - ▁yes
411
+ - ▁back
412
+ - ture
413
+ - ful
414
+ - ▁pri
415
+ - ction
416
+ - ine
417
+ - ▁three
418
+ - ard
419
+ - ▁let
420
+ - pe
421
+ - ▁little
422
+ - ▁down
423
+ - mb
424
+ - ▁si
425
+ - ▁dr
426
+ - ▁mr
427
+ - ▁going
428
+ - ▁comp
429
+ - po
430
+ - ▁m
431
+ - ▁sta
432
+ - ▁gra
433
+ - day
434
+ - ▁many
435
+ - ian
436
+ - ta
437
+ - ▁long
438
+ - ▁pi
439
+ - ▁too
440
+ - ▁app
441
+ - ▁kind
442
+ - ous
443
+ - ci
444
+ - ▁ga
445
+ - ten
446
+ - nt
447
+ - ▁before
448
+ - ▁may
449
+ - ▁got
450
+ - man
451
+ - tic
452
+ - ition
453
+ - cu
454
+ - ugh
455
+ - tra
456
+ - ▁n
457
+ - ward
458
+ - ▁give
459
+ - ▁every
460
+ - ▁hi
461
+ - ting
462
+ - ▁exp
463
+ - ▁those
464
+ - ▁hu
465
+ - ot
466
+ - ▁something
467
+ - ▁lot
468
+ - ▁still
469
+ - ▁ne
470
+ - na
471
+ - ise
472
+ - pp
473
+ - ▁most
474
+ - ▁gu
475
+ - ▁state
476
+ - ▁actually
477
+ - ▁such
478
+ - ▁bi
479
+ - ▁never
480
+ - tain
481
+ - ▁great
482
+ - ▁through
483
+ - ▁al
484
+ - 'no'
485
+ - ▁mar
486
+ - ▁year
487
+ - ach
488
+ - les
489
+ - ▁school
490
+ - ally
491
+ - ial
492
+ - ha
493
+ - ▁old
494
+ - ▁made
495
+ - ary
496
+ - ▁ar
497
+ - ▁years
498
+ - ▁help
499
+ - ▁per
500
+ - ving
501
+ - ical
502
+ - ther
503
+ - ▁does
504
+ - ac
505
+ - ca
506
+ - ▁must
507
+ - di
508
+ - ▁own
509
+ - ▁ru
510
+ - ▁things
511
+ - ▁hand
512
+ - ▁thing
513
+ - ▁high
514
+ - ▁last
515
+ - go
516
+ - ▁sh
517
+ - ▁under
518
+ - ▁four
519
+ - ▁place
520
+ - ations
521
+ - ▁sure
522
+ - mi
523
+ - nce
524
+ - ▁am
525
+ - for
526
+ - ness
527
+ - ▁name
528
+ - ▁five
529
+ - ound
530
+ - ▁op
531
+ - ▁cons
532
+ - ▁ph
533
+ - ▁same
534
+ - row
535
+ - ven
536
+ - ph
537
+ - ite
538
+ - ▁pe
539
+ - j
540
+ - ▁sha
541
+ - ▁friend
542
+ - ▁wi
543
+ - ▁call
544
+ - ▁european
545
+ - ▁h
546
+ - ect
547
+ - ress
548
+ - ▁live
549
+ - port
550
+ - ▁mhm
551
+ - ▁house
552
+ - ie
553
+ - ni
554
+ - ▁plan
555
+ - ▁jo
556
+ - ▁play
557
+ - side
558
+ - ▁va
559
+ - min
560
+ - ious
561
+ - ▁life
562
+ - ▁du
563
+ - ▁ti
564
+ - ▁six
565
+ - ▁men
566
+ - ▁again
567
+ - ▁thank
568
+ - ▁talk
569
+ - par
570
+ - ▁home
571
+ - op
572
+ - ▁both
573
+ - ▁why
574
+ - ▁put
575
+ - ▁another
576
+ - nc
577
+ - ▁being
578
+ - mit
579
+ - ▁came
580
+ - led
581
+ - ▁fo
582
+ - ▁end
583
+ - ▁member
584
+ - ative
585
+ - ▁thought
586
+ - ▁tri
587
+ - iv
588
+ - our
589
+ - red
590
+ - ▁went
591
+ - lic
592
+ - ▁find
593
+ - ▁pu
594
+ - land
595
+ - ▁start
596
+ - ▁far
597
+ - ▁eu
598
+ - ▁imp
599
+ - ▁always
600
+ - ▁ju
601
+ - ▁wa
602
+ - ▁person
603
+ - ▁singapore
604
+ - ap
605
+ - ▁show
606
+ - ▁chi
607
+ - ▁ten
608
+ - ▁eight
609
+ - ▁while
610
+ - ▁point
611
+ - ▁y
612
+ - ▁ja
613
+ - ▁ya
614
+ - ling
615
+ - ctor
616
+ - ▁use
617
+ - ▁acc
618
+ - ▁world
619
+ - ▁pay
620
+ - ▁read
621
+ - va
622
+ - vo
623
+ - ▁change
624
+ - ▁u
625
+ - ▁pl
626
+ - ▁sw
627
+ - ▁war
628
+ - ▁might
629
+ - nk
630
+ - ments
631
+ - and
632
+ - ▁different
633
+ - ▁dec
634
+ - cent
635
+ - ▁ste
636
+ - ▁better
637
+ - ▁fun
638
+ - ▁month
639
+ - ship
640
+ - ton
641
+ - ▁tell
642
+ - ▁twenty
643
+ - ▁commission
644
+ - ▁exc
645
+ - ▁miss
646
+ - if
647
+ - ▁love
648
+ - ▁money
649
+ - ▁found
650
+ - ▁hundred
651
+ - gg
652
+ - ▁add
653
+ - ▁real
654
+ - ities
655
+ - ▁na
656
+ - ▁pass
657
+ - ▁didn
658
+ - ▁v
659
+ - ▁feel
660
+ - ▁week
661
+ - ▁win
662
+ - ible
663
+ - ▁try
664
+ - ▁upon
665
+ - ba
666
+ - ▁interest
667
+ - ▁inter
668
+ - son
669
+ - line
670
+ - ▁ob
671
+ - ▁boy
672
+ - ▁big
673
+ - ▁used
674
+ - ▁seven
675
+ - ▁away
676
+ - ▁family
677
+ - less
678
+ - ▁ki
679
+ - ber
680
+ - ▁around
681
+ - ▁turn
682
+ - ▁anything
683
+ - ▁care
684
+ - ▁young
685
+ - ▁guess
686
+ - ▁happen
687
+ - ▁course
688
+ - ▁agree
689
+ - ▁support
690
+ - ▁conf
691
+ - ual
692
+ - ▁number
693
+ - ▁trans
694
+ - ating
695
+ - ▁mister
696
+ - ▁hard
697
+ - ▁watch
698
+ - ft
699
+ - ▁next
700
+ - ▁sea
701
+ - ▁open
702
+ - ▁without
703
+ - duc
704
+ - gra
705
+ - ak
706
+ - ▁cap
707
+ - ▁cre
708
+ - hi
709
+ - ▁government
710
+ - ▁vo
711
+ - ▁between
712
+ - ▁each
713
+ - ▁ve
714
+ - ▁though
715
+ - ▁country
716
+ - ▁few
717
+ - ▁once
718
+ - ▁'
719
+ - ▁head
720
+ - ▁free
721
+ - ▁mu
722
+ - ▁maybe
723
+ - ▁act
724
+ - ▁night
725
+ - ▁thousand
726
+ - ▁face
727
+ - ▁uhhuh
728
+ - ▁keep
729
+ - ▁nine
730
+ - ▁close
731
+ - ▁case
732
+ - ▁che
733
+ - ▁against
734
+ - ▁done
735
+ - ▁ever
736
+ - ▁law
737
+ - ▁believe
738
+ - ▁public
739
+ - ▁room
740
+ - ▁sub
741
+ - ▁order
742
+ - ▁important
743
+ - ient
744
+ - ▁el
745
+ - ▁children
746
+ - ▁second
747
+ - ▁bri
748
+ - ▁business
749
+ - ▁hope
750
+ - ▁move
751
+ - fa
752
+ - ▁however
753
+ - ▁follow
754
+ - ▁able
755
+ - ▁word
756
+ - ▁yet
757
+ - ▁fla
758
+ - ▁stand
759
+ - ize
760
+ - ▁je
761
+ - ▁service
762
+ - ▁nothing
763
+ - ▁report
764
+ - ▁called
765
+ - ▁grow
766
+ - ▁continue
767
+ - ▁issue
768
+ - ▁since
769
+ - ▁book
770
+ - ▁lu
771
+ - ▁qui
772
+ - ▁develop
773
+ - ▁gen
774
+ - ▁certain
775
+ - light
776
+ - ▁cor
777
+ - ▁small
778
+ - ▁took
779
+ - ▁question
780
+ - ▁whole
781
+ - ▁problem
782
+ - ▁side
783
+ - ▁child
784
+ - ▁full
785
+ - ▁best
786
+ - ▁mm
787
+ - ▁probably
788
+ - fi
789
+ - ▁qua
790
+ - ▁sur
791
+ - ▁market
792
+ - ▁left
793
+ - ▁everything
794
+ - ▁during
795
+ - ▁understand
796
+ - ook
797
+ - wa
798
+ - ▁cent
799
+ - ▁water
800
+ - ▁quite
801
+ - ▁leave
802
+ - ▁himself
803
+ - ip
804
+ - ▁near
805
+ - ▁saw
806
+ - ▁together
807
+ - ▁large
808
+ - ▁having
809
+ - ▁already
810
+ - ▁invest
811
+ - ▁pretty
812
+ - ▁direct
813
+ - ▁hour
814
+ - ▁fact
815
+ - way
816
+ - ▁run
817
+ - ▁bra
818
+ - ▁clear
819
+ - ▁fra
820
+ - ▁area
821
+ - ▁union
822
+ - ▁enough
823
+ - ▁consider
824
+ - ▁lead
825
+ - ▁remain
826
+ - ▁president
827
+ - ▁system
828
+ - ▁def
829
+ - ▁stuff
830
+ - ▁food
831
+ - ▁job
832
+ - ▁heard
833
+ - ▁err
834
+ - ▁mind
835
+ - ▁rest
836
+ - ▁speak
837
+ - ▁asked
838
+ - ator
839
+ - ▁half
840
+ - ▁father
841
+ - com
842
+ - ▁less
843
+ - ▁arm
844
+ - ▁human
845
+ - ency
846
+ - ▁matter
847
+ - ▁group
848
+ - ▁girl
849
+ - ▁current
850
+ - ▁main
851
+ - ttle
852
+ - ▁later
853
+ - ▁learn
854
+ - ▁strong
855
+ - ▁sign
856
+ - ▁check
857
+ - ▁light
858
+ - ▁else
859
+ - ▁true
860
+ - ▁term
861
+ - qui
862
+ - ▁minute
863
+ - ▁spec
864
+ - ▁return
865
+ - ▁answer
866
+ - ▁reason
867
+ - ▁count
868
+ - ▁shall
869
+ - ▁communi
870
+ - ▁travel
871
+ - ▁wait
872
+ - ▁provide
873
+ - ▁low
874
+ - ▁mother
875
+ - ▁expect
876
+ - ▁cause
877
+ - ▁line
878
+ - ▁general
879
+ - lf
880
+ - ▁getting
881
+ - ▁parliament
882
+ - ▁bank
883
+ - ▁company
884
+ - ▁stop
885
+ - cause
886
+ - ▁power
887
+ - ▁gi
888
+ - ▁europe
889
+ - ▁moment
890
+ - ▁among
891
+ - ▁walk
892
+ - ▁allow
893
+ - ▁idea
894
+ - ▁office
895
+ - ▁town
896
+ - ▁cannot
897
+ - ▁countries
898
+ - ▁become
899
+ - ▁appear
900
+ - ▁present
901
+ - ▁bring
902
+ - ▁least
903
+ - ▁almost
904
+ - ▁kids
905
+ - ▁remember
906
+ - ▁include
907
+ - ▁short
908
+ - ▁sometimes
909
+ - ▁game
910
+ - ▁level
911
+ - ▁exactly
912
+ - ▁particular
913
+ - ▁social
914
+ - ▁land
915
+ - ▁woman
916
+ - ▁north
917
+ - ▁nice
918
+ - ▁concern
919
+ - ▁sort
920
+ - ▁effect
921
+ - ▁national
922
+ - ▁several
923
+ - ▁safe
924
+ - ▁until
925
+ - ▁further
926
+ - ▁cost
927
+ - ▁wonder
928
+ - ▁whether
929
+ - ▁either
930
+ - ▁future
931
+ - ▁pra
932
+ - ▁council
933
+ - ▁knew
934
+ - ▁common
935
+ - ▁south
936
+ - ▁making
937
+ - ▁morning
938
+ - ▁process
939
+ - ▁situation
940
+ - ▁white
941
+ - ▁result
942
+ - ▁suppose
943
+ - ▁employ
944
+ - ▁political
945
+ - ▁program
946
+ - ▁along
947
+ - ▁women
948
+ - ▁ski
949
+ - ▁court
950
+ - ▁please
951
+ - ▁shi
952
+ - ▁possible
953
+ - ▁protect
954
+ - ▁experience
955
+ - ▁definitely
956
+ - ▁require
957
+ - ▁account
958
+ - ▁myself
959
+ - ▁black
960
+ - ▁example
961
+ - ▁america
962
+ - ▁thirty
963
+ - ▁student
964
+ - ▁view
965
+ - ▁product
966
+ - ▁wife
967
+ - ▁health
968
+ - ▁major
969
+ - ▁difficult
970
+ - ▁death
971
+ - ▁visit
972
+ - ▁across
973
+ - ▁receive
974
+ - ▁voice
975
+ - ▁citizen
976
+ - ▁regard
977
+ - ▁author
978
+ - ▁treat
979
+ - ▁especially
980
+ - ▁local
981
+ - ▁taking
982
+ - ▁information
983
+ - ▁seemed
984
+ - ▁success
985
+ - ability
986
+ - ▁break
987
+ - ▁whatever
988
+ - ▁security
989
+ - ▁address
990
+ - ▁felt
991
+ - ▁fifty
992
+ - ▁million
993
+ - ▁third
994
+ - ▁usually
995
+ - ▁gonna
996
+ - ▁brother
997
+ - ▁began
998
+ - ▁period
999
+ - ▁east
1000
+ - ▁economic
1001
+ - ▁increase
1002
+ - ▁financial
1003
+ - ▁respect
1004
+ - ▁enjoy
1005
+ - ▁christ
1006
+ - ▁education
1007
+ - ▁brought
1008
+ - ▁organ
1009
+ - ▁parents
1010
+ - ▁policy
1011
+ - ▁round
1012
+ - ▁became
1013
+ - ▁region
1014
+ - ▁lady
1015
+ - ▁discuss
1016
+ - ▁single
1017
+ - ▁early
1018
+ - ▁couple
1019
+ - ▁type
1020
+ - ▁itself
1021
+ - ▁serve
1022
+ - ▁measure
1023
+ - ▁husband
1024
+ - ified
1025
+ - ▁music
1026
+ - ▁ground
1027
+ - ▁companies
1028
+ - ▁street
1029
+ - ▁behind
1030
+ - ▁value
1031
+ - ▁therefore
1032
+ - ▁police
1033
+ - ▁complete
1034
+ - ▁john
1035
+ - ▁daughter
1036
+ - ▁affect
1037
+ - ▁perhaps
1038
+ - ▁international
1039
+ - ▁themselves
1040
+ - ▁improve
1041
+ - ▁condition
1042
+ - ▁hotel
1043
+ - ▁deliver
1044
+ - ▁sense
1045
+ - ▁relation
1046
+ - ▁sorry
1047
+ - ▁credit
1048
+ - ▁effort
1049
+ - ▁instead
1050
+ - ▁york
1051
+ - ▁united
1052
+ - ▁partner
1053
+ - ▁spoke
1054
+ - ▁strange
1055
+ - ▁everybody
1056
+ - ▁horse
1057
+ - ▁depend
1058
+ - ▁subject
1059
+ - ▁project
1060
+ - ▁approach
1061
+ - ▁involve
1062
+ - ▁listen
1063
+ - ▁draw
1064
+ - ▁computer
1065
+ - ▁married
1066
+ - ▁record
1067
+ - ▁happy
1068
+ - ▁sudden
1069
+ - ▁represent
1070
+ - ▁somebody
1071
+ - ▁correct
1072
+ - ▁serious
1073
+ - ▁decision
1074
+ - ▁society
1075
+ - ▁including
1076
+ - ▁college
1077
+ - ▁english
1078
+ - ▁attack
1079
+ - ▁perform
1080
+ - ▁cross
1081
+ - ▁accept
1082
+ - ▁control
1083
+ - ▁flow
1084
+ - ▁although
1085
+ - ▁drink
1086
+ - ▁front
1087
+ - ▁wrong
1088
+ - ▁twi
1089
+ - ▁according
1090
+ - ▁slow
1091
+ - ▁peace
1092
+ - ▁amount
1093
+ - ▁object
1094
+ - ▁movie
1095
+ - ▁benefit
1096
+ - ▁yup
1097
+ - ▁challenge
1098
+ - ▁private
1099
+ - ▁church
1100
+ - ▁wood
1101
+ - ▁field
1102
+ - ▁above
1103
+ - ▁ensure
1104
+ - ▁immediate
1105
+ - ▁figure
1106
+ - ▁foreign
1107
+ - ▁available
1108
+ - ▁insurance
1109
+ - ▁proposal
1110
+ - ▁doubt
1111
+ - ▁strength
1112
+ - ▁difference
1113
+ - ▁stood
1114
+ - ▁implement
1115
+ - ▁economy
1116
+ - ▁detail
1117
+ - ▁umhum
1118
+ - ▁restaurant
1119
+ - ▁collect
1120
+ - ▁global
1121
+ - ▁broke
1122
+ - q
1123
+ optim:
1124
+ name: adamw
1125
+ lr: 2.0
1126
+ betas:
1127
+ - 0.9
1128
+ - 0.98
1129
+ weight_decay: 0
1130
+ sched:
1131
+ name: NoamAnnealing
1132
+ d_model: 176
1133
+ warmup_steps: 10000
1134
+ warmup_ratio: null
1135
+ min_lr: 1.0e-06
1136
+ target: nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE
1137
+ nemo_version: 2.0.0
1138
+ decoding:
1139
+ strategy: greedy_batch
1140
+ preserve_alignments: null
1141
+ compute_timestamps: null
1142
+ word_seperator: ' '
1143
+ ctc_timestamp_type: all
1144
+ batch_dim_index: 0
1145
+ greedy:
1146
+ preserve_alignments: false
1147
+ compute_timestamps: false
1148
+ preserve_frame_confidence: false
1149
+ confidence_method_cfg:
1150
+ name: entropy
1151
+ entropy_type: tsallis
1152
+ alpha: 0.33
1153
+ entropy_norm: exp
1154
+ temperature: DEPRECATED
1155
+ beam:
1156
+ beam_size: 4
1157
+ search_type: default
1158
+ preserve_alignments: false
1159
+ compute_timestamps: false
1160
+ return_best_hypothesis: true
1161
+ beam_alpha: 1.0
1162
+ beam_beta: 0.0
1163
+ kenlm_path: null
1164
+ flashlight_cfg:
1165
+ lexicon_path: null
1166
+ boost_path: null
1167
+ beam_size_token: 16
1168
+ beam_threshold: 20.0
1169
+ unk_weight: -.inf
1170
+ sil_weight: 0.0
1171
+ pyctcdecode_cfg:
1172
+ beam_prune_logp: -10.0
1173
+ token_min_logp: -5.0
1174
+ prune_history: false
1175
+ hotwords: null
1176
+ hotword_weight: 10.0
1177
+ wfst:
1178
+ beam_size: 4
1179
+ search_type: riva
1180
+ return_best_hypothesis: true
1181
+ preserve_alignments: false
1182
+ compute_timestamps: false
1183
+ decoding_mode: nbest
1184
+ open_vocabulary_decoding: false
1185
+ beam_width: 10.0
1186
+ lm_weight: 1.0
1187
+ device: cuda
1188
+ arpa_lm_path: null
1189
+ wfst_lm_path: null
1190
+ riva_decoding_cfg: {}
1191
+ k2_decoding_cfg:
1192
+ search_beam: 20.0
1193
+ output_beam: 10.0
1194
+ min_active_states: 30
1195
+ max_active_states: 10000
1196
+ confidence_cfg:
1197
+ preserve_frame_confidence: false
1198
+ preserve_token_confidence: false
1199
+ preserve_word_confidence: false
1200
+ exclude_blank: true
1201
+ aggregation: min
1202
+ tdt_include_duration: false
1203
+ method_cfg:
1204
+ name: entropy
1205
+ entropy_type: tsallis
1206
+ alpha: 0.33
1207
+ entropy_norm: exp
1208
+ temperature: DEPRECATED
1209
+ temperature: 1.0
model_weights.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be934566674bd3b815a8a72f12b92087a42e7a4d7abbf1840bc46e870bed31a3
3
+ size 52969010