--- license: mit language: - en pipeline_tag: automatic-speech-recognition --- # About This model was created to support experiments for evaluating phonetic transcription with the Buckeye corpus as part of https://github.com/ginic/multipa/tree/buckeye_experiments. This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a very specific subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in scripts/buckeye_experiments on the `buckeye_experiments` branch of the GitHub repository. # Experiment Details These experiments keep the total amount of data equal to half the training data with the gender split 50/50, but further exclude certain speakers completely using the --speaker_restriction argument. This allows us to restrict speakers included in training data in any way. For the purposes of these experiments, we are focussed on the age demogrpahic of the user. For reference, the speakers and their demographics included in the training data are as follows where the speaker age range 'y' means under 30 and 'o' means over 40: | speaker_id | speaker_gender | speaker_age_range | | ---------- | -------------- | ----------------- | | S01 | f | y | | S04 | f | y | | S08 | f | y | | S09 | f | y | | S12 | f | y | | S21 | f | y | | S02 | f | o | | S05 | f | o | | S07 | f | o | | S14 | f | o | | S16 | f | o | | S17 | f | o | | S06 | m | y | | S11 | m | y | | S13 | m | y | | S15 | m | y | | S28 | m | y | | S30 | m | y | | S03 | m | o | | S10 | m | o | | S19 | m | o | | S22 | m | o | | S24 | m | o | Goals: - Determine how variety of speakers in the training data affects performance Params to vary: - training seed (--train_seed) - demographic make up of training data by age, using --speaker_restriction - Experiments `young_only`: only individuals under 30, S01 S04 S08 S09 S12 S21 S06 S11 S13 S15 S28 S30 - Experiments `old_only`: only individuals over 40, S02 S05 S07 S14 S16 S17 S03 S10 S19 S22 S24