Mirco commited on
Commit
7e9e1bd
·
1 Parent(s): 0612491

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -30,6 +30,9 @@ The model uses the ECAPA-TDNN architecture that has previously been used for spe
30
  more fully connected hidden layers after the embedding layer, and cross-entropy loss was used for training.
31
  We observed that this improved the performance of extracted utterance embeddings for downstream tasks.
32
 
 
 
 
33
  The model can classify a speech utterance according to the language spoken.
34
  It covers 107 different languages (
35
  Abkhazian,
@@ -199,6 +202,8 @@ print(emb.shape)
199
  ```
200
  To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
201
 
 
 
202
 
203
  #### Limitations and bias
204
 
 
30
  more fully connected hidden layers after the embedding layer, and cross-entropy loss was used for training.
31
  We observed that this improved the performance of extracted utterance embeddings for downstream tasks.
32
 
33
+ The system is trained with recordings sampled at 16kHz (single channel).
34
+ The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed.
35
+
36
  The model can classify a speech utterance according to the language spoken.
37
  It covers 107 different languages (
38
  Abkhazian,
 
202
  ```
203
  To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
204
 
205
+ The system is trained with recordings sampled at 16kHz (single channel).
206
+ The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed. Make sure your input tensor is compliant with the expected sampling rate if you use *encode_batch* and *classify_batch*.
207
 
208
  #### Limitations and bias
209