blackstone commited on
Commit
cfd85fd
·
1 Parent(s): ab93251

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -24,14 +24,13 @@ pipeline_tag: audio-classification
24
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
25
  <br/><br/>
26
 
27
- # Speaker Verification with ECAPA-TDNN embeddings on CNCeleb
28
 
29
- This repository provides all the necessary tools to perform speaker verification with a pretrained ECAPA-TDNN model using SpeechBrain.
30
  The system can be used to extract speaker embeddings as well.
31
  It is trained on CNCeleb1 + CNCeleb2 training data.
32
 
33
- For a better experience, we encourage you to learn more about
34
- [SpeechBrain](https://speechbrain.github.io). The model performance on CNCeleb1-test set(Cleaned) is:
35
 
36
  | Release | EER(%) | MinDCF(p=0.01) |
37
  |:-------------:|:--------------:|:--------------:|
@@ -41,15 +40,16 @@ For a better experience, we encourage you to learn more about
41
  ## Pipeline description
42
 
43
  This system is composed of an ECAPA-TDNN model. It is a combination of convolutional and residual blocks. The embeddings are extracted using attentive statistical pooling. The system is trained with Additive Margin Softmax Loss. Speaker Verification is performed using cosine distance between speaker embeddings.
44
- You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1-ahC1xeyPinAHp2oAohL-02smNWO41Cc?usp=sharing).
45
 
46
  ### Compute your speaker embeddings
47
 
48
  ```python
49
  import torchaudio
50
  from speechbrain.pretrained import EncoderClassifier
51
- classifier = EncoderClassifier.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb")
52
- signal, fs =torchaudio.load('tests/samples/ASR/spk1_snt1.wav')
 
53
  embeddings = classifier.encode_batch(signal)
54
  ```
55
  The system is trained with recordings sampled at 16kHz (single channel).
@@ -59,7 +59,7 @@ The code will automatically normalize your audio (i.e., resampling + mono channe
59
 
60
  ```python
61
  from speechbrain.pretrained import SpeakerRecognition
62
- verification = SpeakerRecognition.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-voxceleb")
63
  score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
64
  score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker
65
  ```
 
24
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
25
  <br/><br/>
26
 
27
+ # Speaker Verification with ECAPA-TDNN on CNCeleb
28
 
29
+ This repository a pretrained ECAPA-TDNN model using SpeechBrain.
30
  The system can be used to extract speaker embeddings as well.
31
  It is trained on CNCeleb1 + CNCeleb2 training data.
32
 
33
+ The model performance on CNCeleb1-test set(Cleaned) is:
 
34
 
35
  | Release | EER(%) | MinDCF(p=0.01) |
36
  |:-------------:|:--------------:|:--------------:|
 
40
  ## Pipeline description
41
 
42
  This system is composed of an ECAPA-TDNN model. It is a combination of convolutional and residual blocks. The embeddings are extracted using attentive statistical pooling. The system is trained with Additive Margin Softmax Loss. Speaker Verification is performed using cosine distance between speaker embeddings.
43
+ You can find our training results (models, logs, etc) [here]().
44
 
45
  ### Compute your speaker embeddings
46
 
47
  ```python
48
  import torchaudio
49
  from speechbrain.pretrained import EncoderClassifier
50
+ classifier = EncoderClassifier.from_hparams(source="blackstone/spkrec-ecapa-cnceleb")
51
+
52
+ signal, fs = torchaudio.load('tests/samples/ASR/spk1_snt1.wav')
53
  embeddings = classifier.encode_batch(signal)
54
  ```
55
  The system is trained with recordings sampled at 16kHz (single channel).
 
59
 
60
  ```python
61
  from speechbrain.pretrained import SpeakerRecognition
62
+ verification = SpeakerRecognition.from_hparams(source="blackstone/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-cnceleb")
63
  score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
64
  score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker
65
  ```