End-to-end Neural Diarization (EEND) trained on AMI-headset dataset. This example could be found at egs2/ami/diar1.

Configurations:

  • Use ESPNet's default frontend to extract features. The sampling rate is 8000 Hz, with a frame length of 25 ms and a frame shift of 10 ms. The frontend extracts 23 log-scaled Mel-filterbanks.
  • Follow the frame concatenation and subsampling strategy described in paper [[2]]. Each frame is concatenated with the preceding and following 7 frames, followed by subsampling with a factor of 10. As a result, a 345-dimensional acoustic feature (23 × 15) is extracted for each 100 ms.
  • Training and testing are performed exclusively on data with 4 speakers.
  • Use 4 layer stacked Transformer encoder, each outputs 256-dimensional frame-wise embeddings.
  • The training process spans 500 epochs.
  • Detailed configurations are defined in exp/diar/train_diar_diar_raw/config.yaml.

RESULTS

Environments

  • date: Thu Dec 19 22:03:53 EST 2024
  • python version: 3.11.10 (main, Oct 3 2024, 07:29:13) [GCC 11.2.0]
  • espnet version: espnet 202409
  • pytorch version: pytorch 2.4.0
  • Git hash: c12b3d59ca4fd8847edf274e56a1716474d2a30e
    • Commit date: Thu Dec 19 21:58:26 2024 -0500

diar_train_diar_raw

DER

diarized_test

threshold_median_collar DER
result_th0.3_med11_collar0.0 71.73
result_th0.3_med1_collar0.0 74.62
result_th0.4_med11_collar0.0 70.10
result_th0.4_med1_collar0.0 71.98
result_th0.5_med11_collar0.0 70.57
result_th0.5_med1_collar0.0 72.44
result_th0.6_med11_collar0.0 72.64
result_th0.6_med1_collar0.0 74.63
result_th0.7_med11_collar0.0 76.52
result_th0.7_med1_collar0.0 78.41

diar_train_diar_raw

DER

diarized_dev

threshold_median_collar DER
result_th0.3_med11_collar0.0 75.88
result_th0.3_med1_collar0.0 78.21
result_th0.4_med11_collar0.0 71.45
result_th0.4_med1_collar0.0 73.32
result_th0.5_med11_collar0.0 70.53
result_th0.5_med1_collar0.0 72.34
result_th0.6_med11_collar0.0 72.03
result_th0.6_med1_collar0.0 73.96
result_th0.7_med11_collar0.0 76.66
result_th0.7_med1_collar0.0 78.33
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.