ksingla025 commited on
Commit
8b3d0ed
1 Parent(s): 940f069

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # This speech tagger performs transcription for Hindi, annotates key entities, predict speaker age, dialiect and intent.
3
+
4
+ Model is suitable for voiceAI applications, real-time and offline.
5
+
6
+ ## Model Details
7
+
8
+ - **Model type**: NeMo ASR
9
+ - **Architecture**: Conformer CTC
10
+ - **Language**: English
11
+ - **Training data**: CommonVoice, Gigaspeech
12
+ - **Performance metrics**: [Metrics]
13
+
14
+ ## Usage
15
+
16
+ To use this model, you need to install the NeMo library:
17
+
18
+ ```bash
19
+ pip install nemo_toolkit
20
+ ```
21
+
22
+ ### How to run
23
+
24
+ ```python
25
+ import nemo.collections.asr as nemo_asr
26
+
27
+ # Step 1: Load the ASR model from Hugging Face
28
+ model_name = 'WhissleAI/stt_hi_conformer_ctc_entities_age_dialiect_intent'
29
+ asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name)
30
+
31
+ # Step 2: Provide the path to your audio file
32
+ audio_file_path = '/path/to/your/audio_file.wav'
33
+
34
+ # Step 3: Transcribe the audio
35
+ transcription = asr_model.transcribe(paths2audio_files=[audio_file_path])
36
+ print(f'Transcription: {transcription[0]}')
37
+ ```
38
+
39
+ Dataset is from AI4Bharat IndicVoices Hindi V1 and V2 dataset.
40
+
41
+ https://indicvoices.ai4bharat.org/
42
+