iiBLACKii commited on
Commit
81c6beb
·
verified ·
1 Parent(s): 50976d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -78,6 +78,58 @@ if __name__ == "__main__":
78
  ```
79
  [More Information Needed]
80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  ### Downstream Use [optional]
82
 
83
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
78
  ```
79
  [More Information Needed]
80
 
81
+
82
+ ### Using Base Model (OpenAI)
83
+ ```python
84
+
85
+ import torch
86
+ import librosa
87
+ from transformers import WhisperProcessor, WhisperForConditionalGeneration, AutoConfig
88
+
89
+ repo_name = "iiBLACKii/Gujarati_VDB_Fine_Tune"
90
+
91
+ processor = WhisperProcessor.from_pretrained(repo_name)
92
+
93
+ config = AutoConfig.from_pretrained(repo_name)
94
+
95
+ model = WhisperForConditionalGeneration.from_pretrained(repo_name, config=config)
96
+
97
+
98
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
99
+ model.to(device)
100
+
101
+ def preprocess_audio(file_path, sampling_rate=16000):
102
+ audio_array, sr = librosa.load(file_path, sr=None)
103
+ if sr != sampling_rate:
104
+ audio_array = librosa.resample(audio_array, orig_sr=sr, target_sr=sampling_rate)
105
+ return audio_array
106
+
107
+ def transcribe_audio(audio_path):
108
+ audio_array = preprocess_audio(audio_path)
109
+
110
+ input_features = processor.feature_extractor(
111
+ audio_array, sampling_rate=16000, return_tensors="pt"
112
+ ).input_features
113
+
114
+ input_features = input_features.to(device)
115
+
116
+ with torch.no_grad():
117
+ predicted_ids = model.generate(
118
+ input_features,
119
+ max_new_tokens=400,
120
+ num_beams=5,
121
+ )
122
+
123
+ transcription = processor.tokenizer.batch_decode(predicted_ids, skip_special_tokens=True)
124
+ return transcription[0]
125
+
126
+ if __name__ == "__main__":
127
+ audio_file_path = "" #.wav file path
128
+
129
+ print("Transcribing audio...")
130
+ transcription = transcribe_audio(audio_file_path)
131
+ print(f"Transcription: {transcription}")
132
+ ```
133
  ### Downstream Use [optional]
134
 
135
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->