Initial commit of my fine-tuned model

Browse files

Files changed (5) hide show

.gitignore +22 -0
Fine_tuning_TTS_Bengali.ipynb +0 -0
LICENSE +21 -0
README.md +3 -78
SpeechT5_finetune_technicalTerm.ipynb +0 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,22 @@

+### AL ###
+#Template for AL projects for Dynamics 365 Business Central
+#launch.json folder
+.vscode/
+#Cache folder
+.alcache/
+#Symbols folder
+.alpackages/
+#Snapshots folder
+.snapshots/
+#Testing Output folder
+.output/
+#Extension App-file
+*.app
+#Rapid Application Development File
+rad.json
+#Translation Base-file
+*.g.xlf
+#License-file
+*.flf
+#Test results file
+TestResults.xml

Fine_tuning_TTS_Bengali.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2024 Jyotirmoyee Mandal
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,78 +1,3 @@
----
-library_name: transformers
-license: mit
-base_model: microsoft/speecht5_tts
-tags:
-- generated_from_trainer
-datasets:
-- lj_speech
-model-index:
-- name: speecht5_finetuned_English
-  results: []
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# speecht5_finetuned_English
-This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the lj_speech dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3717
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0001
-- train_batch_size: 4
-- eval_batch_size: 2
-- seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 32
-- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 100
-- training_steps: 1500
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 3.772         | 0.3053 | 100  | 0.4201          |
-| 3.5912        | 0.6107 | 200  | 0.4052          |
-| 3.4604        | 0.9160 | 300  | 0.3948          |
-| 3.3894        | 1.2214 | 400  | 0.3906          |
-| 3.3737        | 1.5267 | 500  | 0.3865          |
-| 3.3628        | 1.8321 | 600  | 0.3851          |
-| 3.3236        | 2.1374 | 700  | 0.3821          |
-| 3.306         | 2.4427 | 800  | 0.3811          |
-| 3.2859        | 2.7481 | 900  | 0.3797          |
-| 3.2663        | 3.0534 | 1000 | 0.3763          |
-| 3.2368        | 3.3588 | 1100 | 0.3757          |
-| 3.2107        | 3.6641 | 1200 | 0.3749          |
-| 3.2035        | 3.9695 | 1300 | 0.3730          |
-| 3.1969        | 4.2748 | 1400 | 0.3728          |
-| 3.2107        | 4.5802 | 1500 | 0.3717          |
-### Framework versions
-- Transformers 4.46.0.dev0
-- Pytorch 2.4.1+cu121
-- Datasets 3.0.2
-- Tokenizers 0.20.1

+# TTS_FineTuning_GenAI_Assignment
+Implementation of fine-tuning TTS models for technical vocabulary in English and in Bengali, as part of IIT Roorkee’s GenAI Internship. Includes dataset creation, model fine-tuning, and evaluation using MOS scores. Also explores optimization techniques like quantization for faster inference.
+# Fine-tuning TTS for English with a Focus on Technical Vocabulary

SpeechT5_finetune_technicalTerm.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff