README.md · esha111/id2223

metadata

title: Id2223 Lab2
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false

An example chatbot using Gradio, huggingface_hub, and the Hugging Face Inference API.

Hyperparameter Tuning Learning Rate: Too high a value can cause divergence, while too low slows convergence. A balanced rate like 1e-5 or 3e-5 helps achieve stable learning. Batch Size: Smaller batch sizes introduce more stochasticity, helping escape local minima, while larger ones stabilize training and reduce noise. Epochs: Balancing underfitting (too few epochs) and overfitting (too many epochs) is crucial to maintain performance.

Learning Rate Scheduling Gradual reduction of the learning rate during training prevents overshooting the optimum, helping the model fine-tune its parameters with higher precision.

Data Diversity A diverse dataset helps the model generalize better across different domains and languages. Domain-Specific Data: For example, if the target use case is healthcare, training on medical conversations improves relevance.