amc-madalin commited on
Commit
eeda13d
·
verified ·
1 Parent(s): 2636083

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md CHANGED
@@ -1,3 +1,121 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ # Transformer Model for Language Translation
6
+
7
+ ## Overview
8
+ This project implements a Transformer model for language translation between English and Italian. Built from scratch, it aims to provide a deeper understanding of the Transformer architecture, which has become a cornerstone in natural language processing tasks. The project explores key elements of the architecture, such as the attention mechanism, and demonstrates hands-on experience with data preprocessing, model training, and evaluation.
9
+
10
+ ## Learning Objectives
11
+ - Understand and implement the Transformer model architecture.
12
+ - Explore the attention mechanism and its application in language translation.
13
+ - Gain practical experience with data preprocessing, model training, and evaluation in NLP.
14
+
15
+ ## Model Card on Hugging Face
16
+ You can find and use the pre-trained model on Hugging Face here:
17
+ [Model on Hugging Face](https://huggingface.co/amc-madalin/amc-en-it/tree/main)
18
+
19
+ ```python
20
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
21
+
22
+ # Load model and tokenizer
23
+ tokenizer = AutoTokenizer.from_pretrained("your-huggingface-model-url")
24
+ model = AutoModelForSeq2SeqLM.from_pretrained("your-huggingface-model-url")
25
+
26
+ # Translation Example
27
+ text = "Hello, how are you?"
28
+ inputs = tokenizer(text, return_tensors="pt")
29
+ outputs = model.generate(**inputs)
30
+ translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
31
+ print(translated_text)
32
+ ```
33
+
34
+ ## Project Structure
35
+
36
+ - **Attention Visualization** (`attention_visual.ipynb`): A notebook for visualizing attention maps to understand how the model focuses on different sentence parts during translation.
37
+ - **Configuration Settings** (`config.py`): Includes hyperparameters and other modifiable settings.
38
+ - **Dataset Processing** (`dataset.py`): Handles loading and preprocessing of English and Italian datasets.
39
+ - **Model Architecture** (`model.py`): Defines the Transformer model architecture.
40
+ - **Project Documentation** (`README.md`): This file, which provides a complete overview of the project.
41
+ - **Experiment Logs** (`runs/`): Logs and outputs from model training sessions.
42
+ - **Tokenizers** (`tokenizer_en.json`, `tokenizer_it.json`): Tokenizers for English and Italian text preprocessing.
43
+ - **Training Script** (`train.py`): The script that encapsulates the training process.
44
+ - **Saved Model Weights** (`weights/`): Stores the trained model weights for future use.
45
+
46
+ ## Installation
47
+
48
+ To set up and run the project locally, follow these steps:
49
+
50
+ 1. **Clone the Repository:**
51
+ ```bash
52
+ git clone https://github.com/amc-madalin/transformer-for-language-translation.git
53
+ ```
54
+
55
+ 2. **Create a Python Environment:**
56
+ Create a Conda environment:
57
+ ```bash
58
+ conda create --name transformer python=3.x
59
+ ```
60
+ Replace `3.x` with your preferred Python version.
61
+
62
+ 3. **Activate the Environment:**
63
+ ```bash
64
+ conda activate transformer
65
+ ```
66
+
67
+ 4. **Install Dependencies:**
68
+ Install required packages from `requirements.txt`:
69
+ ```bash
70
+ pip install -r requirements.txt
71
+ ```
72
+
73
+ 5. **Prepare Data:**
74
+ The dataset will be automatically downloaded. Modify the source (`lang_src`) and target (`lang_tgt`) languages in `config.py`, if necessary. The default is set to English (`en`) and Italian (`it`):
75
+ ```json
76
+ "lang_src": "en",
77
+ "lang_tgt": "it",
78
+ ```
79
+
80
+ 6. **Train the Model:**
81
+ Start the training process with:
82
+ ```bash
83
+ python train.py
84
+ ```
85
+
86
+ 7. **Use the Model:**
87
+ The trained model weights will be saved in the `weights/` directory. Use these weights for inference, evaluation, or further applications.
88
+
89
+ ## Using the Model with Hugging Face
90
+ Once trained, the model can be uploaded to Hugging Face for easy access and use.
91
+
92
+ ### Uploading the Model to Hugging Face
93
+ Use the following steps to upload your trained model to Hugging Face:
94
+ ```bash
95
+ huggingface-cli login
96
+ transformers-cli upload ./weights/ --organization your-organization
97
+ ```
98
+
99
+ ### Loading the Model from Hugging Face for Inference
100
+ You can easily load the model for translation tasks directly from Hugging Face:
101
+ ```python
102
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
103
+
104
+ tokenizer = AutoTokenizer.from_pretrained("your-huggingface-model-url")
105
+ model = AutoModelForSeq2SeqLM.from_pretrained("your-huggingface-model-url")
106
+
107
+ # Translate text
108
+ text = "How are you?"
109
+ inputs = tokenizer(text, return_tensors="pt")
110
+ outputs = model.generate(**inputs)
111
+ translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
112
+ print(translation)
113
+ ```
114
+
115
+ ## Learning Resources
116
+ - [YouTube - Coding a Transformer from Scratch on PyTorch](https://youtube.com/your-video-link)
117
+ A detailed walkthrough of coding a Transformer model from scratch using PyTorch, including training and inference.
118
+
119
+ ## Acknowledgements
120
+ Special thanks to **Umar Jamil** for his guidance and contributions that supported the completion of this project.
121
+