tejagowda
/

Translator

Model card Files Files and versions Community

File size: 6,415 Bytes

1e1092f

LSTM and Seq-to-Seq Language Translator
This project implements language translation using two approaches:

LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
Model Architectures
1. LSTM-Based Translator
The LSTM model is built with the following components:

Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
2. Seq-to-Seq Translator
The Seq-to-Seq model uses:

Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.

LSTM and Seq-to-Seq Language Translator
This project implements language translation using two approaches:

LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.

Model Architectures
1. LSTM-Based Translator
The LSTM model is built with the following components:

Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
2. Seq-to-Seq Translator
The Seq-to-Seq model uses:

Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
Dataset
The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.

Preprocessing:

Tokenization: Text is tokenized using Keras' Tokenizer.
Padding: Sequences are padded to a fixed length for training.
Vocabulary Sizes:
English: 1000 pairs
Hebrew: 1000 pairs

Training Details
Training Parameters:
Optimizer: Adam
Loss Function: Sparse Categorical Crossentropy
Batch Size: 32
Epochs: 20
Validation Split: 20%
Checkpoints:
Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.

Training Metrics:
Both models track:

Training Loss
Validation Loss

Evaluation Metrics
1. BLEU Score:
The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.

LSTM Model BLEU: [BLEU Score for LSTM]
Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
2. CHRF Score:
The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.

LSTM Model CHRF: [CHRF Score for LSTM]
Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]


LSTM and Seq-to-Seq Language Translator
This project implements language translation using two approaches:

LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.

Model Architectures
1. LSTM-Based Translator
The LSTM model is built with the following components:

Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
2. Seq-to-Seq Translator
The Seq-to-Seq model uses:

Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
Dataset
The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.

Preprocessing:

Tokenization: Text is tokenized using Keras' Tokenizer.
Padding: Sequences are padded to a fixed length for training.
Vocabulary Sizes:
English: [English Vocabulary Size]
Hebrew: [Hebrew Vocabulary Size]
Training Details
Training Parameters:
Optimizer: Adam
Loss Function: Sparse Categorical Crossentropy
Batch Size: 32
Epochs: 20
Validation Split: 20%
Checkpoints:
Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.

Training Metrics:
Both models track:

Training Loss
Validation Loss
Evaluation Metrics
1. BLEU Score:
The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.

LSTM Model BLEU: [BLEU Score for LSTM]
Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
2. CHRF Score:
The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.

LSTM Model CHRF: [CHRF Score for LSTM]
Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
Results
Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture.
Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences.

Acknowledgments
Dataset: [Custom Parallel Dataset]
Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF.