Model Card for Model ID
Model Details
T_ETA: Split-and-Rephrase Model for NLP Preprocessing
T_ETA is a state-of-the-art split-and-rephrase model fine-tuned on the ETA dataset. It simplifies complex sentences into shorter, semantically accurate sentences, making it an ideal pre-processing step for various NLP tasks.
Key Features
- Sentence Simplification: Breaks down complex sentences while preserving meaning.
- High-Quality Outputs: Balances simplicity, meaning preservation, and grammaticality.
- Versatile Applications: Ideal for machine translation, summarization, information retrieval, and more.
How to Use
model = T5ForConditionalGeneration.from_pretrained("motasem/T_ETA")
tokenizer = T5Tokenizer.from_pretrained("motasem/T_ETA")
# Test the model
input_text = "Jordan, an Arab nation on the east bank of the Jordan River, is defined by ancient monuments, nature reserves and seaside resorts, It's home to the famed archaeological site of Petra, the Nabatean capital dating to around 300 BC, Set in a narrow valley with tombs, temples and monuments carved into the surrounding pink sandstone cliffs, Petra earns its nickname, the Rose City."
input_ids = tokenizer.encode("SR: "+ input_text, return_tensors="pt",max_length=1024,truncation=True)
output_ids = model.generate(input_ids,
max_length=1024,
num_beams=3,
no_repeat_ngram_size=6,
pad_token_id = tokenizer.eos_token_id,
num_return_sequences=1,
early_stopping=True)
output_text = tokenizer.decode(output_ids[0],
max_length=1024,
truncation=True,
skip_special_tokens=False,
clean_up_tokenization_space=True,
padding=True)
print(output_text)
- Downloads last month
- 190
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for motasem/T_ETA
Base model
google-t5/t5-base