Model Card for Model ID

Model Details

T_ETA: Split-and-Rephrase Model for NLP Preprocessing

T_ETA is a state-of-the-art split-and-rephrase model fine-tuned on the ETA dataset. It simplifies complex sentences into shorter, semantically accurate sentences, making it an ideal pre-processing step for various NLP tasks.

Key Features

  • Sentence Simplification: Breaks down complex sentences while preserving meaning.
  • High-Quality Outputs: Balances simplicity, meaning preservation, and grammaticality.
  • Versatile Applications: Ideal for machine translation, summarization, information retrieval, and more.

How to Use

model = T5ForConditionalGeneration.from_pretrained("motasem/T_ETA")
tokenizer = T5Tokenizer.from_pretrained("motasem/T_ETA")

# Test the model
input_text = "Jordan, an Arab nation on the east bank of the Jordan River, is defined by ancient monuments, nature reserves and seaside resorts, It's home to the famed archaeological site of Petra, the Nabatean capital dating to around 300 BC, Set in a narrow valley with tombs, temples and monuments carved into the surrounding pink sandstone cliffs, Petra earns its nickname, the Rose City."
input_ids = tokenizer.encode("SR: "+ input_text, return_tensors="pt",max_length=1024,truncation=True)
output_ids = model.generate(input_ids,
                            max_length=1024,
                            num_beams=3,
                            no_repeat_ngram_size=6,
                            pad_token_id = tokenizer.eos_token_id,
                            num_return_sequences=1,
                            early_stopping=True)

output_text = tokenizer.decode(output_ids[0],
                               max_length=1024,
                               truncation=True,
                               skip_special_tokens=False,
                               clean_up_tokenization_space=True,
                               padding=True)
print(output_text)
Downloads last month
190
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for motasem/T_ETA

Base model

google-t5/t5-base
Finetuned
(445)
this model