|
--- |
|
language: en |
|
datasets: |
|
- efra |
|
license: apache-2.0 |
|
tags: |
|
- summarization |
|
- flan-t5 |
|
- legal |
|
- food |
|
model_type: t5 |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# Flan-T5 Large Fine-Tuned on EFRA Dataset |
|
|
|
This is a fine-tuned version of [Flan-T5 Large](https://huggingface.co/google/flan-t5-large) on the **EFRA dataset** for summarizing legal documents related to food regulations and policies. |
|
|
|
## Model Description |
|
|
|
Flan-T5 is a sequence-to-sequence model trained for text-to-text tasks. This fine-tuned version is specifically optimized for summarizing legal text in the domain of food legislation, regulatory requirements, and compliance documents. |
|
|
|
### Fine-Tuning Details |
|
- **Base Model**: [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) |
|
- **Dataset**: EFRA (a curated dataset of legal documents in the food domain) |
|
- **Objective**: Summarization of legal documents |
|
- **Framework**: Hugging Face Transformers |
|
|
|
## Applications |
|
|
|
This model is suitable for: |
|
- Summarizing legal texts in the food domain |
|
- Extracting key information from lengthy regulatory documents |
|
- Assisting legal professionals and food companies in understanding compliance requirements |
|
|
|
## Example Usage |
|
|
|
```python |
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer |
|
|
|
# Load the model and tokenizer |
|
model = AutoModelForSeq2SeqLM.from_pretrained("giuid/flan_t5_large_summarization_v2") |
|
tokenizer = AutoTokenizer.from_pretrained("giuid/flan_t5_large_summarization_v2") |
|
|
|
# Input text |
|
input_text = "Your lengthy legal document text here..." |
|
|
|
# Tokenize and generate summary |
|
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True) |
|
outputs = model.generate(inputs.input_ids, max_length=150, num_beams=5, early_stopping=True) |
|
|
|
# Decode summary |
|
summary = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(summary) |