MrAiran
/

flan-t5-base-ptbr

Text2Text Generation

text-generation-inference

Model card Files Files and versions Community

flan-t5-base-ptbr / README.md

MrAiran's picture

Update README.md

c754e20 verified 10 months ago

|

history blame contribute delete

3.13 kB

	---
	license: apache-2.0
	datasets:
	- thegoodfellas/mc4-pt-cleaned
	language:
	- pt
	inference: false
	metrics:
	- bleu
	library_name: transformers
	pipeline_tag: text2text-generation
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This is the PT-BR Flan-T5-base model. Forked from: https://huggingface.co/thegoodfellas/tgf-flan-t5-base-ptbr

	# Model Details

	## Model Description

	This model was created to act as the base study for researchs who wants to learn how the Flan-T5 works. This is the Portuguese version.

	- Developed by: The Good Fellas team
	- Model type: Flan-T5
	- Language(s) (NLP): Portuguese (BR)
	- License: apache-2.0
	- Finetuned from model [optional]: Flan-T5-base

	We would like to thanks the TPU Research Cloud team for that amazing opportunity given to us. To learn about TRC: https://sites.research.google/trc/about/

	# Uses

	This model can be used as base to downstream task as instructed by Flan-T5 paper

	# Bias, Risks, and Limitations

	Due to the nature of the web-scraped corpus on which Flan-T5 models were trained, it is likely that their usage could reproduce and amplify
	pre-existing biases in the data, resulting in potentially harmful content such as racial or gender stereotypes and conspiracist views. For this reason,
	the study of such biases is explicitly encouraged, and model usage should ideally be restricted to research-oriented and non-user-facing endeavors.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```
	from transformers import FlaxT5ForConditionalGeneration

	model_flax = FlaxT5ForConditionalGeneration.from_pretrained("thegoodfellas/tgf-flan-t5-base-ptbr")

	```

	# Training Details

	## Training Data

	The training was performed from two datasets, BrWac and Oscar (Portuguese section).

	## Training Procedure

	We trained this model by 1 epoch on each dataset.


	### Training Hyperparameters

	Thanks to TPU Research Cloud we were able to train this model on TPU. 1 single TPUv2-8

	- Training regime:
	- Precision: bf16
	- Batch size: 32
	- LR: 0,005
	- Warmup steps: 10_000
	- Epochs: 1 (each dataset)
	- Optimizer: Adafactor

	# Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	Experiments were conducted using Google Cloud Platform in region us-central1, which has a carbon efficiency of 0.57 kgCO$_2$eq/kWh.
	A cumulative of 50 hours of computation was performed on hardware of type TPUv2 Chip (TDP of 221W).

	Total emissions are estimated to be 6.3 kgCO$_2$eq of which 100 percents were directly offset by the cloud provider.


	- Hardware Type: TPUv2
	- Hours used: 50
	- Cloud Provider: GCP
	- Compute Region: us-central1
	- Carbon Emitted: 6.3 kgCO$_2$eq

	# Technical Specifications [optional]

	## Model Architecture and Objective

	Flan-T5