jcblaise
/

roberta-tagalog-base

Model card Files Files and versions Community

jcblaise commited on Nov 12, 2021

Commit

8e039de

•

1 Parent(s): 4e2d645

Create README.md

Files changed (1) hide show

README.md +30 -0

README.md ADDED Viewed

	@@ -0,0 +1,30 @@

+---
+language: tl
+tags:
+- roberta
+- tagalog
+- filipino
+license: cc-by-sa-4.0
+inference: false
+---
+# RoBERTa Tagalog Base Cased
+Tagalog RoBERTa trained as an improvement over our previous Tagalog pretrained Transformers. Trained with TLUnified, a newer, larger, more topically-varied pretraining corpus for Filipino. This model is part of a larger research project. We open-source the model to allow greater usage within the Filipino NLP community.
+## Citations
+All model details and training setups can be found in our papers. If you use our model or find it useful in your projects, please cite our work:
+```
+@article{cruz2021improving,
+  title={Improving Large-scale Language Models and Resources for Filipino},
+  author={Jan Christian Blaise Cruz and Charibeth Cheng},
+  journal={arXiv preprint arXiv:2111.06053},
+  year={2021}
+}
+```
+## Data and Other Resources
+Data used to train this model as well as other benchmark datasets in Filipino can be found in my website at https://blaisecruz.com
+## Contact
+If you have questions, concerns, or if you just want to chat about NLP and low-resource languages in general, you may reach me through my work email at [email protected]