Model Card: Turkish Scientific RoBERTa ONNX
Model Description
ONNX version of roberta-base-turkish-scientific-cased, specialized for Turkish scientific text analysis.
Intended Use
- Scientific text analysis in Turkish
- Text comprehension
- Fill-mask predictions
- Scientific text summarization
Training Data
- Source: Turkish scientific article abstracts from trdizin, yöktez, and t.k.
- Training Duration: 3+ days
- Steps: 2M
- Built from scratch, no fine-tuning
Technical Specifications
- Base Architecture: RoBERTa
- Tokenizer: BPE (Byte Pair Encoding)
- Format: ONNX
- Original Model: serdarcaglar/roberta-base-turkish-scientific-cased
Performance and Limitations
- Optimized for scientific domain in Turkish
- Not tested for general domain text
- ONNX format optimized for inference
Requirements
- onnxruntime
- transformers
- torch
License and Usage
- Follow original model license
- Users responsible for compliance
Citation
@misc{caglar2024roberta,
author = {Çağlar, Serdar},
title = {Roberta-base-turkish-scientific-cased},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/serdarcaglar/roberta-base-turkish-scientific-cased}
}
Contact
Serdar ÇAĞLAR ([email protected])
- Downloads last month
- 16