Model Card: Turkish Scientific RoBERTa ONNX

Model Description

ONNX version of roberta-base-turkish-scientific-cased, specialized for Turkish scientific text analysis.

Intended Use

  • Scientific text analysis in Turkish
  • Text comprehension
  • Fill-mask predictions
  • Scientific text summarization

Training Data

  • Source: Turkish scientific article abstracts from trdizin, yöktez, and t.k.
  • Training Duration: 3+ days
  • Steps: 2M
  • Built from scratch, no fine-tuning

Technical Specifications

  • Base Architecture: RoBERTa
  • Tokenizer: BPE (Byte Pair Encoding)
  • Format: ONNX
  • Original Model: serdarcaglar/roberta-base-turkish-scientific-cased

Performance and Limitations

  • Optimized for scientific domain in Turkish
  • Not tested for general domain text
  • ONNX format optimized for inference

Requirements

  • onnxruntime
  • transformers
  • torch

License and Usage

  • Follow original model license
  • Users responsible for compliance

Citation

@misc{caglar2024roberta,
  author = {Çağlar, Serdar},
  title = {Roberta-base-turkish-scientific-cased},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/serdarcaglar/roberta-base-turkish-scientific-cased}
}

Contact

Serdar ÇAĞLAR ([email protected])

Downloads last month
16
Inference API
Unable to determine this model's library. Check the docs .