MahaNews-All-BERT

MahaNews-All-BERT is a MahaBERT(l3cube-pune/marathi-bert-v2) model fine-tuned on full L3Cube-MahaNews-All Corpus, a Marathi document classification dataset.
It is a topic identification cum document classification model with 12 output categories. The model is trained on combined MahaNews-LPC (long doc), MahaNews-SHC (short text), and MahaNews-LPC (medium paragraphs)
[dataset link] (https://github.com/l3cube-pune/MarathiNLP)

More details on the dataset, models, and baseline results can be found in our [paper] (coming soon)
Citing:

@inproceedings{mittal2023l3cube,
  title={L3Cube-MahaNews: News-Based Short Text and Long Document Classification Datasets in Marathi},
  author={Mittal, Saloni and Magdum, Vidula and Hiwarkhedkar, Sharayu and Dhekane, Omkar and Joshi, Raviraj},
  booktitle={International Conference on Speech and Language Technologies for Low-resource Languages},
  pages={52--63},
  year={2023},
  organization={Springer}
}

Other Marathi Sentiment models from MahaNews family are shared here:

MahaNews-LDC-BERT (long documents)
MahaNews-SHC-BERT (short text)
MahaNews-LPC-BERT (medium paragraphs)
MahaNews-All-BERT (all document lengths)

Downloads last month
18
Safetensors
Model size
238M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.