λͺ¨λΈ μμΈ μ 보 (readme.md English Version)
1. κ°μ
μ΄ λͺ¨λΈμ νκ΅μ΄ λ¬Έμ₯ λ΄ μ ν΄ννμ΄ ν¬ν¨λμ΄μλμ§, κ·Έλ¦¬κ³ μ ν΄ννμ μ νμ κ²μΆνκΈ° μν΄ νμ΅λ λͺ¨λΈμ
λλ€.
multi-label classificationμ μννλ©°, μ ν΄ννμ΄ ν¬ν¨λμκ±°λ μΌλ°μ μΈ λ¬Έμ₯μΈμ§ νλ¨(λΆλ₯)νλ λͺ¨λΈμ
λλ€.
AI-Taskλ‘λ text-classification(multi-label)μ ν΄λΉν©λλ€. μ¬μ©νλ λ°μ΄ν°μ
μ TTA-DQA/hate_sentence μ
λλ€.
ν΄λμ€ κ΅¬μ±μ μλμ κ°μ΅λλ€.
- 0: 'insult'
- 1: 'abuse',
- 2: 'obscenity'
- 3: 'TVPC' #Threats of violence/promotion of crime
- 4: 'sexuality'
- 5: 'age'
- 6: 'race_region' #race and region
- 7: 'disabled'
- 8: 'religion'
- 9: 'politics'
- 10: 'job'
- 11:'no_hate'
2. Training Information
- Base Model: KcElectra (a pre-trained Korean language model based on Electra)
- Source: beomi/KcELECTRA-base-v2022(https://huggingface.co/beomi/KcELECTRA-base-v2022)
- Model Type: Casual Language Model
- Pre-training (Korean): μ½ 17GB (over 180 million sentences)
- Fine-tuning (hate dataset): μ½ 28.9MB (TTA-DQA/hate_sentence)
- Learning Rate: 5e-6
- Weight Decay: 0.01
- Epochs: 30
- Batch Size: 16
- Data Loader Workers: 2
- Tokenizer: BertWordPieceTokenizer
- Model Size: Approximately 511MB
3. μꡬμ¬ν
- pytorch ~= 1.8.0
- transformers ~= 4.11.3
- emoji ~= 0.6.0
- soynlp ~= 0.0.493
4. Quick Start
- python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("TTA-DQA/Hate-Detection-MultiLabel-KcElectra-FineTuning")
model = AutoModel.from_pretrained("TTA-DQA/Hate-Detection-MultiLabel-KcElectra-FineTuning")
5. Citation
- μ΄ λͺ¨λΈμ μ΄κ±°λAI νμ΅μ© λ°μ΄ν° νμ§κ²μ¦ μ¬μ (2024λ λ μ΄κ±°λAI νμ΅μ© νμ§κ²μ¦)μ μν΄μ ꡬμΆλμμ΅λλ€
6. νΈν₯μ±, μνμ±, μ νμ± λ± νμ
- λ³Έ λͺ¨λΈμ κ° ν΄λμ€ λ³ λ°μ΄ν°μ μμ΄ λ€μ νΈν₯μ μΈ λΆλΆμ΄ μμ΅λλ€.
- λν ν΄λμ€ κΈ°μ€μ λν΄μ, μΈμ΄μ , μΈμ΄ν΄μμ νΉμ±μ μν΄ λ μ΄λΈμ λν μ΄κ²¬μ΄ μμ μ μμ΅λλ€.
- μ ν΄ννμ κ²½μ° μΈμ΄, λ¬Έν, μ μ© λΆμΌ, κ°μΈμ 견ν΄μ λ°λΌ μ£Όκ΄μ μΈ λΆλΆμ΄ μμ΄ κ²°κ³Όμ λν νΈν₯ λλ λ Όλμ΄ μμ μ μμ΅λλ€.
- λ°λΌμ, κ²°κ³Όκ° νκ΅μ΄μ λν μ λμ μΈ μ ν΄ννμ κΈ°μ€μ΄ λ μ λ μμ΅λλ€.
μ€νκ²°κ³Ό
- type : multi-label classification(text-classification)
- f1-score : 0.8279
- accuracy : 0.7013
- Downloads last month
- 82
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for TTA-DQA/Hate-Detection-MultiLabel-KcElectra-FineTuning
Base model
beomi/KcELECTRA-base-v2022