This model is a fine-tuned version of the BERT language model, specifically adapted for multi-label classification tasks in the financial regulatory domain. It is built upon the pre-trained ProsusAI/finbert model, which has been further fine-tuned using a diverse dataset of financial regulatory texts. This allows the model to accurately classify text into multiple relevant categories simultaneously.
Model Architecture
- Base Model: BERT
- Pre-trained Model: ProsusAI/finbert
- Task: Multi-label classification
Performance
Performance metrics on the validation set:
- F1 Score: 0.8637
- ROC AUC: 0.9044
- Accuracy: 0.6155
Limitations and Ethical Considerations
- This model's performance may vary depending on the specific nature of the text data and label distribution.
- Class imbalance in the dataset.
Dataset Information
- Training Dataset: Number of samples: 6562
- Validation Dataset: Number of samples: 929
- Test Dataset: Number of samples: 1884
Training Details
- Training Strategy: Fine-tuning BERT with a randomly initialized classification head.
- Optimizer: Adam
- Learning Rate: 1e-4
- Batch Size: 16
- Number of Epochs: 2
- Evaluation Strategy: Epoch
- Weight Decay: 0.01
- Metric for Best Model: F1 Score
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.