metadata
language:
- en
tags:
- financial NLP
- named entity recognition
- sequence labeling
- structured extraction
- hierarchical taxonomy
- XBRL
- iXBRL
- SEC filings
- financial-information-extraction
datasets:
- AAU-NLP/HiFi-KPI
model_name: BERT-SL1000
library_name: transformers
pipeline_tag: token-classification
base_model: bert-base-uncased
task_categories:
- token-classification
task_ids:
- named-entity-recognition
- financial-information-extraction
pretty_name: 'BERT-SL1000: Sequence Labeling for Financial KPI Extraction'
size_categories: 1M<n<10M
languages:
- en
dataset_name: HiFi-KPI
model_description: >
BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the
**HiFi-KPI dataset** for extracting
**financial key performance indicators (KPIs)** from **SEC earnings filings
(10-K & 10-Q)**. It specializes in identifying
entities, such as revenue, earnings, and financial ratios, using **token
classification**.
This model is part of the **HiFi-KPI benchmark** and is optimized for
**hierarchical label consistency**.
dataset_link: https://huggingface.co/datasets/AAU-NLP/HiFi-KPI
repo_link: https://github.com/rasmus393/HiFi-KPI
BERT-SL1000
Model Description
BERT-SL1000 is a BERT-based sequence labeling model fine-tuned on the HiFi-KPI dataset for extracting financial key performance indicators (KPIs) from SEC earnings filings (10-K & 10-Q). It specializes in identifying entities, such as revenue, earnings etc.
This model is trained on the HiFi-KPI dataset
Use Cases
- Extracting financial KPIs from SEC 10-K and 10-Q reports
- Financial document parsing with iXBRL-based entity recognition
Performance
- Trained on 1,000 most frequent labels from the HiFi-KPI dataset
Dataset & Code
- Dataset: HiFi-KPI on Hugging Face
- Code Example: HiFi-KPI GitHub Repository