FinText
AI & ML interests
Natural Language Processing in Finance, Accounting, Business, Management, Economics, and Marketing
README š»
FinText: A Specialised Financial LLM Repository
š **Stage 1 Release** š
We are thrilled to introduce a specialised suite of 68 large language models (LLMs), meticulously designed for the accounting and finance. The FinText models have been pre-trained on high quality, domain-specific historical data, addressing challenges such as look-ahead bias and information leakage. These models are crafted to elevate the accuracy and depth of financial research and analysis.
š” Key Features:
- Domain-Specific Training: FinText utilises diverse financial datasets including news articles, regulatory filings, transcripts, IP records, key information, board information, speeches (ECB, FED), and major Wikipedia articles.
- Time-Period Specific Models: Separate models are pre-trained for each year from 2007 to 2023, ensuring the utmost precision and historical relevance.
- RoBERTa Architecture: The suite includes both a base model with 125 million parameters and a smaller variant with 51 million parameters.
- Two distinct pre-training durations: We also introduce a series of models to explore the impact of futher pre-training.
- Accessibility: The models are pre-trained using BF16, but are released in FP32 format to ensure they are accessible to a broader community, including those without high-end GPUs.
- Sustainability: The entire electricity used was fully traceable and sourced exclusively from renewable energy.
For further details on this and citation, please refer to the paper, which is accessible from here.
Stay tuned for upcoming updates and new features for FinText. We expect to launch stages 2 and 3 within next months. š
Developed by:
Alliance Manchester Business School