Model Card for CryptoTrader-LM

The model predicts a trading decision—buy, sell, or hold—for either Bitcoin (BTC) or Ethereum (ETH) based on cryptocurrency news and historical price data. This model is fine-tuned using LoRA on the Ministral-8B-Instruct-2410 base model, specifically for the FinNLP @ COLING-2025 Cryptocurrency Trading Challenge.

Model Details

Model Description

This model is fine-tuned using LoRA (Low-Rank Adaptation) on the Ministral-8B-Instruct-2410 model, designed to predict daily cryptocurrency trading decisions (buy, sell, or hold) based on real-time news articles and BTC/ETH price data. The model's goal is to maximize profitability by making informed trading decisions under volatile market conditions.

Base Model: mistralai/Ministral-8B-Instruct-2410
Fine-tuning Framework: PEFT (Parameter Efficient Fine-Tuning)
Task: Cryptocurrency Trading Decision-Making (BTC, ETH)
Languages: English (for news article analysis)

Uses

Direct Use

The model can be used to predict daily trading decisions for BTC or ETH based on real-time financial news and historical cryptocurrency price data. It is designed for participants of the FinNLP Cryptocurrency Trading Challenge, but it could also be applied to other cryptocurrency trading contexts.

Downstream Use

The model can be integrated into automated crypto trading systems, agent-based trading platforms (such as FinMem), or used for research in financial decision-making models.

Out-of-Scope Use

This model is not designed for:

Predicting trading decisions for assets other than Bitcoin (BTC) or Ethereum (ETH).
High-frequency trading (HFT); the model is optimized for daily decision-making, not minute-by-minute trading.
Use in non-financial domains. It is not suitable for generic text-generation tasks or sentiment analysis outside of financial contexts.

Bias, Risks, and Limitations

Bias

The model is fine-tuned on specific data (cryptocurrency news and price data) and may not generalize well to other financial markets or different news sources. There could be biases based on the news outlets and timeframes present in the training data.

Risks

Market Volatility: Cryptocurrency markets are inherently volatile. The model’s predictions are based on past data and news, which may not always predict future market conditions accurately.
Decision-making: The model offers trading advice, but users should employ appropriate risk management techniques and not rely solely on the model for financial decisions.

Limitations

The model’s evaluation is primarily focused on profitability (Sharpe Ratio), and it may not account for other factors such as market liquidity, transaction fees, or slippage.
The model may not perform well in scenarios with significant market regime changes, such as sudden regulatory shifts or unexpected global events.

Recommendations

Risk Management: Users should complement the model’s predictions with traditional risk management strategies and not use the model in isolation for trading.
Bias Awareness: Be aware of potential biases in the news sources and timeframe used in training. The model may underrepresent certain news sources or overemphasize specific types of news.

How to Get Started with the Model

To start using the model for predictions, you can follow the example code below:

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

from huggingface_hub import login
login("YOUR TOKEN HERE")


PROMPT = "[INST]YOUR PROMPT HERE[/INST]"
MAX_LENGTH = 32768  # Do not change
DEVICE = "cpu"


model_id = "agarkovv/CryptoTrader-LM"
base_model_id = "mistralai/Ministral-8B-Instruct-2410"

model = AutoPeftModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

model = model.to(DEVICE)
model.eval()
inputs = tokenizer(
    PROMPT, return_tensors="pt", padding=False, max_length=MAX_LENGTH, truncation=True
)
inputs = {key: value.to(model.device) for key, value in inputs.items()}

res = model.generate(
    **inputs,
    use_cache=True,
    max_new_tokens=MAX_LENGTH,
)
output = tokenizer.decode(res[0], skip_special_tokens=True)
print(output)

Training Details

Training Data

The model was fine-tuned on cryptocurrency market data, including:

Cryptocurrency to USD exchange rates for Bitcoin (BTC) and Ethereum (ETH).
News articles: Textual data related to cryptocurrency markets, including news URLs, titles, sources, and publication dates. The dataset was provided in JSON format, where each entry corresponds to a piece of news relevant to the crypto market.

Data Periods:

Training Data: Data period from 2022-01-01 to 2024-10-15.

The model was trained to correlate news sentiment, content, and cryptocurrency price trends, aiming to predict optimal trading decisions.

Training Procedure

Preprocessing

Text Preprocessing: The raw news data underwent preprocessing which included text normalization, tokenization, and removal of irrelevant tokens (like stop words and special characters).
Price Data Normalization: Historical price data was normalized to reflect percentage changes over time, making it easier for the model to capture price trends.
Data Alignment: News articles were aligned with the corresponding time periods of price data to enable the model to learn from both data sources simultaneously.

Training Hyperparameters

Batch size: 1
Learning rate: 5e-5
Epochs: 3
Precision: Mixed precision (FP16), which helped speed up training while conserving memory.
Optimizer: AdamW
LoRA Parameters: LoRA rank 8, alpha 16, dropout 0.1

Speeds, Sizes, Times

Training Time: Approximately 3 hours on an 4x A100 GPU setup.
Model Size: 8B parameters (base model: Ministral-8B-Instruct).
Checkpoint Size: ~16GB due to the parameter-efficient fine-tuning.

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a validation set of cryptocurrency market data (both price data and news articles). The testing dataset aligns with time periods not seen in training.

Factors

The model’s evaluation primarily focuses on:

Profitability: The model’s ability to make profitable trading decisions.
Volatility Handling: How well the model adapts to market volatility.
Timeliness: The ability to react to time-sensitive news.

Metrics

Sharpe Ratio (SR): The main evaluation metric for the challenge. The Sharpe Ratio is used to measure the risk-adjusted return of the model’s trading decisions.
Profit and Loss (PnL): The net profit or loss generated by the model’s trading decisions over a given time period.
Accuracy: The percentage of correct trading decisions (buy/sell/hold) compared to the optimal strategy.

Results

The model achieved a Sharpe Ratio of 0.94 on the validation set, indicating a strong risk-adjusted return. The model demonstrated consistent profitability over the testing period and effectively managed news-based volatility.

Summary

Sharpe Ratio: 0.94
Accuracy: 72%
Profitability: The model’s decisions resulted in an average 8% profit over the testing period.

Model Examination [optional]

Initial interpretability studies show that the model places significant weight on news headlines containing strong market sentiment indicators (e.g., "surge", "plummet"). Further analysis is recommended to explore how different types of news (e.g., regulatory updates vs. technical analysis) influence model decisions.

Environmental Impact

Carbon emissions and energy consumption estimates during model training:

Hardware Type: 4x NVIDIA A100 GPUs.
Hours used: ~3 hours of total training time.
Cloud Provider: AWS.
Compute Region: US-East.
Carbon Emitted: Approximately 1.1 kg CO2e, as estimated using the Machine Learning Impact calculator.

Technical Specifications

Model Architecture and Objective

Model Architecture: LoRA fine-tuned version of the Mistral-8B model, which is a transformer-based architecture optimized for instruction-following tasks.
Objective: To predict daily trading decisions (buy/sell/hold) for BTC/ETH based on financial news and cryptocurrency price data.

Compute Infrastructure

Hardware

Training Hardware: 4x NVIDIA A100 GPUs with 40GB of VRAM.
Inference Hardware: Can be run on a single GPU with at least 24GB of VRAM.

Software

Framework: PEFT (Parameter Efficient Fine-Tuning) with Hugging Face Transformers.
Deep Learning Libraries: PyTorch, Hugging Face Transformers.
Python Version: 3.10

Citation

If you use this model in your work, please cite it as follows:

BibTeX:

@misc{CryptoTrader-LM,
  author = {300k/ns team},
  title = {CryptoTrader-LM: A LoRA-tuned Ministral-8B Model for Cryptocurrency Trading Decisions},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/agarkovv/CryptoTrader-LM}},
}

APA:

300k/ns team. (2024). CryptoTrader-LM: A LoRA-tuned Ministral-8B Model for Cryptocurrency Trading Decisions. Hugging Face. https://huggingface.co/agarkovv/CryptoTrader-LM

Glossary [optional]

LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that reduces the number of trainable parameters by transforming the large matrices in transformers into low-rank decompositions, allowing for quicker and more memory-efficient fine-tuning.
BTC: The ticker symbol for Bitcoin, a decentralized cryptocurrency.
ETH: The ticker symbol for Ethereum, a decentralized cryptocurrency and blockchain platform.
Sharpe Ratio (SR): A measure of risk-adjusted return, used to evaluate the performance of an investment or trading strategy.
PnL (Profit and Loss): The financial gain or loss realized from trading over a specific time period.

More Information [optional]

For more information on the training process, model performance, or any specific details, please contact the model authors.

Model Card Authors [optional]

300k/ns
Contact via Telegram: @allocfree

Model Card Contact

For any inquiries, please contact via Telegram: @allocfree

Framework Versions

PEFT: v0.13.2
Transformers: v4.33.3
PyTorch: v2.1.0