Model Card for Critical Thinker
Model Details
Model Description
The Critical Thinker model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct, optimized for developing and evaluating critical thinking and investigative reasoning skills. It is specifically trained on the Critical Thinking Synthetic Dataset, which focuses on logical reasoning, forensic investigation, and multi-layered decision-making scenarios.
- Developed by: Theeseus AI
- Funded by [optional]: Independent Research Grant
- Shared by: Theeseus AI
- Model type: Transformer-based Language Model
- Language(s): English
- License: Apache 2.0
- Finetuned from model: meta-llama/Llama-3.1-8B-Instruct
Model Sources
- Repository: Critical Thinker on HuggingFace
- Dataset: Critical Thinking Dataset
Uses
Direct Use
- Critical Thinking Assessments: Evaluating logical reasoning and problem-solving capabilities.
- Digital Forensics Investigations: Testing AI capabilities in analyzing logs, metadata, and cybersecurity incidents.
- AI Research: Studying and benchmarking multi-step reasoning and decision-making models.
Downstream Use
- Cybersecurity Training Programs: Training AI models to detect vulnerabilities, analyze logs, and identify attack patterns.
- Question-Answering Applications: Developing reasoning-focused QA systems for educational and research tools.
- AI Decision Support Systems: Building AI assistants for forensic investigations and cybersecurity monitoring.
Out-of-Scope Use
- Tasks requiring real-time decision-making under high constraints.
- Applications involving medical diagnosis or legal interpretations without human oversight.
Bias, Risks, and Limitations
Known Limitations
- May misinterpret ambiguous evidence or scenarios that lack sufficient context.
- Performance may degrade when analyzing multi-lingual inputs as the training data is primarily in English.
- Model output can include false positives when assessing evidence in forensic cases.
Recommendations
- Use outputs as supporting evidence, not definitive conclusions.
- Perform manual validation for high-stakes decision-making.
- Implement bias-checking algorithms when deploying in production environments.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("theeseus-ai/CriticalThinker")
model = AutoModelForCausalLM.from_pretrained("theeseus-ai/CriticalThinker")
input_text = "Investigate unusual logins from multiple IP addresses in a network."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
Training Details
Training Data
The model is fine-tuned on the Critical Thinking Synthetic Dataset available at HuggingFace. The dataset simulates digital forensics, cybersecurity incidents, and logical deduction scenarios.
Training Procedure
Preprocessing
- Cleaned and validated JSONL format.
- Schema enforcement to ensure consistency.
Hyperparameters
- Optimizer: AdamW
- Batch Size: 16
- Learning Rate: 2e-5
- Epochs: 3
- Precision: bfloat16 (bf16) mixed precision
Compute Resources
- Hardware: NVIDIA A100 (80 GB) GPU
- Training Time: ~24 hours
Evaluation
Testing Data, Factors & Metrics
Testing Data
The dataset was split into 80% training, 10% validation, and 10% testing sets.
Metrics
- Accuracy: Measures correctness of predictions.
- F1 Score: Evaluates precision and recall balance.
- Log-likelihood Loss: Assesses model confidence and robustness.
Results
- Accuracy: 89.4%
- F1 Score: 88.7%
- Log-likelihood Loss: 0.21
Summary
The model demonstrates high performance in logical deduction tasks and multi-choice reasoning problems. It is particularly effective in identifying patterns in digital forensics scenarios.
Environmental Impact
Carbon emissions estimated using the Machine Learning Impact calculator:
- Hardware Type: NVIDIA A100 GPU
- Hours Used: 24
- Cloud Provider: AWS
- Compute Region: US-East
- Carbon Emitted: ~30 kg CO2eq
Technical Specifications
Model Architecture and Objective
- Architecture: Transformer-based autoregressive model (decoder-only).
- Objective: Minimize cross-entropy loss for sequence prediction.
Compute Infrastructure
- Hardware: NVIDIA A100 (80 GB) GPUs.
- Frameworks: PyTorch and HuggingFace Transformers.
Citation
If you use this model, please cite it as follows:
@model{critical_thinker,
author = {Theeseus AI},
title = {Critical Thinker Model},
year = {2024},
version = {1.0},
publisher = {HuggingFace Models},
url = {https://huggingface.co/datasets/theeseus-ai/CriticalThinker}
}
Contact
For questions or contributions, contact:
- Email: [email protected]
- LinkedIn: Theeseus
- Downloads last month
- 2