|
--- |
|
extra_gated_prompt: |- |
|
By accessing TabPFN, you agree to: |
|
1. Not use the model in ways that could harm individuals or communities |
|
2. Comply with all applicable laws and regulations |
|
3. Properly cite the model and its creators in any resulting publications |
|
4. Report any discovered vulnerabilities or safety concerns to Prior Labs |
|
extra_gated_fields: |
|
Organization: |
|
type: text |
|
required: true |
|
description: Company or institution you represent |
|
Role: |
|
type: text |
|
required: true |
|
description: Your role in the organization |
|
Country: |
|
type: country |
|
required: true |
|
description: Country where you or your organization is based |
|
Intended Use: |
|
type: select |
|
required: true |
|
options: |
|
- Academic Research |
|
- Education/Teaching |
|
- Commercial Evaluation |
|
- Non-profit Use |
|
- Personal Learning |
|
- label: Other |
|
value: other |
|
description: Primary intended use of TabPFN |
|
Industry: |
|
type: select |
|
required: true |
|
options: |
|
- Healthcare/Life Sciences |
|
- Financial Services |
|
- Technology |
|
- Education |
|
- Manufacturing |
|
- Research Institution |
|
- label: Other |
|
value: other |
|
description: Your industry sector |
|
Dataset Size: |
|
type: select |
|
required: true |
|
options: |
|
- <1000 rows |
|
- 1000-10000 rows |
|
- 10000-100000 rows |
|
- '>100000 rows' |
|
description: Typical size of datasets you plan to use |
|
License Agreement: |
|
type: checkbox |
|
required: true |
|
label: >- |
|
I agree to the terms of the non-commercial license for research and |
|
evaluation |
|
Contact Permission: |
|
type: checkbox |
|
required: false |
|
label: Prior Labs may contact me about my use case and provide support (optional) |
|
pipeline_tag: tabular-classification |
|
--- |
|
|
|
# Model Card for TabPFN-v2 |
|
|
|
TabPFN is a transformer-based foundation model for tabular data that leverages prior-data based learning to achieve strong performance on small tabular datasets without requiring task-specific training. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
TabPFN is a novel approach to tabular data modeling that uses transformer architectures combined with prior knowledge injection to create a foundation model specifically designed for tabular data tasks. |
|
|
|
- **Developed by:** Prior Labs |
|
- **Model type:** Transformer-based foundation model for tabular data |
|
- **Language(s):** Python |
|
- **License:** Dual licensing - Open source for research/non-commercial use |
|
- **Finetuned from model:** Custom architecture, trained from scratch |
|
|
|
### Model Sources |
|
|
|
- **Repository:** https://github.com/priorlabs/tabpfn |
|
- **Paper:** [More Information Needed] |
|
- **Demo:** Available via API access |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
TabPFN can be directly used for: |
|
- Classification tasks on small to medium-sized tabular datasets |
|
- Automated machine learning workflows |
|
- Quick prototyping and baseline model creation |
|
- Transfer learning applications for tabular data |
|
|
|
### Downstream Use |
|
|
|
The model can be used as: |
|
- A feature extractor for downstream tasks |
|
- A foundation for transfer learning on domain-specific tabular data |
|
- A component in automated ML pipelines |
|
- A baseline model for benchmarking |
|
|
|
### Out-of-Scope Use |
|
|
|
- The model is not designed for: |
|
- Very large datasets (currently optimized for smaller datasets) |
|
- Non-tabular data formats |
|
- Time series forecasting |
|
- Direct regression tasks |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- Performance may vary based on dataset size and characteristics |
|
- Model behavior heavily depends on the quality and representativeness of training data |
|
- May not perform optimally on highly imbalanced datasets |
|
- Resource intensive for very large datasets |
|
|
|
### Recommendations |
|
|
|
- Use on datasets with clear structure and well-defined features |
|
- Validate model outputs especially for sensitive applications |
|
- Consider dataset size limitations when applying the model |
|
- Monitor performance across different subgroups in the data |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from tabpfn import TabPFNClassifier |
|
|
|
# Initialize model |
|
classifier = TabPFNClassifier() |
|
|
|
# Fit and predict |
|
classifier.fit(X_train, y_train) |
|
predictions = classifier.predict(X_test) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** Mixed precision training |
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Metrics |
|
|
|
- Classification accuracy |
|
- F1 score |
|
- ROC-AUC |
|
- Precision-Recall curves |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
## Environmental Impact |
|
|
|
- **Hardware Type:** [More Information Needed] |
|
- **Hours used:** [More Information Needed] |
|
- **Cloud Provider:** [More Information Needed] |
|
- **Compute Region:** [More Information Needed] |
|
- **Carbon Emitted:** [More Information Needed] |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
TabPFN uses a transformer-based architecture specifically designed for tabular data processing, with modifications to handle varying input sizes and feature types. |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
Recommended minimum specifications: |
|
- CPU: Modern multi-core processor |
|
- RAM: 16GB+ |
|
- GPU: Optional, CPU inference supported |
|
|
|
#### Software |
|
|
|
- Python 3.7+ |
|
- Key dependencies: PyTorch, NumPy, Pandas |
|
|
|
## Model Card Contact |
|
|
|
For more information, contact Prior Labs. |