--- library_name: transformers tags: - MOE - GPT-2 - tabular - generative - causalLM pipeline_tag: tabular-regression --- # Tabby Model Card Tabby is a post-training architecture modification for Transformer-based large language models, enabling their use for **tabular dataset synthesis**. This specific demo checkpoint is based on [DistilGPT-2](https://huggingface.co/distilbert/distilgpt2) and fine-tuned on the [UCI Diabetes dataset](https://www.openml.org/search?type=data&sort=version&status=any&order=asc&exact_name=diabetes&id=37), using our novel Plain training method, as an example of Tabby’s tabular synthesis capabilities. Tabby enhances transformer-based LLMs by incorporating **Mixture of Experts (MoE) layers**, allowing for better modeling of structured data. 🐱 **Check out our [blog](https://sprocketlab.github.io/posts/2025/02/tabby/) or [paper](https://arxiv.org/abs/2503.02152) for more details and our [GitHub repo](https://github.com/soCromp/tabby) for code to use the model!** - **Developed by:** University of Wisconsin-Madison - **Shared by:** Sonia Cromp et al. - **Model type:** MoE-enhanced GPT-2-based causal language model for tabular data - **License:** MIT - **Finetuned from model:** [`distilgpt2`](https://huggingface.co/distilbert/distilgpt2) ## Uses ### How to Use [This demo notebook](https://github.com/soCromp/tabby/blob/main/demo.ipynb) loads the model checkpoint provided here and uses it to perform synthesis. To get started, follow the [environment setup instructions](https://github.com/soCromp/tabby/tree/main) in the GitHub readme. ### Direct Use This Tabby checkpoint can be used for: - High-fidelity synthesis of diabetes patients based on the [UCI Diabetes dataset](https://www.openml.org/search?type=data&sort=version&status=any&order=asc&exact_name=diabetes&id=37). - Data augmentation for training machine learning models on the UCI Diabetes dataset. - Comparison with other tabular synthesis approaches. ### Downstream Use - Further fine-tuning on other structured datasets (e.g., financial records, medical records, or survey data). - Generating synthetic tabular data for privacy-preserving machine learning. ## Bias, Risks, and Limitations This Tabby checkpoint inherits biases from the GPT-2 architecture and the UCI Diabetes dataset used for fine-tuning. Considerations include those common to all generative models, such as: - Bias in synthetic data feature distributions, particularly those that may reflect real-world disparities in the dataset. - Potential hallucinations that do not perfectly match real-world distributions. ## Citation If you use Tabby, please cite: ```bibtex @article{cromp2025tabby, title={Tabby: Tabular Data Synthesis with Language Models}, author={Sonia Cromp, Satya Sai Srinath Namburi GNVV, Mohammed Alkhudhayri, Catherine Cao, Samuel Guo, Nicholas Roberts, Frederic Sala}, journal={arXiv preprint arXiv:2503.02152}, year={2025}, url={https://arxiv.org/abs/2503.02152} } ``` ## Model Card Contact For questions or collaborations, please reach out to [Sonia Cromp](https://socromp.github.io) at [cromp@wisc.edu].