metadata
license: apache-2.0
language:
- en
datasets:
- abideen/Cosmopedia-100k-pretrain
metrics:
- accuracy
library_name: transformers
Model Card for Model ID
Model Name
Luxeai-anu-1-bit-70M
Model Description
The Luxeai-anu-1-bit-70M Large Language Model (LLM) is my first trial to implement one-bit LLM based on the original paper - "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". I have taken the pre-trained Mistral-7B-v0.3 and abideen/Cosmopedia-100k-pretrain dataset. I used Microsoft Azure Standard_NC6s_v3 6 cores, 112GB RAM, 736GB storage 1 x NVIDIA Tesla V100 to train this initial model. I will be training on a much bigger dataset once I get a sponshorship for a 8x DGX System. I have tested on a sub-set of the same dataset.
Intended Use
- Task: text generation
How to Use
Please follow the below code to run and test it in Python Jupyter Notebook
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers.models.llama.modeling_llama import *
# Load the model
model = "arunb74/Luxeai-anu-1-bit-70M"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
# Create a text generation pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device_map="auto"
)
prompt = "The LISA Pathfinder scientific collaboration will meet in Trento"
sequences = pipe(
f"<s>[INST] {prompt} [/INST]",
do_sample=True,
max_new_tokens=100,
temperature=0.7,
top_k=50,
top_p=0.95,
num_return_sequences=1,
)
print(sequences[0]['generated_text'])
"""
The output will be as follows - <s>[INST] The LISA Pathfinder scientific collaboration will meet in Trento [/INST]
The LISA Pathfinder Biology, a leading provider of biochemistry and molecular biology, provides a comprehensive understanding of the mechanisms and mechanisms of the LISA pathways. The LISA Pathfinder Biology, a researcher specializing in molecular biology, is a clinical trial of the disease, and its pathophysiology, and a combination of the most commonly used and widely used treatments. It is a relatively simple procedure that involves two steps.
# I need community members to help me further for feedback, suitable dataset for further training, testing, evaluation.
"""