nickmalhotra
/

ProjectIndus

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nickmalhotra commited on May 20

Commit

aee0eac

•

1 Parent(s): 4144863

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -348,6 +348,11 @@ The pre-training and fine-tuning of Project Indus LLM were conducted on high-per
 - **Nodes and GPUs**: Utilization of six nodes, each equipped with eight NVIDIA A100 GPUs. These GPUs are state-of-the-art for machine learning tasks and provide the necessary computational power to handle the large volumes of data and complex model architectures.
 - **Memory and Storage**: Each node was equipped with ample memory and storage to handle the datasets and model parameters efficiently. Specific configurations included 40 GB of GPU memory per card, essential for training large models.
 ##### Software
 The software environment was crucial for efficiently training and running the model. Key components included:

 - **Nodes and GPUs**: Utilization of six nodes, each equipped with eight NVIDIA A100 GPUs. These GPUs are state-of-the-art for machine learning tasks and provide the necessary computational power to handle the large volumes of data and complex model architectures.
 - **Memory and Storage**: Each node was equipped with ample memory and storage to handle the datasets and model parameters efficiently. Specific configurations included 40 GB of GPU memory per card, essential for training large models.
+Inference performance was tested on GPU as well as CPU.
+- **GPU**: On GPU NVIDIA GeForce RTX 3070  we have seen for 250-350 tokens inference time around ~5-10s.
+- **CPU**: On Intel CPU Xeon(R) Platinum 8580 we have seen performance comparable to GPU with throughput of > 30 token/second.
 ##### Software
 The software environment was crucial for efficiently training and running the model. Key components included: