Gemma Causal Language Model (GemmaCausalLM)
This repository contains the configuration and metadata for the GemmaCausalLM
model, a powerful causal language model designed for advanced NLP tasks such as text generation, dialogue systems, and autoregressive language modeling.
Model Overview
1. Core Architecture
The GemmaCausalLM
combines a robust backbone with an intelligent preprocessor, providing an efficient setup for NLP tasks. Below are its key components:
Backbone (GemmaBackbone
):
- Vocabulary Size: 256,000 tokens.
- Model Depth: 26 layers.
- Attention Configuration:
- Query Heads: 8
- Key-Value Heads: 4
- Head Dimension: 256
- Sliding Window Attention: Enabled (window size: 4096).
- Dimensions:
- Hidden Dimension: 2,304
- Intermediate Dimension: 18,432
- Normalization and Regularization:
- Layer Normalization (Epsilon: 1e-6).
- Post-feedforward and post-attention normalization enabled.
- Soft Caps:
- Final Logit Soft Cap: 30.0
- Attention Logit Soft Cap: 50.0
- Dropout: Disabled.
Preprocessor (GemmaCausalLMPreprocessor
):
- Tokenizer (
GemmaTokenizer
):- Configuration File:
tokenizer.json
. - Adds BOS (Beginning of Sequence) and EOS (End of Sequence) tokens.
- Configuration File:
- Sequence Length: 512.
- Data Type:
- Float32 for preprocessor computations.
- Int32 for tokenized inputs.
Metadata
- Keras Version:
3.5.0
- Keras Hub Version:
0.17.0
- Parameter Count:
2,617,270,528
(2.6 billion parameters). - Date Saved:
2024-11-18@13:59:51
This metadata ensures reproducibility and provides insights into the complexity of the model.
Applications
This model is designed for tasks requiring causal language modeling, including but not limited to:
- Text Generation.
- Dialogue Systems.
- Autoregressive NLP tasks.
Model Files
- Backbone Configuration:
The core architecture details for
GemmaBackbone
. - Preprocessor Configuration: Tokenization and sequence preprocessing setup.
- Tokenizer File:
tokenizer.json
. - Preprocessor File:
preprocessor.json
.
Setup and Usage
Dependencies: Ensure the following libraries are installed:
pip install keras keras_hub
Model Loading: The model can be loaded as follows:
from keras_hub.src.models.gemma.gemma_causal_lm import GemmaCausalLM model = GemmaCausalLM.from_config(config_file="path/to/config.json")
Inference: Use the preprocessor to tokenize input text and generate predictions with the model.
preprocessor = model.get_preprocessor() inputs = preprocessor.tokenize("Your input text here.") outputs = model.predict(inputs) print(outputs)
Contributions
Feel free to contribute to this repository by improving configurations, extending functionality, or reporting issues.
License
This project is licensed under the MIT License. See the LICENSE file for details.
- Downloads last month
- 117
Model tree for p2kalita/PolicyLens
Base model
google/gemma-2-2b