Gemma Causal Language Model (GemmaCausalLM)

This repository contains the configuration and metadata for the GemmaCausalLM model, a powerful causal language model designed for advanced NLP tasks such as text generation, dialogue systems, and autoregressive language modeling.

Model Overview

1. Core Architecture

The GemmaCausalLM combines a robust backbone with an intelligent preprocessor, providing an efficient setup for NLP tasks. Below are its key components:

Backbone (`GemmaBackbone`):

Vocabulary Size: 256,000 tokens.
Model Depth: 26 layers.
Attention Configuration:
- Query Heads: 8
- Key-Value Heads: 4
- Head Dimension: 256
- Sliding Window Attention: Enabled (window size: 4096).
Dimensions:
- Hidden Dimension: 2,304
- Intermediate Dimension: 18,432
Normalization and Regularization:
- Layer Normalization (Epsilon: 1e-6).
- Post-feedforward and post-attention normalization enabled.
Soft Caps:
- Final Logit Soft Cap: 30.0
- Attention Logit Soft Cap: 50.0
Dropout: Disabled.

Preprocessor (`GemmaCausalLMPreprocessor`):

Tokenizer (GemmaTokenizer):
- Configuration File: tokenizer.json.
- Adds BOS (Beginning of Sequence) and EOS (End of Sequence) tokens.
Sequence Length: 512.
Data Type:
- Float32 for preprocessor computations.
- Int32 for tokenized inputs.

Metadata

Keras Version: 3.5.0
Keras Hub Version: 0.17.0
Parameter Count: 2,617,270,528 (2.6 billion parameters).
Date Saved: 2024-11-18@13:59:51

This metadata ensures reproducibility and provides insights into the complexity of the model.

Applications

This model is designed for tasks requiring causal language modeling, including but not limited to:

Text Generation.
Dialogue Systems.
Autoregressive NLP tasks.

Model Files

Backbone Configuration: The core architecture details for GemmaBackbone.
Preprocessor Configuration: Tokenization and sequence preprocessing setup.
Tokenizer File: tokenizer.json.
Preprocessor File: preprocessor.json.

Setup and Usage

Dependencies: Ensure the following libraries are installed:
```
pip install keras keras_hub
```

Model Loading: The model can be loaded as follows:

from keras_hub.src.models.gemma.gemma_causal_lm import GemmaCausalLM

model = GemmaCausalLM.from_config(config_file="path/to/config.json")

Inference: Use the preprocessor to tokenize input text and generate predictions with the model.

preprocessor = model.get_preprocessor()
inputs = preprocessor.tokenize("Your input text here.")
outputs = model.predict(inputs)
print(outputs)

Contributions

Feel free to contribute to this repository by improving configurations, extending functionality, or reporting issues.

License

This project is licensed under the MIT License. See the LICENSE file for details.

p2kalita
/

PolicyLens

Gemma Causal Language Model (GemmaCausalLM)

Model Overview

1. Core Architecture

Backbone (`GemmaBackbone`):

Preprocessor (`GemmaCausalLMPreprocessor`):

Metadata

Applications

Model Files

Setup and Usage

Contributions

License

Model tree for p2kalita/PolicyLens

Space using p2kalita/PolicyLens 1

Gemma Causal Language Model (GemmaCausalLM)

Model Overview

1. Core Architecture

Backbone (GemmaBackbone):

Preprocessor (GemmaCausalLMPreprocessor):

Metadata

Applications

Model Files

Setup and Usage

Contributions

License

Model tree for p2kalita/PolicyLens

Space using p2kalita/PolicyLens 1

Backbone (`GemmaBackbone`):

Preprocessor (`GemmaCausalLMPreprocessor`):