Seq2Seq Transformer for Function Call Generation

This repository hosts a custom-trained Seq2Seq Transformer model designed to convert natural language queries into corresponding function call representations. The model leverages an encoder-decoder Transformer architecture built from scratch using PyTorch and supports versioning to facilitate continuous improvements and updates.

Model Description

Architecture:
A full Transformer-based encoder-decoder model with multi-head attention and feed-forward layers. The model incorporates sinusoidal positional encoding to capture sequential information.
Tokenization & Vocabulary:
The model uses a custom-built vocabulary derived from training data. Special tokens include:
- <pad> for padding,
- <bos> to denote the beginning of a sequence,
- <eos> to denote the end of a sequence, and
- <unk> for unknown tokens.
Training:
Trained on paired examples of natural language inputs and function call outputs using a cross-entropy loss function. The training process supports versioning, where each training run increments the model version, and each version is stored for reproducibility and comparison.
Inference:
Greedy decoding is used to generate output sequences from an input sequence. Users can specify the model version to load the appropriate model for inference.

Intended Use

This model is primarily intended for:

Automated function call generation from natural language instructions.
Enhancing natural language interfaces for code generation or task automation.
Integrating into virtual assistants and chatbots to execute backend function calls.

Limitations

Data Dependency:
The model's performance relies on the quality and representativeness of the training data. Out-of-distribution inputs may yield suboptimal or erroneous outputs.
Decoding Strategy:
The current greedy decoding approach may not always produce the most diverse or optimal outputs. Alternative strategies (e.g., beam search) might be explored for improved results.
Generalization:
While the model works well on data similar to its training examples, its performance may degrade on substantially different domains or complex instructions.

Training Data

The model is trained on custom datasets comprising natural language inputs paired with function call outputs. Users are encouraged to fine-tune the model on domain-specific data to maximize its utility in real-world applications.

How to Use

Loading a Specific Version:
The system supports multiple versions. Specify the model version when performing inference to load the desired model.
Inference:
Provide an input text (e.g., "Book me a flight from London to NYC") and the model will generate the corresponding function call output.
Publishing:
The model can be published to the Hugging Face Hub with version-specific details for reproducibility and community sharing.