NexaAIDev
/

Octopus-v1-gemma-7B

@@ -8,6 +8,7 @@ tags:
 - on-device language model
 - android
 - conversational
 ---
 # Octopus V1: On-device language model for function calling of software APIs
@@ -20,16 +21,14 @@ tags:
 <a><img src="Octopus-logo.jpeg" alt="nexa-octopus" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
 </p>
-## Introducing Octopus-V2-2B
-Octopus-V2-2B, an advanced open-source language model with 2 billion parameters, represents Nexa AI's research breakthrough in the application of large language models (LLMs) for function calling, specifically tailored for Android APIs. Unlike Retrieval-Augmented Generation (RAG) methods, which require detailed descriptions of potential function arguments—sometimes needing up to tens of thousands of input tokens—Octopus-V2-2B introduces a unique **functional token** strategy for both its training and inference stages. This approach not only allows it to achieve performance levels comparable to GPT-4 but also significantly enhances its inference speed beyond that of RAG-based methods, making it especially beneficial for edge computing devices.
-📱 **On-device Applications**:  Octopus-V2-2B is engineered to operate seamlessly on Android devices, extending its utility across a wide range of applications, from Android system management to the orchestration of multiple devices. Further demonstrations of its capabilities are available on the [Nexa AI Research Page](https://nexaai.github.io/octopus), showcasing its adaptability and potential for on-device integration.
-🚀 **Inference Speed**: When benchmarked, Octopus-V2-2B demonstrates a remarkable inference speed, outperforming the combination of "Llama7B + RAG solution" by a factor of 36X on a single A100 GPU. Furthermore, compared to GPT-4-turbo (gpt-4-0125-preview), which relies on clusters A100/H100 GPUs, Octopus-V2-2B is 168% faster. This efficiency is attributed to our **functional token** design.
-🐙 **Accuracy**: Octopus-V2-2B not only excels in speed but also in accuracy, surpassing the "Llama7B + RAG solution" in function call accuracy by 31%. It achieves a function call accuracy comparable to GPT-4 and RAG + GPT-3.5, with scores ranging between 98% and 100% across benchmark datasets.
-💪 **Function Calling Capabilities**: Octopus-V2-2B is capable of generating individual, nested, and parallel function calls across a variety of complex scenarios.
 ## Example Use Cases
 <p align="center" width="100%">

 - on-device language model
 - android
 - conversational
+inference: false
 ---
 # Octopus V1: On-device language model for function calling of software APIs
 <a><img src="Octopus-logo.jpeg" alt="nexa-octopus" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
 </p>
+## Introducing Octopus-V1
+Octopus-V1, a series of advanced open-source language models with parameters ranging from 2B to 7B, represents Nexa AI's breakthrough in AI-driven software API interactions. Developed through meticulous fine-tuning using a specialized dataset from 30k+ RapidHub APIs, Octopus-V1 excels in understanding API structures and syntax. The models leverage conditional masking techniques to ensure precise, format-compliant API calls without compromising inference speed. A novel benchmark introduced alongside Octopus-V1 assesses its superior performance against GPT-4 in software API usage, signifying a leap forward in automating software development and API integration.
+📱 **Support 30k+ APIs from RapidAPI Hub**: Octopus leverages an extensive dataset derived from over 30,000 popular APIs on RapidAPI Hub. This rich dataset ensures broad coverage and understanding of diverse software API interactions, enhancing the model's utility across various applications.
+🐙 **Accuracy**: Fine-tuning on models with 2B, 3B, and 7B parameters yields Octopus, which surpasses GPT-4 in API call accuracy. The introduction of a conditional mask further refines its precision, making Octopus highly reliable for software API interactions.
+🎯 **Conditional Masking**: A novel conditional masking technique is employed to ensure outputs adhere to the desired formats and reduce errors. This approach not only maintains fast inference speeds but also substantially increases the model's accuracy in generating function calls and parameters.
 ## Example Use Cases
 <p align="center" width="100%">