YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Mia LLM - Personal AI Ecosystem
Mia LLM is an advanced AI-powered personal assistant built on the Mistral-7B-Instruct-v0.3 model. It is designed to enhance daily life through translation, personal assistance, and interactive services.
Features
Text and Voice Translation
- Text-to-Text Translation: Translate messages into 96 languages.
- Voice-to-Voice Translation: Enable seamless communication through voice translation.
- Real-Time Translation: Provide live translations during audio and video calls.
Personal Assistant Services
- Appointment reminders.
- Shopping list creation.
- Health and fitness recommendations.
- Weather updates and navigation.
- Bill payment reminders.
Document and Multimedia Translation
- PDF and Text Translation: Translate documents into multiple languages.
- Video Dubbing: Add voiceovers to videos in different languages.
- Audio File Translation: Convert audio recordings into other languages.
Advanced Analysis Capabilities
- Sentiment Analysis: Analyze emotional tones in messages.
- Body Language and Facial Expression Analysis: Evaluate video call interactions.
- Language Processing and Accuracy Optimization: Ensure clarity and correctness in communication.
Interactive Services
- Food ordering.
- Taxi booking.
- Ticket and hotel reservations.
- Flight tracking.
Adaptive Learning Capabilities
- Learn user habits for personalized services.
- Continuously update with new languages and content.
Architecture
Mia LLM is based on the Mistral-7B-Instruct-v0.3 architecture, which leverages a Transformer-based structure for high efficiency and accuracy. The model is fine-tuned with instruction tuning for optimal responsiveness.
1. Introduction to Models
- LLama 2 70B, GPT-3.5, and Mixtral 8x7B: These are large language models with varying capacities. The numbers in their names (e.g., 70B) represent the number of parameters in the model. A higher number of parameters generally indicates a greater learning and reasoning capacity.
2. Benchmarks and Test Datasets
Test datasets are used to evaluate the performance of models across different domains.
MMLU (Massive Multitask Language Understanding)
- Description: This test evaluates models' knowledge and reasoning abilities across 57 diverse topics using multiple-choice questions.
- Results:
- LLama 2: 69.9%
- GPT-3.5: 70.0%
- Mixtral: 70.6%
- Mixtral achieved the highest score in this benchmark, though the margin was small.
HellaSwag
- Description: A test of logical reasoning and coherence. Models are fine-tuned with 10 examples and must predict the most plausible continuation of a text.
- Results:
- LLama 2: 87.1% (Best performance)
- GPT-3.5: 85.5%
- Mixtral: 86.7%
ARC Challenge
- Description: Consists of challenging multiple-choice questions, often requiring scientific and academic knowledge. Models are trained with 25 examples.
- Results:
- LLama 2: 85.1%
- GPT-3.5: 85.2%
- Mixtral: 85.8% (Best performance)
WinoGrande
- Description: Evaluates natural language understanding by assessing a model's ability to resolve ambiguities and determine correct references in sentences.
- Results:
- LLama 2: 83.2% (Best performance)
- GPT-3.5: 81.6%
- Mixtral: 81.2%
MBPP (Multi-turn Programming Benchmark for Python)
- Description: Measures programming capabilities by testing the accuracy of Python code generation.
- Results:
- LLama 2: 49.8%
- GPT-3.5: 52.2%
- Mixtral: 60.7% (Significantly superior)
GSM-8K
- Description: A math problem-solving benchmark that tests models on 8th-grade-level mathematics problems.
- Results:
- LLama 2: 53.6%
- GPT-3.5: 57.1%
- Mixtral: 58.4% (Best performance)
MT Bench
- Description: A specialized benchmark for instruction-based models, testing their ability to understand and respond to prompts accurately.
- Results:
- LLama 2: 6.86
- GPT-3.5: 8.32 (Best performance)
- Mixtral: 8.30
3. General Analysis
- Mixtral demonstrated standout performance in several tests, particularly in MBPP, ARC Challenge, and GSM-8K benchmarks, where it outperformed the other models.
- GPT-3.5 showcased consistent results and excelled in MT Bench and other instruction-based evaluations.
- LLama 2, while not the leader in many benchmarks, maintained competitive and stable performance across the board.
4. Conclusion
The benchmarks highlight the strengths and weaknesses of these models, offering insights into their suitability for specific applications:
- Programming: Mixtral is a strong candidate due to its high MBPP score.
- Instruction-based tasks: GPT-3.5 is ideal for such use cases, as demonstrated by its MT Bench results.
- General-purpose usage: LLama 2 provides a balanced and versatile option.
Resource: https://www.e2enetworks.com/blog/mistral-7b-vs-llama2-which-performs-better-and-why#:~:text=Mistral%207B%20significantly%20outperforms%20Llama2,7B%20comes%20out%20on%20top
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("mektup-mia/Mia-LLM")
model = AutoModelForCausalLM.from_pretrained("mektup-mia/Mia-LLM")
input_text = "Translate this text into Spanish."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
---
license: apache-2.0
language:
- en
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
---
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.