Maximofn's picture
feat(DOCUMENTATION): :book: Update README.md to enhance API documentation, including environment variables, deployment instructions, and error handling examples.
91a36bb
|
raw
history blame contribute delete
8.46 kB
---
title: IriusRiskTestChallenge
emoji: πŸš€
colorFrom: green
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
short_description: LLM backend for IriusRisk Tech challenge
---
# IriusRisk test challenge
This project implements a FastAPI API that uses LangChain and LangGraph to generate text with the `SmolLM2-1.7B-Instruct` model from HuggingFace. I have chosen that model so that I could deploy it on a free GPU-only backend from Hugging Face for this test. The API includes security features such as API Key authentication and rate limiting to protect against abuse.
## API URLs
- **Production**: `https://maximofn-iriusrisktestchallenge.hf.space`
- **Local Development**: `http://localhost:7860`
## Main Features
- πŸ€– Text generation using SmolLM2-1.7B-Instruct
- πŸ“ Text summarization capabilities
- πŸ”‘ API Key authentication
- ⚑ Rate limiting for abuse protection
- πŸ”„ Conversation thread support
- πŸ“š Interactive documentation with Swagger and ReDoc
## Configuration
### Environment Variables
For local deployment, create a `.env` file in the project root with the following variables:
```env
API_KEY="your_secret_api_key"
```
## Deployment
### In HuggingFace Spaces
This project is designed to run in HuggingFace Spaces. To configure it:
1. Create a new Space in HuggingFace with blank Docker SDK
2. Add all the files to the Space
3. Configure the API_KEY in the Space's environment secrets
### Local Docker Deployment
For local deployment:
1. Clone this repository
2. Create the `.env` file with your API_KEY
3. Install the dependencies:
```bash
pip install -r requirements.txt
```
### Local Docker Deployment
For local Docker deployment:
1. Clone the repository
2. Create the `.env` file with your API_KEY
3. Build the Docker image:
```bash
docker build -t iriusrisk-test-challenge .
```
4. Run the Docker container:
```bash
docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
```
## Local Execution
```bash
uvicorn app:app --reload
```
The API will be available at `http://localhost:8000`.
## Local Docker Execution
```bash
docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
```
The API will be available at `http://localhost:8000`.
## Endpoints
### GET `/`
Welcome endpoint that returns a greeting message.
- Rate limit: 10 requests per minute
### POST `/generate`
Endpoint to generate text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication
**Request parameters:**
```json
{
"query": "Your question here",
"thread_id": "optional_thread_identifier",
"system_prompt": "optional_system_prompt"
}
```
### POST `/summarize`
Endpoint to summarize text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication
**Request parameters:**
```json
{
"text": "Text to summarize",
"thread_id": "optional_thread_identifier",
"max_length": 200
}
```
## Authentication
The API uses API Key authentication. You must include your API Key in the `X-API-Key` header for all protected endpoint requests.
Example:
```bash
# Production
curl -X POST "https://maximofn-iriusrisktestchallenge.hf.space/generate" \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"query": "What is FastAPI?"}'
# Local development
curl -X POST "http://localhost:7860/generate" \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"query": "What is FastAPI?"}'
```
## Rate Limiting
To protect the API against abuse, the following limits have been implemented:
- Endpoint `/`: 10 requests per minute
- Endpoint `/generate`: 5 requests per minute
- Endpoint `/summarize`: 5 requests per minute
When these limits are exceeded, the API will return a 429 (Too Many Requests) error.
## API Documentation
The interactive API documentation is available at:
- Swagger UI:
- Production: `https://maximofn-iriusrisktestchallenge.hf.space/docs`
- Local: `http://localhost:7860/docs`
- ReDoc:
- Production: `https://maximofn-iriusrisktestchallenge.hf.space/redoc`
- Local: `http://localhost:7860/redoc`
## Error Handling
The API includes error handling for the following situations:
- Error 401: API Key not provided
- Error 403: Invalid API Key
- Error 429: Rate limit exceeded
- Error 500: Internal server error
## Code Examples
### Python
Here are some examples of how to use the API with Python:
#### Text Generation
```python
import requests
# API configuration
API_URL = "https://maximofn-iriusrisktestchallenge.hf.space" # Production URL
# API_URL = "http://localhost:7860" # Local development URL
API_KEY = "your_api_key" # Replace with your API key
# Headers for authentication
headers = {
"X-API-Key": API_KEY,
"Content-Type": "application/json"
}
# Generate text
def generate_text(query, thread_id="default", system_prompt=None):
url = f"{API_URL}/generate"
data = {
"query": query,
"thread_id": thread_id
}
# Add system prompt if provided
if system_prompt:
data["system_prompt"] = system_prompt
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
result = response.json()
return result["generated_text"]
else:
print(f"Error: {response.status_code}")
print(f"Details: {response.text}")
return None
except Exception as e:
print(f"Request failed: {str(e)}")
return None
# Example usage
query = "What are the main features of Python?"
result = generate_text(query)
if result:
print("Response:", result)
# Example with custom thread and system prompt
result = generate_text(
query="Explain object-oriented programming",
thread_id="programming_tutorial",
system_prompt="You are a programming teacher. Explain concepts in simple terms."
)
```
#### Text Summarization
```python
import requests
# Summarize text
def summarize_text(text, max_length=200, thread_id="default"):
url = f"{API_URL}/summarize"
data = {
"text": text,
"max_length": max_length,
"thread_id": thread_id
}
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
result = response.json()
return result["summary"]
else:
print(f"Error: {response.status_code}")
print(f"Details: {response.text}")
return None
except Exception as e:
print(f"Request failed: {str(e)}")
return None
# Example usage
text_to_summarize = """
Python is a high-level, interpreted programming language created by Guido van Rossum
and released in 1991. Python's design philosophy emphasizes code readability with
the use of significant whitespace. Its language constructs and object-oriented
approach aim to help programmers write clear, logical code for small and large-scale projects.
"""
summary = summarize_text(text_to_summarize, max_length=50)
if summary:
print("Summary:", summary)
```
#### Error Handling Example
```python
def make_api_request(endpoint, data):
url = f"{API_URL}/{endpoint}"
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
print("Rate limit exceeded. Please wait before making more requests.")
elif response.status_code in (401, 403):
print("Authentication error. Please check your API key.")
else:
print(f"Error {response.status_code}: {response.text}")
return None
except requests.exceptions.ConnectionError:
print("Connection error. Please check if the API server is running.")
except Exception as e:
print(f"Unexpected error: {str(e)}")
return None
```
These examples show how to:
- Make requests to different endpoints
- Handle authentication with API keys
- Process successful responses
- Handle various types of errors
- Use optional parameters like thread_id and system_prompt
Remember to:
- Replace `API_URL` with your actual API endpoint
- Set your API key in the headers
- Handle rate limiting by implementing appropriate delays between requests
- Implement proper error handling for your use case