Spaces:
Sleeping
title: IriusRiskTestChallenge
emoji: π
colorFrom: green
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
short_description: LLM backend for IriusRisk Tech challenge
IriusRisk test challenge
This project implements a FastAPI API that uses LangChain and LangGraph to generate text with the SmolLM2-1.7B-Instruct
model from HuggingFace. I have chosen that model so that I could deploy it on a free GPU-only backend from Hugging Face for this test. The API includes security features such as API Key authentication and rate limiting to protect against abuse.
API URLs
- Production:
https://maximofn-iriusrisktestchallenge.hf.space
- Local Development:
http://localhost:7860
Main Features
- π€ Text generation using SmolLM2-1.7B-Instruct
- π Text summarization capabilities
- π API Key authentication
- β‘ Rate limiting for abuse protection
- π Conversation thread support
- π Interactive documentation with Swagger and ReDoc
Configuration
Environment Variables
For local deployment, create a .env
file in the project root with the following variables:
API_KEY="your_secret_api_key"
Deployment
In HuggingFace Spaces
This project is designed to run in HuggingFace Spaces. To configure it:
- Create a new Space in HuggingFace with blank Docker SDK
- Add all the files to the Space
- Configure the API_KEY in the Space's environment secrets
Local Docker Deployment
For local deployment:
- Clone this repository
- Create the
.env
file with your API_KEY - Install the dependencies:
pip install -r requirements.txt
Local Docker Deployment
For local Docker deployment:
- Clone the repository
- Create the
.env
file with your API_KEY - Build the Docker image:
docker build -t iriusrisk-test-challenge .
- Run the Docker container:
docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
Local Execution
uvicorn app:app --reload
The API will be available at http://localhost:8000
.
Local Docker Execution
docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
The API will be available at http://localhost:8000
.
Endpoints
GET /
Welcome endpoint that returns a greeting message.
- Rate limit: 10 requests per minute
POST /generate
Endpoint to generate text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication
Request parameters:
{
"query": "Your question here",
"thread_id": "optional_thread_identifier",
"system_prompt": "optional_system_prompt"
}
POST /summarize
Endpoint to summarize text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication
Request parameters:
{
"text": "Text to summarize",
"thread_id": "optional_thread_identifier",
"max_length": 200
}
Authentication
The API uses API Key authentication. You must include your API Key in the X-API-Key
header for all protected endpoint requests.
Example:
# Production
curl -X POST "https://maximofn-iriusrisktestchallenge.hf.space/generate" \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"query": "What is FastAPI?"}'
# Local development
curl -X POST "http://localhost:7860/generate" \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"query": "What is FastAPI?"}'
Rate Limiting
To protect the API against abuse, the following limits have been implemented:
- Endpoint
/
: 10 requests per minute - Endpoint
/generate
: 5 requests per minute - Endpoint
/summarize
: 5 requests per minute
When these limits are exceeded, the API will return a 429 (Too Many Requests) error.
API Documentation
The interactive API documentation is available at:
- Swagger UI:
- Production:
https://maximofn-iriusrisktestchallenge.hf.space/docs
- Local:
http://localhost:7860/docs
- Production:
- ReDoc:
- Production:
https://maximofn-iriusrisktestchallenge.hf.space/redoc
- Local:
http://localhost:7860/redoc
- Production:
Error Handling
The API includes error handling for the following situations:
- Error 401: API Key not provided
- Error 403: Invalid API Key
- Error 429: Rate limit exceeded
- Error 500: Internal server error
Code Examples
Python
Here are some examples of how to use the API with Python:
Text Generation
import requests
# API configuration
API_URL = "https://maximofn-iriusrisktestchallenge.hf.space" # Production URL
# API_URL = "http://localhost:7860" # Local development URL
API_KEY = "your_api_key" # Replace with your API key
# Headers for authentication
headers = {
"X-API-Key": API_KEY,
"Content-Type": "application/json"
}
# Generate text
def generate_text(query, thread_id="default", system_prompt=None):
url = f"{API_URL}/generate"
data = {
"query": query,
"thread_id": thread_id
}
# Add system prompt if provided
if system_prompt:
data["system_prompt"] = system_prompt
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
result = response.json()
return result["generated_text"]
else:
print(f"Error: {response.status_code}")
print(f"Details: {response.text}")
return None
except Exception as e:
print(f"Request failed: {str(e)}")
return None
# Example usage
query = "What are the main features of Python?"
result = generate_text(query)
if result:
print("Response:", result)
# Example with custom thread and system prompt
result = generate_text(
query="Explain object-oriented programming",
thread_id="programming_tutorial",
system_prompt="You are a programming teacher. Explain concepts in simple terms."
)
Text Summarization
import requests
# Summarize text
def summarize_text(text, max_length=200, thread_id="default"):
url = f"{API_URL}/summarize"
data = {
"text": text,
"max_length": max_length,
"thread_id": thread_id
}
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
result = response.json()
return result["summary"]
else:
print(f"Error: {response.status_code}")
print(f"Details: {response.text}")
return None
except Exception as e:
print(f"Request failed: {str(e)}")
return None
# Example usage
text_to_summarize = """
Python is a high-level, interpreted programming language created by Guido van Rossum
and released in 1991. Python's design philosophy emphasizes code readability with
the use of significant whitespace. Its language constructs and object-oriented
approach aim to help programmers write clear, logical code for small and large-scale projects.
"""
summary = summarize_text(text_to_summarize, max_length=50)
if summary:
print("Summary:", summary)
Error Handling Example
def make_api_request(endpoint, data):
url = f"{API_URL}/{endpoint}"
try:
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
print("Rate limit exceeded. Please wait before making more requests.")
elif response.status_code in (401, 403):
print("Authentication error. Please check your API key.")
else:
print(f"Error {response.status_code}: {response.text}")
return None
except requests.exceptions.ConnectionError:
print("Connection error. Please check if the API server is running.")
except Exception as e:
print(f"Unexpected error: {str(e)}")
return None
These examples show how to:
- Make requests to different endpoints
- Handle authentication with API keys
- Process successful responses
- Handle various types of errors
- Use optional parameters like thread_id and system_prompt
Remember to:
- Replace
API_URL
with your actual API endpoint - Set your API key in the headers
- Handle rate limiting by implementing appropriate delays between requests
- Implement proper error handling for your use case