Spaces:
Runtime error
Runtime error
title: tt-creators | |
app_file: creators.py | |
sdk: gradio | |
sdk_version: 5.20.0 | |
# TikTok Creator Analyzer | |
A Gradio-based tool for analyzing TikTok creator profiles from CSV files. | |
## Features | |
- Efficiently loads and processes millions of TikTok creator profiles | |
- Caches data in Parquet format for faster subsequent loads | |
- Tracks processed files to avoid reprocessing the same data | |
- Incrementally updates the database when new files are added | |
- Advanced search with multiple filters: | |
- Follower count range (min/max) | |
- Video count range (min/max) | |
- Keywords in signature | |
- Region filter | |
- "Has Email" filter to find profiles with contact information | |
- Download search results as CSV | |
- Network accessible interface (binds to 0.0.0.0) | |
- Shareable via temporary public URL | |
## Installation | |
1. Install the required dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
2. Make sure your CSV files are in the correct location (`../data/tiktok_profiles/`) | |
## Usage | |
Run the script: | |
```bash | |
python creators.py | |
``` | |
The first run will: | |
1. Load all CSV files from the data directory | |
2. Combine them into a single dataset | |
3. Save the combined data as a Parquet file for faster loading in the future | |
4. Track which files have been processed to avoid duplicates | |
5. Launch a Gradio web interface for searching and analyzing the data | |
Subsequent runs will: | |
1. Load the existing data from the Parquet file | |
2. Check for new CSV files that haven't been processed yet | |
3. If new files exist, process only those files and update the database | |
4. Launch the Gradio interface with the updated data | |
The interface will be accessible from: | |
- Other machines on your network at: `http://your-ip-address:7860` | |
- A temporary public URL that will be displayed in the console (thanks to `share=True`) | |
## Maintenance | |
The application includes a Maintenance tab that shows: | |
- How many files have been processed | |
- When the database was last updated | |
- An option to force reload all files (useful if you suspect data corruption) | |
## Data Format | |
The CSV files should have the following columns: | |
- id | |
- unique_id | |
- follower_count | |
- nickname | |
- video_count | |
- following_count | |
- signature | |
- bio_link | |
- updated_at | |
- tt_seller | |
- region | |
- language | |
- url |