Kuberwastaken commited on
Commit
7c2cd46
·
1 Parent(s): 5b5d66d

Updated to a basic README

Browse files
Files changed (1) hide show
  1. README.md +10 -198
README.md CHANGED
@@ -1,198 +1,10 @@
1
- ![Treat_Banner](static/images/readme-images/New_Treat_Banner.jpg)
2
-
3
- <h1 align="center">
4
- Trigger Recognition for Enjoyable and Appropriate Television
5
- </h1>
6
-
7
- <p align="center">
8
- <img src="https://img.shields.io/static/v1?label=Kuberwastaken&message=TREAT&color=blue&logo=github" alt="Kuberwastaken - TREAT">
9
- <img src="https://img.shields.io/badge/version-2.0-blue" alt="Version 2.0">
10
- <img src="https://img.shields.io/badge/License-Apache_2.0-blue" alt="License Apache 2.0">
11
- </p>
12
-
13
- I was tired of getting grossed out watching unexpected scenes in movies and TV and losing my appetite, that's why I created TREAT.
14
-
15
- The goal of this project is to empower viewers by forewarning them about potential triggers in the content they watch, making the viewing experience more enjoyable, inclusive, and appropriate for everyone.
16
-
17
- TREAT is a web application that uses natural language processing to analyze movie and TV show scripts, identifying potential triggers to help viewers make informed choices.
18
-
19
- ## Installation Instructions
20
- ### Prerequisites
21
- - Star the Repository to Show Your Support.
22
- - Clone the Repository to Your Local Machine:
23
-
24
- ```bash
25
- git clone https://github.com/Kuberwastaken/TREAT.git
26
- ```
27
-
28
- ### Hugging Face Login Instructions for Llama-3.2-1B Model
29
- To use the [Llama-3.2-1B model](https://huggingface.co/meta-llama/Llama-3.2-1B), which provides a 35% increase in accuracy and efficiency over the [previous model](https://github.com/Kuberwastaken/TREAT-CS50), you must request access for it, as it is a gated model.
30
-
31
- ![Request_Accesss_Page](static/images/readme-images/Request-Access-Page.jpg)
32
-
33
- 1. **Login to Hugging Face in Your Environment:**
34
-
35
- Run the following command in your terminal:
36
-
37
- ```bash
38
- huggingface-cli login
39
- ```
40
-
41
- Enter your Hugging Face access token when prompted.
42
-
43
- 2. **Download the Llama-3.2-1B Model:**
44
-
45
- The model will be downloaded automatically when running the script analysis for the first time, provided you have received access.
46
-
47
- ### Environment Setup
48
- To set up the development environment, you will need to create a virtual environment and install the necessary dependencies.
49
-
50
- 1. Create a Virtual Environment:
51
-
52
- ```bash
53
- python3 -m venv treat-env
54
- ```
55
-
56
- 2. Activate the Virtual Environment:
57
-
58
- ```bash
59
- source treat-env/bin/activate # On Unix or MacOS
60
- treat-env\Scriptsctivate # On Windows
61
- ```
62
-
63
- 3. Install Dependencies:
64
-
65
- Navigate to the project directory and run:
66
-
67
- ```bash
68
- pip install -r requirements.txt
69
- ```
70
-
71
- ## Project Usage
72
- 1. **Start the Flask Server:**
73
-
74
- ```bash
75
- python run.py
76
- ```
77
-
78
- 2. **Open Your Browser:**
79
-
80
- Navigate to `http://127.0.0.1:5000` to access the TREAT web interface.
81
-
82
- 3. **Analyze Scripts:**
83
-
84
- You can manually enter a script in the provided text area and click "Analyze Script."
85
-
86
- ## File Descriptions
87
- - **app.py:** The main Flask application file that handles routing.
88
-
89
- - **app/routes.py:** Contains the Flask routes for handling script uploads.
90
-
91
- - **app/model.py:** Includes the script analysis functions using the Llama-3.2-1B model.
92
-
93
- - **templates/index.html:** The main HTML file for the web interface.
94
-
95
- - **static/css/style.css:** Custom CSS for styling the web interface.
96
-
97
- - **static/js/app.js:** JavaScript for handling client-side interactions.
98
-
99
- ## Types of Triggers Detected
100
- The TREAT application focuses on identifying a variety of potential triggers in scripts, including but not limited to:
101
-
102
- - **Violence:** Scenes of physical aggression or harm.
103
-
104
- - **Self-Harm:** Depictions of self-inflicted injury.
105
-
106
- - **Death:** Depictions of death or dying characters.
107
-
108
- - **Sexual Content:** Any depiction or mention of sexual activity, intimacy, or behavior.
109
-
110
- - **Sexual Abuse:** Instances of sexual violence or exploitation.
111
-
112
- - **Gun Use:** Depictions of firearms and their usage.
113
-
114
- - **Gore:** Graphic depiction of injury, blood, or dismemberment.
115
-
116
- - **Vomit:** Depictions of vomiting or nausea-inducing content.
117
-
118
- - **Mental Health Issues:** Depictions of mental health struggles, including anxiety, depression, or disorders.
119
-
120
- - **Animal Cruelty:** Depictions of harm or abuse towards animals.
121
-
122
- These categories help address a very real-world problem by forewarning viewers about potentially distressing content, enhancing their viewing experience.
123
-
124
- Adding new categories is as simple as specifying a new category under model.py and utils.py
125
-
126
- ## Design Choices
127
-
128
- - **Inspiration:** I aimed for a simple and intuitive user experience, focusing on simplicity and ease of use. This decision stemmed from the need to create a tool that is easy to navigate for all users, regardless of background or age.
129
-
130
- - **Theme and Color Scheme:** The chosen theme and color scheme create a visually appealing and engaging environment. The chocolate and sweets theme is intended to stick to the TREAT theme and make the experience enjoyable and pleasant.
131
-
132
- - **Script Analysis:** The Llama-3.2-1B model by Meta was chosen for its increased accuracy (about 35% better) compared to the prior FLAN-T5 version. The decision was based on its ability to provide precise trigger recognition while being open source. As a new and advanced model, it enhances the script analysis capabilities significantly.
133
-
134
- ## How to Edit Sensitivity Settings or Prompts
135
-
136
- To adjust the sensitivity or modify the prompts used during script analysis, you can edit the `model.py` file located under the `treat/app` directory. This file contains various parameters and settings that control how triggers are identified and how the model processes the script. Here’s how you can adjust key settings:
137
-
138
- ### Key Parameters to Edit:
139
-
140
-
141
- - **max_new_tokens**: Controls the maximum number of tokens (words or characters) the model will generate in response to a prompt. A higher number can lead to more detailed results but may also increase processing time.
142
-
143
- - **temperature**: Controls the randomness of the model's responses. A value of 1.0 means standard behavior, while lower values (e.g., 0.2) make the model's output more deterministic and focused. Higher values (e.g., 1.5) allow for more creative or varied responses.
144
-
145
- - **top_p** (Nucleus Sampling): Controls how many of the top predicted tokens are considered during text generation. A value of 0.9 means the model will only consider the top 90% of predictions, cutting off the least likely options. A lower value can make the output more coherent but less creative.
146
-
147
- To adjust these, look for `max_new_tokens`, `temperature`, and `top_p` in the `model.py` file and set them to your desired values. For example:
148
- ```python
149
- max_new_tokens = 10 # Set the maximum number of tokens
150
- temperature = 0.7 # A moderate level for balanced responses
151
- top_p = 0.9 # Adjust the diversity of the output
152
- ```
153
-
154
- ### Chunk Size and Overlap:
155
- To handle long scripts effectively, the text is divided into chunks before being analyzed. Adjusting the chunk size and overlap helps control how much of the script is processed at once and ensures that the model doesn't miss context between chunks.
156
-
157
- - **chunk_size**: This parameter defines the length of each chunk in tokens. A larger chunk size can help the model capture more context, but may result in higher memory usage and processing time. A smaller chunk size may process faster but could lose context.
158
-
159
- - **overlap**: This parameter defines the number of overlapping tokens between consecutive chunks. Adding overlap ensures that context from the end of one chunk is carried over to the next chunk, preventing the model from losing important information.
160
-
161
- To adjust these, look for `chunk_size` and `overlap` in `model.py` and set them to your desired values:
162
- ```python
163
- chunk_size = 1000 # Set the length of each chunk (tokens)
164
- overlap = 200 # Set the number of overlapping tokens between chunks
165
- ```
166
-
167
- ### Adjusting Prompts:
168
- To modify the types of triggers detected by the model, you can edit the prompts under the `trigger_categories` section in `model.py`. This section allows you to adjust how the model recognizes various types of content in the script. Simply modify the prompts to suit your needs.
169
-
170
- ### Summary of Editable Parameters:
171
- - **max_new_tokens, temperature, top_p**: Control the length, randomness, and diversity of the model's output.
172
- - **chunk_size, overlap**: Control how the script is divided into chunks and how context is maintained between chunks.
173
- - **trigger_categories**: Adjust the prompts to change how triggers are identified in the script.
174
-
175
-
176
- ## To-Do List
177
- - Integration with an API to Directly Search Scripts by Name of Movies/Shows
178
-
179
- - Introduce multiple themes to allow users to customize the appearance of the application according to their preferences
180
-
181
- - Increasing speed and efficiency
182
-
183
- - Potentially host this online
184
-
185
- - Make the application mobile-friendly
186
-
187
- ## Open Source Contribution
188
- This repository is completely open source and free to contribute. I intend to keep this project alive and evolve it into a tool that's extremely usable for all. Contributions are welcome and highly encouraged to add new features, improve the user interface, or enhance the script analysis.
189
-
190
- ## Acknowledgements
191
- I would like to thank:
192
-
193
- - Meta AI: For developing and allowing me access to the Llama-3.2-1B model, a very critical component of this project.
194
-
195
- - Parasite (2019): For that unexpected jumpscare that ruined my appetite and ultimately inspired this project.
196
-
197
- ## License
198
- This project is licensed under the [Apache 2.0 License](https://github.com/Kuberwastaken/TREAT/blob/main/LICENSE).
 
1
+ ---
2
+ title: TREAT
3
+ emoji: 🎬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: "2.0" # Replace with the correct version if different
8
+ app_file: gradio_app.py
9
+ pinned: true
10
+ ---