Spaces:
Sleeping
Sleeping
Adding main Files
Browse files- README.md +109 -12
- app.py +155 -0
- requirements.txt +7 -0
- summarizer.py +52 -0
README.md
CHANGED
@@ -1,12 +1,109 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# YouTube Summary AI
|
2 |
+
|
3 |
+
Transform YouTube videos into concise notes and summaries using fully local AI processing. This application runs entirely on your machine with no external API calls, ensuring complete privacy and security of your data.
|
4 |
+
|
5 |
+

|
6 |
+
|
7 |
+
## Key Features
|
8 |
+
|
9 |
+
- π **100% Local Processing**: All AI operations run on your machine
|
10 |
+
- No API keys required
|
11 |
+
- No data sent to external servers
|
12 |
+
- Complete privacy and security
|
13 |
+
- Use CPU or GPU to run the AI
|
14 |
+
- π― **Offline Capable**: Once models are downloaded, works without internet
|
15 |
+
- β‘ **Fast Processing**: Direct local inference without API latency
|
16 |
+
- π₯ Easy YouTube video URL input
|
17 |
+
- π Advanced audio extraction using yt-dlp
|
18 |
+
- π Local transcription using Whisper
|
19 |
+
- π€ Local AI summarization using LLaMA, Gemma or others LLM
|
20 |
+
- π Shareable summary links
|
21 |
+
- π» Clean and intuitive user interface
|
22 |
+
|
23 |
+
## How It Works
|
24 |
+
|
25 |
+
1. **Download**: Downloads YouTube video audio locally using yt-dlp
|
26 |
+
2. **Transcribe**: Processes audio using local Whisper model
|
27 |
+
3. **Summarize**: Generates summary using local LLaMA model
|
28 |
+
4. **All data stays on your machine!**
|
29 |
+
|
30 |
+
## Prerequisites
|
31 |
+
|
32 |
+
Before running the application, make sure you have the following installed:
|
33 |
+
- Python 3.8 or higher
|
34 |
+
- FFmpeg
|
35 |
+
- Ollama with LLaMA model
|
36 |
+
|
37 |
+
## Installation
|
38 |
+
|
39 |
+
1. Clone the repository
|
40 |
+
```bash
|
41 |
+
git clone https://github.com/Shivp1413/youtube-summary-ai.git
|
42 |
+
cd youtube-summary-ai
|
43 |
+
```
|
44 |
+
|
45 |
+
2. Create a virtual environment (recommended)
|
46 |
+
```bash
|
47 |
+
python -m venv venv
|
48 |
+
source venv/bin/activate # On Windows, use: venv\Scripts\activate
|
49 |
+
```
|
50 |
+
|
51 |
+
3. Install the required packages
|
52 |
+
```bash
|
53 |
+
pip install -r requirements.txt
|
54 |
+
```
|
55 |
+
|
56 |
+
4. Install and run Ollama with LLaMA model
|
57 |
+
```bash
|
58 |
+
# Install Ollama from https://ollama.ai
|
59 |
+
ollama pull llama3.1
|
60 |
+
```
|
61 |
+
|
62 |
+
## First-Time Setup
|
63 |
+
|
64 |
+
When you first run the application, it will:
|
65 |
+
1. Download the Whisper base model (~150MB) for local transcription
|
66 |
+
2. Use your local LLaMA model for summarization
|
67 |
+
3. All subsequent runs will use these local models
|
68 |
+
|
69 |
+
## Usage
|
70 |
+
|
71 |
+
1. Start the Streamlit application:
|
72 |
+
```bash
|
73 |
+
streamlit run app.py
|
74 |
+
```
|
75 |
+
|
76 |
+
2. Open your web browser and navigate to `http://localhost:8501`
|
77 |
+
|
78 |
+
3. Enter a YouTube URL and click "Generate Summary"
|
79 |
+
|
80 |
+
4. Share the summary using the generated link
|
81 |
+
|
82 |
+
## Project Structure
|
83 |
+
|
84 |
+
```
|
85 |
+
youtube-summary-ai/
|
86 |
+
βββ app.py # Main Streamlit application
|
87 |
+
βββ summarizer.py # Video processing and local AI logic
|
88 |
+
βββ requirements.txt # Project dependencies
|
89 |
+
βββ assets/ # Project assets
|
90 |
+
β βββ demo.gif # Application demo
|
91 |
+
βββ README.md # Project documentation
|
92 |
+
```
|
93 |
+
|
94 |
+
## Security Features
|
95 |
+
|
96 |
+
- β
No API keys needed
|
97 |
+
- β
No cloud services required
|
98 |
+
- β
All processing happens locally
|
99 |
+
- β
No data leaves your machine
|
100 |
+
- β
Full control over your data
|
101 |
+
- β
Works offline after initial setup
|
102 |
+
|
103 |
+
## Contributing
|
104 |
+
|
105 |
+
Contributions are welcome! Please feel free to submit a Pull Request.
|
106 |
+
|
107 |
+
## License
|
108 |
+
|
109 |
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
app.py
ADDED
@@ -0,0 +1,155 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
from summarizer import process_video
|
3 |
+
from urllib.parse import urlparse, parse_qs
|
4 |
+
import base64
|
5 |
+
|
6 |
+
# Page configuration
|
7 |
+
st.set_page_config(
|
8 |
+
page_title="YouTube Summary AI",
|
9 |
+
page_icon="π",
|
10 |
+
layout="centered",
|
11 |
+
initial_sidebar_state="expanded"
|
12 |
+
)
|
13 |
+
|
14 |
+
# Custom CSS
|
15 |
+
st.markdown("""
|
16 |
+
<style>
|
17 |
+
.main {
|
18 |
+
padding: 2rem 3rem;
|
19 |
+
}
|
20 |
+
.stButton>button {
|
21 |
+
width: 100%;
|
22 |
+
border-radius: 5px;
|
23 |
+
height: 3em;
|
24 |
+
background-color: #FF0000;
|
25 |
+
color: white;
|
26 |
+
}
|
27 |
+
.stTextInput>div>div>input {
|
28 |
+
border-radius: 5px;
|
29 |
+
}
|
30 |
+
.title-text {
|
31 |
+
font-size: 40px;
|
32 |
+
font-weight: bold;
|
33 |
+
text-align: center;
|
34 |
+
padding-bottom: 20px;
|
35 |
+
}
|
36 |
+
.subtitle-text {
|
37 |
+
font-size: 20px;
|
38 |
+
text-align: center;
|
39 |
+
color: #666666;
|
40 |
+
padding-bottom: 30px;
|
41 |
+
}
|
42 |
+
.success-text {
|
43 |
+
padding: 1rem;
|
44 |
+
border-radius: 5px;
|
45 |
+
background-color: #d4edda;
|
46 |
+
color: #155724;
|
47 |
+
margin-bottom: 1rem;
|
48 |
+
}
|
49 |
+
.stAlert > div {
|
50 |
+
padding: 1rem;
|
51 |
+
border-radius: 5px;
|
52 |
+
}
|
53 |
+
</style>
|
54 |
+
""", unsafe_allow_html=True)
|
55 |
+
|
56 |
+
def get_youtube_url_from_params():
|
57 |
+
query_params = st.query_params
|
58 |
+
url = query_params.get("url", "")
|
59 |
+
if url.startswith("https://shiappoutube.com"):
|
60 |
+
return url.replace("https://shiappoutube.com", "https://youtube.com")
|
61 |
+
return url
|
62 |
+
|
63 |
+
def display_youtube_thumbnail(url):
|
64 |
+
try:
|
65 |
+
# Extract video ID from URL
|
66 |
+
parsed_url = urlparse(url)
|
67 |
+
if parsed_url.netloc == 'youtu.be':
|
68 |
+
video_id = parsed_url.path[1:]
|
69 |
+
else:
|
70 |
+
video_id = parse_qs(parsed_url.query)['v'][0]
|
71 |
+
|
72 |
+
# Display thumbnail
|
73 |
+
thumbnail_url = f"https://img.youtube.com/vi/{video_id}/maxresdefault.jpg"
|
74 |
+
st.image(thumbnail_url, use_container_width=True)
|
75 |
+
except:
|
76 |
+
pass
|
77 |
+
|
78 |
+
def main():
|
79 |
+
# Header
|
80 |
+
st.markdown('<p class="title-text">YouTube Summary AI</p>', unsafe_allow_html=True)
|
81 |
+
st.markdown(
|
82 |
+
'<p class="subtitle-text">Transform YouTube videos into concise notes and summaries using AI</p>',
|
83 |
+
unsafe_allow_html=True
|
84 |
+
)
|
85 |
+
|
86 |
+
# Create two columns for the main content
|
87 |
+
col1, col2 = st.columns([2, 1])
|
88 |
+
|
89 |
+
with col1:
|
90 |
+
youtube_url = get_youtube_url_from_params()
|
91 |
+
if not youtube_url:
|
92 |
+
youtube_url = st.text_input(
|
93 |
+
"π₯ Enter YouTube Video URL",
|
94 |
+
placeholder="https://youtube.com/watch?v=..."
|
95 |
+
)
|
96 |
+
|
97 |
+
with col2:
|
98 |
+
if youtube_url:
|
99 |
+
process_button = st.button("π Generate Summary", type="primary")
|
100 |
+
else:
|
101 |
+
process_button = st.button("π Generate Summary", type="primary", disabled=True)
|
102 |
+
|
103 |
+
# Display info box
|
104 |
+
with st.expander("βΉοΈ How to use"):
|
105 |
+
st.markdown("""
|
106 |
+
1. Paste a YouTube video URL in the input field
|
107 |
+
2. Click 'Generate Summary' to process the video
|
108 |
+
3. Or simply replace 'youtube.com' with 'shiappoutube.com' in any YouTube URL
|
109 |
+
|
110 |
+
Example: `https://shiappoutube.com/watch?v=VIDEO_ID`
|
111 |
+
""")
|
112 |
+
|
113 |
+
if youtube_url:
|
114 |
+
display_youtube_thumbnail(youtube_url)
|
115 |
+
|
116 |
+
if process_button:
|
117 |
+
try:
|
118 |
+
# Create placeholder for streaming output
|
119 |
+
output_placeholder = st.empty()
|
120 |
+
|
121 |
+
with st.spinner("π― Downloading video..."):
|
122 |
+
# Process video and display streaming output
|
123 |
+
notes_and_summary = process_video(youtube_url)
|
124 |
+
|
125 |
+
# Display the results in a nice format
|
126 |
+
st.success("β
Processing complete!")
|
127 |
+
|
128 |
+
# Create tabs for different sections
|
129 |
+
tab1, tab2 = st.tabs(["π Notes & Summary", "π Share"])
|
130 |
+
|
131 |
+
with tab1:
|
132 |
+
st.markdown("### Generated Content")
|
133 |
+
st.write(notes_and_summary)
|
134 |
+
|
135 |
+
with tab2:
|
136 |
+
st.markdown("### Share this summary")
|
137 |
+
share_url = youtube_url.replace("youtube.com", "shiappoutube.com")
|
138 |
+
st.code(share_url, language="markdown")
|
139 |
+
st.markdown("Copy this URL to share the summary with others!")
|
140 |
+
|
141 |
+
except Exception as e:
|
142 |
+
st.error(f"β An error occurred: {str(e)}")
|
143 |
+
st.info("Please make sure you've entered a valid YouTube URL and try again.")
|
144 |
+
|
145 |
+
# Footer
|
146 |
+
st.markdown("---")
|
147 |
+
st.markdown(
|
148 |
+
"<div style='text-align: center; color: #666666;'>"
|
149 |
+
"Made with β€οΈ using Streamlit and AI"
|
150 |
+
"</div>",
|
151 |
+
unsafe_allow_html=True
|
152 |
+
)
|
153 |
+
|
154 |
+
if __name__ == "__main__":
|
155 |
+
main()
|
requirements.txt
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
streamlit>=1.28.0
|
2 |
+
yt-dlp>=2023.11.16
|
3 |
+
faster-whisper>=0.10.0
|
4 |
+
ollama>=0.1.5
|
5 |
+
urllib3>=2.0.7
|
6 |
+
python-dotenv>=1.0.0
|
7 |
+
FFmpeg>=1.4
|
summarizer.py
ADDED
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import yt_dlp
|
2 |
+
from faster_whisper import WhisperModel
|
3 |
+
from ollama import Client
|
4 |
+
|
5 |
+
def download_audio(url):
|
6 |
+
ydl_opts = {
|
7 |
+
'format': 'bestaudio/best',
|
8 |
+
'postprocessors': [{
|
9 |
+
'key': 'FFmpegExtractAudio',
|
10 |
+
'preferredcodec': 'wav',
|
11 |
+
'preferredquality': '192',
|
12 |
+
}],
|
13 |
+
'outtmpl': 'audio.%(ext)s',
|
14 |
+
}
|
15 |
+
try:
|
16 |
+
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
|
17 |
+
ydl.download([url])
|
18 |
+
return 'audio.wav'
|
19 |
+
except Exception as e:
|
20 |
+
raise Exception(f"Error downloading audio: {str(e)}")
|
21 |
+
|
22 |
+
def transcribe_audio(audio_file):
|
23 |
+
# Explicitly specify CPU device and compute type
|
24 |
+
model = WhisperModel("base", device="cpu", compute_type="int8")
|
25 |
+
segments, _ = model.transcribe(audio_file)
|
26 |
+
return " ".join([segment.text for segment in segments])
|
27 |
+
|
28 |
+
ollama_client = Client()
|
29 |
+
|
30 |
+
def generate_notes_and_summary_stream(transcript):
|
31 |
+
prompt = f"""
|
32 |
+
Based on the following transcript of a video, create:
|
33 |
+
1. A set of concise, informative notes
|
34 |
+
2. A brief summary of the main points
|
35 |
+
|
36 |
+
Transcript:
|
37 |
+
{transcript}
|
38 |
+
|
39 |
+
Notes and Summary:
|
40 |
+
"""
|
41 |
+
|
42 |
+
stream = ollama_client.generate(model='llama3.1:latest', prompt=prompt, stream=True)
|
43 |
+
for chunk in stream:
|
44 |
+
yield chunk['response']
|
45 |
+
|
46 |
+
def process_video(url):
|
47 |
+
try:
|
48 |
+
audio_file = download_audio(url)
|
49 |
+
transcript = transcribe_audio(audio_file)
|
50 |
+
return generate_notes_and_summary_stream(transcript)
|
51 |
+
except Exception as e:
|
52 |
+
raise Exception(f"An error occurred: {str(e)}")
|