Spaces:

papayaga
/

homeros_demo

Sleeping

App Files Files Community

papayaga commited on Jul 21, 2023

Commit

ec2a0f3

0 Parent(s):

init and basic architecture and logic in the Readme

Browse files

Files changed (9) hide show

.gitignore +166 -0
Dockerfile +52 -0
README.md +81 -0
adaptors/__init__.py +2 -0
adaptors/llm.py +28 -0
adaptors/voice.py +45 -0
homeros.py +3 -0
main.py +86 -0
requirements.txt +141 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,166 @@

+# app specific
+outputs/*
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Mac stuff
+.DS_Store
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/

Dockerfile ADDED Viewed

	@@ -0,0 +1,52 @@

+# Start from Ubuntu base image
+FROM ubuntu:latest
+# Set environment variables
+ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
+# Install essentials
+RUN apt-get update && apt-get install -y \
+    software-properties-common \
+    build-essential \
+    curl \
+    git \
+    vim
+# Install ffmpeg
+RUN apt-get update && apt-get install -y \
+    ffmpeg
+# Install Python3 and pip
+RUN apt-get update && apt-get install -y \
+    python3-pip \
+    python3-dev \
+    python3-setuptools \
+    && pip3 install --upgrade pip setuptools wheel
+# Clean up APT when done
+RUN apt-get clean && rm -rf /var/lib/apt/lists/*
+# Set up a new user named "user" with user ID 1000
+RUN useradd -m -u 1000 user
+# Switch to the "user" user
+USER user
+# Set home to the user's home directory
+ENV HOME=/home/user \
+	PATH=/home/user/.local/bin:$PATH
+# Set up the working directory
+WORKDIR $HOME/app
+# Copy the current directory contents into the container at $HOME/app setting the owner to the user
+COPY --chown=user . $HOME/app
+# Try and run pip command after setting the user with `USER user` to avoid permission issues with Python
+RUN pip install --no-cache-dir --upgrade pip
+# Copy requirements.txt and install requirements
+RUN pip3 install -r requirements.txt
+# Command to run on container start
+CMD ["gradio", "main.py"]

README.md ADDED Viewed

	@@ -0,0 +1,81 @@

+---
+title: HOMER.OS
+emoji: 🐳
+colorFrom: purple
+colorTo: black
+sdk: docker
+app_port: 7860
+---
+# HOMER.OS
+### the operating system for the next age of co-creative storytelling
+This demo is exploring the future of interactive storytelling.
+It puts the user in charge of a how the story is going to develop.
+## Here is how it works:
+1. The user interacts with the system by means of audio messages
+2. The experience starts with the user inputting the details of the hero, the style and the story they'd like to play.
+3. The system creates the beginning of the story and reads it outloud to the user. The system the asks what the hero should do next.
+4. The user answers via voice message
+5. The system takes the user input into account, generates the next chunk of the story and reads it out to the user
+6. The loop continues until after X messages the system decides to end the story (to prevent from exceeding GPT context window for now)
+## Tech. Stack for the demo:
+- GPT-4 for story generation
+- Whisper for speech to text
+- Play.ht for voice generation
+- Gradio for interface
+- Gradio Spaces for deployment
+## Story schema
+- STRING `uuid` = uuid of this story
+- STRING `status` = 'not_started' / 'ongoing' / 'finished'
+- TEXT `world` = text description of the world
+- TEXT `hero` = text description of the hero of the story
+- TEXT `plot` = high level description of the plot. without chapters or anything like that. we can use this to later break down into chapters and get smarter about story ark management with a second LLM
+- STRING `ending` = text string representing what kind of ending we want e.g. happy or tragic
+- STRING `style` = text description of the style of story-telling
+- STRING `voice` = id of the voice we are using for sounding the story
+- TEXT(JSON) `chunks` = JSON array of story-chunks. each chunk has {"text", "audio_url"}
+- TEXT(JSON) `messages` = JSON array of messages in the openAI compatible format {role=system/user/assistant content=message}
+- STRING `full_story_audio_url` = url of the full rendered audio story (story chunks audio combined)
+- TEXT `full_story_text` = full story text
+## Flow
+1. Welcome the user
+2. Ask for the magic word
+3. Check the magic word - if not apologize and tell them how to get it
+4. Once we have the magic word - generate uuid and kickstart story configuration:
+    - say "Let me now ask you a few questions about the story you'd like to hear..."
+    - ask the user about the world their story should happen in
+    - ask the user about the hero and save it
+    - ask the user about the plot and save it
+    - ask the user if they want the story to end in a happy way or in a sad way (free user input) and save it
+    - ask the user about the style and save it
+5. Say "Our story is all set! Let it begin."
+6. Tell the first paragraph / part and then ask at the end "What do you think should the hero do next?"
+7. Process user input, generate the next chunk and repeat
+8. If number of chunks (or total tokens in the story) is approaching the limit - end the story by passing a constructed user message that references the type of ending
+9. Thank the user and say goodby
+10. If the user records more messages - say a fixed message that this story has ended but the user wants another one, they can come again.
+## Basic ToDo
+- Gradio input/outpus/state setup (with text only)
+- Story object setup, schema, logic
+- GPT-4 story generation in a gradio interface
+- Dockerfile and deploy (including magic word for access control)
+- Interchange text input for whisper
+- Inerchange text output for play.ht voice generation
+## Enhancements
+- Add SQlite DB and save stories
+- Add option to download the full story as one .mp3
+- Add meta-moderator role to manage story state and scenarios better

adaptors/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ from . import llm
2	+ from . import voice

adaptors/llm.py ADDED Viewed

	@@ -0,0 +1,28 @@

+'''
+an abstraction over GPT-4 for easy substitution later if needed
+'''
+import openai
+import os
+openai.api_key = os.getenv('OPENAI_API_KEY')
+MODEL = 'gpt-4'
+#MODEL = 'gpt-3.5-turbo'
+def answer(system_message, user_and_assistant_messages):
+    messages = [{
+        "role":"system",
+        "content": system_prompt
+    }]
+    messages.extend(user_and_assistant_messages)
+    chat_completion = openai.ChatCompletion.create(
+        model=MODEL,
+        messages=messages
+    )
+    output = chat_completion.choices[0].message.content
+    return output

adaptors/voice.py ADDED Viewed

	@@ -0,0 +1,45 @@

+'''
+generates voice using play.ht api
+returns a url of the generated mp3
+'''
+import requests
+import sseclient
+from loguru import logger
+import os
+from pprint import pprint
+import json
+url = "https://play.ht/api/v2/tts"
+user_id = os.environ['PLAYHT_USERID']
+api_key = os.environ['PLAYHT_SECRETKEY']
+headers = {
+  "accept": "text/event-stream",
+  "content-type": "application/json",
+  "AUTHORIZATION": f"Bearer {api_key}",
+  "X-USER-ID": user_id
+}
+def say_it(text, voice):
+  payload = {
+    "quality": "medium",
+    "output_format": "mp3",
+    "speed": 1,
+    "sample_rate": 24000,
+    "text": text,
+    "voice": voice
+  }
+  response = requests.post(url, stream=True, headers=headers, json=payload)
+  stream_url = response.headers['content-location']
+  logger.debug(f"stream_url = {stream_url}")
+  resp = requests.get(stream_url, stream=True, headers=headers)
+  client = sseclient.SSEClient(resp)
+  for event in client.events():
+    if event.data:
+      e = json.loads(event.data)
+      if e["stage"] == "complete":
+        return(e["url"])

homeros.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+
2	+ def start_story(world, hero, plot, style):
3	+ return "ohoho"

main.py ADDED Viewed

	@@ -0,0 +1,86 @@

+import gradio as gr
+from pprint import pprint
+import uuid
+import json
+from loguru import logger
+from dotenv import load_dotenv
+load_dotenv()
+from homeros import start_story, continue_story
+def gen_unique_id():
+    return str(uuid.uuid4())
+def do_homeros(user_input, story):
+    pprint(story)
+    if story["status"] == "not_started":
+        story["uuid"] = gen_unique_id()
+    return json.dumps(story["messages"]), story
+demo = gr.Blocks()
+with demo:
+    story = gr.State(value = {
+        "uuid" : "",
+        "status" : "not_started",
+        "world": "",
+        "hero": "",
+        "plot": "",
+        "ending": "",
+        "style": "",
+        "voice": "dylan",
+        "chunks" : [],
+        "messages": [],
+        "full_story_audio_ur": "",
+        "full_story_text": ""
+    })
+    pprint(story.value)
+    with gr.Row():
+        gr.Markdown('''
+# HOMEROS
+This demo is exploring the future of interactive storytelling.
+It puts the user in charge and makes blurs the boundary between the reader and the author.
+Hit "Tell me!" to get started.
+When Homeros asks you something - hit record, answer with your voice and then hit "Tell me!" again.
+        ''')
+    with gr.Row():
+        text_input = gr.Textbox()
+    with gr.Row():
+        go_btn = gr.Button(
+            "Tell me!",
+        )
+    with gr.Row():
+        story_chunk = gr.Textbox()
+    go_btn.click(
+        do_homeros,
+        inputs=[text_input, story],
+        outputs=[story_chunk, story]
+    )
+    demo.queue(
+        concurrency_count=5
+    )
+    demo.launch(
+        server_name="0.0.0.0",
+        ssl_verify=False,
+        show_api=False
+    )

requirements.txt ADDED Viewed

	@@ -0,0 +1,141 @@

+aiofiles==23.1.0
+aiohttp==3.8.5
+aiosignal==1.3.1
+altair==5.0.1
+annotated-types==0.5.0
+anyio==3.7.1
+async-timeout==4.0.2
+attrs==23.1.0
+certifi==2023.5.7
+charset-normalizer==3.2.0
+click==8.1.6
+contourpy==1.1.0
+cycler==0.11.0
+fastapi==0.100.0
+ffmpy==0.3.1
+filelock==3.12.2
+fonttools==4.41.0
+frozenlist==1.4.0
+fsspec==2023.6.0
+gradio==3.37.0
+gradio_client==0.2.10
+h11==0.14.0
+httpcore==0.17.3
+httpx==0.24.1
+huggingface-hub==0.16.4
+idna==3.4
+Jinja2==3.1.2
+jsonschema==4.18.4
+jsonschema-specifications==2023.7.1
+kiwisolver==1.4.4
+linkify-it-py==2.0.2
+loguru==0.7.0
+markdown-it-py==2.2.0
+MarkupSafe==2.1.3
+matplotlib==3.7.2
+mdit-py-plugins==0.3.3
+mdurl==0.1.2
+multidict==6.0.4
+numpy==1.25.1
+openai==0.27.8
+orjson==3.9.2
+packaging==23.1
+pandas==2.0.3
+Pillow==10.0.0
+pydantic==2.0.3
+pydantic_core==2.3.0
+pydub==0.25.1
+pyparsing==3.0.9
+python-dateutil==2.8.2
+python-dotenv==1.0.0
+python-multipart==0.0.6
+pytz==2023.3
+PyYAML==6.0.1
+referencing==0.30.0
+requests==2.31.0
+rpds-py==0.9.2
+semantic-version==2.10.0
+six==1.16.0
+sniffio==1.3.0
+sseclient==0.0.27
+starlette==0.27.0
+toolz==0.12.0
+tqdm==4.65.0
+typing_extensions==4.7.1
+tzdata==2023.3
+uc-micro-py==1.0.2
+urllib3==2.0.4
+uvicorn==0.23.1
+websockets==11.0.3
+yarl==1.9.2
+aiofiles==23.1.0
+aiohttp==3.8.5
+aiosignal==1.3.1
+altair==5.0.1
+annotated-types==0.5.0
+anyio==3.7.1
+async-timeout==4.0.2
+attrs==23.1.0
+certifi==2023.5.7
+charset-normalizer==3.2.0
+click==8.1.6
+contourpy==1.1.0
+cycler==0.11.0
+fastapi==0.100.0
+ffmpy==0.3.1
+filelock==3.12.2
+fonttools==4.41.0
+frozenlist==1.4.0
+fsspec==2023.6.0
+gradio==3.37.0
+gradio_client==0.2.10
+h11==0.14.0
+httpcore==0.17.3
+httpx==0.24.1
+huggingface-hub==0.16.4
+idna==3.4
+Jinja2==3.1.2
+jsonschema==4.18.4
+jsonschema-specifications==2023.7.1
+kiwisolver==1.4.4
+linkify-it-py==2.0.2
+loguru==0.7.0
+markdown-it-py==2.2.0
+MarkupSafe==2.1.3
+matplotlib==3.7.2
+mdit-py-plugins==0.3.3
+mdurl==0.1.2
+multidict==6.0.4
+numpy==1.25.1
+openai==0.27.8
+orjson==3.9.2
+packaging==23.1
+pandas==2.0.3
+Pillow==10.0.0
+pydantic==2.0.3
+pydantic_core==2.3.0
+pydub==0.25.1
+pyparsing==3.0.9
+python-dateutil==2.8.2
+python-dotenv==1.0.0
+python-multipart==0.0.6
+pytz==2023.3
+PyYAML==6.0.1
+referencing==0.30.0
+requests==2.31.0
+rpds-py==0.9.2
+semantic-version==2.10.0
+six==1.16.0
+sniffio==1.3.0
+sseclient==0.0.27
+starlette==0.27.0
+toolz==0.12.0
+tqdm==4.65.0
+typing_extensions==4.7.1
+tzdata==2023.3
+uc-micro-py==1.0.2
+urllib3==2.0.4
+uuid==1.30
+uvicorn==0.23.1
+websockets==11.0.3
+yarl==1.9.2