gradio-templates (Gradio Templates)

akhaliq

posted an update 6 days ago

Post

2304

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat

freddyaboulton

posted an update 7 days ago

Post

1102

Just created a Gradio space for playing with the new OAI realtime voice API!

freddyaboulton/openai-realtime-voice

freddyaboulton

posted an update 7 days ago

Post

455

Gemini can talk 🗣️

Check out the new multimodal API from Google on @akhaliq 's anychat or my space. It's very fast and smart 🍓

https://huggingface.co/spaces/freddyaboulton/gemini-voicehttps://huggingface.co/spaces/akhaliq/anychat

1 reply

·

freddyaboulton

posted an update 12 days ago

Post

1820

Version 0.0.21 of gradio-pdf now properly loads chinese characters!

freddyaboulton

posted an update 13 days ago

Post

1511

Hello Llama 3.2! 🗣️🦙

Build a Siri-like coding assistant that responds to "Hello Llama" in 100 lines of python! All with Gradio, webRTC 😎

freddyaboulton/hey-llama-code-editor

freddyaboulton

posted an update 14 days ago

Post

1050

Just created a cookbook of real time audio/video spaces created using Gradio and WebRTC ⚡️

Use this and the [docs](https://freddyaboulton.github.io/gradio-webrtc/) to get started building the next gen of AI apps!

freddyaboulton/gradio-webrtc-cookbook-6758ba7745aeca7b1be7de0f

2 replies

·

akhaliq

posted an update 27 days ago

Post

3749

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat

1 reply

·

akhaliq

posted an update 27 days ago

Post

3670

New model drop in anychat

allenai/Llama-3.1-Tulu-3-8B is now available

try it here: akhaliq/anychat

akhaliq

posted an update about 1 month ago

Post

2662

anychat

supports chatgpt, gemini, perplexity, claude, meta llama, grok all in one app

try it out there: akhaliq/anychat

fffiloni

posted an update about 1 month ago

Post

8016

DimensionX is out for you to try and duplicate 🤗
—> fffiloni/DimensionX

Discuss Paper: DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion (2411.04928)

Examples by the amazing William Lamkin @phanes

4 replies

·

abidlabs

updated 2 Spaces 3 months ago

Runtime error

4

🖼

Text-to-Image Gradio Template

Running

10

💬

Gradio Chatbot

fffiloni

posted an update 3 months ago

Post

14752

Visionary Walter Murch (editor for Francis Ford Coppola), in 1999:

“ So let's suppose a technical apotheosis some time in the middle of the 21st century, when it somehow becomes possible for one person to make an entire feature film, with virtual actors. Would this be a good thing?

If the history of oil painting is any guide, the broadest answer would be yes, with the obvious caution to keep a wary eye on the destabilizing effect of following too intently a hermetically personal vision. One need only look at the unraveling of painting or classical music in the 20th century to see the risks.

Let's go even further, and force the issue to its ultimate conclusion by supposing the diabolical invention of a black box that could directly convert a single person's thoughts into a viewable cinematic reality. You would attach a series of electrodes to various points on your skull and simply think the film into existence.

And since we are time-traveling, let us present this hypothetical invention as a Faustian bargain to the future filmmakers of the 21st century. If this box were offered by some mysterious cloaked figure in exchange for your eternal soul, would you take it?

The kind of filmmakers who would accept, even leap, at the offer are driven by the desire to see their own vision on screen in as pure a form as possible. They accept present levels of collaboration as the evil necessary to achieve this vision. Alfred Hitchcock, I imagine, would be one of them, judging from his description of the creative process: "The film is already made in my head before we start shooting."”
—
Read "A Digital Cinema of the Mind? Could Be" by Walter Murch: https://archive.nytimes.com/www.nytimes.com/library/film/050299future-film.html

1 reply

·

abidlabs

posted an update 3 months ago

Post

4598

👋 Hi Gradio community,

I'm excited to share that Gradio 5 will launch in October with improvements across security, performance, SEO, design (see the screenshot for Gradio 4 vs. Gradio 5), and user experience, making Gradio a mature framework for web-based ML applications.

Gradio 5 is currently in beta, so if you'd like to try it out early, please refer to the instructions below:

---------- Installation -------------

Gradio 5 depends on Python 3.10 or higher, so if you are running Gradio locally, please ensure that you have Python 3.10 or higher, or download it here: https://www.python.org/downloads/

* Locally: If you are running gradio locally, simply install the release candidate with pip install gradio --pre
* Spaces: If you would like to update an existing gradio Space to use Gradio 5, you can simply update the sdk_version to be 5.0.0b3 in the README.md file on Spaces.

In most cases, that’s all you have to do to run Gradio 5.0. If you start your Gradio application, you should see your Gradio app running, with a fresh new UI.

-----------------------------

Fore more information, please see: https://github.com/gradio-app/gradio/issues/9463

2 replies

·

whitphx

posted an update 6 months ago

Post

2319

Have you looked at the Gemini Nano local LLM?
Gradio-Lite, the in-browser ver. of Gradio, gives it a rich interface using only Python code, even for such an in-browser AI app!

Try out a chat app that runs completely inside your browser 👇
https://www.gradio.app/playground?demo=Hello_World&code=IyBOT1RFOiBHZW1pbmkgTmFubyBtdXN0IGJlIGVuYWJsZWQgaW4geW91ciBicm93c2VyLiBTZWUgYXJ0aWNsZXMgbGlrZSBodHRwczovL3dyaXRpbmdtYXRlLmFpL2Jsb2cvYWNjZXNzLXRvLWdlbWluaS1uYW5vLWxvY2FsbHkKaW1wb3J0IGdyYWRpbyBhcyBncgpmcm9tIGpzIGltcG9ydCBzZWxmICAjIFB5b2RpZGUgcHJvdmlkZXMgYWNjZXNzIHRvIHRoZSBKUyBzY29wZSB2aWEgYGpzYCBtb2R1bGUuIFNlZSBodHRwczovL3B5b2RpZGUub3JnL2VuL3N0YWJsZS91c2FnZS9hcGkvcHl0aG9uLWFwaS5odG1sI3B5dGhvbi1hcGkKCiMgSW5pdGlhbGl6ZSBQcm9tcHQgQVBJCnNlc3Npb24gPSBOb25lCnRyeToKICAgIGNhbl9haV9jcmVhdGUgPSBhd2FpdCBzZWxmLmFpLmNhbkNyZWF0ZVRleHRTZXNzaW9uKCkKICAgIGlmIGNhbl9haV9jcmVhdGUgIT0gIm5vIjoKICAgICAgICBzZXNzaW9uID0gYXdhaXQgc2VsZi5haS5jcmVhdGVUZXh0U2Vzc2lvbigpCmV4Y2VwdDoKICAgIHBhc3MKCgpzZWxmLmFpX3RleHRfc2Vzc2lvbiA9IHNlc3Npb24KCgphc3luYyBkZWYgcHJvbXB0KG1lc3NhZ2UsIGhpc3RvcnkpOgogICAgc2Vzc2lvbiA9IHNlbGYuYWlfdGV4dF9zZXNzaW9uCiAgICBpZiBub3Qgc2Vzc2lvbjoKICAgICAgICByYWlzZSBFeGNlcHRpb24oIkdlbWluaSBOYW5vIGlzIG5vdCBhdmFpbGFibGUgaW4geW91ciBicm93c2VyLiIpCgogICAgc3RyZWFtID0gc2Vzc2lvbi5wcm9tcHRTdHJlYW1pbmcobWVzc2FnZSkKICAgIGFzeW5jIGZvciBjaHVuayBpbiBzdHJlYW06CiAgICAgICAgeWllbGQgY2h1bmsKCgpkZW1vID0gZ3IuQ2hhdEludGVyZmFjZShmbj1wcm9tcHQpCgpkZW1vLmxhdW5jaCgp

Note: Gemini Nano is currently only available on Chrome Canary, and you need to opt-in.
Follow the "Installation" section in https://huggingface.co/blog/Xenova/run-gemini-nano-in-your-browser .

freddyaboulton

posted an update 6 months ago

Post

1503

@dwancin Can you please reset your toggle component's space? It's stuck for some reason. Happy to help

dwancin/gradio_toggle

akhaliq

posted an update 7 months ago

Post

20593

Phased Consistency Model

Phased Consistency Model (2405.18407)

The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phased Consistency Model (PCM), which generalizes the design space and addresses all identified limitations. Our evaluations demonstrate that PCM significantly outperforms LCM across 1--16 step generation settings. While PCM is specifically designed for multi-step refinement, it achieves even superior or comparable 1-step generation results to previously state-of-the-art specifically designed 1-step methods. Furthermore, we show that PCM's methodology is versatile and applicable to video generation, enabling us to train the state-of-the-art few-step text-to-video generator.

abidlabs

posted an update 7 months ago

Post

4080

𝗣𝗿𝗼𝘁𝗼𝘁𝘆𝗽𝗶𝗻𝗴 holds an important place in machine learning. But it has traditionally been quite difficult to go from prototype code to production-ready APIs

We're working on making that a lot easier with 𝗚𝗿𝗮𝗱𝗶𝗼 and will unveil something new on June 6th: https://www.youtube.com/watch?v=44vi31hehw4&ab_channel=HuggingFace

2 replies

·

fffiloni

posted an update 7 months ago

Post

19504

🇫🇷
Quel impact de l’IA sur les filières du cinéma, de l’audiovisuel et du jeu vidéo?
Etude prospective à destination des professionnels
— CNC & BearingPoint | 09/04/2024

Si l’Intelligence Artificielle (IA) est utilisée de longue date dans les secteurs du cinéma, de l’audiovisuel et du jeu vidéo, les nouvelles applications de l’IA générative bousculent notre vision de ce dont est capable une machine et possèdent un potentiel de transformation inédit. Elles impressionnent par la qualité de leurs productions et suscitent par conséquent de nombreux débats, entre attentes et appréhensions.

Le CNC a donc décider de lancer un nouvel Observatoire de l’IA Afin de mieux comprendre les usages de l’IA et ses impacts réels sur la filière de l’image. Dans le cadre de cet Observatoire, le CNC a souhaité dresser un premier état des lieux à travers la cartographie des usages actuels ou potentiels de l’IA à chaque étape du processus de création et de diffusion d’une œuvre, en identifiant les opportunités et risques associés, notamment en termes de métiers et d’emploi. Cette étude CNC / Bearing Point en a présenté les principaux enseignements le 6 mars, lors de la journée CNC « Créer, produire, diffuser à l’heure de l’intelligence artificielle ».

Le CNC publie la version augmentée de la cartographie des usages de l’IA dans les filières du cinéma, de l’audiovisuel et du jeu vidéo.

Lien vers la cartographie complète: https://www.cnc.fr/documents/36995/2097582/Cartographie+des+usages+IA_rapport+complet.pdf/96532829-747e-b85e-c74b-af313072cab7?t=1712309387891

4 replies

·

akhaliq

posted an update 7 months ago

Post

20892

Chameleon

Mixed-Modal Early-Fusion Foundation Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models (2405.09818)

We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.

Gradio Templates

AI & ML interests

Recent Activity

gradio-templates's activity

Text-to-Image Gradio Template

Gradio Chatbot

AI & ML interests

Recent Activity

Team members 12

gradio-templates's activity

Text-to-Image Gradio Template

Gradio Chatbot