Talk to Llama 4
Talk to Llama 4 using Groq + Cloudflare
As a preview of what you can build with FastRTC and Cloudflare, check out this voice chat app built with Meta's new Llama 4 model!
As conversational AI becomes a core interface for tools, products, and services, real-time communication infrastructure is increasingly essential to support natural, multimodal interactions. Hugging Face built FastRTC to let AI developers build low-latency AI-powered audio and video streams with minimal Python code by abstracting away the complexities of WebRTC – the gold standard technology for real-time communication.
WebRTC-powered applications often face deployment challenges due to the need for specialized TURN servers, which enable reliable connections across different network environments. To address this issue, Cloudflare has built a global network of these servers that spans over 335 locations worldwide.
This partnership combines FastRTC’s easy development approach with Cloudflare's global TURN network, ensuring developers can create fast and reliable WebRTC applications with global connectivity.
FastRTC developers with a valid Hugging Face Access Token can stream 10GB of data for FREE every month without a credit card. Once the monthly limit is reached, developers can switch to their Cloudflare account for higher capacity (instructions).
This partnership is especially valuable for AI developers building:
This partnership lets developers focus on their core application logic with FastRTC, while eliminating the need to build and maintain TURN infrastructure. Cloudflare's managed service handles global scalability and reliability, allowing AI developers to deliver exceptional experiences without the overhead of maintaining infrastructure.
The integration will be available in the FastRTC version 0.0.20 and above. To get started:
pip install --upgrade 'fastrtc[vad]'from fastrtc import ReplyOnPause, Stream, get_cloudflare_turn_credentials
import os
os.environ["HF_TOKEN"] = "<your-hf-token>"
def echo(audio):
yield audio
stream = Stream(ReplyOnPause(echo),
rtc_config=get_cloudflare_turn_credentials)
stream.ui.launch()
Launch your script with python, python <name of your script>.py
See this Collection on Hugging Face as well as the FastRTC Cookbook for more examples.
If you have any questions or feedback, please reach out to us on GitHub or Hugging Face. Please follow us on Hugging Face for latest updates and announcements.
Talk to Llama 4 using Groq + Cloudflare