Add Streaming Audio Support for TTS Entities #2277

LadaBr · 2026-01-02T10:59:46Z

LadaBr
Jan 2, 2026

Describe the feature

Add real-time streaming audio support for TTS (Text-to-Speech) engines in the voice assistant pipeline. Currently, all TTS providers must buffer the complete audio file before playback can begin, adding significant latency. Many modern TTS APIs (Cartesia, ElevenLabs, OpenAI) support streaming responses where audio chunks arrive progressively, but Home Assistant's TTS framework doesn't support this.

This feature would allow TTS integrations to stream audio chunks as they arrive from the API, enabling the voice assistant to start speaking immediately without waiting for the entire response to be generated.

Example commands

"Tell me about the history of Prague" (long response = most latency improvement)
"What's the weather forecast for the next week?"
"Explain how solar panels work"
"Read my calendar for today"
"Give me a recipe for chocolate cake"

Current behavior: 3-5 second delay before speech starts (waiting for complete audio)
With streaming: Speech starts in <500ms (first chunks arrive immediately)

Use cases

1. More Natural Conversations
Voice assistants would feel more responsive and human-like. When you ask a question, the assistant starts answering immediately instead of a noticeable pause.

2. Better User Experience for Long Responses
Currently, long responses (like calendar summaries, weather forecasts, or explanations) have uncomfortable pauses. With streaming, the assistant starts speaking right away.

3. Enable Modern TTS Providers
Many cutting-edge TTS providers (Cartesia with ultra-low latency, ElevenLabs with natural voices) are designed for streaming. Home Assistant can't take advantage of their best feature - instant response.

4. Reduced Hardware Requirements
Streaming reduces memory usage since you don't need to buffer entire audio files. This benefits lower-end devices running Home Assistant.

5. Competitive with Commercial Assistants
Alexa, Google Assistant, and Siri all stream TTS responses. Home Assistant voice assistants feel slow in comparison.

Anything else?

Language considerations:

All languages would benefit equally - streaming is independent of language
Particularly important for verbose languages (German, Czech) where responses are naturally longer

Hardware setups:

Most beneficial for: Raspberry Pi, lower-end servers, ESP32-based satellites
Reduces memory pressure and network buffering requirements
Enables better performance on satellite devices with limited RAM

Integration with other voice services:

Would enable seamless integration with modern TTS providers:
- Cartesia TTS (ultra-low latency, supports streaming)
- ElevenLabs (natural voices, streaming API)
- OpenAI TTS (streaming supported in API)
- Azure Speech Services (has streaming mode)
- Google Cloud TTS (streaming available)

Technical implementation notes:

Could be backward compatible - legacy TTS engines continue returning complete bytes
New optional method async_get_tts_audio_stream() or optional callback parameter
Cache system could buffer streams in background for replay
HTTP response could use chunked transfer encoding
Voice assistant pipeline would need minor updates to handle streaming playback

Answered by mib1185

Jan 2, 2026

Support for streaming the tts response is already in the core, but the tts integrations needs to adopt it - some already do so (Elevenlabs, Nabu Cloud and Wyoming)

View full answer

mib1185 · 2026-01-02T11:47:36Z

mib1185
Jan 2, 2026
Collaborator

Support for streaming the tts response is already in the core, but the tts integrations needs to adopt it - some already do so (Elevenlabs, Nabu Cloud and Wyoming)

2 replies

mib1185 Jan 2, 2026
Collaborator

ok, it got implemented in the core but not in the assist pipeline yet

mitrokun Jan 2, 2026

ok, it got implemented in the core but not in the assist pipeline yet

Streaming has been available for use since June last year.

/thread

Cobb787 · 2026-01-06T19:00:45Z

Cobb787
Jan 6, 2026

This is a strong and well-explained feature request, especially in terms of improving perceived latency for voice responses. Allowing audio playback to start immediately instead of waiting several seconds would significantly enhance the natural flow of conversations.

Real-time streaming is already a proven approach in other ecosystems, where progressive delivery improves responsiveness and user satisfaction. A similar concept can be seen in media streaming add-ons like torrentio, where content becomes usable as soon as data starts arriving rather than after full buffering. Applying that same idea to TTS pipelines feels both logical and overdue.

The suggested backward-compatible design is also practical. Existing TTS providers could continue returning complete audio, while newer integrations take advantage of streaming capabilities. This would especially benefit lower-end hardware by reducing memory usage and improving overall performance.

Overall, adding streaming TTS support would meaningfully improve the voice assistant experience and help close the gap with commercial platforms that already offer near-instant speech responses.

1 reply

mib1185 Jan 6, 2026
Collaborator

yeah ... that's a good ai generated summarisation of the feature request and the provided links of its implementation 😬

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home Assistant

Add Streaming Audio Support for TTS Entities #2277

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Home Assistant

Add Streaming Audio Support for TTS Entities #2277

Uh oh!

LadaBr Jan 2, 2026

Describe the feature

Example commands

Use cases

Anything else?

Replies: 2 comments · 3 replies

Uh oh!

mib1185 Jan 2, 2026 Collaborator

Uh oh!

mib1185 Jan 2, 2026 Collaborator

Uh oh!

mitrokun Jan 2, 2026

Uh oh!

Uh oh!

Cobb787 Jan 6, 2026

Uh oh!

Uh oh!

mib1185 Jan 6, 2026 Collaborator

LadaBr
Jan 2, 2026

Replies: 2 comments 3 replies

mib1185
Jan 2, 2026
Collaborator

mib1185 Jan 2, 2026
Collaborator

Cobb787
Jan 6, 2026

mib1185 Jan 6, 2026
Collaborator