Skip to content

[v0.20.5] Excessive warning log spam: "Cannot find RTP history" and "IsSendable(): Invalid state: Disconnected" #2066

@varohso

Description

@varohso

Environment

OME Version: v0.20.5 (Docker image)
Role: Edge (1:N WebRTC streaming)
Scale: ~150 edge nodes
Comparison: v0.16.8 edges running the same workload do NOT produce these warnings

Problem

Two warning messages are flooding the logs at an extremely high rate on v0.20.5 edge servers.

  1. RTP history lookup failure (rtc_bandwidth_estimator.cpp:173)
[2026-04-10 13:26:41.645] W [SW-WebRTC:481] WebRTC Publisher | rtc_bandwidth_estimator.cpp:173 | Cannot find RTP history for wide_seq_no(13045)
[2026-04-10 13:26:41.645] W [SW-WebRTC:481] WebRTC Publisher | rtc_bandwidth_estimator.cpp:173 | Cannot find RTP history for wide_seq_no(13046)
[2026-04-10 13:26:41.645] W [SW-WebRTC:481] WebRTC Publisher | rtc_bandwidth_estimator.cpp:173 | Cannot find RTP history for wide_seq_no(13047)

...
Consecutive sequence numbers are missing from the history, suggesting the buffer may be undersized or packets are expiring before TransportCC feedback arrives.

  1. Disconnected socket send attempts (socket.cpp:1142)
    [2026-04-10 13:26:44.059] W [SW-WebRTC:475] Socket | socket.cpp:1142 | [#140] [0x7f074a4b7710] IsSendable(): Invalid state: Disconnected (expected: == SocketState::Connected) - <ClientSocket: 0x7f074a4b7710, #140, Disconnected, TCP, Nonblocking, 162.120.194.154:32575>
    This repeats hundreds of times per second for a single disconnected client, indicating the session is not being cleaned up after socket disconnect.

Impact

Log volume is extremely high, causing disk I/O pressure
Potential CPU waste from repeated send attempts to disconnected sockets
May affect streaming performance on busy edge nodes
Expected behavior
RTP history buffer should be large enough to handle normal TransportCC feedback latency
Disconnected sockets should trigger session cleanup promptly, stopping further send attempts

Follow-up questions

We're running an A/B test with many edges on v0.16.8 and ~n edges on v0.20.5 under the same workload. These warnings only appear on v0.20.5 edges.

Regarding the IsSendable(): Disconnected spam
The same socket produces dozens of warnings within a single millisecond, which seems like the media sending thread keeps attempting to push packets to a session whose socket is already disconnected.

Could this be a race condition where the disconnect event doesn't propagate to the sending thread in time, causing the session to remain active indefinitely after socket disconnect?
Was there a change in session lifecycle management between v0.16.8 and v0.20.5 that might explain this?
Regarding Cannot find RTP history for wide_seq_no
The missing sequence numbers are consecutive (13045, 13046, 13047...), suggesting a batch of packets is missing from the history rather than sporadic losses.

Is it possible that this is a cascading effect of the disconnected socket issue above — i.e., TransportCC feedback arriving for a session that's in a broken state?
Or is the RTP history buffer size insufficient for certain network latency conditions?
Any guidance on configuration options to mitigate these in the meantime would also be appreciated.

Metadata

Metadata

Assignees

Labels

in progressBeing actively worked on but may take some time to completepatchedPatch applied

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions