Streaming token usage not captured for OpenAI Responses API

### Description

## What happened?

When using the OpenAI Responses API with streaming (`client.responses.stream()`), logfire does not capture token usage attributes (`gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`) in the span. Non-streaming calls work correctly.

The issue is in `OpenaiResponsesStreamState.get_attributes()` which doesn't extract the `usage` field from the response object.

## Steps to Reproduce

```python
# Requirements:
#   pip install logfire openai

import os
import logfire
from openai import OpenAI

# Configure logfire first (before setting up our own tracer)
logfire.configure(send_to_logfire=False)

from opentelemetry import trace
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter

# Add our exporter to capture spans
exporter = InMemorySpanExporter()
provider = trace.get_tracer_provider()
if hasattr(provider, 'add_span_processor'):
    provider.add_span_processor(SimpleSpanProcessor(exporter))

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Instrument OpenAI with logfire
logfire.instrument_openai(client)

def test_streaming():
    """Test OpenAI Responses API streaming and check for token usage."""
    exporter.clear()

    # Make a streaming request using Responses API
    with client.responses.stream(
        model="gpt-4o-mini",
        input="Write a haiku about programming."
    ) as stream:
        stream.until_done()

    # Check the captured spans for token usage
    spans = exporter.get_finished_spans()
    for span in spans:
        attrs = dict(span.attributes) if span.attributes else {}
        input_tokens = attrs.get("gen_ai.usage.input_tokens")
        output_tokens = attrs.get("gen_ai.usage.output_tokens")
        if "streaming" in span.name.lower():
            print(f"Span: {span.name}")
            print(f"  gen_ai.usage.input_tokens: {input_tokens}")
            print(f"  gen_ai.usage.output_tokens: {output_tokens}")
            return input_tokens, output_tokens

    return None, None

print("="*60)
print("Testing WITHOUT patch")
print("="*60)

input_tokens, output_tokens = test_streaming()

if input_tokens is None and output_tokens is None:
    print("\nBUG: Token usage not captured for streaming!")
else:
    print(f"\nToken usage captured: input={input_tokens}, output={output_tokens}")

# Now apply the patch
print("\n" + "="*60)
print("Applying patch: adding token usage extraction")
print("="*60)

from logfire._internal.integrations.llm_providers.openai import (
    OpenaiResponsesStreamState,
    responses_output_events,
)

def patched_get_attributes(self, span_data: dict) -> dict:
    response = self.get_response_data()
    span_data["events"] = span_data["events"] + responses_output_events(response)

    # FIX: Extract token usage from the response
    usage = getattr(response, "usage", None)
    input_tokens = getattr(usage, "input_tokens", None)
    output_tokens = getattr(usage, "output_tokens", None)
    if isinstance(input_tokens, int):
        span_data["gen_ai.usage.input_tokens"] = input_tokens
    if isinstance(output_tokens, int):
        span_data["gen_ai.usage.output_tokens"] = output_tokens

    return span_data

OpenaiResponsesStreamState.get_attributes = patched_get_attributes

print("\n" + "="*60)
print("Testing WITH patch")
print("="*60)

input_tokens, output_tokens = test_streaming()

if input_tokens is None and output_tokens is None:
    print("\nToken usage still not captured!")
else:
    print(f"\nFIXED: Token usage captured: input={input_tokens}, output={output_tokens}")
```

Output:
```
============================================================
Testing WITHOUT patch
============================================================
15:31:46.405 Responses API with 'gpt-4o-mini' [LLM]
15:31:48.322 streaming response from 'gpt-4o-mini' took 1.08s [LLM]
Span: streaming response from {request_data[model]!r} took {duration:.2f}s
  gen_ai.usage.input_tokens: None
  gen_ai.usage.output_tokens: None

BUG: Token usage not captured for streaming!

============================================================
Applying patch: adding token usage extraction
============================================================

============================================================
Testing WITH patch
============================================================
15:31:48.323 Responses API with 'gpt-4o-mini' [LLM]
15:31:50.137 streaming response from 'gpt-4o-mini' took 1.07s [LLM]
Span: streaming response from {request_data[model]!r} took {duration:.2f}s
  gen_ai.usage.input_tokens: 14
  gen_ai.usage.output_tokens: 21

FIXED: Token usage captured: input=14, output=21
```

## Expected Result

Token usage should be captured in span attributes:
- `gen_ai.usage.input_tokens`: ~14 (for the test prompt)
- `gen_ai.usage.output_tokens`: ~21-25 (varies by run)

## Actual Result (Buggy Behavior)

Token usage attributes are `None` for streaming responses using the Responses API.

After applying the patch (extracting usage from response object), token counts are correctly captured.

## Additional context

The bug is in `logfire/_internal/integrations/llm_providers/openai.py`:

```python
class OpenaiResponsesStreamState:
    def get_attributes(self, span_data: dict[str, Any]) -> dict[str, Any]:
        response = self.get_response_data()
        span_data['events'] = span_data['events'] + responses_output_events(response)
        return span_data  # Missing: token usage extraction
```

The fix is to extract `usage` from the response:

```python
def get_attributes(self, span_data: dict[str, Any]) -> dict[str, Any]:
    response = self.get_response_data()
    span_data['events'] = span_data['events'] + responses_output_events(response)

    # Extract token usage from the response
    usage = getattr(response, "usage", None)
    input_tokens = getattr(usage, "input_tokens", None)
    output_tokens = getattr(usage, "output_tokens", None)
    if isinstance(input_tokens, int):
        span_data["gen_ai.usage.input_tokens"] = input_tokens
    if isinstance(output_tokens, int):
        span_data["gen_ai.usage.output_tokens"] = output_tokens

    return span_data
```

Note: The non-streaming `OpenaiResponsesState` class likely has similar code that works correctly - this fix just adds the same token extraction to the streaming variant.

## Possibly related issue

While testing this, we noticed the span name shows the literal template string `{request_data[model]!r}` instead of the interpolated value (e.g., `'gpt-4o-mini'`). This appears to be a separate issue in `llm_provider.py` where the template string `'streaming response from {request_data[model]!r} took {duration:.2f}s'` is not being properly formatted. We've seen similar formatting issues in our production logs.

### Python, Logfire & OS Versions, related packages (not required)

```TOML
logfire="4.19.0"
platform="Linux-6.8.0-90-generic-x86_64-with-glibc2.39"
python="3.11.13 (main, Jun  4 2025, 17:37:17) [Clang 20.1.4 ]"
[related_packages]
requests="2.32.5"
pydantic="2.11.1"
fastapi="0.128.0"
openai="2.15.0"
protobuf="5.29.5"
rich="14.2.0"
executing="2.2.1"
opentelemetry-api="1.39.1"
opentelemetry-exporter-otlp-proto-common="1.39.1"
opentelemetry-exporter-otlp-proto-http="1.39.1"
opentelemetry-instrumentation="0.60b1"
opentelemetry-instrumentation-asgi="0.60b1"
opentelemetry-instrumentation-dbapi="0.60b1"
opentelemetry-instrumentation-fastapi="0.60b1"
opentelemetry-instrumentation-google-genai="0.5b0"
opentelemetry-instrumentation-httpx="0.60b1"
opentelemetry-instrumentation-psycopg="0.60b1"
opentelemetry-instrumentation-psycopg2="0.60b1"
opentelemetry-instrumentation-requests="0.60b1"
opentelemetry-instrumentation-sqlalchemy="0.60b1"
opentelemetry-instrumentation-system-metrics="0.60b1"
opentelemetry-proto="1.39.1"
opentelemetry-sdk="1.39.1"
opentelemetry-semantic-conventions="0.60b1"
opentelemetry-util-genai="0.2b0"
opentelemetry-util-http="0.60b1"
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming token usage not captured for OpenAI Responses API #1651

Description

What happened?

Steps to Reproduce

Expected Result

Actual Result (Buggy Behavior)

Additional context

Possibly related issue

Python, Logfire & OS Versions, related packages (not required)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Streaming token usage not captured for OpenAI Responses API #1651

Description

Description

What happened?

Steps to Reproduce

Expected Result

Actual Result (Buggy Behavior)

Additional context

Possibly related issue

Python, Logfire & OS Versions, related packages (not required)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions