Description:
When an MCP client (such as Cursor, Claude Desktop, or any RFC 9728-compliant client) connects to an OAuth-protected MCP endpoint, it needs to discover the authorization server's endpoints (authorization, token, JWKS, registration) to perform the browser-based OAuth login flow. Today, the ai-gateway controller attempts to auto-discover these endpoints by fetching the issuer's /.well-known/oauth-authorization-server or /.well-known/openid-configuration document. If auto-discovery fails, the controller falls back to Keycloak-style hardcoded paths (e.g., /protocol/openid-connect/auth, /protocol/openid-connect/token).
This approach breaks for authorization servers that:
-
Do not serve well-known metadata at the issuer URL — For example, the IDP we use have a legacy issuer path (/api/idp/authn/.well-known/openid-configuration) that returns {} (empty JSON), while the real metadata is only available at a versioned path (/api/idp/v4/authn/.well-known/openid-configuration).
-
Use non-standard endpoint paths — our IDP uses /authn/v1/oidc/auth and /v4/authn/oidc/token, not the Keycloak-style /protocol/openid-connect/auth and /protocol/openid-connect/token.
-
Require authentication to access well-known endpoints — Some deployments place the well-known endpoints behind an authentication gateway, returning 401 to unauthenticated requests from the controller.
-
Have a mismatch between the issuer identity and the discovery path — The issuer field in the MCPRoute must exactly match the iss claim in JWT tokens (Envoy does exact string comparison for token validation). But the well-known metadata might only be served at a different (e.g., versioned) path. These two concerns — token validation and metadata discovery — are coupled through the issuer field today, and they shouldn't be.
Why the existing auto-discovery doesn't work
The controller derives well-known URLs from the issuer field. For an issuer like https://example.com/api/iam/authn, it tries:
| URL tried |
Result |
https://example.com/.well-known/oauth-authorization-server/api/idp/authn |
401 (gateway blocks) |
https://example.com/.well-known/openid-configuration/api/idp/authn |
401 (gateway blocks) |
https://example.com/api/idp/authn/.well-known/oauth-authorization-server |
401 or 404 |
https://example.com/api/idp/authn/.well-known/openid-configuration |
200 OK, but returns {} |
The last variant returns valid JSON (HTTP 200), so the controller treats it as success. But all fields are empty strings. The controller then falls back to Keycloak-style hardcoded paths, which don't exist on the actual authorization server. MCP clients read these wrong URLs, try to hit them, and get 404 errors.
The correct metadata is served at a versioned path (/api/idp/v4/authn/.well-known/openid-configuration), but the controller never tries this URL because it's not derivable from the issuer. And we can't change the issuer to the versioned path because tokens are stamped with the legacy iss claim — Envoy would reject every token.
The versioning problem
Authorization server endpoint paths may include API version strings (e.g., v4.1.b1) that change across upgrades. MCPRoutes are created dynamically at runtime (by nai-api), not by Helm. This means:
- Existing MCPRoutes are not redeployed when the authorization server upgrades.
- Any versioned URLs baked into the MCPRoute spec will break when the authorization server version changes.
- The
issuer field (used for JWT validation) is version-agnostic and stable, but the actual endpoint paths are not.
This is an inherent tension: the stable identifier for token validation (issuer) differs from the versioned paths where endpoints are actually served.
Impact without this fix
Without authorizationServerMetadata, the MCP OAuth flow fails for any authorization server that doesn't match the Keycloak endpoint convention. The specific failure mode depends on the client:
- Cursor:
HTTP 404: Invalid OAuth error response: Unexpected end of JSON input — no browser login page ever opens.
- Other MCP clients: Similar failures at the OAuth discovery phase, before any authentication attempt.
The data plane (Bearer token validation) works fine — it's purely the control plane discovery metadata that's wrong.
Desired Behavior
Add an optional authorizationServerMetadata field to the MCPRoute's oauth spec that allows explicit specification of the authorization server's endpoint URLs:
securityPolicy:
oauth:
issuer: "https://example.com/api/idp/authn"
authorizationServerMetadata:
authorizationEndpoint: "https://example.com/api/iam/authn/v1/oidc/auth"
tokenEndpoint: "https://example.com/api/idp/v4/authn/oidc/token"
jwksUri: "https://example.com/api/idp/v4/authn/oidc/keys"
registrationEndpoint: "https://example.com/api/idp/v4/authn/oidc/clients"
When authorizationServerMetadata is provided:
- The controller uses these values directly for the
/.well-known/oauth-authorization-server response, bypassing auto-discovery and the Keycloak fallback entirely.
- The
authorization_servers field in the protected resource metadata points to the resource URL (not the issuer), so MCP clients fetch auth server metadata from the gateway's own well-known endpoint (where the correct values are served).
- The
issuer field remains solely for JWT token validation (Envoy iss claim matching), decoupling it from endpoint discovery.
Additionally, the controller should treat empty well-known responses ({}) as discovery failures rather than successes, falling through to the next discovery attempt or the fallback.
Precedence for auth server metadata
- Explicit
authorizationServerMetadata from the MCPRoute spec (if provided)
- Auto-discovered metadata from the issuer's well-known endpoint (existing behavior, with improved empty-response handling)
- Keycloak-style hardcoded defaults as a last resort (existing behavior)
Relevant Links
Description:
When an MCP client (such as Cursor, Claude Desktop, or any RFC 9728-compliant client) connects to an OAuth-protected MCP endpoint, it needs to discover the authorization server's endpoints (authorization, token, JWKS, registration) to perform the browser-based OAuth login flow. Today, the ai-gateway controller attempts to auto-discover these endpoints by fetching the issuer's
/.well-known/oauth-authorization-serveror/.well-known/openid-configurationdocument. If auto-discovery fails, the controller falls back to Keycloak-style hardcoded paths (e.g.,/protocol/openid-connect/auth,/protocol/openid-connect/token).This approach breaks for authorization servers that:
Do not serve well-known metadata at the issuer URL — For example, the IDP we use have a legacy issuer path (
/api/idp/authn/.well-known/openid-configuration) that returns{}(empty JSON), while the real metadata is only available at a versioned path (/api/idp/v4/authn/.well-known/openid-configuration).Use non-standard endpoint paths — our IDP uses
/authn/v1/oidc/authand/v4/authn/oidc/token, not the Keycloak-style/protocol/openid-connect/authand/protocol/openid-connect/token.Require authentication to access well-known endpoints — Some deployments place the well-known endpoints behind an authentication gateway, returning 401 to unauthenticated requests from the controller.
Have a mismatch between the issuer identity and the discovery path — The
issuerfield in the MCPRoute must exactly match theissclaim in JWT tokens (Envoy does exact string comparison for token validation). But the well-known metadata might only be served at a different (e.g., versioned) path. These two concerns — token validation and metadata discovery — are coupled through theissuerfield today, and they shouldn't be.Why the existing auto-discovery doesn't work
The controller derives well-known URLs from the
issuerfield. For an issuer likehttps://example.com/api/iam/authn, it tries:https://example.com/.well-known/oauth-authorization-server/api/idp/authnhttps://example.com/.well-known/openid-configuration/api/idp/authnhttps://example.com/api/idp/authn/.well-known/oauth-authorization-serverhttps://example.com/api/idp/authn/.well-known/openid-configuration{}The last variant returns valid JSON (HTTP 200), so the controller treats it as success. But all fields are empty strings. The controller then falls back to Keycloak-style hardcoded paths, which don't exist on the actual authorization server. MCP clients read these wrong URLs, try to hit them, and get 404 errors.
The correct metadata is served at a versioned path (
/api/idp/v4/authn/.well-known/openid-configuration), but the controller never tries this URL because it's not derivable from theissuer. And we can't change theissuerto the versioned path because tokens are stamped with the legacyissclaim — Envoy would reject every token.The versioning problem
Authorization server endpoint paths may include API version strings (e.g.,
v4.1.b1) that change across upgrades. MCPRoutes are created dynamically at runtime (bynai-api), not by Helm. This means:issuerfield (used for JWT validation) is version-agnostic and stable, but the actual endpoint paths are not.This is an inherent tension: the stable identifier for token validation (
issuer) differs from the versioned paths where endpoints are actually served.Impact without this fix
Without
authorizationServerMetadata, the MCP OAuth flow fails for any authorization server that doesn't match the Keycloak endpoint convention. The specific failure mode depends on the client:HTTP 404: Invalid OAuth error response: Unexpected end of JSON input— no browser login page ever opens.The data plane (Bearer token validation) works fine — it's purely the control plane discovery metadata that's wrong.
Desired Behavior
Add an optional
authorizationServerMetadatafield to the MCPRoute'soauthspec that allows explicit specification of the authorization server's endpoint URLs:When
authorizationServerMetadatais provided:/.well-known/oauth-authorization-serverresponse, bypassing auto-discovery and the Keycloak fallback entirely.authorization_serversfield in the protected resource metadata points to the resource URL (not the issuer), so MCP clients fetch auth server metadata from the gateway's own well-known endpoint (where the correct values are served).issuerfield remains solely for JWT token validation (Envoyissclaim matching), decoupling it from endpoint discovery.Additionally, the controller should treat empty well-known responses (
{}) as discovery failures rather than successes, falling through to the next discovery attempt or the fallback.Precedence for auth server metadata
authorizationServerMetadatafrom the MCPRoute spec (if provided)Relevant Links