Skip to content

feat(prompt): KG-685 add and introduce, Google Search grounding support for Gemini#1898

Open
Jai Goyal (goyaljai) wants to merge 13 commits into
JetBrains:developfrom
goyaljai:KG-grounding-google-search-gemini
Open

feat(prompt): KG-685 add and introduce, Google Search grounding support for Gemini#1898
Jai Goyal (goyaljai) wants to merge 13 commits into
JetBrains:developfrom
goyaljai:KG-grounding-google-search-gemini

Conversation

@goyaljai

@goyaljai Jai Goyal (goyaljai) commented Apr 24, 2026

Copy link
Copy Markdown

Summary

Implements grounding with Google Search for Gemini models, and fixes a missing input validation bug discovered during implementation.

  • Add LLMCapability.Grounding so callers can check model support with model.supports(LLMCapability.Grounding)
  • Add GoogleGroundingConfig sealed class — GoogleSearch (Gemini 2.0+, native google_search tool) and GoogleSearchRetrieval (Gemini 1.5, googleSearchRetrieval with optional dynamic threshold)
  • Add groundingConfig: GoogleGroundingConfig? field to GoogleParams
  • All Gemini 2.0+ models gain LLMCapability.Grounding via fullCapabilities
  • createGoogleRequest() injects the appropriate grounding tool and guards unsupported models with a clear require() error

Usage

val prompt = prompt("search") { user("Who won Euro 2024?") }
val response = executor.execute(
    prompt,
    GoogleModels.Gemini2_5Flash,
    params = GoogleParams(groundingConfig = GoogleGroundingConfig.GoogleSearch)
)

Test plan

  • createGoogleRequest injects google_search tool when GoogleSearch grounding is set
  • createGoogleRequest injects googleSearchRetrieval tool with threshold when set — asserts MODE_DYNAMIC is present
  • createGoogleRequest merges grounding tool with function tools
  • createGoogleRequest throws when grounding set on model that does not support it
  • GoogleSearchRetrieval rejects dynamicThreshold above 1_0
  • GoogleSearchRetrieval rejects negative dynamicThreshold
  • GoogleSearchRetrieval accepts null and boundary dynamicThreshold values
  • All 27 GoogleLLMClientTest tests pass, no regressions

Closes https://youtrack.jetbrains.com/issue/KG-685/

@Amaneusz Jakub Amanowicz (Amaneusz) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for this PR! I went through it and I think it could be improved - see my comments below.

The main things to address are:

  1. We don't want a seperate LLMCapability for GoogleSearch based grounding
  2. The API integration would not work in the proposed state (we need typed googleSearch, currently we have loose type for google_search)
  3. This PR tries to integrate with legacy Gemini models we don't support

I'm open to consulting further if you disagree on some points (I might have got something wrong) or need some guidance wrt our codebase

Comment thread prompt/prompt-llm/src/commonMain/kotlin/ai/koog/prompt/llm/LLMCapability.kt Outdated
@Serializable
internal class GoogleTool(
val functionDeclarations: List<GoogleFunctionDeclaration>? = null,
@SerialName("google_search")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that it's google_search and not googleSearch?
https://ai.google.dev/api/caching#Tool

@Amaneusz Jakub Amanowicz (Amaneusz) Apr 28, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this should not just be some JsonObject - there's specific contract for it: https://ai.google.dev/api/caching#GoogleSearch

So we want something like this in the end:

@Serializable                                                                                                          
internal class GoogleSearch(                                                                                           
    val timeRangeFilter: Interval? = null,                                                                             
    val searchTypes: SearchTypes? = null,                                                                              
)

And in GoogleTool :

val googleSearch: GoogleSearch? = null

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, Guess I verified but will verify again against the API spec and fix the serialization name. if needed I will Fix in ~1hr.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it doesnot work like that.

I tested both features using generativelanguage.googleapis.com (this is the Google AI Studio API used by koog, not Vertex AI).

searchTypes.webSearch: The API accepted it, but the results were the same as using just {"googleSearch": {}}. There was no difference in behavior. So the filter did not work.
timeRangeFilter: I tested it with the question “Iran vs US war 2026, what is happening?” and set the date range from Jan 2025 to July 2025. Even with this filter, the model still returned full details about the Feb 2026 conflict, same as without the filter. So the filter did not work.

According to the documentation (https://ai.google.dev/api/caching#GoogleSearch
), these features seem to work only with the Vertex AI API (aiplatform.googleapis.com), not the public API we are using.

Since koog uses generativelanguage.googleapis.com, both features NOT NEEDED to avoid relying on something that doesn’t actually work.

Now, GoogleSearch is just an empty object, and only basic grounding is supported.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation these features seem to work only with the Vertex AI API

I don't think this is a correct statement. Can you show which part of the docs says that?
Afaik these docs cover the exact API we're using.

Moreover, the fact that the API did not give you the result you expected does not mean that we should not comply to the documented request format.

I tested the request shape with simple curl that was meant to be invalid (see missing startTime in timeRangeFilter):

> curl -s "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \                                                                                                                             
             -H "X-goog-api-key: ..." \
             -H "Content-Type: application/json" \
             -d '{                                                                                                              
               "contents": [{"parts": [{"text": "Who won latest parliament elections in Hungary"}]}],                    
               "tools": [{                                                                                                        
                 "googleSearch": {                                                                                                
                   "timeRangeFilter": {                                                                                           
                     "endTime": "2024-12-31T23:59:59Z"   
                   }                                                                                                              
                 }                                                                                                              
               }]                                                                                                                 
             }'

and got 400:

{
  "error": {
    "code": 400,
    "message": "* GenerateContentRequest.tools[0].google_search.time_range_filter: [FIELD_INVALID] Both start time and end time must be given\n",
    "status": "INVALID_ARGUMENT"
  }
}

which proves that the shape IS expected. We must comply with the schema that is defined here: https://ai.google.dev/api/caching#GoogleSearch

The fact that you got the same responses with your calls only proves that inner fields of googleSearch are not required, but we should still allow users to pass them if they want - otherwise we constrain the Gemini API.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, check these docs: https://ai.google.dev/gemini-api/docs/google-search

When a response is successfully grounded, the response includes a groundingMetadata field.

In my case, when I run your query - I did receive the mentioned field. It's true that the filter did not seem to work as expected, but the API itself did. We don't test models, we test integration

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair point — the API accepts it, we just shouldn't claim it guarantees filtered results.

@goyaljai Jai Goyal (goyaljai) force-pushed the KG-grounding-google-search-gemini branch 2 times, most recently from d7151b5 to 8c0be29 Compare April 28, 2026 07:11
@goyaljai

Copy link
Copy Markdown
Author

Thank you again for this PR! I went through it and I think it could be improved - see my comments below.

The main things to address are:

  1. We don't want a seperate LLMCapability for GoogleSearch based grounding
  2. The API integration would not work in the proposed state (we need typed googleSearch, currently we have loose type for google_search)
  3. This PR tries to integrate with legacy Gemini models we don't support

I'm open to consulting further if you disagree on some points (I might have got something wrong) or need some guidance wrt our codebase

Done all the changes. Please review again.

@goyaljai

Jai Goyal (goyaljai) commented Apr 29, 2026

Copy link
Copy Markdown
Author

Jakub Amanowicz (@Amaneusz) Please review now.
The model may not behave differently today, but that's the model's limitation, not ours (Koogs). We should expose it.

Changed the code to support these.

@goyaljai Jai Goyal (goyaljai) changed the title feat(prompt): KG-685 add Google Search grounding support for Gemini 2.0+ models feat(prompt): KG-685 add and introduce, Google Search grounding support for Gemini Apr 29, 2026
….0+ models

- Add LLMCapability.Grounding capability
- Add GoogleGroundingConfig sealed class (GoogleSearch / GoogleSearchRetrieval)
- Add groundingConfig field to GoogleParams
- Add Grounding to all Gemini 2.0+ models via fullCapabilities
- Inject google_search / googleSearchRetrieval tool in createGoogleRequest()
- Fix: dynamicRetrievalConfig requires mode=MODE_DYNAMIC for threshold to take effect
- Fix (KG-685): add init{} validation for dynamicThreshold in [0.0, 1.0]
- 7 unit tests, live API verified

Closes https://youtrack.jetbrains.com/issue/KG-685/
…nsupported API features

- Replace GoogleGroundingConfig sealed class with simple groundingEnabled: Boolean in GoogleParams
- Remove LLMCapability.Grounding (grounding is provider-specific, not a model trait)
- Remove timeRangeFilter/Interval and searchTypes -- tested on generativelanguage.googleapis.com,
  neither changed model behavior (timeRangeFilter proved no-op via Iran/US war 2026 date test)
- Replace JsonObject grounding fields with typed GoogleSearch/GoogleTool internal classes
- Add GoogleGroundingLiveTest integration test
- Use shouldNotBeNull() instead of !! in tests
Use @BeforeAll + assumeTrue to skip gracefully in CI when GEMINI_API_TEST_KEY
is not set. Tests are ABORTED (not FAILED) when key is absent, so CI passes.
The API validates timeRangeFilter fields (400 on partial input proves it).
We should expose what the documented API supports regardless of model behavior.
- Interval(startTime, endTime) added back to GoogleGenerateContent
- groundingStartTime/groundingEndTime added to GoogleParams
- Validation: both must be set together (matches API requirement)
- GoogleLLMClient builds Interval when both times are provided
@goyaljai Jai Goyal (goyaljai) force-pushed the KG-grounding-google-search-gemini branch from c6a61ef to e1c143b Compare May 28, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants