Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional MLflow integration across API, CRD schema, deepcopy, controller logic, and templates: introduces Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Critical findings
Code quality findings
Architectural notes
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
A new image has been built to help with testing out this PR: To use this image run the following: cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/1000/head
git checkout -b pullrequest 0eb20289080ad354fd1e01550faa70d52b3bb6e5
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-1000"More instructions here on how to deploy and test a Data Science Pipelines Application. |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (3)
controllers/config/defaults.go (1)
168-170:DefaultPluginsEnabled = trueenables feature by default.With plugins enabled by default but configuration broken due to path mismatches, users may see unexpected behavior. Consider defaulting to
falseuntil the feature is stable, or ensure the configuration plumbing is fixed first.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/config/defaults.go` around lines 168 - 170, DefaultPluginsEnabled is set to true which turns plugins on by default despite DefaultPlugins being empty and potential path/config mismatches; change the default to false by updating the DefaultPluginsEnabled constant to false (or alternatively fix the configuration/path plumbing before enabling by ensuring DefaultPlugins is populated and plugin path resolution in the plugin initialization code is validated), and add a short comment near DefaultPlugins and DefaultPluginsEnabled explaining the safe default choice until plugin config is stable.config/samples/dspa-simple/dspa_simple.yaml (1)
7-7: Sample enables plugins without demonstrating configuration.Enabling
pluginsEnabled: truewithout showing the correspondingpluginsfield may confuse users. Consider adding a commented example of thepluginsfield or settingpluginsEnabled: falseuntil the feature is documented.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/samples/dspa-simple/dspa_simple.yaml` at line 7, The sample sets pluginsEnabled: true but provides no plugins configuration, which can confuse users; update the sample by either setting pluginsEnabled: false or add a commented example showing the plugins field (e.g., a commented plugins: section with one or two plugin entries and their minimal config) so users see how to enable and configure plugins; refer to the pluginsEnabled and plugins keys when making the change.controllers/util/util.go (1)
105-111: Simplify return statement; consider conflict-retry guidance.The function can be reduced to a single return statement. Additionally, per the call site in
controllers/dspipeline_params.go:694, there's no retry on409 Conflict. WithMAX_CONCURRENT_RECONCILES=10, concurrent updates to the same ConfigMap will race. Consider documenting that callers must handle conflicts or implement retry-with-backoff within this helper.Simplify
func UpdateConfigMap(ctx context.Context, cfgMap *v1.ConfigMap, client client.Client) error { - err := client.Update(ctx, cfgMap) - if err != nil { - return err - } - return nil + return client.Update(ctx, cfgMap) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/util/util.go` around lines 105 - 111, The UpdateConfigMap helper currently returns an explicit nil after updating; simplify it to a single return by returning the result of client.Update(ctx, cfgMap) directly (function: UpdateConfigMap). Also address concurrent 409 Conflict races (given MAX_CONCURRENT_RECONCILES) by either documenting that callers must handle conflicts or implementing retry-with-backoff inside UpdateConfigMap (use client-go's retry.RetryOnConflict or a small loop that checks errors.IsConflict and retries with exponential backoff) so concurrent updates to the same ConfigMap are retried safely.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1/dspipeline_types.go`:
- Around line 58-64: The Plugins field lacks kubebuilder validation and should
be constrained (add a +kubebuilder:validation:MaxLength=<reasonable limit> and
optionally a pattern/format marker) to prevent unbounded strings; add this
validation marker above the Plugins declaration. The PluginsEnabled bool must
distinguish "unset" from explicit false — change the type to *bool
(PluginsEnabled *bool `json:"pluginsEnabled,omitempty"`) or remove `omitempty`,
and update any controller logic that sets p.PluginsEnabled (refer to the
PluginsEnabled field and the controller code that assigns
config.DefaultPluginsEnabled) to first check for a user-provided value (nil vs
non-nil) before applying the default; ensure kubebuilder validation markers
(e.g., +kubebuilder:validation:Optional) remain consistent.
In `@config/base/kustomization.yaml`:
- Around line 184-190: The PLUGINS vars entry has fieldref wrongly nested inside
objref; update the PLUGINS vars block (the entry named PLUGINS) so that fieldref
is a sibling of objref (matching the structure used by WEBHOOK_ANNOTATIONS) —
move the fieldref.node (fieldpath: data.PLUGINS) out from under
objref.apiVersion into the same level as objref, ensuring the var uses objref:
{kind: ConfigMap, name: dspo-parameters, apiVersion: v1} and a separate
fieldref: {fieldpath: data.PLUGINS} so $(PLUGINS) substitution works.
In `@config/configmaps/files/config.yaml`:
- Line 29: The YAML places the Plugins key under DSPO (DSPO.Plugins) while the
code expects DSPO.ApiServer.Plugins (PluginsConfigName in
controllers/config/defaults.go), so update one side to match: either move the
Plugins entry in config.yaml under an ApiServer mapping (so the key becomes DSPO
-> ApiServer -> Plugins) or change PluginsConfigName to "DSPO.Plugins" in
defaults.go; adjust whichever file you modify to preserve the exact key string
used by Viper and ensure the names match (reference: PluginsConfigName constant
and the Plugins key in config.yaml).
In
`@config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml`:
- Around line 919-927: Remove the internal "todo: in this approach..." sentence
from the plugins field description, add a maxLength constraint to the plugins
schema (e.g., add maxLength: 4096 on the string items or on the string property
for plugins) to prevent arbitrarily large values that can stress etcd, and set
an explicit default for pluginsEnabled by adding default: true to the
pluginsEnabled property so the schema and description are consistent; update the
descriptions accordingly for plugins and pluginsEnabled while keeping the
existing field names (plugins, pluginsEnabled).
In `@config/manager/manager.yaml`:
- Around line 88-89: The deployed manager injects an env var named PLUGINS but
the code in main.go reads os.Getenv("DSPO_APISERVER_PLUGINS"), so the plugin
configuration is never seen; fix this by renaming the environment variable key
in the manager manifest from PLUGINS to DSPO_APISERVER_PLUGINS (update the
manifest entry that currently reads - name: PLUGINS value: $(PLUGINS)) so it
matches the code path that checks DSPO_APISERVER_PLUGINS in main.go and enables
proper MLflow plugin detection.
In `@controllers/config/defaults.go`:
- Line 112: The constant PluginsConfigName currently has the wrong config path
string; update the value of PluginsConfigName to "DSPO.Plugins" to match the
existing YAML structure (or alternatively align the YAML to include an ApiServer
nesting), ensuring any code that references PluginsConfigName uses the corrected
path.
In `@controllers/dspipeline_params.go`:
- Around line 494-498: The conditional is reading dsp.Spec.PluginsEnabled
directly while later logic uses p.PluginsEnabled, causing divergence; update
SetupObjectParams so p.PluginsEnabled is assigned from dsp.Spec.PluginsEnabled
early (fix the assignment around the existing initialization code) and change
the block that clears plugins to check p.PluginsEnabled (not
dsp.Spec.PluginsEnabled); also ensure when p.PluginsEnabled is false you set
p.Plugins = "" to prevent injection downstream (refer to SetupObjectParams,
p.PluginsEnabled, p.Plugins, and dsp.Spec.PluginsEnabled).
- Around line 640-641: p.PluginsEnabled is always set to
config.DefaultPluginsEnabled, ignoring dsp.Spec.PluginsEnabled; change the
assignment so p.PluginsEnabled reflects the spec when present and falls back to
the config default otherwise. Specifically, replace the unconditional
p.PluginsEnabled = config.DefaultPluginsEnabled with logic that checks
dsp.Spec.PluginsEnabled (handle it as *bool if the API uses a pointer): if
dsp.Spec.PluginsEnabled != nil set p.PluginsEnabled = *dsp.Spec.PluginsEnabled
else set it = config.DefaultPluginsEnabled; ensure subsequent code uses
p.PluginsEnabled rather than reading dsp.Spec.PluginsEnabled directly.
- Around line 677-701: The ConfigMap update sequence around util.GetConfigMap ->
updateJSON -> util.UpdateConfigMap risks TOCTOU conflicts and silently proceeds
on updateJSON errors; change the block that runs when p.PluginsEnabled &&
p.Plugins != "" to first validate that p.Plugins is well-formed JSON (e.g.,
attempt to unmarshal into interface{}) before calling updateJSON, then perform
the read-modify-write inside a retry loop that re-reads the ConfigMap (using
util.GetConfigMap), checks/uses cm.ResourceVersion, calls updateJSON, and
retries the UpdateConfigMap on conflict (409) a few times; also immediately
return the updateJSON error instead of only logging it (so update failures don’t
continue with an empty/invalid updatedServerCfg). Ensure you reference
updateJSON, util.GetConfigMap, util.UpdateConfigMap, cm.Data["config.json"], and
p.Plugins in the fix.
In `@main.go`:
- Around line 289-311: ParsePlugins may return nil and plugins["mlflow"] will
panic; after calling ParsePlugins check for nil (e.g., if plugins == nil { log
the parse error and exit or treat as empty map }) before indexing. Also make the
MLflow CR absence decisive: in the block that attempts to Get the MLflow CR
(mlflowCR via mgr.GetAPIReader().Get) convert the current non-fatal
setupLog.Error into a fatal startup error (or return the error from main) if the
mlflow plugin is required, so the operator does not continue without the
required CR; alternatively, if non-fatal, document and handle it downstream in
the reconciler. Finally, confirm the CR name/namespace/scope assumptions used
when setting mlflowCR (Name "mlflow" and empty Namespace) match the MLflow CRD
scope and adjust the ObjectKey/namespace accordingly.
- Around line 331-346: ParsePlugins currently unmarshals pluginStr without size
checks and returns nil on error; add defensive validation and richer logging:
define a constant maxPluginsSize (e.g., 64KB), check len(pluginStr) <=
maxPluginsSize at the top of ParsePlugins and log/return an empty map if
exceeded, then call json.Unmarshal into envelope wrapped in a short-decorated
error log that includes the input length and a truncated preview (not the full
content) on failure, and on any unmarshal or type-assertion failure return an
empty map[string]interface{} (not nil) so callers won’t panic; update uses of
variables envelope, pluginStr and setupLog in ParsePlugins accordingly.
In `@scripts/release/params.py`:
- Around line 49-51: The DSPO_APISERVER_PLUGINS default is malformed JSON
("{\"}\"") which causes ParsePlugins (main.go:335) json.Unmarshal to fail and
disables plugins; update the DSPO_APISERVER_PLUGINS entry to a valid empty JSON
object ("{}") so ParsePlugins receives valid JSON and plugin initialization
(including MLflow detection) works correctly.
---
Nitpick comments:
In `@config/samples/dspa-simple/dspa_simple.yaml`:
- Line 7: The sample sets pluginsEnabled: true but provides no plugins
configuration, which can confuse users; update the sample by either setting
pluginsEnabled: false or add a commented example showing the plugins field
(e.g., a commented plugins: section with one or two plugin entries and their
minimal config) so users see how to enable and configure plugins; refer to the
pluginsEnabled and plugins keys when making the change.
In `@controllers/config/defaults.go`:
- Around line 168-170: DefaultPluginsEnabled is set to true which turns plugins
on by default despite DefaultPlugins being empty and potential path/config
mismatches; change the default to false by updating the DefaultPluginsEnabled
constant to false (or alternatively fix the configuration/path plumbing before
enabling by ensuring DefaultPlugins is populated and plugin path resolution in
the plugin initialization code is validated), and add a short comment near
DefaultPlugins and DefaultPluginsEnabled explaining the safe default choice
until plugin config is stable.
In `@controllers/util/util.go`:
- Around line 105-111: The UpdateConfigMap helper currently returns an explicit
nil after updating; simplify it to a single return by returning the result of
client.Update(ctx, cfgMap) directly (function: UpdateConfigMap). Also address
concurrent 409 Conflict races (given MAX_CONCURRENT_RECONCILES) by either
documenting that callers must handle conflicts or implementing
retry-with-backoff inside UpdateConfigMap (use client-go's retry.RetryOnConflict
or a small loop that checks errors.IsConflict and retries with exponential
backoff) so concurrent updates to the same ConfigMap are retried safely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8b23755a-45ff-481b-9028-8479d5f73b1b
📒 Files selected for processing (12)
api/v1/dspipeline_types.goconfig/base/kustomization.yamlconfig/base/params.envconfig/configmaps/files/config.yamlconfig/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yamlconfig/manager/manager.yamlconfig/samples/dspa-simple/dspa_simple.yamlcontrollers/config/defaults.gocontrollers/dspipeline_params.gocontrollers/util/util.gomain.goscripts/release/params.py
.../bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml
Outdated
Show resolved
Hide resolved
hbelmiro
left a comment
There was a problem hiding this comment.
- Can we move the MLflow check to the reconciler? That way CRs can be added/modified on the fly and users can have status feedback.
- I'd remove
PluginsEnabled. IfPluginsis empty or absent, plugins are disabled. - Probably out of scope: It would be nice to have admin (operator level) granularity on what plugins are allowed by DSPAs
|
Change to PR detected. A new PR build was completed. |
Hey @hbelmiro thanks for the review.
|
api/v1/dspipeline_types.go
Outdated
| Plugins string `json:"plugins,omitempty"` | ||
|
|
||
| // Enable plugins. Defaults to True. | ||
| // +kubebuilder:validation:Optional | ||
| PluginsEnabled bool `json:"pluginsEnabled,omitempty"` |
There was a problem hiding this comment.
After discussing with Matt, we've agreed to just surface explicit configuration options catered to MLflow specifically. The fields we are going to need are:
spec:
# If mlflow field is omitted, the default behavior is mlflow.integrationMode = AutoDetect
mlflow:
integrationMode: # *enum type, accepts one of "Disabled" or "AutoDetect", when omitted assume "AutoDetect"
injectUserEnvVars: # *bool type, by default this value is false, when omitted the value is false.The behavior works as follows:
integrationMode: Disabled
DSP API Server does not enable MLflow plugin configuration.
integrationMode: AutoDetect
For each DSPA reconcile loop, DSPO needs to enable MLflow plugin only if MLflow is deployed on the RHOAI cluster.
DSPO on start up should look up the MLflow Custom Resource's Status spec and cache the endpoint. That link points to the cluster internal address that can be leveraged by KFP in RHOAI.
Add a cache ttl, and use this cache to determine in each reconcile if MLflow is deployed (only need to make this check if integrationMode is AutoDetect).
injectUserEnvVars: false
DSPA sets the equivalent MLflow plugin setting injectUserEnvVars to false. This doesn't exist yet — you will need to add this to your KFP PR for launcher/driver as a top level plugin setting for MLflow.
When this is false, KFP will not update the user container's environment variable (the executor container). We want this behavior by default when unspecified.
injectUserEnvVars: true
Conversely, when the user opts in to this by setting true, then propagate this to the KFP config so that KFP can then have driver set these on the executor container.
There was a problem hiding this comment.
Note that if integrationMode: AutoDetect but we are unable to find the endpoint, then we should surface a DSPA status condition on this with a helpful user facing message. In this case plugin integration is not configured for KFP.
controllers/dspipeline_controller.go
Outdated
| dspaStatus.SetObjStoreReady() | ||
| } | ||
|
|
||
| if params.PluginsEnabled && params.Plugins != "" { |
There was a problem hiding this comment.
Right now I don't think mlflowCR is doing anything.
As mentioned earlier, use explicit mlflow fields suggested earlier. You do not need to always fetch mlflow CR, use the semantics described in the earlier comment, do it once at dspo startup, and after everytime cache expires.
controllers/dspipeline_controller.go
Outdated
| dspaStatus.SetObjStoreReady() | ||
| } | ||
|
|
||
| if params.PluginsEnabled && params.Plugins != "" { |
There was a problem hiding this comment.
I think your approach needs to change here. Take a look at how plugin settings are configured at the API server level. This is the config that is present here. This is what you need to update when mlflow integrationMode is set to AutoDetect and you the MLflow instance exists on the cluster (i.e. you have successfully retrieved the mlflow endpoint).
Now the only thing you need to dynamically fetch is the endpoint (as mentioned earlier) from the MLflow CR. The rest you can infer/hardcode (as constants):
Timeout: 30s (or have llm suggest a reasonable timeout)
TLS: the dspa custom bundle path is sufficient
WorkspacesEnabled: true
AuthType: kubernetes
ExperimentDescription: "Created by AI Pipelines."
This can go straight in to the pipeline server config, and the integration will be enabled.
api/v1/dspipeline_types.go
Outdated
| Proxy *ProxyConfig `json:"proxy,omitempty"` | ||
|
|
||
| // Plugins is a string list of JSON configs for each API server plugin | ||
| Plugins string `json:"plugins,omitempty"` |
There was a problem hiding this comment.
Ignore this comment, I added this to negate the accidental approve earlier.
e830308 to
eb0cd45
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (6)
api/v1/dspipeline_types.go (2)
59-59:⚠️ Potential issue | 🟠 MajorFix the MLflow JSON tag before this becomes part of the public API.
json:"mlflowConfig,omitEmpty"uses a non-existentencoding/jsonoption, soomitemptyis not applied, andconfig/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml:919-930already describes a different MLflow wire shape. Align the tag with the intended API/CRD name and add the usual kubebuilder marker on the field.As per coding guidelines, "Kubebuilder validation markers on all user-facing fields."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v1/dspipeline_types.go` at line 59, The MLflow field's JSON tag is wrong and missing a kubebuilder marker: update the MLflow field (MLflow *MLflowConfig) to use the correct wire name and omitempty (e.g. json:"mlflow,omitempty" to match the CRD) and add the kubebuilder validation marker (e.g. +kubebuilder:validation:Optional) immediately above the field so it is treated as optional in generated CRDs; then regenerate CRDs to verify the wire shape matches the existing manifests.
459-469:⚠️ Potential issue | 🔴 Critical
MLflowConfigcannot be generated into a CRD as written.The anonymous
IntegrationModefield has no JSON tag, which is already trippingcontroller-genin CI, andtype IntegrationMode intwill marshal as0/1instead of the string enum the CRD/API exposes. MakeIntegrationModea string type and give both fields explicit JSON tags, defaults, and enum markers.As per coding guidelines, "Kubebuilder validation markers on all user-facing fields" and "Enum validation where applicable."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v1/dspipeline_types.go` around lines 459 - 469, The MLflowConfig struct and IntegrationMode type must be updated so the CRD can be generated: change type IntegrationMode from int to string, keep the enum constants (AutoDetect, Disabled) as string values, add explicit json tags on both IntegrationMode and InjectUserEnvVars in MLflowConfig (e.g., `integrationMode` and `injectUserEnvVars`), and add kubebuilder validation markers on the fields — include an enum marker for IntegrationMode listing "AutoDetect" and "Disabled" and a default marker (e.g., default=AutoDetect), plus a default/optional marker for InjectUserEnvVars as appropriate so controller-gen can produce correct CRD validation.controllers/dspipeline_params.go (3)
675-675:⚠️ Potential issue | 🔴 CriticalHandle omitted MLflow config before dereferencing it.
p.MLflow = dsp.Spec.MLflowfollowed byp.MLflow.IntegrationModepanics whenever the field is omitted, becausep.MLflowstays nil. Initialize a default config or guard the pointer before the mode check.Minimal fix
- p.MLflow = dsp.Spec.MLflow + if dsp.Spec.MLflow != nil { + p.MLflow = dsp.Spec.MLflow.DeepCopy() + } else { + p.MLflow = &dspa.MLflowConfig{ + IntegrationMode: dspa.AutoDetect, + } + } @@ - if p.MLflow.IntegrationMode == dspa.AutoDetect { + if p.MLflow != nil && p.MLflow.IntegrationMode == dspa.AutoDetect {Also applies to: 704-704
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/dspipeline_params.go` at line 675, p.MLflow is assigned from dsp.Spec.MLflow but may be nil, causing a panic when later accessing p.MLflow.IntegrationMode; modify the assignment sites (where p.MLflow = dsp.Spec.MLflow is set) to either (a) check for nil and set p.MLflow to a default MLflow config struct when dsp.Spec.MLflow == nil, or (b) guard subsequent accesses by checking if p.MLflow != nil before reading IntegrationMode (e.g., use conditional logic in the code path that reads p.MLflow.IntegrationMode). Apply the same fix at both locations referenced (the assignment at p.MLflow = dsp.Spec.MLflow and the second occurrence) so no dereference happens on a nil pointer.
712-721:⚠️ Potential issue | 🔴 CriticalThis MLflow config is built and patched through the wrong path.
controllers/util/util.go:93-133showsUpdateConfigMapByNameusesfieldboth as the ConfigMap data key and the JSON key passed toupdateJSON, so passing"config.json"here cannot write the intended MLflow entry. This block also runs beforeCustomServerConfigis defaulted later in the function, and the emptyif err != nil {}discards update failures. Build the finalconfig.jsonafter defaults are applied and write it through a helper that separatesdataKeyfromjsonKey.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/dspipeline_params.go` around lines 712 - 721, The MLflow plugin config is being built and patched too early and via the wrong JSON key: move the BuildMLflowPluginConfigJson(...) + config patching logic to after CustomServerConfig is defaulted, and replace the util.UpdateConfigMapByName(...) call with a call to a helper that accepts separate dataKey and jsonKey (rather than passing "config.json" for both) so the MLflow entry is written into the correct JSON field; also handle and surface any error from the update (remove the empty if err != nil {} and log/return the error). Ensure you reference BuildMLflowPluginConfigJson, p.APIServer.CustomServerConfig.Name, p.Namespace and util.UpdateConfigMapByName (or the new helper) when making the change.
704-715:⚠️ Potential issue | 🟠 MajorDo not continue after MLflow endpoint/config generation fails.
This block only logs
RetrieveMLflowEndpoint/BuildMLflowPluginConfigJsonfailures and keeps reconciling. That leaves the DSPA looking healthy while MLflow integration is skipped, andpluginCfgcan still be built from an empty endpoint. Return an error or set a failing status condition here instead of continuing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/dspipeline_params.go` around lines 704 - 715, The code continues reconciliation when RetrieveMLflowEndpoint or BuildMLflowPluginConfigJson fail (leaving pluginCfg empty); change the error handling inside the p.MLflow.IntegrationMode == dspa.AutoDetect branch so that on error you do not proceed: after calling RetrieveMLflowEndpoint and after BuildMLflowPluginConfigJson check the returned err and either return that error from the enclosing reconcile function or set a failing DSPA status condition and return, ensuring you do not continue with an empty mlflowEndpoint or pluginCfg; target the calls to RetrieveMLflowEndpoint, BuildMLflowPluginConfigJson, and the pluginCfg/mlflowEndpoint variables for the fix.scripts/release/params.py (1)
50-50:⚠️ Potential issue | 🟠 Major
MLFLOWdefaults to malformed JSON.
"{\"}\""expands to{"}", so this is neither an empty object ({}) nor a valid quoted JSON payload ("{}"). If the release params consumer parsesMLFLOWas JSON like the neighboring config values, plugin initialization will fail.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/release/params.py` at line 50, The MLFLOW default value is malformed ("{\"}\"") and should be a valid JSON payload; update the MLFLOW entry in params.py (the "MLFLOW" mapping) to a proper JSON string representing an empty object — e.g., replace "{\"}\"" with "{}" so consumers that JSON-parse the value will receive an empty object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/dspipeline_params.go`:
- Line 45: The import github.com/opendatahub-io/mlflow-operator/api/v1 used in
controllers/dspipeline_params.go is not present in go.mod; update your module
dependencies by adding github.com/opendatahub-io/mlflow-operator at an
appropriate version (pin a specific release or commit) to go.mod and run go mod
tidy to populate go.sum; ensure the module version you choose is compatible with
the mlflowv1 import used in that file and commit the updated go.mod and go.sum
alongside this change.
---
Duplicate comments:
In `@api/v1/dspipeline_types.go`:
- Line 59: The MLflow field's JSON tag is wrong and missing a kubebuilder
marker: update the MLflow field (MLflow *MLflowConfig) to use the correct wire
name and omitempty (e.g. json:"mlflow,omitempty" to match the CRD) and add the
kubebuilder validation marker (e.g. +kubebuilder:validation:Optional)
immediately above the field so it is treated as optional in generated CRDs; then
regenerate CRDs to verify the wire shape matches the existing manifests.
- Around line 459-469: The MLflowConfig struct and IntegrationMode type must be
updated so the CRD can be generated: change type IntegrationMode from int to
string, keep the enum constants (AutoDetect, Disabled) as string values, add
explicit json tags on both IntegrationMode and InjectUserEnvVars in MLflowConfig
(e.g., `integrationMode` and `injectUserEnvVars`), and add kubebuilder
validation markers on the fields — include an enum marker for IntegrationMode
listing "AutoDetect" and "Disabled" and a default marker (e.g.,
default=AutoDetect), plus a default/optional marker for InjectUserEnvVars as
appropriate so controller-gen can produce correct CRD validation.
In `@controllers/dspipeline_params.go`:
- Line 675: p.MLflow is assigned from dsp.Spec.MLflow but may be nil, causing a
panic when later accessing p.MLflow.IntegrationMode; modify the assignment sites
(where p.MLflow = dsp.Spec.MLflow is set) to either (a) check for nil and set
p.MLflow to a default MLflow config struct when dsp.Spec.MLflow == nil, or (b)
guard subsequent accesses by checking if p.MLflow != nil before reading
IntegrationMode (e.g., use conditional logic in the code path that reads
p.MLflow.IntegrationMode). Apply the same fix at both locations referenced (the
assignment at p.MLflow = dsp.Spec.MLflow and the second occurrence) so no
dereference happens on a nil pointer.
- Around line 712-721: The MLflow plugin config is being built and patched too
early and via the wrong JSON key: move the BuildMLflowPluginConfigJson(...) +
config patching logic to after CustomServerConfig is defaulted, and replace the
util.UpdateConfigMapByName(...) call with a call to a helper that accepts
separate dataKey and jsonKey (rather than passing "config.json" for both) so the
MLflow entry is written into the correct JSON field; also handle and surface any
error from the update (remove the empty if err != nil {} and log/return the
error). Ensure you reference BuildMLflowPluginConfigJson,
p.APIServer.CustomServerConfig.Name, p.Namespace and util.UpdateConfigMapByName
(or the new helper) when making the change.
- Around line 704-715: The code continues reconciliation when
RetrieveMLflowEndpoint or BuildMLflowPluginConfigJson fail (leaving pluginCfg
empty); change the error handling inside the p.MLflow.IntegrationMode ==
dspa.AutoDetect branch so that on error you do not proceed: after calling
RetrieveMLflowEndpoint and after BuildMLflowPluginConfigJson check the returned
err and either return that error from the enclosing reconcile function or set a
failing DSPA status condition and return, ensuring you do not continue with an
empty mlflowEndpoint or pluginCfg; target the calls to RetrieveMLflowEndpoint,
BuildMLflowPluginConfigJson, and the pluginCfg/mlflowEndpoint variables for the
fix.
In `@scripts/release/params.py`:
- Line 50: The MLFLOW default value is malformed ("{\"}\"") and should be a
valid JSON payload; update the MLFLOW entry in params.py (the "MLFLOW" mapping)
to a proper JSON string representing an empty object — e.g., replace "{\"}\""
with "{}" so consumers that JSON-parse the value will receive an empty object.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: c0b1ce06-3eb8-40f1-b296-3fcdb30abe51
📒 Files selected for processing (6)
api/v1/dspipeline_types.goconfig/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yamlcontrollers/dspipeline_params.gocontrollers/util/util.gomain.goscripts/release/params.py
✅ Files skipped from review due to trivial changes (1)
- main.go
🚧 Files skipped from review as they are similar to previous changes (2)
- config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml
- controllers/util/util.go
35558d9 to
51b88cb
Compare
|
Change to PR detected. A new PR build was completed. |
There was a problem hiding this comment.
Actionable comments posted: 10
♻️ Duplicate comments (1)
scripts/release/params.py (1)
50-50:⚠️ Potential issue | 🔴 CriticalFix malformed JSON default for
MLFLOW.Line 50 uses
"{\"}\"", which is invalid JSON and will break any JSON parsing path for this option.Proposed fix
- "MLFLOW": "{\"}\"", + "MLFLOW": "{}",As per coding guidelines, prioritize bug-prone patterns and error handling gaps.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/release/params.py` at line 50, The MLFLOW default value is a malformed JSON string ("{\"}\"") and will break JSON parsing; update the "MLFLOW" entry to contain valid JSON (for example an empty object "{}" or another appropriate JSON payload) and ensure the string is properly escaped to match the surrounding file format so JSON parsing succeeds; locate the "MLFLOW" key in scripts/release/params.py and replace the invalid value accordingly.
🧹 Nitpick comments (1)
controllers/dspipeline_params.go (1)
147-168: Unusedmapstructuretags.These struct fields have both
jsonandmapstructuretags, but onlyjson.Marshalis used inBuildMLflowPluginConfigJson. Remove unused tags to reduce noise:type PluginConfig struct { - Endpoint string `json:"endpoint,omitempty" mapstructure:"endpoint"` - Timeout string `json:"timeout,omitempty" mapstructure:"timeout"` - TLS TLSConfig `json:"tls,omitempty" mapstructure:"tls"` - Settings MLflowPluginSettings `json:"settings,omitempty" mapstructure:"settings"` + Endpoint string `json:"endpoint,omitempty"` + Timeout string `json:"timeout,omitempty"` + TLS TLSConfig `json:"tls,omitempty"` + Settings MLflowPluginSettings `json:"settings,omitempty"` }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/dspipeline_params.go` around lines 147 - 168, The struct tags include unused `mapstructure` tags (in PluginConfig, TLSConfig, MLflowPluginSettings) while only json.Marshal is used in BuildMLflowPluginConfigJson; remove the `mapstructure:"..."` tags from the fields (Endpoint, Timeout, TLS, Settings, InsecureSkipVerify, CABundlePath, WorkspacesEnabled, ExperimentDescription, DefaultExperimentName, KFPBaseURL, InjectUserEnvVars) so only `json` tags remain to reduce noise and keep serialization consistent with BuildMLflowPluginConfigJson.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1/dspipeline_types.go`:
- Around line 58-59: The struct field MLflow *MLflowConfig currently has an
incorrect JSON tag `json:"mlflowConfig,omitEmpty"`; change it to
`json:"mlflow,omitempty"` (use lowercase "omitempty" and the CRD-aligned field
name "mlflow") so the tag matches the CRD and uses the correct omitempty option;
update the tag on the MLflow field (MLflow *MLflowConfig) accordingly.
- Around line 471-481: The MLflowConfig struct and its IntegrationMode type lack
JSON tags and use an int iota enum which breaks controller-gen; change
IntegrationMode from int to a string-based type (e.g., type IntegrationMode
string) with explicit constant values like "AUTODETECT" and "DISABLED" and add
JSON tags to MLflowConfig fields (e.g., `json:"integrationMode,omitempty"` and
`json:"injectUserEnvVars,omitempty"` referencing the IntegrationMode and
InjectUserEnvVars fields) and, if desired for CRD validation, add kubebuilder
validation markers on IntegrationMode (e.g., `//
+kubebuilder:validation:Enum=AUTODETECT;DISABLED`) so the CRD serializes as
strings and controller-gen no longer errors.
In
`@config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml`:
- Around line 917-931: The CRD's mlflow schema is missing type declarations and
uses PascalCase field names; update the mlflow block to include "type: object",
rename the properties "IntegrationMode" -> "integrationMode" and
"InjectUserEnvVars" -> "injectUserEnvVars" to follow Kubernetes/Go JSON tag
camelCase conventions, and add "type: string" to the integrationMode property
(keeping its enum values) while ensuring injectUserEnvVars retains "type:
boolean"; adjust any downstream schema references to these property names as
needed.
In `@config/internal/apiserver/default/rolebinding_pipeline-runner.yaml.tmpl`:
- Around line 17-33: The RoleBinding for mlflow-integration is created
unconditionally; wrap the RoleBinding block (the resource named
mlflow-integration that references ClusterRole mlflow-integration and
ServiceAccount pipeline-runner-{{.Name}}) in a template conditional that checks
MLflow is configured and not disabled (use the template variables .MLflow and
.MLflow.IntegrationMode and test that IntegrationMode != "DISABLED", e.g. an if
that requires .MLflow to be present and its IntegrationMode not equal to
"DISABLED"), so the RoleBinding is only rendered when MLflow integration is
enabled.
In `@controllers/dspipeline_params.go`:
- Around line 794-813: The MLflow plugin setup block can panic because
p.APIServer.CustomServerConfig may be nil when accessed; update the logic around
RetrieveMLflowEndpoint / BuildMLflowPluginConfigJson /
util.UpdateConfigMapByName so it only runs after p.APIServer.CustomServerConfig
is initialized (or add an explicit nil check on p.APIServer.CustomServerConfig
before using .Name). Concretely, either move the entire MLflow config block to
execute after the CustomServerConfig defaulting/initialization code, or guard
the existing block with if p.APIServer != nil && p.APIServer.CustomServerConfig
!= nil before calling util.UpdateConfigMapByName and building the plugin config.
- Around line 751-766: The MLflow handling will never match the Disabled
constant because the spec field's Go type/tags in api/v1/dspipeline_types.go are
incorrect (embedded field missing JSON tag and type mismatch between
IntegrationMode definitions); fix the API type by giving the embedded MLflow
spec a proper json tag and ensure IntegrationMode in the CRD type uses the same
underlying type as dspa.IntegrationMode (e.g., string if dspa.IntegrationMode is
a string) so deserialization works, then in controllers/dspipeline_params.go
read dsp.Spec.MLflow.IntegrationMode directly and compare to dspa.Disabled; also
simplify the boolean check by assigning mlflowCfg.InjectUserEnvVars =
dsp.Spec.MLflow.InjectUserEnvVars (no == true check).
- Around line 246-263: RetrieveMLflowEndpoint uses client.Get to read MLflow CRs
(mlflow.kubeflow.org) via the Kubernetes API, but the cluster Role in
config/rbac/role.yaml lacks permissions; update the Role to grant the
mlflow.kubeflow.org apiGroup access to the mlflows resource with verbs get,
list, and watch so client.Get in RetrieveMLflowEndpoint (and any calls to mlflow
resources) won't be denied by RBAC.
In `@controllers/util/util.go`:
- Around line 119-122: The check using cmField := cm.Data[field] conflates a
missing key with an empty value; change to use the presence form (cmField, ok :=
cm.Data[field]) and if !ok return fmt.Errorf("configmap field %s does not
exist", field), else if cmField == "" return fmt.Errorf("configmap field %s is
empty", field) so the error messages accurately reflect whether the key is
missing or present-but-empty (update the code around cmField, cm.Data and the
existing error returns).
- Around line 114-133: UpdateConfigMapByName currently passes a JSON string
returned by BuildMLflowPluginConfigJson into updateJSON causing double-encoding
(updateJSON does obj[field] = value then json.Marshal). Fix by unmarshaling the
JSON string into a Go value before calling updateJSON (or change updateJSON to
accept an interface{} and unmarshal there): parse value (the JSON text) into an
interface{} (e.g., var v interface{}; json.Unmarshal([]byte(value), &v)) and
pass that unmarshaled v to updateJSON so updateJSON assigns a structured object
to obj[field] instead of a string, avoiding the escaped JSON literal issue with
obj[field] = value.
In `@scripts/release/params.py`:
- Line 50: The release params file adds a "MLFLOW" entry that isn't being used
by kustomize; either wire it into kustomize vars or remove it: update
scripts/release/params.py to remove the "MLFLOW" key if you don't need it, or
add a matching var in config/base/kustomization.yaml (declare a varRef/data
entry for data.MLFLOW and include it in vars/name) so that data.MLFLOW is
available for manifest substitution; target the "MLFLOW" symbol in params.py and
the vars block in config/base/kustomization.yaml (referencing data.MLFLOW) when
making the change.
---
Duplicate comments:
In `@scripts/release/params.py`:
- Line 50: The MLFLOW default value is a malformed JSON string ("{\"}\"") and
will break JSON parsing; update the "MLFLOW" entry to contain valid JSON (for
example an empty object "{}" or another appropriate JSON payload) and ensure the
string is properly escaped to match the surrounding file format so JSON parsing
succeeds; locate the "MLFLOW" key in scripts/release/params.py and replace the
invalid value accordingly.
---
Nitpick comments:
In `@controllers/dspipeline_params.go`:
- Around line 147-168: The struct tags include unused `mapstructure` tags (in
PluginConfig, TLSConfig, MLflowPluginSettings) while only json.Marshal is used
in BuildMLflowPluginConfigJson; remove the `mapstructure:"..."` tags from the
fields (Endpoint, Timeout, TLS, Settings, InsecureSkipVerify, CABundlePath,
WorkspacesEnabled, ExperimentDescription, DefaultExperimentName, KFPBaseURL,
InjectUserEnvVars) so only `json` tags remain to reduce noise and keep
serialization consistent with BuildMLflowPluginConfigJson.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 7962928e-9fac-471d-ad34-bd737826f12f
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum,!**/*.sum
📒 Files selected for processing (9)
api/v1/dspipeline_types.goconfig/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yamlconfig/internal/apiserver/default/rolebinding_ds-pipeline.yaml.tmplconfig/internal/apiserver/default/rolebinding_pipeline-runner.yaml.tmplcontrollers/dspipeline_params.gocontrollers/util/util.gogo.modmain.goscripts/release/params.py
✅ Files skipped from review due to trivial changes (3)
- main.go
- go.mod
- config/internal/apiserver/default/rolebinding_ds-pipeline.yaml.tmpl
.../bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml
Outdated
Show resolved
Hide resolved
config/internal/apiserver/default/rolebinding_pipeline-runner.yaml.tmpl
Outdated
Show resolved
Hide resolved
| func (p *DSPAParams) RetrieveMLflowEndpoint(ctx context.Context, client client.Client, log logr.Logger) (string, error) { | ||
| mlflowName := "mlflow" | ||
| mlflow := &mlflowv1.MLflow{} | ||
| namespacedName := types.NamespacedName{ | ||
| Name: mlflowName, | ||
| Namespace: p.Namespace, | ||
| } | ||
| err := client.Get(ctx, namespacedName, mlflow) | ||
| if err != nil { | ||
| log.V(1).Info(fmt.Sprintf("Unable to retrieve mlflow resource [%s].", mlflowName)) | ||
| return "", err | ||
| } | ||
| status := mlflow.Status | ||
| if status.Address != nil && status.Address.URL != "" { | ||
| return mlflow.Status.Address.URL, nil | ||
| } | ||
| return "", errors.New("MLflow resource missing Status.Address.URL field. Unable to resolve endpoint") | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify RBAC rules for MLflow resources
echo "Checking for MLflow RBAC rules in role.yaml..."
rg -i "mlflow" config/rbac/role.yaml || echo "No MLflow rules found in role.yaml"
echo ""
echo "Checking MLflow operator API group..."
rg -n "GroupVersion" --type go -A2 | grep -i mlflow || echo "Could not determine MLflow API group"Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 247
🏁 Script executed:
#!/bin/bash
# Find MLflow type imports in the controller file
echo "=== Searching for MLflow imports in dspipeline_params.go ==="
rg "mlflow" controllers/dspipeline_params.go -i -B2 -A2
echo ""
echo "=== Looking for MLflow type definitions ==="
fd -e go -type f | xargs rg "type MLflow struct" -l
echo ""
echo "=== Inspecting role.yaml structure ==="
head -50 config/rbac/role.yaml
echo ""
echo "=== Checking full MLflow API references ==="
rg "mlflowv1\.|mlflow\..*io" --type go -l | head -5Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 4654
🏁 Script executed:
#!/bin/bash
# Search for MLflow CRD or operator information
echo "=== Looking for MLflow GroupVersion or API group ==="
rg "GroupVersion|group.*mlflow" --type go -i | head -20
echo ""
echo "=== Check if MLflow is optional or required ==="
grep -n "RetrieveMLflowEndpoint" controllers/dspipeline_params.go -A5 -B5 | head -30Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 1827
🏁 Script executed:
#!/bin/bash
# Search for any existing RBAC configurations for other CRs
echo "=== Current RBAC rules structure in role.yaml ==="
rg -A3 "^- apiGroups:" config/rbac/role.yaml | head -50Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 778
🏁 Script executed:
#!/bin/bash
# Check how argoproj RBAC is configured (as reference for external CRs)
echo "=== Argoproj RBAC rules in role.yaml ==="
rg -A5 "argoproj" config/rbac/role.yaml
echo ""
echo "=== Full list of apiGroups in role.yaml ==="
rg "^ - apiGroups:" -A1 config/rbac/role.yaml | grep -E "^ - apiGroups:|^ - " | head -30Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 380
🏁 Script executed:
#!/bin/bash
# Search for any references to mlflow operator API group in the codebase
echo "=== Looking for mlflow-operator imports and schema ==="
rg "github.com/opendatahub-io/mlflow-operator" --type go
echo ""
echo "=== Checking if there's documentation about MLflow API group ==="
fd -e md -type f | xargs rg "mlflow.*group|api.*group" -i 2>/dev/null | head -10Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 961
🏁 Script executed:
#!/bin/bash
# Check if there's a go.mod or go.sum that references mlflow-operator version
echo "=== MLflow operator dependency in go.mod ==="
rg "mlflow-operator" go.mod -A1 -B1
echo ""
echo "=== Check main.go for MLflow operator setup ==="
grep -n "mlflowv1" main.go -A3 -B3
echo ""
echo "=== Verify the exact import path and look for API group ==="
head -100 main.go | tail -50Repository: opendatahub-io/data-science-pipelines-operator
Length of output: 2874
🌐 Web query:
mlflow-operator opendatahub api group schema
💡 Result:
The mlflow-operator from OpenDataHub uses the API group mlflow.kubeflow.org with version v1. The primary Custom Resource is MLflowConfig in the package github.com/opendatahub-io/mlflow-operator/api/mlflowconfig/v1. MLflowConfig is a namespaced singleton resource (metadata.name must be "mlflow") used to override artifact storage settings per namespace. It references spec.artifactRootSecret named "mlflow-artifact-connection". Example usage from documentation: apiVersion: mlflow.kubeflow.org/v1 kind: MLflowConfig metadata: name: mlflow spec: artifactRootSecret: mlflow-artifact-connection The full Go API schema documentation is available at pkg.go.dev/github.com/opendatahub-io/mlflow-operator/api/mlflowconfig/v1 (published 2026-02-18). Sample CRs are in the repository at config/samples/, including mlflow_v1_mlflow.yaml for OpenShift deployment with local storage. The operator deploys MLflow using internal Helm charts, supporting OpenDataHub (default) and RHOAI modes, with built-in Kubernetes auth and storage configuration for artifacts (local, PVC, S3-compatible).
Citations:
- 1: https://pkg.go.dev/github.com/opendatahub-io/mlflow-operator/api/mlflowconfig/v1
- 2: https://github.com/opendatahub-io/mlflow-operator
Add RBAC rules for MLflow CR access (CWE-862).
client.Get at line 253 retrieves the MLflow CR from the mlflow.kubeflow.org API group, but config/rbac/role.yaml contains no rules granting access. This will fail with 403 Forbidden when MLflow integration is enabled.
Add to config/rbac/role.yaml:
- apiGroups:
- mlflow.kubeflow.org
resources:
- mlflows
verbs:
- get
- list
- watch🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@controllers/dspipeline_params.go` around lines 246 - 263,
RetrieveMLflowEndpoint uses client.Get to read MLflow CRs (mlflow.kubeflow.org)
via the Kubernetes API, but the cluster Role in config/rbac/role.yaml lacks
permissions; update the Role to grant the mlflow.kubeflow.org apiGroup access to
the mlflows resource with verbs get, list, and watch so client.Get in
RetrieveMLflowEndpoint (and any calls to mlflow resources) won't be denied by
RBAC.
controllers/util/util.go
Outdated
| func UpdateConfigMapByName(ctx context.Context, cfgName, ns string, client client.Client, field, value string) error { | ||
| cm, err := GetConfigMap(ctx, cfgName, ns, client) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get configmap %s/%s: %w", ns, cfgName, err) | ||
| } | ||
| cmField := cm.Data[field] | ||
| if cmField == "" { | ||
| return fmt.Errorf("configmap field %s does not exist", field) | ||
| } | ||
| updatedField, err := updateJSON(cmField, field, value) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to update configmap field %s/%s: %w", ns, field, err) | ||
| } | ||
| cm.Data[field] = updatedField | ||
| err = UpdateConfigMap(ctx, cm, client) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to update configmap %s/%s: %w", ns, cfgName, err) | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Double-encoding bug: JSON string value will be escaped.
At line 123, updateJSON(cmField, field, value) is called where value is already a JSON string (from BuildMLflowPluginConfigJson). Inside updateJSON, line 321 does obj[field] = value, then line 322 marshals the entire object. This double-encodes value - the JSON string becomes an escaped string literal.
Expected result: {"config.json": {"endpoint": "..."}}
Actual result: {"config.json": "{\"endpoint\": \"...\"}"}
Fix by unmarshaling value before assignment:
-func updateJSON(jsonStr string, field string, value any) (string, error) {
+func updateJSON(jsonStr string, field string, value string) (string, error) {
var obj map[string]any
if err := json.Unmarshal([]byte(jsonStr), &obj); err != nil {
return "", fmt.Errorf("invalid JSON input: %w", err)
}
- obj[field] = value
+ var parsed any
+ if err := json.Unmarshal([]byte(value), &parsed); err != nil {
+ return "", fmt.Errorf("invalid JSON value: %w", err)
+ }
+ obj[field] = parsed
out, err := json.Marshal(obj)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@controllers/util/util.go` around lines 114 - 133, UpdateConfigMapByName
currently passes a JSON string returned by BuildMLflowPluginConfigJson into
updateJSON causing double-encoding (updateJSON does obj[field] = value then
json.Marshal). Fix by unmarshaling the JSON string into a Go value before
calling updateJSON (or change updateJSON to accept an interface{} and unmarshal
there): parse value (the JSON text) into an interface{} (e.g., var v
interface{}; json.Unmarshal([]byte(value), &v)) and pass that unmarshaled v to
updateJSON so updateJSON assigns a structured object to obj[field] instead of a
string, avoiding the escaped JSON literal issue with obj[field] = value.
controllers/util/util.go
Outdated
| cmField := cm.Data[field] | ||
| if cmField == "" { | ||
| return fmt.Errorf("configmap field %s does not exist", field) | ||
| } |
There was a problem hiding this comment.
Empty string check doesn't distinguish missing key from empty value.
cm.Data[field] returns empty string for both missing keys and keys with empty values. The error message claims the field "does not exist" but that's not necessarily true.
- cmField := cm.Data[field]
- if cmField == "" {
- return fmt.Errorf("configmap field %s does not exist", field)
+ cmField, exists := cm.Data[field]
+ if !exists {
+ return fmt.Errorf("configmap field %s does not exist", field)
+ }
+ if cmField == "" {
+ return fmt.Errorf("configmap field %s is empty", field)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cmField := cm.Data[field] | |
| if cmField == "" { | |
| return fmt.Errorf("configmap field %s does not exist", field) | |
| } | |
| cmField, exists := cm.Data[field] | |
| if !exists { | |
| return fmt.Errorf("configmap field %s does not exist", field) | |
| } | |
| if cmField == "" { | |
| return fmt.Errorf("configmap field %s is empty", field) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@controllers/util/util.go` around lines 119 - 122, The check using cmField :=
cm.Data[field] conflates a missing key with an empty value; change to use the
presence form (cmField, ok := cm.Data[field]) and if !ok return
fmt.Errorf("configmap field %s does not exist", field), else if cmField == ""
return fmt.Errorf("configmap field %s is empty", field) so the error messages
accurately reflect whether the key is missing or present-but-empty (update the
code around cmField, cm.Data and the existing error returns).
| "FIPSENABLED": "false", | ||
| "WEBHOOK_ANNOTATIONS": "" | ||
| "WEBHOOK_ANNOTATIONS": "", | ||
| "MLFLOW": "{\"}\"", |
There was a problem hiding this comment.
Wire MLFLOW through kustomize vars or remove it from release params.
Line 50 adds MLFLOW, but config/base/kustomization.yaml var definitions do not include data.MLFLOW, so this setting is currently inert for manifest substitution.
Proposed wiring in config/base/kustomization.yaml
vars:
+ - name: MLFLOW
+ objref:
+ kind: ConfigMap
+ name: dspo-parameters
+ apiVersion: v1
+ fieldref:
+ fieldpath: data.MLFLOWAs per coding guidelines, prioritize architectural issues and anti-patterns.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/release/params.py` at line 50, The release params file adds a
"MLFLOW" entry that isn't being used by kustomize; either wire it into kustomize
vars or remove it: update scripts/release/params.py to remove the "MLFLOW" key
if you don't need it, or add a matching var in config/base/kustomization.yaml
(declare a varRef/data entry for data.MLFLOW and include it in vars/name) so
that data.MLFLOW is available for manifest substitution; target the "MLFLOW"
symbol in params.py and the vars block in config/base/kustomization.yaml
(referencing data.MLFLOW) when making the change.
|
Change to PR detected. A new PR build was completed. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml (1)
613-627: Previous schema issues resolved; add a description for themlflowobject.The past review findings (missing
type: object, missingtype: stringonintegrationMode, PascalCase field names) have all been addressed. The schema is now structurally correct.One remaining gap: the
mlflowproperty itself lacks adescriptionfield. Other peer properties likemlmd(line 628) also omit this, but for a new user-facing feature, a brief description improves discoverability.Suggested description addition
mlflow: + description: |- + MLflow can be optionally enabled as an API server plugin for tracking + pipeline runs in MLflow experiments. properties: injectUserEnvVars:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml` around lines 613 - 627, The mlflow property in the CRD schema is missing a user-facing description; update the mlflow object (the block that defines properties injectUserEnvVars and integrationMode) to include a concise description explaining what the mlflow section configures (e.g., MLflow integration settings for KFP, toggling plugin integration and whether to inject user env vars), so consumers see purpose and usage when viewing the CRD.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@config/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yaml`:
- Around line 613-627: The mlflow property in the CRD schema is missing a
user-facing description; update the mlflow object (the block that defines
properties injectUserEnvVars and integrationMode) to include a concise
description explaining what the mlflow section configures (e.g., MLflow
integration settings for KFP, toggling plugin integration and whether to inject
user env vars), so consumers see purpose and usage when viewing the CRD.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 86a704a2-06d1-4d23-a666-b484e471d9ba
📒 Files selected for processing (4)
api/v1/dspipeline_types.goapi/v1/zz_generated.deepcopy.goconfig/crd/bases/datasciencepipelinesapplications.opendatahub.io_datasciencepipelinesapplications.yamlconfig/internal/apiserver/default/rolebinding_pipeline-runner.yaml.tmpl
🚧 Files skipped from review as they are similar to previous changes (2)
- config/internal/apiserver/default/rolebinding_pipeline-runner.yaml.tmpl
- api/v1/dspipeline_types.go
|
Change to PR detected. A new PR build was completed. |
4 similar comments
|
Change to PR detected. A new PR build was completed. |
|
Change to PR detected. A new PR build was completed. |
|
Change to PR detected. A new PR build was completed. |
|
Change to PR detected. A new PR build was completed. |
|
Change to PR detected. A new PR build was completed. |
api/v1/dspipeline_types.go
Outdated
| // Indicates whether to enable MLflow plugin on API server if plugin installed on cluster, | ||
| // or disable entirely. | ||
| // +kubebuilder:validation:Optional | ||
| IntegrationMode IntegrationMode `json:"integrationMode,omitempty"` |
There was a problem hiding this comment.
can you set default to autodetect
// +kubebuilder:default=AUTODETECT
There was a problem hiding this comment.
Updated, added // +kubebuilder:default=false for InjectUserEnvVars as well.
config/internal/apiserver/default/rolebinding_ds-pipeline.yaml.tmpl
Outdated
Show resolved
Hide resolved
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Change to PR detected. A new PR build was completed. |
5f36953 to
448ce67
Compare
|
Change to PR detected. A new PR build was completed. |
Signed-off-by: Alyssa Goins <agoins@redhat.com>
448ce67 to
f5c9ad7
Compare
|
Change to PR detected. A new PR build was completed. |
Signed-off-by: Alyssa Goins <agoins@redhat.com>
|
Change to PR detected. A new PR build was completed. |
Signed-off-by: Alyssa Goins <agoins@redhat.com>
|
Change to PR detected. A new PR build was completed. |
| "Plugins": { | ||
| "mlflow": {{ if .APIServerPluginsJson }} |
There was a problem hiding this comment.
is this resulting in a valid json? it looks like it's missing a close I think you fixed this already. The following still applies though:}
Can we add a unit test for the rendering of server-config.yaml.tmpl with the new MLflow config? The declarative test framework already supports this, it should just be a matter of adding an expected ConfigMap to one of the existing test cases under controllers/testdata/declarative/, or adding a focused assertion in apiserver_test.go. Minimal effort given the existing infrastructure.
| if isCRDAvailable(restCfg, mlflowv1.GroupVersion.String()) { | ||
| mlflowSyncPeriod := 5 * time.Minute | ||
| mgrOpts.Cache.ByObject[&mlflowv1.MLflow{}] = cache.ByObject{SyncPeriod: &mlflowSyncPeriod} | ||
| setupLog.Info("MLflow CRD detected, caching MLflow resources", "syncPeriod", mlflowSyncPeriod) |
There was a problem hiding this comment.
(AI Curated response)
The isCRDAvailable check in main.go only runs at operator startup. If the MLflow CRD is installed after DSPO is already running, it won't be detected and the cache won't be configured — requiring an operator restart.
Instead of relying on the manager cache for the MLflow CR, consider fetching the MLflow endpoint directly via an uncached API call in RetrieveMLflowEndpoint and caching the result in-memory with a TTL (e.g. a simple struct with sync.Mutex, the cached
value, and an expiry timestamp). This way:
- No startup-time CRD detection needed — the lookup happens lazily during reconciliation.
- The MLflow CRD/CR can be installed at any time without restarting DSPO.
- The TTL controls how often you hit the API server, giving you the same performance benefit as the cached approach.
The mlflowv1 scheme registration can stay since it's needed to deserialize the MLflow CR, but the manager cache configuration and isCRDAvailable check in main.go could be removed.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The issue resolved by this Pull Request:
Resolves #<issue_number>
Description of your changes:
Add support for plugins, following kubeflow/pipelines#13005 and kubeflow/pipelines#13052
Testing instructions
Checklist
Summary by CodeRabbit
New Features
Chores