Skip to content

Release candidate v0.5.6 - dev branch#2507

Merged
Vasilije1990 merged 34 commits intodevfrom
release-candidate-v0.5.6
Mar 30, 2026
Merged

Release candidate v0.5.6 - dev branch#2507
Vasilije1990 merged 34 commits intodevfrom
release-candidate-v0.5.6

Conversation

@dexters1
Copy link
Copy Markdown
Collaborator

@dexters1 dexters1 commented Mar 27, 2026

Description

Release branch for Cognee v0.5.6

Acceptance Criteria

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Code refactoring
  • Other (please specify):

Screenshots

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR (See CONTRIBUTING.md)
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Summary by CodeRabbit

  • New Features

    • Added input sanitization and centralized handling for embeddings so invalid inputs produce safe zero vectors and original ordering is preserved.
  • Chores

    • Bumped package version to 0.5.6.
    • Updated a pinned dependency to a newer patch release.
  • Tests

    • Added unit test validating embedding sanitization and zero-vector behavior.
    • Added API tests for conditional authentication behavior and a CI job to run them.

@dexters1 dexters1 self-assigned this Mar 27, 2026
@pull-checklist
Copy link
Copy Markdown

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Bumps project and dependency versions, adds embedding input sanitation and centralized response handling utilities, updates multiple embedding engines to use them, adjusts a slicing minor fix, and adds/updates tests plus a CI job for conditional-authentication testing.

Changes

Cohort / File(s) Summary
Version / packaging
pyproject.toml, cognee-mcp/pyproject.toml
Project version bumped 0.5.50.5.6; pinned dependency cognee[postgres,docs,neo4j] bumped 0.5.40.5.5.
Embedding utilities (new)
cognee/infrastructure/databases/vector/embeddings/utils.py
Added is_embeddable, sanitize_embedding_text_inputs, and handle_embedding_response to validate inputs and map invalid entries to zero vectors.
Embedding engines — integration
cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py, cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py, cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
Use sanitized inputs for both mock and real embedding flows; post-process returned embeddings via handle_embedding_response; adjust single-string handling to use sanitized input; added TODOs in Ollama truncation.
Embedding engine — minor fix
cognee/infrastructure/databases/vector/embeddings/OpenAICompatibleEmbeddingEngine.py
Minor string-slicing syntax change when normalizing base URL (no behavioral change).
Tests
cognee/tests/api/test_conditional_authentication_endpoints.py, cognee/tests/unit/api/test_conditional_authentication_endpoints.py, cognee/tests/unit/test_litellm_embedding_sanitize.py
Added API tests for conditional-auth behavior and patched dotenv loading in test modules; added unit test verifying embedding sanitization produces zero vectors for invalid inputs.
CI
.github/workflows/e2e_tests.yml
Added run_conditional_auth_test job running conditional-authentication tests with env var ENABLE_BACKEND_ACCESS_CONTROL=false.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

ci

Suggested reviewers

  • pazone
  • Vasilije1990
🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description provides only minimal information ('Release branch for Cognee v0.5.6') with most template sections left empty or unchecked, lacking details on actual changes, acceptance criteria, and testing completion. Provide a detailed human-generated description explaining the specific changes, improvements, and bug fixes included in v0.5.6, complete the Type of Change section, verify and check the Pre-submission Checklist items, and describe acceptance criteria with testing proof.
Docstring Coverage ⚠️ Warning Docstring coverage is 72.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main objective of the pull request as a release candidate for version 0.5.6.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release-candidate-v0.5.6

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

Hello @dexters1, thank you for submitting a PR! We will respond as soon as possible.

@dexters1 dexters1 changed the title chore: update versions Release candidate v0.5.6 - dev update Mar 27, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee-mcp/pyproject.toml`:
- Line 11: Update the pinned dependency string
"cognee[postgres,docs,neo4j]==0.5.5" to "cognee[postgres,docs,neo4j]==0.5.6" in
the pyproject.toml for cognee-mcp to match the main package bump, then
regenerate the lock file (e.g., run `uv lock` in the cognee-mcp directory) so
the lockfile reflects the updated version.

In `@pyproject.toml`:
- Line 4: The cognee-mcp package is pinned to cognee[postgres,docs,neo4j]==0.5.5
which mismatches the main package version = "0.5.6"; update the dependency
string in cognee-mcp/pyproject.toml from cognee[postgres,docs,neo4j]==0.5.5 to
cognee[postgres,docs,neo4j]==0.5.6 so the local release uses the matching 0.5.6
version.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 35d69647-e5f3-4839-91b1-84aa281defd3

📥 Commits

Reviewing files that changed from the base of the PR and between 5e8d1b6 and cc9452a.

⛔ Files ignored due to path filters (3)
  • cognee-mcp/uv.lock is excluded by !**/*.lock
  • poetry.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • cognee-mcp/pyproject.toml
  • pyproject.toml

Comment thread cognee-mcp/pyproject.toml
# For local cognee repo usage remove comment bellow and add absolute path to cognee. Then run `uv sync --reinstall` in the mcp folder on local cognee changes.
#"cognee[postgres,docs,neo4j] @ file:/Users/igorilic/Desktop/cognee",
"cognee[postgres,docs,neo4j]==0.5.4",
"cognee[postgres,docs,neo4j]==0.5.5",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Pin to cognee==0.5.6 to match the main package version.

This line pins cognee to version 0.5.5, but the main package in pyproject.toml line 4 is being bumped to 0.5.6 in this same PR. This creates a version skew where:

  • The cognee-mcp package will fetch cognee 0.5.5 from PyPI
  • The main repository is releasing version 0.5.6
  • Users installing cognee-mcp after the v0.5.6 release will get an outdated cognee dependency

For a consistent release, this should be updated to cognee[postgres,docs,neo4j]==0.5.6.

📦 Proposed fix to align versions
-    "cognee[postgres,docs,neo4j]==0.5.5",
+    "cognee[postgres,docs,neo4j]==0.5.6",

After making this change, regenerate the lock file:

cd cognee-mcp && uv lock
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"cognee[postgres,docs,neo4j]==0.5.5",
"cognee[postgres,docs,neo4j]==0.5.6",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee-mcp/pyproject.toml` at line 11, Update the pinned dependency string
"cognee[postgres,docs,neo4j]==0.5.5" to "cognee[postgres,docs,neo4j]==0.5.6" in
the pyproject.toml for cognee-mcp to match the main package bump, then
regenerate the lock file (e.g., run `uv lock` in the cognee-mcp directory) so
the lockfile reflects the updated version.

Comment thread pyproject.toml
name = "cognee"

version = "0.5.5"
version = "0.5.6"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check for version references and ensure consistency

echo "=== Main package version ==="
rg -n '^version\s*=' pyproject.toml

echo -e "\n=== cognee-mcp dependency constraint ==="
rg -n 'cognee\[.*\]==' cognee-mcp/pyproject.toml

echo -e "\n=== Lock file versions ==="
echo "Main uv.lock:"
rg -A2 '^\[\[package\]\]\nname = "cognee"' uv.lock | head -5

echo -e "\ncognee-mcp uv.lock:"
rg -A2 '^\[\[package\]\]\nname = "cognee"' cognee-mcp/uv.lock | head -5

Repository: topoteretes/cognee

Length of output: 661


Update cognee-mcp dependency to match main package version.

The main package version is 0.5.6 (line 4), but cognee-mcp/pyproject.toml line 11 pins cognee[postgres,docs,neo4j]==0.5.5. This version mismatch will cause cognee-mcp to fetch 0.5.5 from PyPI instead of using the local 0.5.6 being released, breaking the monorepo consistency.

Update cognee-mcp/pyproject.toml to pin cognee[postgres,docs,neo4j]==0.5.6.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 4, The cognee-mcp package is pinned to
cognee[postgres,docs,neo4j]==0.5.5 which mismatches the main package version =
"0.5.6"; update the dependency string in cognee-mcp/pyproject.toml from
cognee[postgres,docs,neo4j]==0.5.5 to cognee[postgres,docs,neo4j]==0.5.6 so the
local release uses the matching 0.5.6 version.

dexters1 and others added 12 commits March 27, 2026 14:51
<!-- .github/pull_request_template.md -->

## Description
Expanded PR by contributor Aroool to sanitize embedding calls

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Code refactoring
- [ ] Other (please specify):

## Screenshots
<!-- ADD SCREENSHOT OF LOCAL TESTS PASSING-->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
(See `CONTRIBUTING.md`)
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
@gitguardian
Copy link
Copy Markdown

gitguardian bot commented Mar 28, 2026

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9573981 Triggered Generic Password c963a8e .github/workflows/e2e_tests.yml View secret
17116131 Triggered Generic Password fbe94d4 cognee/tests/integration/infrastructure/graph/test_kuzu_neo4j_neighbors.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@dexters1 dexters1 changed the title Release candidate v0.5.6 - dev update Release candidate v0.5.6 - dev branch Mar 28, 2026
@mergify mergify bot mentioned this pull request Mar 28, 2026
13 tasks
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee/infrastructure/databases/vector/embeddings/utils.py`:
- Around line 8-18: The docstring incorrectly states an alphanumeric requirement
but the implementation simply treats any non-whitespace string as embeddable;
update the docstring to reflect the actual rule (i.e., “any non-empty string
after s.strip() is embeddable”) or change the implementation to enforce the
alphanumeric rule — in this case, modify the docstring above the function that
uses parameter s (and mentions s.strip()) so it accurately describes that the
function returns True for any string with length >= 1 after stripping
whitespace.
- Around line 37-51: handle_embedding_response currently assumes dimensions is
provided and blindly does [0.0] * dimensions; change it to derive the
zero-vector length from the returned embeddings when dimensions is None and
validate input/output counts: if dimensions is None and embeddings is non-empty,
set zero_len = len(embeddings[0]); if embeddings is empty and dimensions is None
raise a ValueError; also fail fast by raising a ValueError if len(embeddings) !=
len(original_texts); then build zero_vector = [0.0] * zero_len and continue
using is_embeddable(original_texts[i]) to choose embeddings[i] or zero_vector.

In `@cognee/tests/unit/test_litellm_embedding_sanitize.py`:
- Around line 15-37: The test's fake_aembedding doesn't verify the sanitized
payload and uses identical vectors, so update the stub and constants: make
UNIQUE_B, UNIQUE_C, UNIQUE_D, UNIQUE_E distinct from each other and from
UNIQUE_A, and inside fake_aembedding assert that kwargs["input"] equals the
expected sanitized list (i.e., filtered values that embed_text should send, not
raw ""/whitespace), failing the test if embed_text passes unfiltered inputs;
reference the fake_aembedding stub and UNIQUE_A..UNIQUE_E symbols when making
these changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 81ba175e-b6c5-46b4-a4bd-6c332625cb74

📥 Commits

Reviewing files that changed from the base of the PR and between 4b38042 and 7f2976f.

📒 Files selected for processing (6)
  • cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py
  • cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
  • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
  • cognee/infrastructure/databases/vector/embeddings/utils.py
  • cognee/tests/unit/api/test_conditional_authentication_endpoints.py
  • cognee/tests/unit/test_litellm_embedding_sanitize.py
✅ Files skipped from review due to trivial changes (1)
  • cognee/tests/unit/api/test_conditional_authentication_endpoints.py

Comment on lines +8 to +18
"""
Check if input string is embeddable, if not it will be replaced with a dummy value to prevent API errors.
Empty strings and a string with only a space character are not embeddable.
If input string contains at least one alphanumeric character, it is considered embeddable.
"""
if not isinstance(s, str):
return False
# Strip whitespace to check if the string is empty or only contains spaces
s = s.strip()
if len(s) >= 1:
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the docstring with the actual embeddable rule.

The implementation accepts any non-whitespace string, but Lines 10-11 say the input must contain an alphanumeric character. That already changes how "(" is classified in the new test, so callers currently have two different contracts.

✏️ Suggested docstring fix
-    Empty strings and a string with only a space character are not embeddable.
-    If input string contains at least one alphanumeric character, it is considered embeddable.
+    Empty or all-whitespace strings are not embeddable.
+    Any non-whitespace string is considered embeddable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/infrastructure/databases/vector/embeddings/utils.py` around lines 8 -
18, The docstring incorrectly states an alphanumeric requirement but the
implementation simply treats any non-whitespace string as embeddable; update the
docstring to reflect the actual rule (i.e., “any non-empty string after
s.strip() is embeddable”) or change the implementation to enforce the
alphanumeric rule — in this case, modify the docstring above the function that
uses parameter s (and mentions s.strip()) so it accurately describes that the
function returns True for any string with length >= 1 after stripping
whitespace.

Comment on lines +37 to +51
def handle_embedding_response(
original_texts: Union[List[str], str], embeddings: List[List[float]], dimensions: int
) -> List[List[float]]:
"""
Compare the original input strings against the results.
If the original string was 'junk' that was not embeddable, overwrite its vector with zeros.
"""
if isinstance(original_texts, str):
original_texts = [original_texts]

zero_vector = [0.0] * dimensions
return [
embeddings[i] if is_embeddable(original_texts[i]) else zero_vector
for i in range(len(original_texts))
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't assume dimensions is always set.

dimensions is optional in the engine constructors, but this helper now unconditionally does [0.0] * dimensions. That turns a previously valid “use provider default size” configuration into a TypeError on every response. Please resolve the zero-vector length from embeddings when dimensions is None, and fail fast if the provider returns a different number of vectors than inputs.

🛠️ Proposed fix
-from typing import List, Union
+from typing import List, Optional, Union
@@
 def handle_embedding_response(
-    original_texts: Union[List[str], str], embeddings: List[List[float]], dimensions: int
+    original_texts: Union[List[str], str],
+    embeddings: List[List[float]],
+    dimensions: Optional[int],
 ) -> List[List[float]]:
@@
     if isinstance(original_texts, str):
         original_texts = [original_texts]
+    if len(embeddings) != len(original_texts):
+        raise ValueError(
+            f"Expected {len(original_texts)} embeddings, received {len(embeddings)}."
+        )
 
-    zero_vector = [0.0] * dimensions
+    vector_size = dimensions if dimensions is not None else len(embeddings[0]) if embeddings else 0
     return [
-        embeddings[i] if is_embeddable(original_texts[i]) else zero_vector
+        embeddings[i] if is_embeddable(original_texts[i]) else [0.0] * vector_size
         for i in range(len(original_texts))
     ]

As per coding guidelines, "Prefer explicit, structured error handling in Python code."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/infrastructure/databases/vector/embeddings/utils.py` around lines 37 -
51, handle_embedding_response currently assumes dimensions is provided and
blindly does [0.0] * dimensions; change it to derive the zero-vector length from
the returned embeddings when dimensions is None and validate input/output
counts: if dimensions is None and embeddings is non-empty, set zero_len =
len(embeddings[0]); if embeddings is empty and dimensions is None raise a
ValueError; also fail fast by raising a ValueError if len(embeddings) !=
len(original_texts); then build zero_vector = [0.0] * zero_len and continue
using is_embeddable(original_texts[i]) to choose embeddings[i] or zero_vector.

Comment on lines +15 to +37
# Stub litellm call to return distinct vectors for each valid input
UNIQUE_A = [1.0, 1.0, 1.0, 1.0]
UNIQUE_B = [2.0, 2.0, 2.0, 2.0]
UNIQUE_C = [2.0, 2.0, 2.0, 2.0]
UNIQUE_D = [2.0, 2.0, 2.0, 2.0]
UNIQUE_E = [2.0, 2.0, 2.0, 2.0]

class _Resp:
def __init__(self, data):
self.data = data

async def fake_aembedding(**kwargs):
# kwargs["input"] should already be filtered by engine
# In our inputs below, valid entries are "valid" and "ok!"
return _Resp(
[
{"embedding": UNIQUE_A},
{"embedding": UNIQUE_B},
{"embedding": UNIQUE_C},
{"embedding": UNIQUE_D},
{"embedding": UNIQUE_E},
]
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This test can pass even if sanitization is removed.

fake_aembedding() never checks kwargs["input"], so the test still passes if embed_text() sends raw "" / " " values to LiteLLM. Also, UNIQUE_B through UNIQUE_E are identical, so a reordering bug would slip through too. Assert the sanitized payload and make each mocked embedding distinct.

🧪 Suggested hardening
-    UNIQUE_B = [2.0, 2.0, 2.0, 2.0]
-    UNIQUE_C = [2.0, 2.0, 2.0, 2.0]
-    UNIQUE_D = [2.0, 2.0, 2.0, 2.0]
-    UNIQUE_E = [2.0, 2.0, 2.0, 2.0]
+    UNIQUE_B = [2.0, 2.0, 2.0, 2.0]
+    UNIQUE_C = [3.0, 3.0, 3.0, 3.0]
+    UNIQUE_D = [4.0, 4.0, 4.0, 4.0]
+    UNIQUE_E = [5.0, 5.0, 5.0, 5.0]
@@
     async def fake_aembedding(**kwargs):
-        # kwargs["input"] should already be filtered by engine
-        # In our inputs below, valid entries are "valid" and "ok!"
+        assert kwargs["input"] == [".", "(", "valid", ".", "ok!"]
         return _Resp(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/test_litellm_embedding_sanitize.py` around lines 15 - 37,
The test's fake_aembedding doesn't verify the sanitized payload and uses
identical vectors, so update the stub and constants: make UNIQUE_B, UNIQUE_C,
UNIQUE_D, UNIQUE_E distinct from each other and from UNIQUE_A, and inside
fake_aembedding assert that kwargs["input"] equals the expected sanitized list
(i.e., filtered values that embed_text should send, not raw ""/whitespace),
failing the test if embed_text passes unfiltered inputs; reference the
fake_aembedding stub and UNIQUE_A..UNIQUE_E symbols when making these changes.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
cognee/tests/unit/api/test_conditional_authentication_endpoints.py (4)

47-52: Docstring placement is incorrect.

The docstring """Create a test client.""" at line 51 is placed after the import statement, so it won't be recognized as the function's docstring. Move it immediately after the function definition.

♻️ Suggested fix
     `@pytest.fixture`
     def client(self):
+        """Create a test client."""
         from cognee.api.client import app

-        """Create a test client."""
         return TestClient(app)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 47 - 52, The fixture function client has its docstring placed after the
import so it's not recognized; move the string literal """Create a test
client.""" immediately after the def client(self): line so it becomes the
function docstring for the pytest fixture (update the fixture named client in
the test file accordingly).

161-164: Dead code: POST branch is unreachable.

The elif method == "POST" branch (lines 163-164) is unreachable because the parametrize decorator only includes "GET" methods. Either remove the dead code or add POST endpoints to the parametrize list if testing POST was intended.

♻️ Option 1: Remove dead code
         mock_get_default.return_value = mock_default_user

-        if method == "GET":
-            response = client.get(endpoint)
-        elif method == "POST":
-            response = client.post(endpoint, json={})
+        response = client.get(endpoint)

         assert mock_get_default.call_count == 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 161 - 164, The POST branch under the test that switches on `method`
(checking `if method == "GET": response = client.get(endpoint)` / `elif method
== "POST": response = client.post(endpoint, json={})`) is dead because the
test's `parametrize` only supplies "GET"; either remove the unreachable `elif
method == "POST"` block or update the test's `@pytest.mark.parametrize` to
include "POST" and provide appropriate POST endpoints/payloads; locate the test
by the `method` loop and the `client.get`/`client.post` calls (in
test_conditional_authentication_endpoints) and apply one of those two fixes so
the code and parametrization remain consistent.

8-15: Module-level with patch is unconventional but functional.

Using a module-level with patch("dotenv.load_dotenv"): block to wrap all test definitions is an unusual pattern. While it works for preventing .env loading before the delayed importlib.import_module calls, future maintainers may find this surprising.

Consider documenting this pattern more explicitly or using a conftest.py with an autouse session-scoped fixture to achieve the same isolation:

# In conftest.py
`@pytest.fixture`(scope="session", autouse=True)
def prevent_dotenv_loading():
    with patch("dotenv.load_dotenv"):
        yield

This is an optional refactor since the current approach is functional and the comment at line 9 partially explains the intent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 8 - 15, The tests currently use a module-level with
patch("dotenv.load_dotenv") block around imports to prevent .env loading; move
this behavior into a dedicated autouse session fixture (e.g.,
prevent_dotenv_loading) in conftest.py that wraps patch("dotenv.load_dotenv")
and yields so all tests inherit it, then remove the module-level with block and
keep the os.environ settings (REQUIRE_AUTHENTICATION,
ENABLE_BACKEND_ACCESS_CONTROL) in the test module or set them in a setup
fixture; reference the existing patch("dotenv.load_dotenv") usage and create a
pytest fixture with scope="session" and autouse=True to provide the same
isolation more conventionally.

40-41: Duplicate environment variable assignments.

Lines 40-41 set REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL to "false", but these are already set at lines 13-14. The redundant assignments can be removed.

♻️ Suggested fix
     # To turn off authentication we need to set the environment variable before importing the module
     # Also both require_authentication and backend access control must be false
-    os.environ["REQUIRE_AUTHENTICATION"] = "false"
-    os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false"
     gau_mod = importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 40 - 41, Remove the duplicate environment assignments by deleting the
second occurrences that set REQUIRE_AUTHENTICATION="false" and
ENABLE_BACKEND_ACCESS_CONTROL="false" (keep the initial setup and remove the
redundant later assignments), ensuring each env var is only initialized once in
the test file so test configuration isn't confusing or duplicated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py`:
- Around line 47-52: The fixture function client has its docstring placed after
the import so it's not recognized; move the string literal """Create a test
client.""" immediately after the def client(self): line so it becomes the
function docstring for the pytest fixture (update the fixture named client in
the test file accordingly).
- Around line 161-164: The POST branch under the test that switches on `method`
(checking `if method == "GET": response = client.get(endpoint)` / `elif method
== "POST": response = client.post(endpoint, json={})`) is dead because the
test's `parametrize` only supplies "GET"; either remove the unreachable `elif
method == "POST"` block or update the test's `@pytest.mark.parametrize` to
include "POST" and provide appropriate POST endpoints/payloads; locate the test
by the `method` loop and the `client.get`/`client.post` calls (in
test_conditional_authentication_endpoints) and apply one of those two fixes so
the code and parametrization remain consistent.
- Around line 8-15: The tests currently use a module-level with
patch("dotenv.load_dotenv") block around imports to prevent .env loading; move
this behavior into a dedicated autouse session fixture (e.g.,
prevent_dotenv_loading) in conftest.py that wraps patch("dotenv.load_dotenv")
and yields so all tests inherit it, then remove the module-level with block and
keep the os.environ settings (REQUIRE_AUTHENTICATION,
ENABLE_BACKEND_ACCESS_CONTROL) in the test module or set them in a setup
fixture; reference the existing patch("dotenv.load_dotenv") usage and create a
pytest fixture with scope="session" and autouse=True to provide the same
isolation more conventionally.
- Around line 40-41: Remove the duplicate environment assignments by deleting
the second occurrences that set REQUIRE_AUTHENTICATION="false" and
ENABLE_BACKEND_ACCESS_CONTROL="false" (keep the initial setup and remove the
redundant later assignments), ensuring each env var is only initialized once in
the test file so test configuration isn't confusing or duplicated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 09d54a6b-d9b4-41e5-b97d-9369a6df5bdb

📥 Commits

Reviewing files that changed from the base of the PR and between 7f2976f and 9c6d617.

📒 Files selected for processing (1)
  • cognee/tests/unit/api/test_conditional_authentication_endpoints.py

@Vasilije1990 Vasilije1990 self-requested a review March 28, 2026 17:18
Vasilije1990
Vasilije1990 previously approved these changes Mar 28, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
cognee/tests/api/test_conditional_authentication_endpoints.py (4)

47-52: Minor: Unconventional docstring placement.

The docstring on line 51 appears after the fixture parameter. While Python allows this, the conventional placement is immediately after the function definition line.

♻️ Suggested improvement
         `@pytest.fixture`
         def client(self):
+            """Create a test client."""
             from cognee.api.client import app
 
-            """Create a test client."""
             return TestClient(app)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
47 - 52, The fixture function client has its docstring placed after import
statements instead of immediately following the def line; move the triple-quoted
string so it is the first statement inside the def client(...) function (i.e.,
place the docstring directly after the def line above the import and return),
keeping the function name client and preserving the TestClient(app) return
behavior.

38-42: Redundant environment variable assignments.

Lines 40-41 duplicate the environment variable settings from lines 13-14. Consider removing this duplication to improve clarity.

♻️ Proposed fix
     # To turn off authentication we need to set the environment variable before importing the module
     # Also both require_authentication and backend access control must be false
-    os.environ["REQUIRE_AUTHENTICATION"] = "false"
-    os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false"
     gau_mod = importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
38 - 42, Remove the duplicated environment variable assignments for
REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL that appear immediately
before importing the get_authenticated_user module; ensure only the initial
os.environ["REQUIRE_AUTHENTICATION"] = "false" and
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false" (the ones already set
earlier in the file) remain so that the
importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
call (gau_mod) still happens after the single, non-duplicated environment setup.

161-164: Dead code: POST branch will never execute.

The parameterized test (lines 147-153) only includes GET endpoints, so the elif method == "POST" branch on lines 163-164 will never be reached. Either remove the dead branch or add POST endpoints to the parameterization if intended.

♻️ Simplified version
-            if method == "GET":
-                response = client.get(endpoint)
-            elif method == "POST":
-                response = client.post(endpoint, json={})
+            response = client.get(endpoint)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
161 - 164, The test contains a dead branch: the parameterized inputs only supply
GET endpoints so the `elif method == "POST"` branch (using
`client.post(endpoint, json={})`) will never run; either remove that POST branch
to simplify the test or update the test parameterization to include POST cases
and corresponding endpoint values; locate the method dispatch around `if method
== "GET": response = client.get(endpoint)` / `elif method == "POST": response =
client.post(endpoint, json={})` and apply the chosen fix (remove the POST branch
or add POST endpoints to the parameter list and adjust expected responses).

24-36: Remove unused fixture mock_authenticated_user.

The fixture defined at lines 25-36 is not used by any test in this file. All tests that require a user object use mock_default_user instead. Remove this fixture to eliminate dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
24 - 36, Remove the unused pytest fixture mock_authenticated_user: delete the
function named mock_authenticated_user from the test module (it returns a User
and duplicates mock_default_user), and ensure no tests reference
mock_authenticated_user; keep existing mock_default_user intact for tests that
need a user object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cognee/tests/api/test_conditional_authentication_endpoints.py`:
- Around line 47-52: The fixture function client has its docstring placed after
import statements instead of immediately following the def line; move the
triple-quoted string so it is the first statement inside the def client(...)
function (i.e., place the docstring directly after the def line above the import
and return), keeping the function name client and preserving the TestClient(app)
return behavior.
- Around line 38-42: Remove the duplicated environment variable assignments for
REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL that appear immediately
before importing the get_authenticated_user module; ensure only the initial
os.environ["REQUIRE_AUTHENTICATION"] = "false" and
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false" (the ones already set
earlier in the file) remain so that the
importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
call (gau_mod) still happens after the single, non-duplicated environment setup.
- Around line 161-164: The test contains a dead branch: the parameterized inputs
only supply GET endpoints so the `elif method == "POST"` branch (using
`client.post(endpoint, json={})`) will never run; either remove that POST branch
to simplify the test or update the test parameterization to include POST cases
and corresponding endpoint values; locate the method dispatch around `if method
== "GET": response = client.get(endpoint)` / `elif method == "POST": response =
client.post(endpoint, json={})` and apply the chosen fix (remove the POST branch
or add POST endpoints to the parameter list and adjust expected responses).
- Around line 24-36: Remove the unused pytest fixture mock_authenticated_user:
delete the function named mock_authenticated_user from the test module (it
returns a User and duplicates mock_default_user), and ensure no tests reference
mock_authenticated_user; keep existing mock_default_user intact for tests that
need a user object.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ab4a5273-599c-44eb-8651-99ddbba25c1d

📥 Commits

Reviewing files that changed from the base of the PR and between 9c6d617 and 8b9630d.

📒 Files selected for processing (2)
  • .github/workflows/e2e_tests.yml
  • cognee/tests/api/test_conditional_authentication_endpoints.py

hajdul88 and others added 14 commits March 30, 2026 15:08
<!-- .github/pull_request_template.md -->

## Description
Add auto migration to LanceDB DataPoint schema

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Code refactoring
- [ ] Other (please specify):

## Screenshots
<!-- ADD SCREENSHOT OF LOCAL TESTS PASSING-->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
(See `CONTRIBUTING.md`)
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
@dexters1 dexters1 requested a review from Vasilije1990 March 30, 2026 22:00
@Vasilije1990 Vasilije1990 merged commit 0e096b2 into dev Mar 30, 2026
427 of 431 checks passed
@Vasilije1990 Vasilije1990 deleted the release-candidate-v0.5.6 branch March 30, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants