Release candidate v0.5.6 - dev branch by dexters1 · Pull Request #2507 · topoteretes/cognee

dexters1 · 2026-03-27T13:35:46Z

Description

Release branch for Cognee v0.5.6

Acceptance Criteria

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Code refactoring
Other (please specify):

Screenshots

Pre-submission Checklist

I have tested my changes thoroughly before submitting this PR (See CONTRIBUTING.md)
This PR contains minimal changes necessary to address the issue/feature
My code follows the project's coding standards and style guidelines
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if applicable)
All new and existing tests pass
I have searched existing PRs to ensure this change hasn't been submitted already
I have linked any relevant issues in the description
My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Summary by CodeRabbit

New Features
- Added input sanitization and centralized handling for embeddings so invalid inputs produce safe zero vectors and original ordering is preserved.
Chores
- Bumped package version to 0.5.6.
- Updated a pinned dependency to a newer patch release.
Tests
- Added unit test validating embedding sanitization and zero-vector behavior.
- Added API tests for conditional authentication behavior and a CI job to run them.

pull-checklist · 2026-03-27T13:35:51Z

Please make sure all the checkboxes are checked:

I have tested these changes locally.
I have reviewed the code changes.
I have added end-to-end and unit tests (if applicable).
I have updated the documentation and README.md file (if necessary).
I have removed unnecessary code and debug statements.
PR title is clear and follows the convention.
I have tagged reviewers or team members for feedback.

coderabbitai · 2026-03-27T13:35:59Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Bumps project and dependency versions, adds embedding input sanitation and centralized response handling utilities, updates multiple embedding engines to use them, adjusts a slicing minor fix, and adds/updates tests plus a CI job for conditional-authentication testing.

Changes

Cohort / File(s)	Summary
Version / packaging `pyproject.toml`, `cognee-mcp/pyproject.toml`	Project version bumped `0.5.5` → `0.5.6`; pinned dependency `cognee[postgres,docs,neo4j]` bumped `0.5.4` → `0.5.5`.
Embedding utilities (new) `cognee/infrastructure/databases/vector/embeddings/utils.py`	Added `is_embeddable`, `sanitize_embedding_text_inputs`, and `handle_embedding_response` to validate inputs and map invalid entries to zero vectors.
Embedding engines — integration `cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py`, `cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py`, `cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py`	Use sanitized inputs for both mock and real embedding flows; post-process returned embeddings via `handle_embedding_response`; adjust single-string handling to use sanitized input; added TODOs in Ollama truncation.
Embedding engine — minor fix `cognee/infrastructure/databases/vector/embeddings/OpenAICompatibleEmbeddingEngine.py`	Minor string-slicing syntax change when normalizing `base` URL (no behavioral change).
Tests `cognee/tests/api/test_conditional_authentication_endpoints.py`, `cognee/tests/unit/api/test_conditional_authentication_endpoints.py`, `cognee/tests/unit/test_litellm_embedding_sanitize.py`	Added API tests for conditional-auth behavior and patched dotenv loading in test modules; added unit test verifying embedding sanitization produces zero vectors for invalid inputs.
CI `.github/workflows/e2e_tests.yml`	Added `run_conditional_auth_test` job running conditional-authentication tests with env var `ENABLE_BACKEND_ACCESS_CONTROL=false`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

chore: Fix issues with embedding and CI/CD pytest event loops [COG-3803] #2017 — Modifies embedding engines (including LiteLLMEmbeddingEngine.embed_text) and CI embedding configuration; similar code areas changed.
fix: Resolve issues with MCP #2135 — Updates pinned cognee[...] dependency in cognee-mcp/pyproject.toml; related dependency-bump changes.
Merge dev into main #1422 — Changes to OllamaEmbeddingEngine (payload/formatting/truncation); closely related to OllamaEngine edits here.

Suggested labels

ci

Suggested reviewers

pazone
Vasilije1990

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description provides only minimal information ('Release branch for Cognee v0.5.6') with most template sections left empty or unchecked, lacking details on actual changes, acceptance criteria, and testing completion.	Provide a detailed human-generated description explaining the specific changes, improvements, and bug fixes included in v0.5.6, complete the Type of Change section, verify and check the Pre-submission Checklist items, and describe acceptance criteria with testing proof.
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main objective of the pull request as a release candidate for version 0.5.6.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch release-candidate-v0.5.6

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-27T13:36:03Z

Hello @dexters1, thank you for submitting a PR! We will respond as soon as possible.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee-mcp/pyproject.toml`:
- Line 11: Update the pinned dependency string
"cognee[postgres,docs,neo4j]==0.5.5" to "cognee[postgres,docs,neo4j]==0.5.6" in
the pyproject.toml for cognee-mcp to match the main package bump, then
regenerate the lock file (e.g., run `uv lock` in the cognee-mcp directory) so
the lockfile reflects the updated version.

In `@pyproject.toml`:
- Line 4: The cognee-mcp package is pinned to cognee[postgres,docs,neo4j]==0.5.5
which mismatches the main package version = "0.5.6"; update the dependency
string in cognee-mcp/pyproject.toml from cognee[postgres,docs,neo4j]==0.5.5 to
cognee[postgres,docs,neo4j]==0.5.6 so the local release uses the matching 0.5.6
version.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 35d69647-e5f3-4839-91b1-84aa281defd3

📥 Commits

Reviewing files that changed from the base of the PR and between 5e8d1b6 and cc9452a.

⛔ Files ignored due to path filters (3)

cognee-mcp/uv.lock is excluded by !**/*.lock
poetry.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

cognee-mcp/pyproject.toml
pyproject.toml

coderabbitai · 2026-03-27T13:38:40Z

    # For local cognee repo usage remove comment bellow and add absolute path to cognee. Then run `uv sync --reinstall` in the mcp folder on local cognee changes.
    #"cognee[postgres,docs,neo4j] @ file:/Users/igorilic/Desktop/cognee",
-    "cognee[postgres,docs,neo4j]==0.5.4",
+    "cognee[postgres,docs,neo4j]==0.5.5",


⚠️ Potential issue | 🔴 Critical

Critical: Pin to cognee==0.5.6 to match the main package version.

This line pins cognee to version 0.5.5, but the main package in pyproject.toml line 4 is being bumped to 0.5.6 in this same PR. This creates a version skew where:

The cognee-mcp package will fetch cognee 0.5.5 from PyPI

The main repository is releasing version 0.5.6

Users installing cognee-mcp after the v0.5.6 release will get an outdated cognee dependency

For a consistent release, this should be updated to cognee[postgres,docs,neo4j]==0.5.6.

📦 Proposed fix to align versions

- "cognee[postgres,docs,neo4j]==0.5.5", + "cognee[postgres,docs,neo4j]==0.5.6",

After making this change, regenerate the lock file:

cd cognee-mcp && uv lock

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"cognee[postgres,docs,neo4j]==0.5.5",

"cognee[postgres,docs,neo4j]==0.5.6",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee-mcp/pyproject.toml` at line 11, Update the pinned dependency string "cognee[postgres,docs,neo4j]==0.5.5" to "cognee[postgres,docs,neo4j]==0.5.6" in the pyproject.toml for cognee-mcp to match the main package bump, then regenerate the lock file (e.g., run `uv lock` in the cognee-mcp directory) so the lockfile reflects the updated version.

coderabbitai · 2026-03-27T13:38:40Z

 name = "cognee"

-version = "0.5.5"
+version = "0.5.6"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Check for version references and ensure consistency echo "=== Main package version ===" rg -n '^version\s*=' pyproject.toml echo -e "\n=== cognee-mcp dependency constraint ===" rg -n 'cognee\[.*\]==' cognee-mcp/pyproject.toml echo -e "\n=== Lock file versions ===" echo "Main uv.lock:" rg -A2 '^\[\[package\]\]\nname = "cognee"' uv.lock | head -5 echo -e "\ncognee-mcp uv.lock:" rg -A2 '^\[\[package\]\]\nname = "cognee"' cognee-mcp/uv.lock | head -5

Repository: topoteretes/cognee

Length of output: 661

Update cognee-mcp dependency to match main package version.

The main package version is 0.5.6 (line 4), but cognee-mcp/pyproject.toml line 11 pins cognee[postgres,docs,neo4j]==0.5.5. This version mismatch will cause cognee-mcp to fetch 0.5.5 from PyPI instead of using the local 0.5.6 being released, breaking the monorepo consistency.

Update cognee-mcp/pyproject.toml to pin cognee[postgres,docs,neo4j]==0.5.6.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` at line 4, The cognee-mcp package is pinned to cognee[postgres,docs,neo4j]==0.5.5 which mismatches the main package version = "0.5.6"; update the dependency string in cognee-mcp/pyproject.toml from cognee[postgres,docs,neo4j]==0.5.5 to cognee[postgres,docs,neo4j]==0.5.6 so the local release uses the matching 0.5.6 version.

…gnee into aroool-sanitize-embedding

## Description Expanded PR by contributor Aroool to sanitize embedding calls ## Acceptance Criteria  ## Type of Change  - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Code refactoring - [ ] Other (please specify): ## Screenshots  ## Pre-submission Checklist  - [ ] **I have tested my changes thoroughly before submitting this PR** (See `CONTRIBUTING.md`) - [ ] **This PR contains minimal changes necessary to address the issue/feature** - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

gitguardian · 2026-03-28T16:04:58Z

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
9573981	Triggered	Generic Password	`c963a8e`	.github/workflows/e2e_tests.yml	View secret
17116131	Triggered	Generic Password	`fbe94d4`	cognee/tests/integration/infrastructure/graph/test_kuzu_neo4j_neighbors.py	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee/infrastructure/databases/vector/embeddings/utils.py`:
- Around line 8-18: The docstring incorrectly states an alphanumeric requirement
but the implementation simply treats any non-whitespace string as embeddable;
update the docstring to reflect the actual rule (i.e., “any non-empty string
after s.strip() is embeddable”) or change the implementation to enforce the
alphanumeric rule — in this case, modify the docstring above the function that
uses parameter s (and mentions s.strip()) so it accurately describes that the
function returns True for any string with length >= 1 after stripping
whitespace.
- Around line 37-51: handle_embedding_response currently assumes dimensions is
provided and blindly does [0.0] * dimensions; change it to derive the
zero-vector length from the returned embeddings when dimensions is None and
validate input/output counts: if dimensions is None and embeddings is non-empty,
set zero_len = len(embeddings[0]); if embeddings is empty and dimensions is None
raise a ValueError; also fail fast by raising a ValueError if len(embeddings) !=
len(original_texts); then build zero_vector = [0.0] * zero_len and continue
using is_embeddable(original_texts[i]) to choose embeddings[i] or zero_vector.

In `@cognee/tests/unit/test_litellm_embedding_sanitize.py`:
- Around line 15-37: The test's fake_aembedding doesn't verify the sanitized
payload and uses identical vectors, so update the stub and constants: make
UNIQUE_B, UNIQUE_C, UNIQUE_D, UNIQUE_E distinct from each other and from
UNIQUE_A, and inside fake_aembedding assert that kwargs["input"] equals the
expected sanitized list (i.e., filtered values that embed_text should send, not
raw ""/whitespace), failing the test if embed_text passes unfiltered inputs;
reference the fake_aembedding stub and UNIQUE_A..UNIQUE_E symbols when making
these changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 81ba175e-b6c5-46b4-a4bd-6c332625cb74

📥 Commits

Reviewing files that changed from the base of the PR and between 4b38042 and 7f2976f.

📒 Files selected for processing (6)

cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
cognee/infrastructure/databases/vector/embeddings/utils.py
cognee/tests/unit/api/test_conditional_authentication_endpoints.py
cognee/tests/unit/test_litellm_embedding_sanitize.py

✅ Files skipped from review due to trivial changes (1)

cognee/tests/unit/api/test_conditional_authentication_endpoints.py

coderabbitai · 2026-03-28T16:13:52Z

+    """
+    Check if input string is embeddable, if not it will be replaced with a dummy value to prevent API errors.
+    Empty strings and a string with only a space character are not embeddable.
+    If input string contains at least one alphanumeric character, it is considered embeddable.
+    """
+    if not isinstance(s, str):
+        return False
+    # Strip whitespace to check if the string is empty or only contains spaces
+    s = s.strip()
+    if len(s) >= 1:
+        return True


⚠️ Potential issue | 🟡 Minor

Align the docstring with the actual embeddable rule.

The implementation accepts any non-whitespace string, but Lines 10-11 say the input must contain an alphanumeric character. That already changes how "(" is classified in the new test, so callers currently have two different contracts.

✏️ Suggested docstring fix

- Empty strings and a string with only a space character are not embeddable. - If input string contains at least one alphanumeric character, it is considered embeddable. + Empty or all-whitespace strings are not embeddable. + Any non-whitespace string is considered embeddable.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/infrastructure/databases/vector/embeddings/utils.py` around lines 8 - 18, The docstring incorrectly states an alphanumeric requirement but the implementation simply treats any non-whitespace string as embeddable; update the docstring to reflect the actual rule (i.e., “any non-empty string after s.strip() is embeddable”) or change the implementation to enforce the alphanumeric rule — in this case, modify the docstring above the function that uses parameter s (and mentions s.strip()) so it accurately describes that the function returns True for any string with length >= 1 after stripping whitespace.

coderabbitai · 2026-03-28T16:13:52Z

+def handle_embedding_response(
+    original_texts: Union[List[str], str], embeddings: List[List[float]], dimensions: int
+) -> List[List[float]]:
+    """
+    Compare the original input strings against the results.
+    If the original string was 'junk' that was not embeddable, overwrite its vector with zeros.
+    """
+    if isinstance(original_texts, str):
+        original_texts = [original_texts]
+
+    zero_vector = [0.0] * dimensions
+    return [
+        embeddings[i] if is_embeddable(original_texts[i]) else zero_vector
+        for i in range(len(original_texts))
+    ]


⚠️ Potential issue | 🟠 Major

Don't assume dimensions is always set.

dimensions is optional in the engine constructors, but this helper now unconditionally does [0.0] * dimensions. That turns a previously valid “use provider default size” configuration into a TypeError on every response. Please resolve the zero-vector length from embeddings when dimensions is None, and fail fast if the provider returns a different number of vectors than inputs.

🛠️ Proposed fix

-from typing import List, Union +from typing import List, Optional, Union @@ def handle_embedding_response( - original_texts: Union[List[str], str], embeddings: List[List[float]], dimensions: int + original_texts: Union[List[str], str], + embeddings: List[List[float]], + dimensions: Optional[int], ) -> List[List[float]]: @@ if isinstance(original_texts, str): original_texts = [original_texts] + if len(embeddings) != len(original_texts): + raise ValueError( + f"Expected {len(original_texts)} embeddings, received {len(embeddings)}." + ) - zero_vector = [0.0] * dimensions + vector_size = dimensions if dimensions is not None else len(embeddings[0]) if embeddings else 0 return [ - embeddings[i] if is_embeddable(original_texts[i]) else zero_vector + embeddings[i] if is_embeddable(original_texts[i]) else [0.0] * vector_size for i in range(len(original_texts)) ]

As per coding guidelines, "Prefer explicit, structured error handling in Python code."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/infrastructure/databases/vector/embeddings/utils.py` around lines 37 - 51, handle_embedding_response currently assumes dimensions is provided and blindly does [0.0] * dimensions; change it to derive the zero-vector length from the returned embeddings when dimensions is None and validate input/output counts: if dimensions is None and embeddings is non-empty, set zero_len = len(embeddings[0]); if embeddings is empty and dimensions is None raise a ValueError; also fail fast by raising a ValueError if len(embeddings) != len(original_texts); then build zero_vector = [0.0] * zero_len and continue using is_embeddable(original_texts[i]) to choose embeddings[i] or zero_vector.

coderabbitai · 2026-03-28T16:13:52Z

+    # Stub litellm call to return distinct vectors for each valid input
+    UNIQUE_A = [1.0, 1.0, 1.0, 1.0]
+    UNIQUE_B = [2.0, 2.0, 2.0, 2.0]
+    UNIQUE_C = [2.0, 2.0, 2.0, 2.0]
+    UNIQUE_D = [2.0, 2.0, 2.0, 2.0]
+    UNIQUE_E = [2.0, 2.0, 2.0, 2.0]
+
+    class _Resp:
+        def __init__(self, data):
+            self.data = data
+
+    async def fake_aembedding(**kwargs):
+        # kwargs["input"] should already be filtered by engine
+        # In our inputs below, valid entries are "valid" and "ok!"
+        return _Resp(
+            [
+                {"embedding": UNIQUE_A},
+                {"embedding": UNIQUE_B},
+                {"embedding": UNIQUE_C},
+                {"embedding": UNIQUE_D},
+                {"embedding": UNIQUE_E},
+            ]
+        )


⚠️ Potential issue | 🟡 Minor

This test can pass even if sanitization is removed.

fake_aembedding() never checks kwargs["input"], so the test still passes if embed_text() sends raw "" / " " values to LiteLLM. Also, UNIQUE_B through UNIQUE_E are identical, so a reordering bug would slip through too. Assert the sanitized payload and make each mocked embedding distinct.

🧪 Suggested hardening

- UNIQUE_B = [2.0, 2.0, 2.0, 2.0] - UNIQUE_C = [2.0, 2.0, 2.0, 2.0] - UNIQUE_D = [2.0, 2.0, 2.0, 2.0] - UNIQUE_E = [2.0, 2.0, 2.0, 2.0] + UNIQUE_B = [2.0, 2.0, 2.0, 2.0] + UNIQUE_C = [3.0, 3.0, 3.0, 3.0] + UNIQUE_D = [4.0, 4.0, 4.0, 4.0] + UNIQUE_E = [5.0, 5.0, 5.0, 5.0] @@ async def fake_aembedding(**kwargs): - # kwargs["input"] should already be filtered by engine - # In our inputs below, valid entries are "valid" and "ok!" + assert kwargs["input"] == [".", "(", "valid", ".", "ok!"] return _Resp(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/tests/unit/test_litellm_embedding_sanitize.py` around lines 15 - 37, The test's fake_aembedding doesn't verify the sanitized payload and uses identical vectors, so update the stub and constants: make UNIQUE_B, UNIQUE_C, UNIQUE_D, UNIQUE_E distinct from each other and from UNIQUE_A, and inside fake_aembedding assert that kwargs["input"] equals the expected sanitized list (i.e., filtered values that embed_text should send, not raw ""/whitespace), failing the test if embed_text passes unfiltered inputs; reference the fake_aembedding stub and UNIQUE_A..UNIQUE_E symbols when making these changes.

coderabbitai

🧹 Nitpick comments (4)

cognee/tests/unit/api/test_conditional_authentication_endpoints.py (4)

47-52: Docstring placement is incorrect.

The docstring """Create a test client.""" at line 51 is placed after the import statement, so it won't be recognized as the function's docstring. Move it immediately after the function definition.

♻️ Suggested fix

     `@pytest.fixture`
     def client(self):
+        """Create a test client."""
         from cognee.api.client import app

-        """Create a test client."""
         return TestClient(app)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 47 - 52, The fixture function client has its docstring placed after the
import so it's not recognized; move the string literal """Create a test
client.""" immediately after the def client(self): line so it becomes the
function docstring for the pytest fixture (update the fixture named client in
the test file accordingly).

161-164: Dead code: POST branch is unreachable.

The elif method == "POST" branch (lines 163-164) is unreachable because the parametrize decorator only includes "GET" methods. Either remove the dead code or add POST endpoints to the parametrize list if testing POST was intended.

♻️ Option 1: Remove dead code

         mock_get_default.return_value = mock_default_user

-        if method == "GET":
-            response = client.get(endpoint)
-        elif method == "POST":
-            response = client.post(endpoint, json={})
+        response = client.get(endpoint)

         assert mock_get_default.call_count == 1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 161 - 164, The POST branch under the test that switches on `method`
(checking `if method == "GET": response = client.get(endpoint)` / `elif method
== "POST": response = client.post(endpoint, json={})`) is dead because the
test's `parametrize` only supplies "GET"; either remove the unreachable `elif
method == "POST"` block or update the test's `@pytest.mark.parametrize` to
include "POST" and provide appropriate POST endpoints/payloads; locate the test
by the `method` loop and the `client.get`/`client.post` calls (in
test_conditional_authentication_endpoints) and apply one of those two fixes so
the code and parametrization remain consistent.

8-15: Module-level with patch is unconventional but functional.

Using a module-level with patch("dotenv.load_dotenv"): block to wrap all test definitions is an unusual pattern. While it works for preventing .env loading before the delayed importlib.import_module calls, future maintainers may find this surprising.

Consider documenting this pattern more explicitly or using a conftest.py with an autouse session-scoped fixture to achieve the same isolation:

# In conftest.py
`@pytest.fixture`(scope="session", autouse=True)
def prevent_dotenv_loading():
    with patch("dotenv.load_dotenv"):
        yield

This is an optional refactor since the current approach is functional and the comment at line 9 partially explains the intent.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 8 - 15, The tests currently use a module-level with
patch("dotenv.load_dotenv") block around imports to prevent .env loading; move
this behavior into a dedicated autouse session fixture (e.g.,
prevent_dotenv_loading) in conftest.py that wraps patch("dotenv.load_dotenv")
and yields so all tests inherit it, then remove the module-level with block and
keep the os.environ settings (REQUIRE_AUTHENTICATION,
ENABLE_BACKEND_ACCESS_CONTROL) in the test module or set them in a setup
fixture; reference the existing patch("dotenv.load_dotenv") usage and create a
pytest fixture with scope="session" and autouse=True to provide the same
isolation more conventionally.

40-41: Duplicate environment variable assignments.

Lines 40-41 set REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL to "false", but these are already set at lines 13-14. The redundant assignments can be removed.

♻️ Suggested fix

     # To turn off authentication we need to set the environment variable before importing the module
     # Also both require_authentication and backend access control must be false
-    os.environ["REQUIRE_AUTHENTICATION"] = "false"
-    os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false"
     gau_mod = importlib.import_module("cognee.modules.users.methods.get_authenticated_user")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py` around
lines 40 - 41, Remove the duplicate environment assignments by deleting the
second occurrences that set REQUIRE_AUTHENTICATION="false" and
ENABLE_BACKEND_ACCESS_CONTROL="false" (keep the initial setup and remove the
redundant later assignments), ensuring each env var is only initialized once in
the test file so test configuration isn't confusing or duplicated.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cognee/tests/unit/api/test_conditional_authentication_endpoints.py`:
- Around line 47-52: The fixture function client has its docstring placed after
the import so it's not recognized; move the string literal """Create a test
client.""" immediately after the def client(self): line so it becomes the
function docstring for the pytest fixture (update the fixture named client in
the test file accordingly).
- Around line 161-164: The POST branch under the test that switches on `method`
(checking `if method == "GET": response = client.get(endpoint)` / `elif method
== "POST": response = client.post(endpoint, json={})`) is dead because the
test's `parametrize` only supplies "GET"; either remove the unreachable `elif
method == "POST"` block or update the test's `@pytest.mark.parametrize` to
include "POST" and provide appropriate POST endpoints/payloads; locate the test
by the `method` loop and the `client.get`/`client.post` calls (in
test_conditional_authentication_endpoints) and apply one of those two fixes so
the code and parametrization remain consistent.
- Around line 8-15: The tests currently use a module-level with
patch("dotenv.load_dotenv") block around imports to prevent .env loading; move
this behavior into a dedicated autouse session fixture (e.g.,
prevent_dotenv_loading) in conftest.py that wraps patch("dotenv.load_dotenv")
and yields so all tests inherit it, then remove the module-level with block and
keep the os.environ settings (REQUIRE_AUTHENTICATION,
ENABLE_BACKEND_ACCESS_CONTROL) in the test module or set them in a setup
fixture; reference the existing patch("dotenv.load_dotenv") usage and create a
pytest fixture with scope="session" and autouse=True to provide the same
isolation more conventionally.
- Around line 40-41: Remove the duplicate environment assignments by deleting
the second occurrences that set REQUIRE_AUTHENTICATION="false" and
ENABLE_BACKEND_ACCESS_CONTROL="false" (keep the initial setup and remove the
redundant later assignments), ensuring each env var is only initialized once in
the test file so test configuration isn't confusing or duplicated.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 09d54a6b-d9b4-41e5-b97d-9369a6df5bdb

📥 Commits

Reviewing files that changed from the base of the PR and between 7f2976f and 9c6d617.

📒 Files selected for processing (1)

cognee/tests/unit/api/test_conditional_authentication_endpoints.py

coderabbitai

🧹 Nitpick comments (4)

cognee/tests/api/test_conditional_authentication_endpoints.py (4)

47-52: Minor: Unconventional docstring placement.

The docstring on line 51 appears after the fixture parameter. While Python allows this, the conventional placement is immediately after the function definition line.

♻️ Suggested improvement

         `@pytest.fixture`
         def client(self):
+            """Create a test client."""
             from cognee.api.client import app
 
-            """Create a test client."""
             return TestClient(app)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
47 - 52, The fixture function client has its docstring placed after import
statements instead of immediately following the def line; move the triple-quoted
string so it is the first statement inside the def client(...) function (i.e.,
place the docstring directly after the def line above the import and return),
keeping the function name client and preserving the TestClient(app) return
behavior.

38-42: Redundant environment variable assignments.

Lines 40-41 duplicate the environment variable settings from lines 13-14. Consider removing this duplication to improve clarity.

♻️ Proposed fix

     # To turn off authentication we need to set the environment variable before importing the module
     # Also both require_authentication and backend access control must be false
-    os.environ["REQUIRE_AUTHENTICATION"] = "false"
-    os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false"
     gau_mod = importlib.import_module("cognee.modules.users.methods.get_authenticated_user")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
38 - 42, Remove the duplicated environment variable assignments for
REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL that appear immediately
before importing the get_authenticated_user module; ensure only the initial
os.environ["REQUIRE_AUTHENTICATION"] = "false" and
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false" (the ones already set
earlier in the file) remain so that the
importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
call (gau_mod) still happens after the single, non-duplicated environment setup.

161-164: Dead code: POST branch will never execute.

The parameterized test (lines 147-153) only includes GET endpoints, so the elif method == "POST" branch on lines 163-164 will never be reached. Either remove the dead branch or add POST endpoints to the parameterization if intended.

♻️ Simplified version

-            if method == "GET":
-                response = client.get(endpoint)
-            elif method == "POST":
-                response = client.post(endpoint, json={})
+            response = client.get(endpoint)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
161 - 164, The test contains a dead branch: the parameterized inputs only supply
GET endpoints so the `elif method == "POST"` branch (using
`client.post(endpoint, json={})`) will never run; either remove that POST branch
to simplify the test or update the test parameterization to include POST cases
and corresponding endpoint values; locate the method dispatch around `if method
== "GET": response = client.get(endpoint)` / `elif method == "POST": response =
client.post(endpoint, json={})` and apply the chosen fix (remove the POST branch
or add POST endpoints to the parameter list and adjust expected responses).

24-36: Remove unused fixture mock_authenticated_user.

The fixture defined at lines 25-36 is not used by any test in this file. All tests that require a user object use mock_default_user instead. Remove this fixture to eliminate dead code.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/tests/api/test_conditional_authentication_endpoints.py` around lines
24 - 36, Remove the unused pytest fixture mock_authenticated_user: delete the
function named mock_authenticated_user from the test module (it returns a User
and duplicates mock_default_user), and ensure no tests reference
mock_authenticated_user; keep existing mock_default_user intact for tests that
need a user object.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cognee/tests/api/test_conditional_authentication_endpoints.py`:
- Around line 47-52: The fixture function client has its docstring placed after
import statements instead of immediately following the def line; move the
triple-quoted string so it is the first statement inside the def client(...)
function (i.e., place the docstring directly after the def line above the import
and return), keeping the function name client and preserving the TestClient(app)
return behavior.
- Around line 38-42: Remove the duplicated environment variable assignments for
REQUIRE_AUTHENTICATION and ENABLE_BACKEND_ACCESS_CONTROL that appear immediately
before importing the get_authenticated_user module; ensure only the initial
os.environ["REQUIRE_AUTHENTICATION"] = "false" and
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "false" (the ones already set
earlier in the file) remain so that the
importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
call (gau_mod) still happens after the single, non-duplicated environment setup.
- Around line 161-164: The test contains a dead branch: the parameterized inputs
only supply GET endpoints so the `elif method == "POST"` branch (using
`client.post(endpoint, json={})`) will never run; either remove that POST branch
to simplify the test or update the test parameterization to include POST cases
and corresponding endpoint values; locate the method dispatch around `if method
== "GET": response = client.get(endpoint)` / `elif method == "POST": response =
client.post(endpoint, json={})` and apply the chosen fix (remove the POST branch
or add POST endpoints to the parameter list and adjust expected responses).
- Around line 24-36: Remove the unused pytest fixture mock_authenticated_user:
delete the function named mock_authenticated_user from the test module (it
returns a User and duplicates mock_default_user), and ensure no tests reference
mock_authenticated_user; keep existing mock_default_user intact for tests that
need a user object.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ab4a5273-599c-44eb-8651-99ddbba25c1d

📥 Commits

Reviewing files that changed from the base of the PR and between 9c6d617 and 8b9630d.

📒 Files selected for processing (2)

.github/workflows/e2e_tests.yml
cognee/tests/api/test_conditional_authentication_endpoints.py

This reverts commit c630bd8.

## Description Add auto migration to LanceDB DataPoint schema ## Acceptance Criteria  ## Type of Change  - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Code refactoring - [ ] Other (please specify): ## Screenshots  ## Pre-submission Checklist  - [ ] **I have tested my changes thoroughly before submitting this PR** (See `CONTRIBUTING.md`) - [ ] **This PR contains minimal changes necessary to address the issue/feature** - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

…nee into release-candidate-v0.5.6

Aroool and others added 4 commits February 28, 2026 00:57

Fix #2191: sanitize embedding inputs before API call

3e121a4

Address CodeRabbit feedback for #2191

351f4e0

Merge branch 'dev' into fix-2191-embedding-sanitize

fbe94d4

chore: update versions

cc9452a

dexters1 self-assigned this Mar 27, 2026

dexters1 added the core-team label Mar 27, 2026

dexters1 changed the title ~~chore: update versions~~ Release candidate v0.5.6 - dev update Mar 27, 2026

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

dexters1 and others added 12 commits March 27, 2026 14:51

chore: format code

4b38042

feat: add embedding sanitization

f15d41f

test: update unit test

1e42ea0

Merge branch 'release-candidate-v0.5.6' into aroool-sanitize-embedding

182ba1b

refactor: sanitize text input for all embeding models

6cee1e7

Merge branch 'aroool-sanitize-embedding' of github.com:topoteretes/co…

f38c8fd

…gnee into aroool-sanitize-embedding

chore: add todo comment

67675b8

chore: add comments to Ollama embedding

b2269f6

chore: update tod

bf302c1

fix: resolve issue with conditional auth test

32d0903

refactor: comment out unit tests

0839f4d

dexters1 changed the title ~~Release candidate v0.5.6 - dev update~~ Release candidate v0.5.6 - dev branch Mar 28, 2026

mergify bot mentioned this pull request Mar 28, 2026

Release candidate v0.5.6 - main branch #2512

Merged

13 tasks

coderabbitai bot reviewed Mar 28, 2026

View reviewed changes

refactor: return removed test

9c6d617

coderabbitai bot reviewed Mar 28, 2026

View reviewed changes

Vasilije1990 self-requested a review March 28, 2026 17:18

Vasilije1990 previously approved these changes Mar 28, 2026

View reviewed changes

dexters1 added 3 commits March 30, 2026 12:17

feat: add automigrate for LanceDB

95cfc36

test: add schema migration test

69d23bc

fix: resolve conditional auth test issue

8b9630d

dexters1 dismissed Vasilije1990’s stale review via 8b9630d March 30, 2026 11:00

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

hajdul88 and others added 14 commits March 30, 2026 15:08

feat: adds back properties that are missing

8bd3590

feat: only expected properties assert

c630bd8

Revert "feat: only expected properties assert"

694814d

This reverts commit c630bd8.

Merge branch 'release-candidate-v0.5.6' into automigrate-lancedb-rows

4da584c

refactor: remove unnnecessary code

a612c79

Merge branch 'dev' into release-candidate-v0.5.6

4fa0257

Merge branch 'release-candidate-v0.5.6' into automigrate-lancedb-rows

f3d0c00

chore: update lock files

e517bf5

Merge branch 'release-candidate-v0.5.6' of github.com:topoteretes/cog…

37252c4

…nee into release-candidate-v0.5.6

Merge branch 'dev' into release-candidate-v0.5.6

f481842

chore: ruff format

9bb69cf

refactor: return inherited field

ac87032

chore: reduce lanceDB version for uv lock

c506d3a

dexters1 requested a review from Vasilije1990 March 30, 2026 22:00

Vasilije1990 merged commit 0e096b2 into dev Mar 30, 2026
427 of 431 checks passed

Vasilije1990 deleted the release-candidate-v0.5.6 branch March 30, 2026 22:03

	"cognee[postgres,docs,neo4j]==0.5.5",
	"cognee[postgres,docs,neo4j]==0.5.6",

Conversation

dexters1 commented Mar 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Acceptance Criteria

Type of Change

Screenshots

Pre-submission Checklist

DCO Affirmation

Summary by CodeRabbit

Uh oh!

pull-checklist bot commented Mar 27, 2026

Please make sure all the checkboxes are checked:

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gitguardian bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dexters1 commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

gitguardian bot commented Mar 28, 2026 •

edited

Loading