Question: Role of Specific Document Classes (AudioDocument, PdfDocument) in Cognee #1605

LStromann · 2025-10-22T17:48:40Z

LStromann
Oct 22, 2025
Collaborator

=I am looking at the data flow, specifically: Ingestion -> ... -> Classify documents -> ... -> Extract chunks -> ...

My understanding is that during the Ingestion step, various file types (such as PDF, audio, and image files) are saved into Cognee’s storage and then converted to text.

My confusion arises at the Classify documents step. It appears that this step classifies the converted text as a TextDocument. This has led me to wonder if the specific classes—such as PdfDocument, AudioDocument, and ImageDocument—are being utilized in this part of the flow.

I would like to know the intended role of these specific document classes. Are they used in a different part of the process, or am I perhaps misunderstanding a key aspect of their functionality that I haven't seen?

This discussion was automatically pulled from Discord.

LStromann · 2025-10-22T17:50:26Z

LStromann
Oct 22, 2025
Collaborator Author

We've changed our architecture so all documents are converted to text initally through different Loaders (that handle PDF, CSV conversion and etc.) in the add pipeline, so only TextDocuments are used now in Cognify.

Before we've had this conversion happen during Cognify with these different Document classes

0 replies

LStromann · 2025-10-22T17:50:31Z

LStromann
Oct 22, 2025
Collaborator Author

Thank you for the clarification. That makes perfect sense.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topoteretes

Question: Role of Specific Document Classes (AudioDocument, PdfDocument) in Cognee #1605

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Topoteretes

Question: Role of Specific Document Classes (AudioDocument, PdfDocument) in Cognee #1605

Uh oh!

LStromann Oct 22, 2025 Collaborator

Replies: 2 comments

Uh oh!

LStromann Oct 22, 2025 Collaborator Author

Uh oh!

LStromann Oct 22, 2025 Collaborator Author

LStromann
Oct 22, 2025
Collaborator

LStromann
Oct 22, 2025
Collaborator Author

LStromann
Oct 22, 2025
Collaborator Author