Skip to content

feat: add __len__ support to LightningDataModule#21546

Open
dhruvildarji wants to merge 1 commit intoLightning-AI:masterfrom
dhruvildarji:feat/datamodule-len
Open

feat: add __len__ support to LightningDataModule#21546
dhruvildarji wants to merge 1 commit intoLightning-AI:masterfrom
dhruvildarji:feat/datamodule-len

Conversation

@dhruvildarji
Copy link
Copy Markdown

@dhruvildarji dhruvildarji commented Feb 23, 2026

Closes #5965.

Adds __len__ support to LightningDataModule so users can call len(datamodule).

Default implementation: Attempts to return the length from train_dataloader().dataset. If the training dataloader is not configured or the dataset doesn't implement __len__, raises a TypeError with a helpful message directing users to override __len__ in their subclass.

Usage:

# Use the default implementation (returns train dataset length)
datamodule = MyDataModule()
datamodule.setup("fit")
print(len(datamodule))  # e.g., 50000

# Or override for custom behavior
class MyDataModule(LightningDataModule):
    def __len__(self) -> int:
        return len(self.train_dataset)

📚 Documentation preview 📚: https://pytorch-lightning--21546.org.readthedocs.build/en/21546/

@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pl Generic label for PyTorch Lightning package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support len(datamodule)

1 participant