Skip to content

[Feature] Load dataset distributedly in the single-controller mode #1111

@garrett4wade

Description

@garrett4wade

Checklist

  • This feature will maintain backward compatibility with the current APIs in
    areal/api/. If not, please raise a refactor issue first.

Background

Loading data in the centralized controller process is extremely expensive.

Potential Solution

Create a DataController and distributed workers for loading and processing data in multiple processes. The worker processes will send RTensors back to the controller. The real tensor data will transfer between the underlying workers.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions