Parallelization of Auto-Labeling Process #115

Buckler89 · 2024-01-13T10:29:56Z

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Description

I am writing to propose a feature enhancement related to the parallelization of the auto-labeling process within your project. Given the computationally intensive nature of auto-labeling, leveraging multiple GPUs could significantly improve efficiency and performance.

Feature Description:
The idea is to enable parallel processing for auto-labeling tasks by utilizing multiple GPUs. This would allow for the instantiation of multiple models (GroundingDINO or GroundedSAM) and enable each to operate on a separate device.

Potential Benefits:

Increased Efficiency: Parallel processing can reduce the time required for auto-labeling, especially for large datasets.
Scalability: This feature would make the tool more scalable, accommodating projects with varying resource availabilities.
Resource Optimization: By distributing the workload across multiple GPUs, each unit's computational capabilities are better utilized.
Suggested Implementation:

An option to specify the device to use when instantiating the Model.
Mechanisms for assigning different portions of the data or models to different GPUs.

I believe this enhancement could significantly contribute to the performance and scalability of the auto-labeling process in your project. I would be happy to discuss this further and contribute to its implementation.

Thank you for considering this proposal.

Use case

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

capjamesg · 2024-01-16T16:00:34Z

Hello there! Thank you for creating this Issue. We would love to support parallelized auto-labeling. This is not currently on our roadmap, but if an external contributor submits a PR we will take a look and review! This would have to be done on a per-model basis since each model will need different logic to support parallelization (and some models may not support it).

capjamesg · 2024-05-20T14:54:07Z

Since there is nobody actively working on this, I am going to close this issue for now. If anyone wants to work on parallelization for Autodsitill, we would be excited to review any PRs and help bring the idea to fruition!

Buckler89 added the enhancement New feature or request label Jan 13, 2024

capjamesg closed this as completed May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization of Auto-Labeling Process #115

Parallelization of Auto-Labeling Process #115

Buckler89 commented Jan 13, 2024 •

edited

capjamesg commented Jan 16, 2024

capjamesg commented May 20, 2024

Parallelization of Auto-Labeling Process #115

Parallelization of Auto-Labeling Process #115

Comments

Buckler89 commented Jan 13, 2024 • edited

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

capjamesg commented Jan 16, 2024

capjamesg commented May 20, 2024

Buckler89 commented Jan 13, 2024 •

edited