Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Proposal] (New data partition strategy) Extended Dirichlet strategy #337

Open
liyipeng00 opened this issue Nov 3, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@liyipeng00
Copy link

liyipeng00 commented Nov 3, 2023

Recently, I find one new data partition strategy called Extended Dirichlet strategy ~~~ ours :), which could be added in this repo.

It combines the two common partition strategies (i.e., Quantity-based class imbalance and Diribution-based class imbalance in Li et al. (2022)) to generate arbitrarily heterogeneous data. The difference is to add a step of allocating classes (labels) to determine the number of classes per client (denoted by $C$) before allocating samples via Dirichlet distribution (with concentrate parameter $\alpha$).

The implementation is in convergence. You can find more details in Convergence Analysis of Sequential Federated Learning on Heterogeneous Data.
[Figure:
Row 1: $C=2$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$;
Row 2: $C=5$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$;
Row 3: $C=10$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$; ]

Li, Q., Diao, Y., Chen, Q., & He, B. (2022, May). Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 965-978). IEEE.

@AgentDS AgentDS added the enhancement New feature or request label Nov 3, 2023
@AgentDS
Copy link
Member

AgentDS commented Nov 3, 2023

We will check your code. Thank you very much!

@AgentDS AgentDS changed the title (New data partition strategy) Extended Dirichlet strategy [Feature Proposal] (New data partition strategy) Extended Dirichlet strategy Nov 3, 2023
@liyipeng00
Copy link
Author

liyipeng00 commented Nov 3, 2023

Thanks. We are glad to hear from you. The code is ExDirPartition, and you can generate the map with the following command (changing the dataset location is required).

python partition.py -d mnist -n 10 --partition exdir -C 1 --alpha 1.0 

@AgentDS
Copy link
Member

AgentDS commented Nov 4, 2023

Interesting work!

@liyipeng00
Copy link
Author

Thanks, =^_^=.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants