Label and feature skew Partitioner #3146

WilliamLindskog · 2024-03-14T10:12:10Z

Describe the type of feature and its functionality.

Hi there,

I've checked the documentation for datasets and open PRs and I think these partitioners would be helpful.

As in the baseline NIID-Bench, there is a partition strategy where each client gets data with a specific number of unique labels i.e. label_quantity_partitioner (only applicable for classification tasks). For such partitioner, one should be able to specify how many allotted number of labels a client is given - must be less or equal to number of unique labels in dataset.

Another partition strategy is found in the original paper - a feature distribution partition based on Gaussian Noise. Specifically, given user-defined noise level σ, we would add noises xˆ ∼ Gau(σ · i/N) for Party P_i, where Gau(σ · i/N) is a Gaussian distribution with mean 0 and variance σ · i/N.

What do you think?

Describe step by step what files and adjustments are you planning to include.

There would be a need to create two new partitioners:

Label quantity partitioner
Gausian noise partitioner

And also test scripts for these.

Is there something else you want to add?

N/A

adam-narozniak · 2024-03-20T09:08:38Z

Hi @WilliamLindskog
Thanks for writing the issue. We want to support both of them.
Regarding the first Partitioner, I informally call it ClassConstrain Partitioner (I think some people call it pathological, but I saw that name used in a different context,t too). It was also used in other work. This will be supported shortly and is a current priority regarding the partitioning schemes. (There's even been an attempt to add it based on the implementation in the FedProx paper, though it does not generalize well; also, a heuristic was used there for the class choice, but we'll move to the purely probabilistic approach).

Regarding the second Partitioner. I'll move to that either directly after the ClassConstrain is done or have just one more quantity skew that works in a similar manner to ClassConstrain but additionally assigns a small certain number of other classes (not sure how it'll be parameterized = whether percentage or raw numbers). Which, in contrast, are completely zero in ClassConstrain.

I'll keep you updated. Also, please let me know if you have other partitioning schemes you think we should add and would like to use.

WilliamLindskog added the feature request This issue or comment suggests an additional feature. label Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label and feature skew Partitioner #3146

Label and feature skew Partitioner #3146

WilliamLindskog commented Mar 14, 2024

adam-narozniak commented Mar 20, 2024

Label and feature skew Partitioner #3146

Label and feature skew Partitioner #3146

Comments

WilliamLindskog commented Mar 14, 2024

Describe the type of feature and its functionality.

Describe step by step what files and adjustments are you planning to include.

Is there something else you want to add?

adam-narozniak commented Mar 20, 2024