Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(baselines) Add flwr baseline fedht #4398

Open
wants to merge 95 commits into
base: main
Choose a base branch
from

Conversation

chancejohnstone
Copy link

@chancejohnstone chancejohnstone commented Oct 30, 2024

Issue

Adding Fed-HT (and FedIter-HT) baseline originally introduced in:

Tong, Q., Liang, G., Zhu, T. and Bi, J., 2020. Federated nonconvex sparse learning. arXiv preprint arXiv:2101.00052.

Description

Fed-HT (and Fed-IterHT) are aggregation strategies for generating highly predictive models while also allowing for constrained sparsity (with some sparsity threshold parameter), instead of just sparsity through regularization. The goal with this baseline is to implement Fed-HT and Fed-IterHT as a custom Flower aggregation strategy and apply to simulated and benchmark datasets from the paper.

Related issues/PRs

This PR is related to the following PR: Add Flower Baseline: FedHT #3987

Proposal

Proposing to merge local fork with Flower repo to include fedht baseline in future Flower versions.

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Comments

Results from paper not reproduced exactly as hyperparameters for figures are not identified. Waiting on response from original authors.

@chancejohnstone
Copy link
Author

Hi @chancejohnstone,

I left a few suggestions below.

I tried running the four Simulation II experiments as described in the README.md but i got different results. 1) For FedAvg final centralised loss was around 0.65-0.70 (higher than the ~0.4 in the plot); 2) For the second command using FedHt, the training didn't go that well and the final centralised loss was ~1.6 (much higher than the ~0.5 shown in the plot); 3) for the third command with FedHtIter , final centralised loss was ~0.75 (instead of ~0.3 as suggested by the plot); 4) the final command did the best with a final ~0.5 centralised loss. However, is this performing the Dist-IHT as shown in the figure? There is no apparent difference w.r.t. the second command (other than changing num_local_epochs from 5 to 1).

@jafermarq thanks for the feedback! I'll make sure the figures and results match for each of the experiments; I think the figure may have been generated with earlier experiment iterations. Also, Dist-IHT is identified in the paper as performing Fed-HT with one local epoch. Working to address the other comments as well.

@chancejohnstone
Copy link
Author

chancejohnstone commented Dec 4, 2024

@jafermarq I think I've resolved the reproducibility issues with Simulation II; visuals in README have been updated to reflect this. I've also removed the inclusion of Dist-IHT just because it's a special case of Fed-HT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants