The workshop will take place in Mission Ballroom MR2, Santa Clara Convention Center on the 1st of September 2022.
|8.55 - 9.00||Opening Remarks||Tianyi Chen (RPI)|
|9.00 - 9.40|| Keynote #1
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally! [abstract] [slides]
We introduce ProxSkip - a surprisingly simple and provably efficient method for minimizing the sum of a smooth () and an expensive nonsmooth proximable () function. The canonical approach to solving such problems is via the proximal gradient descent (ProxGD) algorithm, which is based on the evaluation of the gradient of and the prox operator of in each iteration. In this work we are specifically interested in the regime in which the evaluation of prox is costly relative to the evaluation of the gradient, which is the case in many applications. ProxSkip allows for the expensive prox operator to be skipped in most iterations: while its iteration complexity is , where is the condition number of , the number of prox evaluations is only. Our main motivation comes from federated learning, where evaluation of the gradient operator corresponds to taking a local GD step independently on all devices, and evaluation of prox corresponds to (expensive) communication in the form of gradient averaging. In this context, ProxSkip offers an effective acceleration of communication complexity. Unlike other local gradient-type methods, such as FedAvg, SCAFFOLD, S-Local-GD and FedLin, whose theoretical communication complexity is worse than, or at best matching, that of vanilla GD in the heterogeneous data regime, we obtain a provable and large improvement without any heterogeneity-bounding assumptions.
|Peter Richtarik (KAUST)|
|9.40 - 10.20|| Keynote #2
Three Daunting Challenges of Federated Learning: Privacy Leakage, Label Deficiency, and Resource Constraints [abstract]
Federated learning (FL) has emerged as a promising approach to enable decentralized machine learning directly at the edge, in order to enhance users’ privacy, comply with regulations, and reduce development costs. In this talk, I will provide an overview of FL and highlight three fundamental challenges for landing FL into practice: (1) privacy and security guarantees for FL; (2) label scarcity at the edge; and (3) FL over resource-constrained edge nodes. I will also provide a brief overview of FedML ( https://fedml.ai ), which is a platform that enables zero-code, lightweight, cross-platform, and provably secure federated learning and analytics.
|Salman Avestimehr (USC)|
|10.20 - 11.00|| Keynote #3
Federated Learning for EdgeAI: New Ideas and Opportunities for Progress [abstract]
EdgeAI aims at the widespread deployment of AI on edge devices. To this end, a critical requirement of future ML systems is to enable on-device automated training and inference in distributed settings, wherever and whenever data, devices, or users are present, without sending the training (possibly sensitive) data to the cloud or incurring long response times. Starting from these overarching considerations, we consider on-device distributed learning, the hardware it runs on, and their co-design to allow for efficient federated learning and resource-aware deployment on edge devices. We hope to convey the excitement of working in this problem space that brings together topics in ML, optimization, communications, and application-hardware (co-)design.
|Radu Marculescu (UT Austin)|
|11.00 - 11.40|| Keynote #4
Model Based Deep Learning with Applications to Federated Learning [abstract]
Deep neural networks provide unprecedented performance gains in many real-world problems in signal and image processing. Despite these gains, the future development and practical deployment of deep networks are hindered by their black-box nature, i.e., a lack of interpretability and the need for very large training sets. On the other hand, signal processing and communications have traditionally relied on classical statistical modeling techniques that utilize mathematical formulations representing the underlying physics, prior information and additional domain knowledge. Simple classical models are useful but sensitive to inaccuracies and may lead to poor performance when real systems display complex or dynamic behavior. Here we introduce various approaches to model based learning which merge parametric models with optimization tools and classical algorithms leading to efficient, interpretable networks from reasonably sized training sets. We then show how model based signal processing can impact federated learning both in terms of communication efficiency and in terms of convergence properties. We will consider examples to image deblurring, super resolution in ultrasound and microscopy, efficient communication systems, and efficient diagnosis of COVID19 using X-ray and ultrasound.
|Yonina Eldar (Weizmann)|
|11.40 - 13.40|| Live Demo Session on FedML
A tutorial followed by a live demo (an interactive session with participants to run FL in FedML platform)
|Chaoyang He (FedML)|
|13.40 - 14.20|| Keynote #5
Scalable, Heterogeneity-Aware and Trustworthy Federated Learning [abstract]
Federated learning has become a popular distributed machine learning paradigm for developing edge AI applications. However, the data residing across the edge devices is intrinsically statistically heterogeneous (i.e., non-IID data distribution) and edge devices usually have limited communication bandwidth to transfer local updates. Such statistical heterogeneity and communication limitation are two major bottlenecks that hinder applying federated learning in practice. In addition, recent works have demonstrated that sharing model updates makes federated learning vulnerable to inference attacks and model poisoning attacks. In this talk, we will present our recent works on novel federated learning frameworks to address the scalability and heterogeneity issues simultaneously. In addition, we will also reveal the essential reason of privacy leakage and model poisoning attacks in federated learning procedures, and provide the defense mechanisms accordingly towards trustworthy federated learning.
|Yiran Chen (Duke)|
|14.20 - 15.00|| Keynote #6
On Lower Bounds of Distributed Learning with Communication Compression [abstract] [slides]
There have been many recent works proposing new compressors for various distributed optimization settings. But, all cutting-edge performance analyses come down to one of the only two properties of compressors: unbiasedness or contraction. This leads to a natural question: If we want to improve the convergence rate of distributed optimization with communication compression, should we continue using those properties and focus on how to apply them more cleverly in distributed algorithms, or should we look for new compressor properties? To answer this question, we present theoretical performance lower bounds imposed by those two properties and, then, show that the lower bounds are nearly matched by a method, which works with any compressors satisfying one of those two properties. Hence, future work shall look for a fundamentally new compressor property. This is joint work with Xinmeng Huang (UPenn), Yiming Chen (Alibaba), and Kun Yuan (Alibaba).
|Wotao Yin (Alibaba Damo)|
|15.00 - 17.00|| Poster Session and Best Student Poster Competition
Best Paper Award:
|17.00 - 17.05||Closing Remarks||TPC|