December 18th full day: Invited talks at The Chinese University of Hong Kong, Shenzhen. December 19th full day and 20th morning: Invited talks at the conference hall located at the Songshan Lake, Dongguan. On December 18th night, there will be a banquet for all full registered members arranged by the Tsinghua-Berkeley Shenzhen Institute.
Administration Building, 2/F W201, CUHK(Shenzhen)
|9:30-10:15||Byzantine Consensus Through the Lens of Information Theory||David Tse|
|10:15-11:00||Reinforcement Learning, Bit by Bit||Benjamin Van Roy|
|11:00-11:45||Beyond Trans-dimensional Sampling: Generalised Bayesian Model Selection||Ercan Engin Kuruoğlu|
|13:30-14:15||Biomedical NLP: Machine Learning and Beyond||Fei Xia|
|14:15-15:00||Natural Language Representation and Decoding||Chengqing Zong|
|15:30-16:15||A General Framework for Representing and Annotating Multifaceted Cell Heterogeneity in Human Cell Atlas||Xuegong Zhang|
|16:15-17:00||Team Playing under Uncertainties||David Gesbert|
Songshan Lake, Dongguan
|9:00||Take Shuttle Bus to Songshan Lake, Dongguan|
|13:30-14:15||Gradient Statistics Aware Power Control for Over-the-Air Federated Learning||Meixia Tao|
|14:15-15:00||Anticipated Learning Machine for Time-series Prediction||Luonan Chen|
|15:00-15:45||Precoding in Wireless Communications and Satellite Systems||Björn Ottersten|
|16:15-17:00||On Copula-Based Multiuser Performance Bounds – the Road to Ultra-High Reliability||Eduard A. Jorswieck|
|17:00-17:45||An Overview on Enabling Technologies for 6G||Erik G. Larsson|
Songshan Lake, Dongguan
|9:15-10:00||Exploring Stochastic Methods for Deep Learning and Reinforcement Learning||Zaiwen Wen|
|10:30-11:15||Nonparametric Perspective on Deep Learning||Guang Cheng|
|11:15-12:00||Federated Matrix Factorization: Algorithm Design and Applications||Tsung-Hui Chang|
|15:00||Take Shuttle BusBack to Shenzhen|
Abstract: The study of the fundamental limits of communication in the presence of noise was initiated by Shannon in 1948. The study of the fundamental limits of distributed consensus in the presence of faults was initiated by Lamport, Pease and Shostak in the early 1980's. Both fields study the problem of designing optimal reliable systems but there has been surprisingly limited cross-fertilization of ideas in the past 40 years. In this talk, we give an example of the utility of such cross-fertilization by using an analogy from network information theory to formulate and solve a central problem in blockchains: the availability-finality dilemma.
Abstract: Leveraging creative algorithmic concepts and advances in computation, achievements of reinforcement learning in simulated systems have captivated the world's imagination. The path to AI agents that learn from real experience counts on major advances in data efficiency. Should we try to carefully quantify information generated through and extracted from experience so that we can design agents that uncover and digest information as efficiently as possible? I will discuss research that aims to take us in this direction.
Tsinghua-Berkeley Shenzhen Institute
Abstract: Most model estimation methods of signal processing and machine learning make the assumption that the model type and dimension are known beforehand and the estimation problem reduces to estimating the values of the parameters of the model of known class and dimension. However, in various applications we do not have this prior information, e.g. we may not know in advance how many clusters there are in our data, how many sources there are in a speech signal mixture in the cocktail party problem or how many targets there are in a radar signal. The Bayesian Monte Carlo method, namely Reversible Jump Markov Chain Monte Carlo (RJMCMC) provides a solution to the estimation of model dimension. However, a more general problem of estimating the type of model remains. For example, in a channel estimation problem we generally do not know the distribution of noise, or in a system identification problem we do not know beforehand if the system is linear or nonlinear. In this talk, we will present a greater picture of the model selection problem and will extend the RJMCMC algorithm to a trans-class sampling algorithm which is capable of choosing among different generic models automatically. We will demonstrate the success of the method on problems such as Volterra system identification, PLC noise modelling, speckle classification in Synthetic Aperture Radar or Ultrasound Images and in wavelet transform coefficients of natural images.
University of Washington
Abstract: In the past decade, the Natural Language Processing (NLP) field has undergone tremendous changes due to rapid advance of Neural Networks (NN). While NN has produced impressive results on many NLP tasks, its limitations are well-known and NN alone cannot solve all the NLP problems; thus, it is crucial for the NLP field not to neglect other important research questions. This talk will use several biomedical NLP projects as examples to demonstrate that building high-quality NLP systems for the biomedical domain often requires a better understanding of the task itself, deep analysis of the data, and the input of domain experts at various stages of medical NLP research and development.
University of Chinese Academy of Sciences
Abstract: The distributed representation is the basis for the realization of natural language processing methods based on deep learning. Obtaining high-quality semantic representations is of great significance for subsequent natural language processing tasks, while exploring the human brain's semantic encoding and decoding process is very useful for studying the semantic representation and computational methods. This report will introduce the team's recent research on language representation and neural decoding effects of different representation models, and will try to explore the process of language encoding and decoding in the human brain and its modeling methods.
Abstract: The goal of Huma Cell Atlas (HCA) is to build a collection of maps that comprehensively define and describe all cell types and their molecular features in healthy and diseased people. Just like maps having different coordinates, providing coordinate systems is one of the key tasks in building HCA. Current efforts on this task has been devoted to build the Common Coordinate Framework (CCF), which defines the positions of individual cells in a reference human body at different levels. A CCF enables multiscale, 3D exploration of the human body, and is the key to make spatial alignment between tissue samples from different people. To better understand the highly orchestrated function and organization of different cells, it is essential to combine the auxiliary information like spatial location, sex, race, and the endogenous attributes of cells, such as cell types/states, developing trajectory, biological functions, that encoded in the transcriptomics and other omics data. Here we propose to go beyond the geometric coordinate systems, and build a multidimensional coordinates system for different physical and biological attributes of cells. As the first step toward this goal, we introduced UniCoord, a general deep learning-based framework to represent multifaceted cell heterogeneity in human cell atlas. We adopted a supervised VAE structure in UniCoord to learn the mapping relationship between gene expression and low-dimensional coordinate space that encodes different cell attributes. We conducted several experiments on different real datasets. Results showed that UniCoord was able to represent a variety kinds of cell heterogeneities such as discrete, continuous and hierarchical structures. The trained UniCoord model can be used to automatically label various attributes of cells and generate the corresponding expression data. These results validated UniCoord as a feasible coordinates frame for representing and illustrating multifaceted cell heterogeneity in HCA.
Abstract: Cooperation is an essential function in a wide array of network scenarios, including wireless, robotics, transports and beyond. In decentralized networks, cooperation (or team play) must be achieved by agents despite information uncertainties in the global state of the network. Cooperation in the presence of information uncertainties is a highly challenging problem for which no systematic optimization solution exist. In this talk, we describe different mechanisms for solving it, from information theoretic to machine learning based. In the machine learning approach this problem, we introduce so called Team Deep Learning Networks (Team-DNN) where agents learn to coordinate with each other under uncertainties. In the communication domain, we show how devices can learn how to message each other relevant information and take appropriate transmission decisions, possibly under the control of a meta-expert, so as to optimize network performance.
Shanghai Jiao Tong University
Abstract: Federated learning (FL) is a promising technique that enables many edge devices to train a machine learning model collaboratively in wireless networks. By exploiting the superposition nature of wireless waveforms, over-the-air computation (AirComp) can accelerate model aggregation and hence facilitate communication-efficient FL. Due to channel fading, power control is crucial in AirComp. In this talk, I will consider the power control problem for over-the-air FL by taking gradient statistics, which have been overlooked before, into account. In particular, we find that the relative dispersion of gradients at each round of SGD-based training algorithm, i.e., squared multivariate coefficient of variation (SMCV), plays a key role in power control and it varies over iterations. Experimental results demonstrate that the proposed gradient-statistics-aware power control can achieve higher test accuracy than the existing schemes for a wide range of scenarios.
University of Chinese Academy of Sciences
Abstract: We develop a novel machine learning framework, i.e. anticipated learning machine or auto-reservoir computing framework, to efficiently and accurately make multi-step-ahead predictions based on a short term high-dimensional time series. Different from traditional reservoir computing whose reservoir is an external dynamical system irrelevant to the target system, auto-reservoir computing directly transforms the observed high-dimensional dynamics as its reservoir, which maps the highdimensional/spatial data to the future temporal values of a target variable based on our spatiotemporal information (STI) transformation. Thus, the multi-step prediction of the target variable is achieved in an accurate and computationally efficient manner. Auto-reservoir computing is successfully applied to both representative models and real-world datasets, all of which show satisfactory performance in the multi-step-ahead prediction, even when the data are perturbed by noise and when the system is time-varying. Actually, such a transformation equivalently expands the sample size and thus has great potential in practical applications in artificial intelligence and machine learning. This framework has also been successfully applied to the analysis and predictions of various biological systems.
University of Luxembourg
Abstract: The advent of spatial processing in multiantenna wireless communications has transformed the design of mobile networks allowing us to meet the tremendous demands for data and services in mobile applications. Signal processing techniques implemented in base-band transceivers enables the efficient exploitation of radio spectrum, increased gain to improve coverage, more reliable and secure transmissions and improved energy efficiency. We will focus on the challenges of spatial precoding techniques implemented on the transmit side of wireless systems. Early developments on transmit beamforming and spatial division multiple access will be reviewed as well as more recent developments on symbol-level precoding. The process from early results reported in the academic literature to experimental validation, standardization and commercialization will be emphasized.
Technische Universität Braunschweig
Abstract: Reliability in wireless communications can be improved by diversity in space, frequency, and time. Use cases in beyond 5G networks require high reliability and will apply massive number of antennas, huge spectrum resources, and multi-connectivity. It is well known that correlation can reduce the outage performance of diversity schemes. However, the impact of general dependency structures – so called copula – on the performance of diversity schemes is less studied. In the talk, we provide an approach to model and analyze dependent fading multi-user systems. On the bright side, there is a optimistic results that negative dependent – counter-monotonic – fading channels lead to positive zero-outage capacities. On the down side, the framework provides worst-case bounds on the joint distributions of fading channels for various multi-user settings. First, the scenario with two fading channels is studied, and then it is generalized to the case with arbitrary number. The focus is on the outage performance for slowly fading channels. In an outlook, we provide also insights for average performance for fast fading multiuser channels. All results are illustrated with numerical examples, comparisons to the standard correlation models, and with applications in beyond 5G networks.
Abstract: In this talk, an overview on enabling technologies for 6G will be given, which includes cell-free massive MIMO and massive/grant-free access.
Abstract: Stochastic methods are widely used in machine learning. In this talk, we present a structured stochastic quasi-Newton method and a sketchy empirical natural gradient method for deep learning. We also introduce a stochastic quadratic penalty algorithm for reinforcement learning.
The Chinese University of Hong Kong, Shenzhen
Abstract: Models built with deep neural network (DNN) can handle complicated real-world data extremely well, without suffering from the curse of dimensionality or the non-convex optimization. To contribute to the theoretical understanding of deep learning, we will investigate the nonparametric aspects of DNNs by addressing the following questions: (i) what kind of data can be best learned by deep neural networks？ (ii) can deep neural networks achieve the statistical optimality? (iii) is there any algorithmic guarantee for obtaining such optimal neural networks? Our theoretical analysis applies to two most fundamental setup in practice: regression and classification.
The Chinese University of Hong Kong, Shenzhen
Abstract: Recent demands on data privacy have called for federated learning (FL) as a new distributed learning paradigm in massive and heterogeneous networks. Although many FL algorithms have been proposed, few of them have considered the matrix factorization (MF) model, which is known to have a vast number of signal processing and machine learning applications. Different from the existing FL algorithms that are designed for smooth problems with single block of variables, in federated MF (FedMF), we have to deal with challenging non-convex and non-smooth problems (due to constraints or regularization) with two blocks of variables, in the presence of non-i.i.d. data. In this talk, we address the challenge by first proposing a new FedMF algorithm, namely, FedMAvg, based on the model averaging principle, for a general MF model. We further restrict to the friendly Frobenius cost function, and propose a gradient sharing based FedMF algorithm, called FedMGS, that enjoys great robustness against heterogeneous data distribution. Both FedMAvg and FedMGS adopt multiple steps of local updates per communication round to speed up convergence and allow only a randomly sampled subset of clients to communicate with the server for reducing the communication cost. Convergence for the two algorithms are analyzed, which delineate the impacts of data distribution, local update number, and partial client communication on the algorithm performance. By focusing on a data clustering task, extensive experiment results are presented to examine the practical performance of both algorithms, as well as demonstrating their efficacy over the existing distributed clustering algorithms.