2020 MIIS

2020-12-18(Friday) Administration Building, 2/F W201, CUHK(Shenzhen)
08:00-8:30	Registration
08:30-9:00	Opening ceremony
09:00-9:30	Tea break
9:30-10:15	Byzantine Consensus Through the Lens of Information Theory	David Tse
10:15-11:00	Reinforcement Learning, Bit by Bit	Benjamin Van Roy
11:00-11:45	Beyond Trans-dimensional Sampling: Generalised Bayesian Model Selection	Ercan Engin Kuruoğlu
12:00-13:30	Lunch
13:30-14:15	Biomedical NLP: Machine Learning and Beyond	Fei Xia
14:15-15:00	Natural Language Representation and Decoding	Chengqing Zong
15:00-15:30	Tea break
15:30-16:15	A General Framework for Representing and Annotating Multifaceted Cell Heterogeneity in Human Cell Atlas	Xuegong Zhang
16:15-17:00	Team Playing under Uncertainties	David Gesbert
17:00-20:00	Banquet
9:00	Take Shuttle Bus to Songshan Lake, Dongguan

2020-12-20(Sunday) Songshan Lake, Dongguan
12:00-13:30	Lunch
13:30-14:15	Gradient Statistics Aware Power Control for Over-the-Air Federated Learning	Meixia Tao
14:15-15:00	Anticipated Learning Machine for Time-series Prediction	Luonan Chen
15:00-15:45	Precoding in Wireless Communications and Satellite Systems	Björn Ottersten
15:45-16:15	Tea break
16:15-17:00	On Copula-Based Multiuser Performance Bounds – the Road to Ultra-High Reliability	Eduard A. Jorswieck
17:00-17:45	An Overview on Enabling Technologies for 6G	Erik G. Larsson
18:00-20:30	Dinner
08:30-9:15	TBD...	Zongben Xu
9:15-10:00	Exploring Stochastic Methods for Deep Learning and Reinforcement Learning	Zaiwen Wen
10:00-10:30	Tea break
10:30-11:15	Nonparametric Perspective on Deep Learning	Guang Cheng
11:15-12:00	Federated Matrix Factorization: Algorithm Design and Applications	Tsung-Hui Chang
12:00-13:00	Lunch
13:00-15:00	Visit
15:00	Take Shuttle BusBack to Shenzhen

Byzantine Consensus Through the Lens of Information Theory

David Tse

Stanford University

Abstract: The study of the fundamental limits of communication in the presence of noise was initiated by Shannon in 1948. The study of the fundamental limits of distributed consensus in the presence of faults was initiated by Lamport, Pease and Shostak in the early 1980's. Both fields study the problem of designing optimal reliable systems but there has been surprisingly limited cross-fertilization of ideas in the past 40 years. In this talk, we give an example of the utility of such cross-fertilization by using an analogy from network information theory to formulate and solve a central problem in blockchains: the availability-finality dilemma.

Reinforcement Learning, Bit by Bit

Benjamin Van Roy

Stanford University

Abstract: Leveraging creative algorithmic concepts and advances in computation, achievements of reinforcement learning in simulated systems have captivated the world's imagination. The path to AI agents that learn from real experience counts on major advances in data efficiency. Should we try to carefully quantify information generated through and extracted from experience so that we can design agents that uncover and digest information as efficiently as possible? I will discuss research that aims to take us in this direction.

Beyond Trans-dimensional Sampling: Generalised Bayesian Model Selection

Ercan Engin Kuruoğlu

Tsinghua-Berkeley Shenzhen Institute

Abstract: Most model estimation methods of signal processing and machine learning make the assumption that the model type and dimension are known beforehand and the estimation problem reduces to estimating the values of the parameters of the model of known class and dimension. However, in various applications we do not have this prior information, e.g. we may not know in advance how many clusters there are in our data, how many sources there are in a speech signal mixture in the cocktail party problem or how many targets there are in a radar signal. The Bayesian Monte Carlo method, namely Reversible Jump Markov Chain Monte Carlo (RJMCMC) provides a solution to the estimation of model dimension. However, a more general problem of estimating the type of model remains. For example, in a channel estimation problem we generally do not know the distribution of noise, or in a system identification problem we do not know beforehand if the system is linear or nonlinear. In this talk, we will present a greater picture of the model selection problem and will extend the RJMCMC algorithm to a trans-class sampling algorithm which is capable of choosing among different generic models automatically. We will demonstrate the success of the method on problems such as Volterra system identification, PLC noise modelling, speckle classification in Synthetic Aperture Radar or Ultrasound Images and in wavelet transform coefficients of natural images.

Biomedical NLP: Machine Learning and Beyond

Fei Xia

University of Washington

Abstract: In the past decade, the Natural Language Processing (NLP) field has undergone tremendous changes due to rapid advance of Neural Networks (NN). While NN has produced impressive results on many NLP tasks, its limitations are well-known and NN alone cannot solve all the NLP problems; thus, it is crucial for the NLP field not to neglect other important research questions. This talk will use several biomedical NLP projects as examples to demonstrate that building high-quality NLP systems for the biomedical domain often requires a better understanding of the task itself, deep analysis of the data, and the input of domain experts at various stages of medical NLP research and development.

Natural Language Representation and Decoding

Chengqing Zong

University of Chinese Academy of Sciences

Abstract: The distributed representation is the basis for the realization of natural language processing methods based on deep learning. Obtaining high-quality semantic representations is of great significance for subsequent natural language processing tasks, while exploring the human brain's semantic encoding and decoding process is very useful for studying the semantic representation and computational methods. This report will introduce the team's recent research on language representation and neural decoding effects of different representation models, and will try to explore the process of language encoding and decoding in the human brain and its modeling methods.

A General Framework for Representing and Annotating Multifaceted Cell Heterogeneity in Human Cell Atlas

Xuegong Zhang

Tsinghua University

Abstract: The goal of Huma Cell Atlas (HCA) is to build a collection of maps that comprehensively define and describe all cell types and their molecular features in healthy and diseased people. Just like maps having different coordinates, providing coordinate systems is one of the key tasks in building HCA. Current efforts on this task has been devoted to build the Common Coordinate Framework (CCF), which defines the positions of individual cells in a reference human body at different levels. A CCF enables multiscale, 3D exploration of the human body, and is the key to make spatial alignment between tissue samples from different people. To better understand the highly orchestrated function and organization of different cells, it is essential to combine the auxiliary information like spatial location, sex, race, and the endogenous attributes of cells, such as cell types/states, developing trajectory, biological functions, that encoded in the transcriptomics and other omics data. Here we propose to go beyond the geometric coordinate systems, and build a multidimensional coordinates system for different physical and biological attributes of cells. As the first step toward this goal, we introduced UniCoord, a general deep learning-based framework to represent multifaceted cell heterogeneity in human cell atlas. We adopted a supervised VAE structure in UniCoord to learn the mapping relationship between gene expression and low-dimensional coordinate space that encodes different cell attributes. We conducted several experiments on different real datasets. Results showed that UniCoord was able to represent a variety kinds of cell heterogeneities such as discrete, continuous and hierarchical structures. The trained UniCoord model can be used to automatically label various attributes of cells and generate the corresponding expression data. These results validated UniCoord as a feasible coordinates frame for representing and illustrating multifaceted cell heterogeneity in HCA.

Team Playing under Uncertainties

David Gesbert

EURECOM

Abstract: Cooperation is an essential function in a wide array of network scenarios, including wireless, robotics, transports and beyond. In decentralized networks, cooperation (or team play) must be achieved by agents despite information uncertainties in the global state of the network. Cooperation in the presence of information uncertainties is a highly challenging problem for which no systematic optimization solution exist. In this talk, we describe different mechanisms for solving it, from information theoretic to machine learning based. In the machine learning approach this problem, we introduce so called Team Deep Learning Networks (Team-DNN) where agents learn to coordinate with each other under uncertainties. In the communication domain, we show how devices can learn how to message each other relevant information and take appropriate transmission decisions, possibly under the control of a meta-expert, so as to optimize network performance.

Gradient Statistics Aware Power Control for Over-the-Air Federated Learning

Meixia Tao

Shanghai Jiao Tong University

Abstract: Federated learning (FL) is a promising technique that enables many edge devices to train a machine learning model collaboratively in wireless networks. By exploiting the superposition nature of wireless waveforms, over-the-air computation (AirComp) can accelerate model aggregation and hence facilitate communication-efficient FL. Due to channel fading, power control is crucial in AirComp. In this talk, I will consider the power control problem for over-the-air FL by taking gradient statistics, which have been overlooked before, into account. In particular, we find that the relative dispersion of gradients at each round of SGD-based training algorithm, i.e., squared multivariate coefficient of variation (SMCV), plays a key role in power control and it varies over iterations. Experimental results demonstrate that the proposed gradient-statistics-aware power control can achieve higher test accuracy than the existing schemes for a wide range of scenarios.

Anticipated Learning Machine for Time-series Prediciton

Luonan Chen

University of Chinese Academy of Sciences

Abstract: We develop a novel machine learning framework, i.e. anticipated learning machine or auto-reservoir computing framework, to efficiently and accurately make multi-step-ahead predictions based on a short term high-dimensional time series. Different from traditional reservoir computing whose reservoir is an external dynamical system irrelevant to the target system, auto-reservoir computing directly transforms the observed high-dimensional dynamics as its reservoir, which maps the highdimensional/spatial data to the future temporal values of a target variable based on our spatiotemporal information (STI) transformation. Thus, the multi-step prediction of the target variable is achieved in an accurate and computationally efficient manner. Auto-reservoir computing is successfully applied to both representative models and real-world datasets, all of which show satisfactory performance in the multi-step-ahead prediction, even when the data are perturbed by noise and when the system is time-varying. Actually, such a transformation equivalently expands the sample size and thus has great potential in practical applications in artificial intelligence and machine learning. This framework has also been successfully applied to the analysis and predictions of various biological systems.

Precoding in Wireless Communications and Satellite Systems

Björn Ottersten

University of Luxembourg

Abstract: The advent of spatial processing in multiantenna wireless communications has transformed the design of mobile networks allowing us to meet the tremendous demands for data and services in mobile applications. Signal processing techniques implemented in base-band transceivers enables the efficient exploitation of radio spectrum, increased gain to improve coverage, more reliable and secure transmissions and improved energy efficiency. We will focus on the challenges of spatial precoding techniques implemented on the transmit side of wireless systems. Early developments on transmit beamforming and spatial division multiple access will be reviewed as well as more recent developments on symbol-level precoding. The process from early results reported in the academic literature to experimental validation, standardization and commercialization will be emphasized.

On Copula-Based Multiuser Performance Bounds – the Road to Ultra-High Reliability

Eduard A. Jorswieck

Technische Universität Braunschweig

Abstract: Reliability in wireless communications can be improved by diversity in space, frequency, and time. Use cases in beyond 5G networks require high reliability and will apply massive number of antennas, huge spectrum resources, and multi-connectivity. It is well known that correlation can reduce the outage performance of diversity schemes. However, the impact of general dependency structures – so called copula – on the performance of diversity schemes is less studied. In the talk, we provide an approach to model and analyze dependent fading multi-user systems. On the bright side, there is a optimistic results that negative dependent – counter-monotonic – fading channels lead to positive zero-outage capacities. On the down side, the framework provides worst-case bounds on the joint distributions of fading channels for various multi-user settings. First, the scenario with two fading channels is studied, and then it is generalized to the case with arbitrary number. The focus is on the outage performance for slowly fading channels. In an outlook, we provide also insights for average performance for fast fading multiuser channels. All results are illustrated with numerical examples, comparisons to the standard correlation models, and with applications in beyond 5G networks.

An Overview on Enabling Technologies for 6G

Erik G. Larsson

Linköping University

Abstract: In this talk, an overview on enabling technologies for 6G will be given, which includes cell-free massive MIMO and massive/grant-free access.

TBD...

Zongben Xu

Xi'an Jiaotong University

Abstract: To be updated...

Exploring Stochastic Methods for Deep Learning and Reinforcement Learning

Zaiwen Wen

Peking University

Abstract: Stochastic methods are widely used in machine learning. In this talk, we present a structured stochastic quasi-Newton method and a sketchy empirical natural gradient method for deep learning. We also introduce a stochastic quadratic penalty algorithm for reinforcement learning.

Nonparametric Perspective on Deep Learning

Guang Cheng

The Chinese University of Hong Kong, Shenzhen

Abstract: Models built with deep neural network (DNN) can handle complicated real-world data extremely well, without suffering from the curse of dimensionality or the non-convex optimization. To contribute to the theoretical understanding of deep learning, we will investigate the nonparametric aspects of DNNs by addressing the following questions: (i) what kind of data can be best learned by deep neural networks？ (ii) can deep neural networks achieve the statistical optimality? (iii) is there any algorithmic guarantee for obtaining such optimal neural networks? Our theoretical analysis applies to two most fundamental setup in practice: regression and classification.

Federated Matrix Factorization: Algorithm Design and Applications

Tsung-Hui Chang

The Chinese University of Hong Kong, Shenzhen

Abstract: Recent demands on data privacy have called for federated learning (FL) as a new distributed learning paradigm in massive and heterogeneous networks. Although many FL algorithms have been proposed, few of them have considered the matrix factorization (MF) model, which is known to have a vast number of signal processing and machine learning applications. Different from the existing FL algorithms that are designed for smooth problems with single block of variables, in federated MF (FedMF), we have to deal with challenging non-convex and non-smooth problems (due to constraints or regularization) with two blocks of variables, in the presence of non-i.i.d. data. In this talk, we address the challenge by first proposing a new FedMF algorithm, namely, FedMAvg, based on the model averaging principle, for a general MF model. We further restrict to the friendly Frobenius cost function, and propose a gradient sharing based FedMF algorithm, called FedMGS, that enjoys great robustness against heterogeneous data distribution. Both FedMAvg and FedMGS adopt multiple steps of local updates per communication round to speed up convergence and allow only a randomly sampled subset of clients to communicate with the server for reducing the communication cost. Convergence for the two algorithms are analyzed, which delineate the impacts of data distribution, local update number, and partial client communication on the algorithm performance. By focusing on a data clustering task, extensive experiment results are presented to examine the practical performance of both algorithms, as well as demonstrating their efficacy over the existing distributed clustering algorithms.

Schedule

Invited Talks

Byzantine Consensus Through the Lens of Information Theory

David Tse

Reinforcement Learning, Bit by Bit

Benjamin Van Roy

Beyond Trans-dimensional Sampling: Generalised Bayesian Model Selection

Ercan Engin Kuruoğlu

Biomedical NLP: Machine Learning and Beyond

Fei Xia

Natural Language Representation and Decoding

Chengqing Zong

A General Framework for Representing and Annotating Multifaceted Cell Heterogeneity in Human Cell Atlas

Xuegong Zhang

Team Playing under Uncertainties

David Gesbert

Gradient Statistics Aware Power Control for Over-the-Air Federated Learning

Meixia Tao

Anticipated Learning Machine for Time-series Prediciton

Luonan Chen

Precoding in Wireless Communications and Satellite Systems

Björn Ottersten

On Copula-Based Multiuser Performance Bounds – the Road to Ultra-High Reliability

Eduard A. Jorswieck

An Overview on Enabling Technologies for 6G

Erik G. Larsson

TBD...

Zongben Xu

Exploring Stochastic Methods for Deep Learning and Reinforcement Learning

Zaiwen Wen

Nonparametric Perspective on Deep Learning

Guang Cheng

Federated Matrix Factorization: Algorithm Design and Applications

Tsung-Hui Chang