CasualCNN

2020-08-18

What are causal convolutions?

https://arxiv.org/pdf/1609.03499.pdf

The word causal comes from signal processing, in particular form the characterization of filters. Signals are functions of time and/or space. Filters are functions that remove certain aspects of a signal, leaving only features that you are interested in (e.g. certain frequencies or the positions of certain patterns). Linear filters are filters where, at each point in time and/or space, the output is determined by a weighted sum/integral of the input, i.e. by a convolution. A filter is called causal if the filter output does not depend on future inputs.

In WaveNet the current acoustic intensity that the neural network produces at time step t only depends on data before t. If the network is used to generate new data, then it obviously can’t depend on future data (since it has not been generated yet). During training it could, but then the network could not be used to generate new data. There are two ways of implementing a causal filter in deep learning frameworks: The simplest one is to mask the parts of the filter kernel that are concerned with future input, by setting them to zero at each SGD update, but that is quite expensive as about half of the multiplications and additions go to waste. A more efficient way is to shift and pad the signal by the kernel size and then undo the shifting (which relies on the translation-equivariance property of convolution).