Deep learning with convolutional neural networks (deep ConvNets) has revolutionized computer vision through end-to-end learning, i.e. learning from the raw data. Now, there is increasing interest in using deep ConvNets for end-to-end EEG analysis. However, little is known about many important aspects of how to design and train ConvNets for end-to-end EEG decoding, and there is still a lack of techniques to visualize the informative EEG features the ConvNets learn. Here, we studied deep ConvNets with a range of different architectures, designed for decoding imagined or executed movements from raw EEG. Our results show that recent advances from the machine learning field, including batch normalization and exponential linear units, together with a cropped training strategy, boosted the deep ConvNets decoding performance, reaching or surpassing that of the widely-used filter bank common spatial patterns (FBCSP) decoding algorithm. While FBCSP is designed to use spectral power modulations, the features used by ConvNets are not fixed a priori. Our novel methods for visualizing the learned features demonstrated that ConvNets indeed learned to use spectral power modulations in the alpha, beta and high gamma frequencies. These methods also proved useful as a technique for spatially mapping the learned features, revealing the topography of the causal contributions of features in different frequency bands to decoding the movement classes. Our study thus shows how to design and train ConvNets to decode movement-related information from the raw EEG without handcrafted features and highlights the potential of deep ConvNets combined with advanced visualization techniques for EEG-based brain mapping.