Functional magnetic resonance imaging (fMRI) produces low number of samples in high dimensional vector spaces which is hardly adequate for brain decoding tasks. In this study, we propose a combination of autoencoding and temporal convolutional neural network architecture which aims to reduce the feature dimensionality along with improved classification performance. The proposed network learns temporal representations of voxel intensities at each layer of the network by leveraging unlabeled fMRI data with regularized autoencoders. Learned temporal representations capture the temporal regularities of the fMRI data and are observed to be an expressive bank of activation patterns. Then a temporal convolutional neural network with spatial pooling layers reduces the dimensionality of the learned representations. By employing the proposed method, raw input fMRI data is mapped to a low-dimensional feature space where the final classification is conducted. In addition, a simple decorrelated representation approach is proposed for tuning the model hyper-parameters. The proposed method is tested on a ten class recognition memory experiment with nine subjects. Results support the efficiency and potential of the proposed model, compared to the baseline multi-voxel pattern analysis techniques.