Skip to main content

Understanding Neural Networks Through Deep Visualization

Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the development of better tools for visualizing and interpreting neural nets. We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space. Because previous versions of this idea produced less recognizable images, here we introduce several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations. Both tools are open source and work on a pre-trained convnet with minimal setup.

Makale: Fast R-CNN – Region based Convolutional Networks

This paper proposes Fast R-CNN, a clean and fast framework for object detection. Compared to traditional R-CNN, and its accelerated version SPPnet, Fast R-CNN trains networks using a multi-task loss in a single training stage. The multi-task loss simplifies learning and improves detection accuracy. Unlike SPPnet, all network layers can be updated during fine-tuning. We show that this difference has practical ramifications for very deep networks, such as VGG16, where mAP suffers when only the fully-connected layers are updated. Compared to “slow” R-CNN, Fast R-CNN is 9x faster at training VGG16 for detection, 213x faster at test-time, and achieves a significantly higher mAP on PASCAL VOC 2012. Compared to SPPnet, Fast R-CNN trains VGG16 3x faster, tests 10x faster, and is more accurate. Fast R-CNN is implemented in Python and C++ and is available under the open-source MIT License at this https URL

Tez: Deep Learning Approaches to Problems in Speech Recognition, Computational Chemistry, and Natural Language Text Processing


Info
George E. Dahl
Ph.D. Thesis
2015
University of Toronto

The deep learning approach to machine learning emphasizes high-capacity, scalable models that learn distributed  representations  of  their  input. This  dissertation  demonstrates  the  ecacy  and  generality of this approach in a series of diverse case studies in speech recognition, computational chemistry, and natural language processing.  Throughout these studies, I extend and modify the neural network models as needed to be more e ective for each task.
In  the  area  of  speech  recognition,  I  develop  a  more  accurate  acoustic  model  using  a  deep  neural network.  This model, which uses recti ed linear units and dropout, improves word error rates on a 50 hour broadcast news task.  A similar neural network results in a model for molecular activity prediction substantially more e ective than production systems used in the pharmaceutical industry.  Even though training assays in drug discovery are not typically very large, it is still possible to train very large models by leveraging data from multiple assays in the same model and by using e ective regularization schemes. In the area of natural language processing, I first describe a new restricted Boltzmann machine training algorithm suitable for text data.  Then, I introduce a new neural network generative model of parsed sentences capable of generating reasonable samples and demonstrate a performance advantage for deeper variants of the model.

Devamını Oku