Deep Learning Seminar Notes

Deep Learning Seminar Notes

Introduction

Seymour Papert 1966 Summer Vision Project – goal was object identification
Rules for everyday problems are difficult to write down.
Talks are extremely difficult.
Machine learning applications – machine translation – face recognition – road hazard detection – syntax parsing – object detection – image recognition
Vision – Imagenet – Imagenet: A large-scale hierarchical image database – J Deng et al. (2009), from Stanford – academic competition focused on prediction 1000 object classes (1.2M images)
imagenet 2010 – SVM
imagenet 20911 – SVM
Imagenet 2012 – university of toronoto deep convolutional neural network (hinton?) – won by a large margin – every subsequent entry is a nn.
fine grain classification, generalization, sensible errors – inception v3 architecture

Deep learning for vision

alexnet, supervision
Imagenet classification with deep convolutional neural networks
Krizhevsky Sutskever Hinton 2012
multilayer perceptron trained with back propagation known since the 1980s
backpropagation applied to handwritten zip code
winning network contained 60 M parameters
achieving scale in compute and data is critical – large academic data sets – SIMD hardward (GPUs, SSE instruction sets)
untangling invariant object recognition DiCarlo and Cox (2007)
Hierarchical composition of simple mathematical functions
loosely inspired by visual pathway of the brain
neuron – perceptron is a toy model
Rosenblatt 1958 perceptron
weighted sum of inputs running through nonlinearity
one of the best is just a max (0,z)
NN -> y = f(f(…))
output of network is a real-valued vectors
Step 1: probabilistic – softmax function – normalize distribution
Step 2: one-hot encoding. Correct distribution of 1
KL divergence – cross entropy loss
then compute derivatives of loss based on weights – gradient descent
Back propagation
Rumelhart, Hinton, Williams, McClelland et al. 1986
Werbos (1974)
Deep networks operate on ~ 1M dimensions
optimization is highly non-convex
playground.tensorflow.org

image recognition

LeCun, Bottou, Bengio, And Haffner (1998)
http://yann.lecun.com/exdb/minst/
Gradient based learning applied to document recognition
number parameters for fully connected system – grows as pixels on side squared
iphone camera would need 4 million layers for a single layer
exploit symmetries
translation, cropping, dilation, contrast, rotation, scale, brightness
Ruderman and Bialek (1994) Statistics of natural images: scaling in the woods
Simoncelli and Olshausen (2001) Natural image statistics and neural representation
translation invariance -> convolution
https://docs.gimp.org/en/plug-in-convmatrix.html
Subsitute convolutional layer – model parameters roughly independent of the size of the image
input and output depth are arbitrary parameters and not equal
neural networks operate with depths up to 1024
LeCun et al. 1989 – two convolutional layers
Krizhevsky et al. 2012 – convolutions and fully connected layers

progress

2012 supervision 16.4% error
karpathy 2014 5.1% (human)
2015 inception 3 – 3.6%
2015 resnet 1st 3.6%
2016 inception-resnet 3.1%
best models are of order a thousand layers
szegedy et al. 2016
he zhang ren sun 2015
szegedy 2015
ioff and szegedy 2015
karpathy 2014
simonyan and zisserman
inception – parallel pathways with multiscale filters

advances in neural networks

nonlinearities: example of batch normalization
- covariate shifts are problematic in machine learning
- blog.bigml.com
- covariate shifts must be mitigated through domain adaptation
- input and training and test data have different distributions
- how to you maintain – Ioffe and Szegedy 2015 – Batch Normalization: reduce internal covariate shifts
- Goodfellow et al. 2013
- normalize the activiations within a mini-batch
- learn the moments as a parameter of the network
- learn the mean and variance of each layer as parameters
- perceptron y = f(BatchNorm(\Sum w_i x_i))
- stabalize hidden layer activations
- CNNs train faster with fewer data samples
- employ faster learning rates and less network regularizations
understanding: example of gradient propagation
- for training a network – focused on claculating gradients on parameters
- but how does objectives change based on images
- weight vs. image space
- which pixels elicit a large activation values within a layer?
- Zeiler and Fergus (@013) – Visualizing and Understanding Convolutional Networks
- http://mscoco.org
- what happens if we distort the image
- what if we used the wrong image
- Mordvintsev Olah and Tyka (2015) – Inceptionism: Going Deeper into Neural Networks
- What pixels distort and image into the “dog”
- http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html
- A Neural Algorithm of Artistic Style – Gatys, Ecker, Bethge (2015)
- https://github.com/kaishengtai/neuralart
- Goodfellow, Shlens, And Szegedy (2015) – explaining and harnessing adversarial examples
- Szegedy et al. (2014) – intriguing properties of neural networks
- add images to see what can fool the network
- robust across trained networks, architectures, and other machine learnings
- network operates in different perceptual space

Conclusions

cs231n.github.io/convolutional-networks/
www.tensoflow.org
g.co/brainresidency
Applicants in all areas

Feedback

Brant Robertson's github.io Site

UCSC Deep Learning SCIPP Seminar Notes

Deep Learning Seminar Notes

Introduction

Deep learning for vision

image recognition

progress

advances in neural networks

Conclusions