- GPU-Accelerated Tensorflow
- A list of Tensorflow tutorials
- A Guide to TF Layers: Building a Convolutional Neural Network
- Deep Convolutional Neural Networks
- How to Retrain an Image Classifier for New Categories
- Image Recognition
- Other Information
A list of Tensorflow tutorials
This tutorial covers MNIST and shows how to build a CNN-based classification model. It introduces ReLU activation functions and pooling layers. The tutorial also introduces softmax activation functions. It references the Stanford CS23 course on convolutional neural networks. It introduces a loss function and the cross entropy function. It also introduces one-hot encoding and stochastic gradient descent.
** First we define the model function, which returns an estimator. It takes as arguments the data, labels, and a mode (e.g., train, eval, predict).
layers module expects tensors of size
[batch_size, image_width, image_height, channels].
batch_size is number of images for training and
channels is, e.g., 3 for RGB or 1 for BW. We can use
tf.reshape() to make this tensor.
conv2d() module receives the input layer, and then the output size depends on padding (e.g.,
padding=same zero pads to maintain the image size). The output
conv2d() will be the number of
filters times the number of input
channels, times the size of the images. An activation function has to be indicated (e.g.,
max_pooling2d() receives the convolution, and uses
pool_size=[n,m] to reduce the size by
m in each direction provided the
strides=n. For instance, max pooling 2x2 reduces a 28x28 image to 14x14.
tf.reshape() can be used to take the output from
max_pooling2d() and make it
batch_size x a 1D array. That
can be input into
tf.layers.dense() takes a flattened input tensor, and you specify the number of neurons with
units. Note that
units does not need to equal the number of array elements in the flattend input tensor. An activation function must be specified (e.g.,
tf.layers.dropout() applies dropout regularization, with
rate indicating the percentage of neuron outputs that are randomly dropped.
training specifies if we are training, which can be passed by the
** The output of
batch_size x units.
logits layer is another
dropout, but with an output
** Predicted class can be found using
** The probabilities can be determined using
** These predictions are then zipped and returned if in prediction mode.
** Otherwise, a loss function is computed – instead of
one_hot tutorial now uses
sparse_softmax_cross_entropy directly on input labels and output logits.
** If training, then define a
tf.train.GradientDescentOptimizer with an input learning rate (e.g., 0.001). Pass the loss function output to the optimizer. Then return the estimator.
** If evaluating, just compute the accuracy from
tf.metrics.accuracy and return the estimator.
** At this point, the model is defined. We then have to define a
main() function to run the model on the data.
main(), we need to define the dataset. We select the
mnist.train.images to get the training dataset, and load the labels as an array. We then define a test or evaluation dataset, which is
mnist.test.images and its corresponding labels as an array.
tf.estimator.Estimator() function is given the
cnn_model_fn and a model output directory. The classifier is then trained via
mnist_classifier.train() and then evaluated using
** The CIFAR-10 data is based on fixed length binary information, and there is a
** Image distortion and augmentation is applied.
** The model adds local response normalization as a step. This normalizes individual images by taking a weighted, squared sum of nearby images in the array.
** The model splits training and evaluation into separate scripts
** As an exercise, they suggest downloading the Street View House Numbers database and re-running the AlexNet model. This requires doing some reading with MatLab, so on the back burner for the time being.
This retrains ImageNet to classify flowers. First the flower images and the retraining example are downloaded. The retraining is started using
python retrain.py --image_dir ~/flower_photo, this creates the bottlenecks that help apply ImageNet to a new classification set. The code then procedes to train and estimate accuracy. The tutorial also shows how to use TensorBoard (e.g.,
tensorboard --logdir /tmp/retrain_logs). The
label_image.py script provides a starting point for using a retrained ImageNet for classification. One can also specify the dimensions of the images:
python label_image.py \ --graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \ --input_layer=Placeholder \ --output_layer=final_result \ --input_height=224 --input_width=224 \ --image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg