Tutorial for detecting facial key points using convolutional neural networks (1)

This is a tutorial that teaches you how to learn a deep school. Step by step, we will try to solve the detection problem of facial key points in Kaggle challenge.

This tutorial introduces Lasagne, a relatively new neural network library based on Python and Theano. We will use Lasagne to simulate a series of neural network structures, discussing data augmentaTIon, dropout, momentum, and pre-training. There are many ways to improve our results.

I assume that you already know something about neural networks. So we will not introduce the background of neural networks. There are also some good books and videos on neural networks, such as Neural Networks and Deep Learning online book. Alec Radford's presentation Deep Learning with Python's Theano library is also a good example of a quick introduction. And ConvNetJS Browser Demos

prepare in advance

If you only need to understand, you don't need to write a code yourself and execute it. Here are some installation tutorials for those who have a CUDA-configured GPU and want to run the experiment.

I assume that you have installed the CUDA toolkit, Python 2.7.x, numpy, pandas, matplotlib, and scikit-learn. Install the remaining dependencies, such as Lasagne and Theano, to run the following commands.
Pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requiremen...

Note that for the sake of brevity, I didn't create a virtual environment in the command, but you need it.

Translator: I configured this environment on windows10, install anaconda (use this environment to install dependencies), VS2013 (not recommended 2015), CUDA tools.

If all goes well, you will find mnist.py in the src/lasagne/examples/ directory of your virtual environment and run the MNIST example. This is a "Hello world" program for neural networks. There are ten categories in the data, which are numbers from 0 to 9, and handwritten digital pictures of 28 & TImes; 28 when input.

Cd src/lasagne/examples/
Python mnist.py

This command will start printing out after about 30 seconds. The reason it takes a while is that Lasagne uses Theano for heavy lifting; Theano is in turn an "optimized GPU metaprogramming code generation array-oriented optimized Python math compiler" that will generate C code that needs to be compiled before training happens. . Fortunately, our group needs to pay the price of this overhead on the first run.

Translator: If you do not configure the GPU, the CPU is used, it should not be so long compilation time, but the execution time is somewhat long. If you use the GPU, when you run some programs for the first time, you will be prompted to compile the content.

When the training starts, you will see
Epoch 1 of 500
Training loss: 1.352731
validaTIon loss: 0.466565
validaTIon accuracy: 87.70 %
Epoch 2 of 500
Training loss: 0.591704
Validation loss: 0.326680
Validity accuracy: 90.64 %
Epoch 3 of 500
Training loss: 0.464022
Validation loss: 0.275699
Validity accuracy: 91.98 %
...

If you let the training run long enough, you will notice that after about 75 generations, it will achieve a test accuracy of about 98%.

If you are using a GPU and you want Theano to use it, you will create a .theanorc file under the user's home folder. You need to use different configuration information depending on your installation environment and your own operating system configuration:
[global]
floatX = float32
Device = gpu0

[lib]
Cnmem = 1

Translator: This is my configuration file.

[cuba]
Root = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
[global]
Openmp = False
Device = gpu
floatX = float32
Allow_input_downcast = True

[nvcc]
Fastmath = True
Flags = -IC:\Anaconda2\libs
Compiler_bindir = C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
Base_compiledir = path_to_a_directory_without_such_characters

[blas]
Ldflags =

[gcc]
Cxxflags = -IC:\Anaconda2\MinGW

data

The training data set for facial keypoint detection includes 7049 (96x96) grayscale images. For each image, we should learn to find the correct position (x and y coordinates) of the 15 key points, for example
Left_eye_center
Right_eye_outer_corner
Mouth_center_bottom_lip

An example of a face marked with three key points.

An interesting change to the data set is that for some key points, we only have about 2,000 tags, while other key points have more than 7,000 tags for training.

Let's write some Python code to load the data from the provided CSV file. We will write a function that can load training and test data. The difference between the two data sets is that the test data does not contain the target value; this is the goal of predicting these problems. Here is our load() function:
# file kfkd.py
Import os

Import numpy as np
From pandas.io.parsers import read_csv
From sklearn.utils import shuffle

FTRAIN = '~/data/kaggle-facial-keypoint-detection/training.csv'
FTEST = '~/data/kaggle-facial-keypoint-detection/test.csv'

Finish And Wood Working Staple

Wood Working Staple,T-Head Staple,Hardware Decoration Staple,18 Gauge Brad Nail

Zhejiang Best Nail Industrial Co., Ltd. , https://www.beststaple.com

This entry was posted in on