Deep learning framework Keras code analysis

In general, the deep learning framework of keras is really "easy". It is written in a more detailed reference. It is not like caffe. After installing it, you have to rely on the technical blog. Keras has its own official documentation (but it is English), this provides a great learning space for beginners.

This document must be pushed! English nice can look at the document directly, this article is to use Chinese to talk about this matter.

Simple deep learning framework Keras code analysis and application

Keras official documentation

The first thing to be clear is: I haven't learned Python, and I need to write anything about Baidu, so sometimes the code will be more redundant, and I can write a lot in one sentence~

Paper citation - 3.2 test platform

The project code is run on Windows 7, mainly using Matlab R2013a and Python, where Matlab is used for patch segmentation and preprocessing, and convolutional neural network construction uses Keras, a deep learning framework rooted in Python and Theano. Keras is a deep learning framework based on Theano. It is designed with reference to Torch. It is written in Python and is a highly modular neural network library that supports GPUs and CPUs. It is especially easy to use and is suitable for rapid development.

Reference material

Based on Theano's Deep Learning framework, Keras learns essays-12-core

Based on Theano's Deep Learning framework, Keras learns essay-13-convolution layer

1. The main function constructed by direct convolutional neural network

Def create_model(data):

Model = SequenTIal()

Model.add(ConvoluTIon2D(64, 5, 5, border_mode='valid', input_shape=data.shape[-3:]))

Model.add(AcTIvaTIon('relu'))

Model.add(Dropout(0.5))

Model.add(Convolution2D(64, 5, 5, border_mode='valid'))

Model.add(Activation('relu'))

Model.add(MaxPooling2D(pool_size=(2, 2)))

Model.add(Dropout(0.5))

Model.add(Convolution2D(32, 3, 3, border_mode='valid'))

Model.add(Activation('relu'))

Model.add(MaxPooling2D(pool_size=(2, 2)))

Model.add(Convolution2D(32, 3, 3, border_mode='valid'))

Model.add(Activation('relu'))

Model.add(Dropout(0.5))

Model.add(Flatten())

Model.add(Dense(512, init='normal'))

Model.add(Activation('relu'))

Model.add(Dropout(0.5))

Model.add(Dense(LABELTYPE, init='normal'))

Model.add(Activation('softmax'))

Sgd = SGD (l2=0.0, lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

Model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode=â€œcategoricalâ€)

Return model

This function is quite concise and clear. Entering the training set and outputting an empty neural network is actually the initialization of the convolutional neural network. Model = Sequential() is the beginning of the neural network, the latter model.add () is always layered, like building blocks, what to add, convolutional neural network has two types of layers: 1) convolution , 2) downsampling, corresponding to the code is:

Model.add(Convolution2D(64, 5, 5, border_mode='valid'))

# Add a convolutional layer, convolution number 64, convolution size 5*5

Model.add(MaxPooling2D(pool_size=(2, 2)))

# Add a downsampling layer, sampling window size 2*2

1.1 Activation function

Note: An activation function is added after each convolutional layer, which is the part of the textbook.

Simple deep learning framework Keras code analysis and application

It can control the result of convolution within a certain range of values, such as 0~1, -1~1, etc., so that the value of each convolution is not very different.

Corresponding to the code is this sentence:

Model.add(Activation('relu'))

This activation function (Activation) keras provides a lot of alternatives, I use ReLU here, others have

Tanh

Sigmoid

Hard_sigmoid

Linear

Wait a minute, the keras library is constantly updated, and the more optimized activation functions used in the new papers will also be included, such as:

LeakyReLU

PReLU

ELU

Wait, it can be replaced, the name "optimize the network", in fact, just change the name, haha, the internal function has been written for you. Note: The activation function of the last layer of the convolutional neural network is generally to choose "softmax". I have to say a little more about how these activation functions should be chosen.

Simple deep learning framework Keras code analysis and application

Refer to this article

The reasons are

Simple deep learning framework Keras code analysis and application

Causes the gradient to disappear, not a zero center

Simple deep learning framework Keras code analysis and application

Cause the gradient to disappear

Simple deep learning framework Keras code analysis and application

x "0 time gradient is gone

Simple deep learning framework Keras code analysis and application

I know that Leaky ReLU is ready-made, but it is still useless for the time being. Now I still use ReLU. Don't ask me why :)

1.2 Dropout layer

Dropout: For "over-fitting" issues

Simple deep learning framework Keras code analysis and application

Don't let some neurons get excited

Not all neurons are excited when the human brain processes signals, because 1) the energy supply of the brain cannot keep up, 2) the specificity of neurons, specific neurons process specific signals, 3) all The activation of the neurons increases the reaction time. Therefore, we have to choose between neural network simulation, such as suppressing neurons whose signal strength is lower than a certain value, which can improve the speed and robustness of the network and reduce the possibility of "overfitting". . Amount, nonsense does not say, anyway, it is good! Reflected in the code is this:

Model.add(Dropout(0.5))

This 0.5 can be changed, meaning that the 50% of the neurons whose signal intensity is ranked are suppressed, that is, throw them away~

1.3 There are still some details

The function that initialized this network so far should only have some small things that are unclear:

Model.add(Convolution2D(64, 5, 5, border_mode='valid', input_shape=data.shape[-3:]))

You will find that the first convolutional code is longer than the others, because it also needs to add some parameters of the training set, that is, input_shape = data.shape[-3:], which means to illustrate the training set. The sample has several channels and the size of each input image, I am here

Simple deep learning framework Keras code analysis and application

Data.shape[-3:] means that I used six channels, each of which has a size of 24*24 pixels.

The concept of a channel is, for example, a black and white picture, that is, a channel, that is, a gray value; a color picture is three channels, that is, RGB; of course, it is also possible to use no color as a channel, such as the six channels I use. But the mechanism inside the channel is not very clear, maybe it is set for RGB. Call a question mark here?

Model.add(Flatten())

Model.add(Dense(512, init='normal'))

Adding a fully connected layer here is the two codes, which is equivalent to this in the convolutional neural network.

Simple deep learning framework Keras code analysis and application

512 means that there are 512 neurons in this layer.

There is nothing to say, that is, part of the model, there can be several layers, but generally placed in the back of the network.

Sgd = SGD (l2=0.0, lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

Model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode=â€œcategoricalâ€)

This part is the legendary "gradient descent method", which is used in the feedback phase of the neural network to continuously learn and adjust the parameters of each layer of convolution, the so-called "learning" process. I use the most common sgd here, the parameters include the learning speed (lr), although other parameters can be changed theoretically, but I did not change them, huh, huh.

Tip: The learning parameters are generally small, I use 0.01, this is determined according to different training set data. If it is too small, the training speed is very slow. If it is too big, it is easy to train and blew out.

Simple deep learning framework Keras code analysis and application

The circle is the current position, and the five-pointed star is the target position. If the learning speed is too fast, it is easy to skip the target position directly, resulting in training failure.

I haven't tried the other feedback methods provided by keras (Optimizer), and I don't know their respective advantages and disadvantages. Here are a few other options:

RMSprop

Adagrad

Adadelta

Adam

Adamax

Wait, I guess every method can correspond to a deep learning paper, the code keras has been provided, and I want to know the details to trace the paper. Here I mention the cost function. For the "learning slow" and "transition fit" problems, there are methods to modify the cost function. I understand the truth. I am still groping in the keras where I am making changes. Let me talk about the truth first:

Simple deep learning framework Keras code analysis and application

Thus, a better cost function is

Simple deep learning framework Keras code analysis and application

Looking for an opportunity to change the code inside this part of keras

Main code section, The End

2. Training pre-code

There are a few steps you need to take before you start training.

Import the required python package

Import Data

Divide training sets and test sets

2.1 related python package import

#coding:utf-8

'''

GPU run command:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cnn.py

CPU run command:

Python cnn.py

'''

###########################################################

# Import various module components used

###########################################################

# ConvNets module

From __future__ import absolute_import

From __future__ import print_function

From keras.models import Sequential

From keras.layers.core import Dense, Dropout, Activation, Flatten

From keras.layers.advanced_activations import PReLU, LeakyReLU

Import keras.layers.advanced_activations as adact

From keras.layers.convolutional import Convolution2D, MaxPooling2D

From keras.optimizers import SGD, Adadelta, Adagrad, Adam, Adamax

From keras.utils import np_utils, generic_utils

From six.moves import range

From keras.callbacks import EarlyStopping

#ç»Ÿè®¡çš„æ¨¡å—

From collections import Counter

Import random, cPickle

From cutslice3d import load_data

From cutslice3d import ROW, COL, LABELTYPE, CHANNEL

#Memory Tuning Module

Import sys

I haven't said anything here, it's equivalent to #include in C language, what to import later. For the specific import of which package is to go to the location of the keras installation, my installation path is

C:\Users\Administrator\Anaconda2\Lib\site-packages\keras

You will see one by one .py file

Simple deep learning framework Keras code analysis and application

In the keras directory, itâ€™s like this.

For example, if you need to import the Sequential() function, you must first know that it is defined in the models.py of keras, and then it is natural to come out with this code.

From keras.models import Sequential

# Import Sequential from keras's models.py.

You see, the code is simple to translate. The hard part is that you don't know where the Sequential() function is defined. This requires a good look at the keras documentation. So many functions are not possible here.

In this part, I personally feel that I need Python knowledge. Because of the removal of keras, many packages are quite useful. With these functions, you can save a lot of things. Example:

From collections import Counter

The role is to count the number of times different elements in a matrix appear, and the code that follows is implemented.

Cnt = Counter(A)

For k,v in cnt.iteritems():

Print ('\', k, '--ã€‹', v)

# Implemented the number of occurrences of the elements of the statistical A matrix

2.2 Simple processing module for data

###########################################################

# Description of this test

###########################################################

Print("\\\Hey you, this is a trial on malignance and benign tumors detection via ConvNets. I'm Zongwei Zhou. :)")

Print("Each input patch is 51*51, cutted from 1383 3d CT & PT images. The MINIMUM is above 30 segment pixels.")

###########################################################

# Download Data

###########################################################

Print("ã€‹" Loading Data ..")

TrData, TrLabel, VaData, VaLabel = load_data()

###########################################################

#æ‰“ä¹±æ•°æ®

###########################################################

Index = [i for i in range(len(TrLabel))]

Random.shuffle(index)

TrData = TrData[index]

TrLabel = TrLabel[index]

Print('\Therefore, read in', TrData.shape[0], 'samples from the dataset totally.')

#label is 0~1 a total of 2 categories, keras requires the format binary class matrices, convert it, directly call this function provided by keras

TrLabel = np_utils.to_categorical(TrLabel, LABELTYPE)

Here I used a load_data() function, which I wrote myself, which is a data import, read into the training set and test set from the .mat file. That is, the translation of the input patch, the rotation transformation, and the training set test set division are all done in MATLAB, and the amount of data obtained is huge. As of April 7, my training set has reached a scale of 31.4 GB. The python-side function is more intuitive, it is like this

Def load_data():

###########################################################

# Read data from a .mat file

###########################################################

Mat_training = h5py.File(DATAPATH_Training);

Mat_training.keys()

Training_CT_x = mat_training[Training_CT_1];

Training_CT_y = mat_training[Training_CT_2];

Training_CT_z = mat_training[Training_CT_3];

Training_PT_x = mat_training[Training_PT_1];

Training_PT_y = mat_training[Training_PT_2];

Training_PT_z = mat_training[Training_PT_3];

TrLabel = mat_training[Training_label];

TrLabel = np.transpose(TrLabel);

Training_Dataset = len(TrLabel);

Mat_validation = h5py.File(DATAPATH_Validation);

Mat_validation.keys()

Validation_CT_x = mat_validation[Validation_CT_1];

Validation_CT_y = mat_validation[Validation_CT_2];

Validation_CT_z = mat_validation[Validation_CT_3];

Validation_PT_x = mat_validation[Validation_PT_1];

Validation_PT_y = mat_validation[Validation_PT_2];

Validation_PT_z = mat_validation[Validation_PT_3];

VaLabel = mat_validation[Validation_label];

VaLabel = np.transpose(VaLabel);

Validation_Dataset = len(VaLabel);

###########################################################

# Initialization

###########################################################

TrData = np.empty((Training_Dataset, CHANNEL, ROW, COL), dtype = "float32");

VaData = np.empty((Validation_Dataset, CHANNEL, ROW, COL), dtype = "float32");

###########################################################

# Crop image, channel input

###########################################################

For i in range(Training_Dataset):

TrData[i,0,:,:]=Training_CT_x[:,:,i];

TrData[i,1,:,:]=Training_CT_y[:,:,i];

TrData[i,2,:,:]=Training_CT_z[:,:,i];

TrData[i,3,:,:]=Training_PT_x[:,:,i];

TrData[i,4,:,:]=Training_PT_y[:,:,i];

TrData[i,5,:,:]=Training_PT_z[:,:,i];

For i in range(Validation_Dataset):

VaData[i,0,:,:]=Validation_CT_x[:,:,i];

VaData[i,1,:,:]=Validation_CT_y[:,:,i];

VaData[i,2,:,:]=Validation_CT_z[:,:,i];

VaData[i,3,:,:]=Validation_PT_x[:,:,i];

VaData[i,4,:,:]=Validation_PT_y[:,:,i];

VaData[i,5,:,:]=Validation_PT_z[:,:,i];

Print '\The dimension of each data and label, listed as folllowing:'

Print '\TrData : ', TrData.shape

Print '\TrLabel : ', TrLabel.shape

Print '\Range : ', np.amin(TrData[:,0,:,:]), '~', np.amax(TrData[:,0,:,:])

Print '\\', np.amin(TrData[:,1,:,:]), '~', np.amax(TrData[:,1,:,:])

Print '\\', np.amin(TrData[:,2,:,:]), '~', np.amax(TrData[:,2,:,:])

Print '\\', np.amin(TrData[:,3,:,:]), '~', np.amax(TrData[:,3,:,:])

Print '\\', np.amin(TrData[:,4,:,:]), '~', np.amax(TrData[:,4,:,:])

Print '\\', np.amin(TrData[:,5,:,:]), '~', np.amax(TrData[:,5,:,:])

Print '\VaData : ', VaData.shape

Print '\VaLabel : ', VaLabel.shape

Print '\Range : ', np.amin(VaData[:,0,:,:]), '~', np.amax(VaData[:,0,:,:])

Print '\\', np.amin(VaData[:,1,:,:]), '~', np.amax(VaData[:,1,:,:])

Print '\\', np.amin(VaData[:,2,:,:]), '~', np.amax(VaData[:,2,:,:])

Print '\\', np.amin(VaData[:,3,:,:]), '~', np.amax(VaData[:,3,:,:])

Print '\\', np.amin(VaData[:,4,:,:]), '~', np.amax(VaData[:,4,:,:])

Print '\\', np.amin(VaData[:,5,:,:]), '~', np.amax(VaData[:,5,:,:])

Return TrData, TrLabel, VaData, VaLabel

Read the data stored in .mat, the output is directly divided into the training set (TrData, TrLabel) and the test set (VaData, VaLabel), it is relatively simple, not to expand. About Data Augmentation on the MATLAB side, I will introduce it later. Explain that the role of data expansion is also aimed at the "over-fitting" problem.

Note: my label is 0~1 in 2 categories, keras requires the format binary class matrices, so to convert it, directly call this function np_utils.to_categorical() provided by keras.

3. Training medium and late code

The hard bones in front are finished, here is just a joke, just a few words to solve the problem.

Print("ã€‹ã€‹Build Model ..")

Model = create_model(TrData)

###########################################################

#è®ç»ƒConvNetsæ¨¡åž‹

###########################################################

Print("ã€‹" Training ConvNets Model ..")

Print("\Here, batch_size =", BATCH_SIZE, ", epoch =", EPOCH, ", lr =", LR, ", momentum =", MOMENTUM)

Early_stopping = EarlyStopping(monitor='val_loss', patience=2)

Hist = model.fit(TrData, TrLabel, \\

Batch_size=BATCH_SIZE, \\

Nb_epoch=EPOCH, \\

Shuffle=True, \\

Verbose=1, \\

Show_accuracy=True, \\

Validation_split=VALIDATION_SPLIT, \\

Callbacks=[early_stopping])

###########################################################

# Testing ConvNets Model

###########################################################

Print("ã€‹" Test the model...")

Pre_temp=model.predict_classes(VaData)

3.1 Training model

First call 1. The function create_model() in the main function constructed by the direct convolutional neural network to establish an initialized model. Then the training master code is a sentence

Hist = model.fit(TrData, TrLabel, \\

Batch_size=100, \\

Nb_epoch=10, \\

Shuffle=True, \\

Verbose=1, \\

Show_accuracy=True, \\

Validation_split=0.2, \\

Callbacks=[early_stopping])

:) Yes, just a word, but the things in this sentence are a bit more. . . I will simply list the items I am concerned about here:

TrData: Training data

TrLabel: Training Data Tab

Batch_size: training sample used for each gradient descent adjustment parameter

Nb_epoch: number of training iterations

Shuffle: When suffle=True, it will randomly plan every epoch data (default scramble), but the verification data will not be disturbed by default.

Validation_split: The proportion of the test set, I chose 0.2 here. Note that this is not a test set in the simple processing module of 2.2 data. This test set is a training test set, that is, it is possible to become a training set in the next training. The 2.2 simple processing module of the data is the global test set, the final test for the trained network.

Early_stopping: Whether to end the training in advance, the network itself judges, when the result of this training and the last training is almost automatic, the training iteration is stopped automatically, that is, it is not necessarily trained nb_epoch(10) times.

The call to early_stopping is here

Early_stopping = EarlyStopping(monitor='val_loss', patience=2)

The rest are related to the interface when training, according to my or default:)

Give a mouth, if you want to see the results of each training can be done! Hist stored in hist = model.fit() is the result of each training and the accuracy of the test.

If you want to see the output of each layer, you can do it!

This can be used in conjunction with convolutional neural networks and other traditional classifiers to optimize the experiment of the softmax method, involving more advanced algorithms, I will talk about later. Here we only put the code that looks at each layer of output:

Get_feature = theano.function([origin_model.layers[0].input], origin_model.layers[12].get_output(train=False),allow_input_downcast=False)

Feature = get_feature(data)

Ok, let's take a look at the Python function code for SVM and Random Forests. If you want to do this experiment, you can use it:

###########################################################

# SVM

###########################################################

Def svc(traindata,trainlabel,testdata,testlabel):

Print("Start training SVM.....")

svcClf = SVC (C=1.0, kernel=â€œrbfâ€, cache_size=3000)

svcClf.fit(traindata,trainlabel)

Pred_testlabel = svcClf.predict(testdata)

Num = len(pred_testlabel)

Accuracy = len([1 for i in range(num) if testlabel[i]==pred_testlabel[i]])/float(num)

Print("\"" cnn-svm Accuracy")

Prt(testlabel, pred_testlabel)

###########################################################

# Random Forests

###########################################################

Def rf(traindata,trainlabel,testdata,testlabel):

Print("Start training Random Forest....")

rfClf = RandomForestClassifier(n_estimators=100,criterion='gini')

rfClf.fit(traindata,trainlabel)

Pred_testlabel = rfClf.predict(testdata)

Print("\"" cnn-rf Accuracy")

Prt(testlabel, pred_testlabel)

Receive ~ stop.

3.2 Test model

Huh, this is the most water, but also a word

Pre_temp=model.predict_classes(VaData)

Apply an existing function predict_classes() to the test set VaData and return the predicted result pre_temp of the trained network. Ok, finally compare pre_temp with the correct test set label VaLabel, you will know the training of this network, the experimental stage victory! Send a screenshot:

Simple deep learning framework Keras code analysis and application

Everybody Happy

3.3 Saving the model

It is not easy to train a model. Not only do you need to adjust parameters, but also adjust the network structure. The training time is especially long. So you should learn to save the trained network. The code is like this:

###########################################################

# Save ConvNets model

###########################################################

Model.save_weights('MyConvNets.h5')

cPickle.dump(model, open('./MyConvNets.pkl',"wb"))

Json_string = model.to_json()

Open(W_MODEL, 'w').write(json_string)

Just save it, these are the three files.

Simple deep learning framework Keras code analysis and application

Saved model file

When you go back and call this network, you can use this code.

Model = cPickle.load(open('MyConvNets.pkl',"rb"))

The model is read into the model stored in the pkl file.

See Through Display

Kindwin Technology (H.K.) Limited , https://www.ktl-led.com