In general, the deep learning framework of keras is really "easy". It is written in a more detailed reference. It is not like caffe. After installing it, you have to rely on the technical blog. Keras has its own official documentation (but it is English), this provides a great learning space for beginners.
This document must be pushed! English nice can look at the document directly, this article is to use Chinese to talk about this matter.
Keras official documentation
The first thing to be clear is: I haven't learned Python, and I need to write anything about Baidu, so sometimes the code will be more redundant, and I can write a lot in one sentence~
Paper citation - 3.2 test platform
The project code is run on Windows 7, mainly using Matlab R2013a and Python, where Matlab is used for patch segmentation and preprocessing, and convolutional neural network construction uses Keras, a deep learning framework rooted in Python and Theano. Keras is a deep learning framework based on Theano. It is designed with reference to Torch. It is written in Python and is a highly modular neural network library that supports GPUs and CPUs. It is especially easy to use and is suitable for rapid development.
Reference material
Based on Theano's Deep Learning framework, Keras learns essays-12-core
Based on Theano's Deep Learning framework, Keras learns essay-13-convolution layer
1. The main function constructed by direct convolutional neural networkDef create_model(data):
Model = SequenTIal()
Model.add(ConvoluTIon2D(64, 5, 5, border_mode='valid', input_shape=data.shape[-3:]))
Model.add(AcTIvaTIon('relu'))
Model.add(Dropout(0.5))
Model.add(Convolution2D(64, 5, 5, border_mode='valid'))
Model.add(Activation('relu'))
Model.add(MaxPooling2D(pool_size=(2, 2)))
Model.add(Dropout(0.5))
Model.add(Convolution2D(32, 3, 3, border_mode='valid'))
Model.add(Activation('relu'))
Model.add(MaxPooling2D(pool_size=(2, 2)))
Model.add(Convolution2D(32, 3, 3, border_mode='valid'))
Model.add(Activation('relu'))
Model.add(Dropout(0.5))
Model.add(Flatten())
Model.add(Dense(512, init='normal'))
Model.add(Activation('relu'))
Model.add(Dropout(0.5))
Model.add(Dense(LABELTYPE, init='normal'))
Model.add(Activation('softmax'))
Sgd = SGD (l2=0.0, lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
Model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode=“categoricalâ€)
Return model
This function is quite concise and clear. Entering the training set and outputting an empty neural network is actually the initialization of the convolutional neural network. Model = Sequential() is the beginning of the neural network, the latter model.add () is always layered, like building blocks, what to add, convolutional neural network has two types of layers: 1) convolution , 2) downsampling, corresponding to the code is:
Model.add(Convolution2D(64, 5, 5, border_mode='valid'))
# Add a convolutional layer, convolution number 64, convolution size 5*5
Model.add(MaxPooling2D(pool_size=(2, 2)))
# Add a downsampling layer, sampling window size 2*2
1.1 Activation function
Note: An activation function is added after each convolutional layer, which is the part of the textbook.
It can control the result of convolution within a certain range of values, such as 0~1, -1~1, etc., so that the value of each convolution is not very different.
Corresponding to the code is this sentence:
Model.add(Activation('relu'))
This activation function (Activation) keras provides a lot of alternatives, I use ReLU here, others have
Tanh
Sigmoid
Hard_sigmoid
Linear
Wait a minute, the keras library is constantly updated, and the more optimized activation functions used in the new papers will also be included, such as:
LeakyReLU
PReLU
ELU
Wait, it can be replaced, the name "optimize the network", in fact, just change the name, haha, the internal function has been written for you. Note: The activation function of the last layer of the convolutional neural network is generally to choose "softmax". I have to say a little more about how these activation functions should be chosen.
Refer to this article
The reasons are
Causes the gradient to disappear, not a zero center
Cause the gradient to disappear
x "0 time gradient is gone
I know that Leaky ReLU is ready-made, but it is still useless for the time being. Now I still use ReLU. Don't ask me why :)
1.2 Dropout layer
Dropout: For "over-fitting" issues
Don't let some neurons get excited
Not all neurons are excited when the human brain processes signals, because 1) the energy supply of the brain cannot keep up, 2) the specificity of neurons, specific neurons process specific signals, 3) all The activation of the neurons increases the reaction time. Therefore, we have to choose between neural network simulation, such as suppressing neurons whose signal strength is lower than a certain value, which can improve the speed and robustness of the network and reduce the possibility of "overfitting". . Amount, nonsense does not say, anyway, it is good! Reflected in the code is this:
Model.add(Dropout(0.5))
This 0.5 can be changed, meaning that the 50% of the neurons whose signal intensity is ranked are suppressed, that is, throw them away~
1.3 There are still some details
The function that initialized this network so far should only have some small things that are unclear:
Model.add(Convolution2D(64, 5, 5, border_mode='valid', input_shape=data.shape[-3:]))
You will find that the first convolutional code is longer than the others, because it also needs to add some parameters of the training set, that is, input_shape = data.shape[-3:], which means to illustrate the training set. The sample has several channels and the size of each input image, I am here
Data.shape[-3:] means that I used six channels, each of which has a size of 24*24 pixels.
The concept of a channel is, for example, a black and white picture, that is, a channel, that is, a gray value; a color picture is three channels, that is, RGB; of course, it is also possible to use no color as a channel, such as the six channels I use. But the mechanism inside the channel is not very clear, maybe it is set for RGB. Call a question mark here?
Model.add(Flatten())
Model.add(Dense(512, init='normal'))
Adding a fully connected layer here is the two codes, which is equivalent to this in the convolutional neural network.
512 means that there are 512 neurons in this layer.
There is nothing to say, that is, part of the model, there can be several layers, but generally placed in the back of the network.
Sgd = SGD (l2=0.0, lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
Model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode=“categoricalâ€)
This part is the legendary "gradient descent method", which is used in the feedback phase of the neural network to continuously learn and adjust the parameters of each layer of convolution, the so-called "learning" process. I use the most common sgd here, the parameters include the learning speed (lr), although other parameters can be changed theoretically, but I did not change them, huh, huh.
Tip: The learning parameters are generally small, I use 0.01, this is determined according to different training set data. If it is too small, the training speed is very slow. If it is too big, it is easy to train and blew out.
The circle is the current position, and the five-pointed star is the target position. If the learning speed is too fast, it is easy to skip the target position directly, resulting in training failure.
I haven't tried the other feedback methods provided by keras (Optimizer), and I don't know their respective advantages and disadvantages. Here are a few other options:
RMSprop
Adagrad
Adadelta
Adam
Adamax
Wait, I guess every method can correspond to a deep learning paper, the code keras has been provided, and I want to know the details to trace the paper. Here I mention the cost function. For the "learning slow" and "transition fit" problems, there are methods to modify the cost function. I understand the truth. I am still groping in the keras where I am making changes. Let me talk about the truth first:
Thus, a better cost function is
Looking for an opportunity to change the code inside this part of keras
Main code section, The End
2. Training pre-codeThere are a few steps you need to take before you start training.
Import the required python package
Import Data
Divide training sets and test sets
2.1 related python package import
#coding:utf-8
'''
GPU run command:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cnn.py
CPU run command:
Python cnn.py
'''
###########################################################
# Import various module components used
###########################################################
# ConvNets module
From __future__ import absolute_import
From __future__ import print_function
From keras.models import Sequential
From keras.layers.core import Dense, Dropout, Activation, Flatten
From keras.layers.advanced_activations import PReLU, LeakyReLU
Import keras.layers.advanced_activations as adact
From keras.layers.convolutional import Convolution2D, MaxPooling2D
From keras.optimizers import SGD, Adadelta, Adagrad, Adam, Adamax
From keras.utils import np_utils, generic_utils
From six.moves import range
From keras.callbacks import EarlyStopping
#统计的模å—
From collections import Counter
Import random, cPickle
From cutslice3d import load_data
From cutslice3d import ROW, COL, LABELTYPE, CHANNEL
#Memory Tuning Module
Import sys
I haven't said anything here, it's equivalent to #include in C language, what to import later. For the specific import of which package is to go to the location of the keras installation, my installation path is
C:\Users\Administrator\Anaconda2\Lib\site-packages\keras
You will see one by one .py file
In the keras directory, it’s like this.
For example, if you need to import the Sequential() function, you must first know that it is defined in the models.py of keras, and then it is natural to come out with this code.
From keras.models import Sequential
# Import Sequential from keras's models.py.
You see, the code is simple to translate. The hard part is that you don't know where the Sequential() function is defined. This requires a good look at the keras documentation. So many functions are not possible here.
In this part, I personally feel that I need Python knowledge. Because of the removal of keras, many packages are quite useful. With these functions, you can save a lot of things. Example:
From collections import Counter
The role is to count the number of times different elements in a matrix appear, and the code that follows is implemented.
Cnt = Counter(A)
For k,v in cnt.iteritems():
Print ('\', k, '--》', v)
# Implemented the number of occurrences of the elements of the statistical A matrix
2.2 Simple processing module for data
###########################################################
# Description of this test
###########################################################
Print("\\\Hey you, this is a trial on malignance and benign tumors detection via ConvNets. I'm Zongwei Zhou. :)")
Print("Each input patch is 51*51, cutted from 1383 3d CT & PT images. The MINIMUM is above 30 segment pixels.")
###########################################################
# Download Data
###########################################################
Print("》" Loading Data ..")
TrData, TrLabel, VaData, VaLabel = load_data()
###########################################################
#打乱数æ®
###########################################################
Index = [i for i in range(len(TrLabel))]
Random.shuffle(index)
TrData = TrData[index]
TrLabel = TrLabel[index]
Print('\Therefore, read in', TrData.shape[0], 'samples from the dataset totally.')
#label is 0~1 a total of 2 categories, keras requires the format binary class matrices, convert it, directly call this function provided by keras
TrLabel = np_utils.to_categorical(TrLabel, LABELTYPE)
Here I used a load_data() function, which I wrote myself, which is a data import, read into the training set and test set from the .mat file. That is, the translation of the input patch, the rotation transformation, and the training set test set division are all done in MATLAB, and the amount of data obtained is huge. As of April 7, my training set has reached a scale of 31.4 GB. The python-side function is more intuitive, it is like this
Def load_data():
###########################################################
# Read data from a .mat file
###########################################################
Mat_training = h5py.File(DATAPATH_Training);
Mat_training.keys()
Training_CT_x = mat_training[Training_CT_1];
Training_CT_y = mat_training[Training_CT_2];
Training_CT_z = mat_training[Training_CT_3];
Training_PT_x = mat_training[Training_PT_1];
Training_PT_y = mat_training[Training_PT_2];
Training_PT_z = mat_training[Training_PT_3];
TrLabel = mat_training[Training_label];
TrLabel = np.transpose(TrLabel);
Training_Dataset = len(TrLabel);
Mat_validation = h5py.File(DATAPATH_Validation);
Mat_validation.keys()
Validation_CT_x = mat_validation[Validation_CT_1];
Validation_CT_y = mat_validation[Validation_CT_2];
Validation_CT_z = mat_validation[Validation_CT_3];
Validation_PT_x = mat_validation[Validation_PT_1];
Validation_PT_y = mat_validation[Validation_PT_2];
Validation_PT_z = mat_validation[Validation_PT_3];
VaLabel = mat_validation[Validation_label];
VaLabel = np.transpose(VaLabel);
Validation_Dataset = len(VaLabel);
###########################################################
# Initialization
###########################################################
TrData = np.empty((Training_Dataset, CHANNEL, ROW, COL), dtype = "float32");
VaData = np.empty((Validation_Dataset, CHANNEL, ROW, COL), dtype = "float32");
###########################################################
# Crop image, channel input
###########################################################
For i in range(Training_Dataset):
TrData[i,0,:,:]=Training_CT_x[:,:,i];
TrData[i,1,:,:]=Training_CT_y[:,:,i];
TrData[i,2,:,:]=Training_CT_z[:,:,i];
TrData[i,3,:,:]=Training_PT_x[:,:,i];
TrData[i,4,:,:]=Training_PT_y[:,:,i];
TrData[i,5,:,:]=Training_PT_z[:,:,i];
For i in range(Validation_Dataset):
VaData[i,0,:,:]=Validation_CT_x[:,:,i];
VaData[i,1,:,:]=Validation_CT_y[:,:,i];
VaData[i,2,:,:]=Validation_CT_z[:,:,i];
VaData[i,3,:,:]=Validation_PT_x[:,:,i];
VaData[i,4,:,:]=Validation_PT_y[:,:,i];
VaData[i,5,:,:]=Validation_PT_z[:,:,i];
Print '\The dimension of each data and label, listed as folllowing:'
Print '\TrData : ', TrData.shape
Print '\TrLabel : ', TrLabel.shape
Print '\Range : ', np.amin(TrData[:,0,:,:]), '~', np.amax(TrData[:,0,:,:])
Print '\\', np.amin(TrData[:,1,:,:]), '~', np.amax(TrData[:,1,:,:])
Print '\\', np.amin(TrData[:,2,:,:]), '~', np.amax(TrData[:,2,:,:])
Print '\\', np.amin(TrData[:,3,:,:]), '~', np.amax(TrData[:,3,:,:])
Print '\\', np.amin(TrData[:,4,:,:]), '~', np.amax(TrData[:,4,:,:])
Print '\\', np.amin(TrData[:,5,:,:]), '~', np.amax(TrData[:,5,:,:])
Print '\VaData : ', VaData.shape
Print '\VaLabel : ', VaLabel.shape
Print '\Range : ', np.amin(VaData[:,0,:,:]), '~', np.amax(VaData[:,0,:,:])
Print '\\', np.amin(VaData[:,1,:,:]), '~', np.amax(VaData[:,1,:,:])
Print '\\', np.amin(VaData[:,2,:,:]), '~', np.amax(VaData[:,2,:,:])
Print '\\', np.amin(VaData[:,3,:,:]), '~', np.amax(VaData[:,3,:,:])
Print '\\', np.amin(VaData[:,4,:,:]), '~', np.amax(VaData[:,4,:,:])
Print '\\', np.amin(VaData[:,5,:,:]), '~', np.amax(VaData[:,5,:,:])
Return TrData, TrLabel, VaData, VaLabel
Read the data stored in .mat, the output is directly divided into the training set (TrData, TrLabel) and the test set (VaData, VaLabel), it is relatively simple, not to expand. About Data Augmentation on the MATLAB side, I will introduce it later. Explain that the role of data expansion is also aimed at the "over-fitting" problem.
Note: my label is 0~1 in 2 categories, keras requires the format binary class matrices, so to convert it, directly call this function np_utils.to_categorical() provided by keras.
3. Training medium and late codeThe hard bones in front are finished, here is just a joke, just a few words to solve the problem.
Print("》》Build Model ..")
Model = create_model(TrData)
###########################################################
#è®ç»ƒConvNets模型
###########################################################
Print("》" Training ConvNets Model ..")
Print("\Here, batch_size =", BATCH_SIZE, ", epoch =", EPOCH, ", lr =", LR, ", momentum =", MOMENTUM)
Early_stopping = EarlyStopping(monitor='val_loss', patience=2)
Hist = model.fit(TrData, TrLabel, \\
Batch_size=BATCH_SIZE, \\
Nb_epoch=EPOCH, \\
Shuffle=True, \\
Verbose=1, \\
Show_accuracy=True, \\
Validation_split=VALIDATION_SPLIT, \\
Callbacks=[early_stopping])
###########################################################
# Testing ConvNets Model
###########################################################
Print("》" Test the model...")
Pre_temp=model.predict_classes(VaData)
3.1 Training model
First call 1. The function create_model() in the main function constructed by the direct convolutional neural network to establish an initialized model. Then the training master code is a sentence
Hist = model.fit(TrData, TrLabel, \\
Batch_size=100, \\
Nb_epoch=10, \\
Shuffle=True, \\
Verbose=1, \\
Show_accuracy=True, \\
Validation_split=0.2, \\
Callbacks=[early_stopping])
:) Yes, just a word, but the things in this sentence are a bit more. . . I will simply list the items I am concerned about here:
TrData: Training data
TrLabel: Training Data Tab
Batch_size: training sample used for each gradient descent adjustment parameter
Nb_epoch: number of training iterations
Shuffle: When suffle=True, it will randomly plan every epoch data (default scramble), but the verification data will not be disturbed by default.
Validation_split: The proportion of the test set, I chose 0.2 here. Note that this is not a test set in the simple processing module of 2.2 data. This test set is a training test set, that is, it is possible to become a training set in the next training. The 2.2 simple processing module of the data is the global test set, the final test for the trained network.
Early_stopping: Whether to end the training in advance, the network itself judges, when the result of this training and the last training is almost automatic, the training iteration is stopped automatically, that is, it is not necessarily trained nb_epoch(10) times.
The call to early_stopping is here
Early_stopping = EarlyStopping(monitor='val_loss', patience=2)
The rest are related to the interface when training, according to my or default:)
Give a mouth, if you want to see the results of each training can be done! Hist stored in hist = model.fit() is the result of each training and the accuracy of the test.
If you want to see the output of each layer, you can do it!
This can be used in conjunction with convolutional neural networks and other traditional classifiers to optimize the experiment of the softmax method, involving more advanced algorithms, I will talk about later. Here we only put the code that looks at each layer of output:
Get_feature = theano.function([origin_model.layers[0].input], origin_model.layers[12].get_output(train=False),allow_input_downcast=False)
Feature = get_feature(data)
Ok, let's take a look at the Python function code for SVM and Random Forests. If you want to do this experiment, you can use it:
###########################################################
# SVM
###########################################################
Def svc(traindata,trainlabel,testdata,testlabel):
Print("Start training SVM.....")
svcClf = SVC (C=1.0, kernel=“rbfâ€, cache_size=3000)
svcClf.fit(traindata,trainlabel)
Pred_testlabel = svcClf.predict(testdata)
Num = len(pred_testlabel)
Accuracy = len([1 for i in range(num) if testlabel[i]==pred_testlabel[i]])/float(num)
Print("\"" cnn-svm Accuracy")
Prt(testlabel, pred_testlabel)
###########################################################
# Random Forests
###########################################################
Def rf(traindata,trainlabel,testdata,testlabel):
Print("Start training Random Forest....")
rfClf = RandomForestClassifier(n_estimators=100,criterion='gini')
rfClf.fit(traindata,trainlabel)
Pred_testlabel = rfClf.predict(testdata)
Print("\"" cnn-rf Accuracy")
Prt(testlabel, pred_testlabel)
Receive ~ stop.
3.2 Test model
Huh, this is the most water, but also a word
Pre_temp=model.predict_classes(VaData)
Apply an existing function predict_classes() to the test set VaData and return the predicted result pre_temp of the trained network. Ok, finally compare pre_temp with the correct test set label VaLabel, you will know the training of this network, the experimental stage victory! Send a screenshot:
Everybody Happy
3.3 Saving the model
It is not easy to train a model. Not only do you need to adjust parameters, but also adjust the network structure. The training time is especially long. So you should learn to save the trained network. The code is like this:
###########################################################
# Save ConvNets model
###########################################################
Model.save_weights('MyConvNets.h5')
cPickle.dump(model, open('./MyConvNets.pkl',"wb"))
Json_string = model.to_json()
Open(W_MODEL, 'w').write(json_string)
Just save it, these are the three files.
Saved model file
When you go back and call this network, you can use this code.
Model = cPickle.load(open('MyConvNets.pkl',"rb"))
The model is read into the model stored in the pkl file.
Kindwin Technology (H.K.) Limited , https://www.ktl-led.com