1、TensorFlow USTDay 2: CNNSung Kim 1Gifts from Google and Line2TensorFlow Mechanics feed data and run graph (operation)sess.run (op, feed_dict=x: x_data) update variables in the graph (and return values) Build graph using TensorFlow operations3Machine Learning BasicsLinear RegressionLogistic Regressio
2、n (Binary classification)Softmax ClassificationNeural Networks 4Linear RegressionW = tf.Variable(tf.random_normal(1), name=weight)b = tf.Variable(tf.random_normal(1), name=bias)# Our hypothesis XW+bhypothesis = X * W + b# cost/loss functioncost = tf.reduce_mean(tf.square(hypothesis - y_train)# Minim
3、izetrain = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)GradientDescent5train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)Logistic RegressionGradientDescentCost?Model? (Hypothesis?)6train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimiz
4、e(cost)Logistic RegressionGradientDescentCost?Model? (Hypothesis?)hypothesis = tf.sigmoid(tf.matmul(X, W) + b)# cost/loss functioncost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)7Softmax ClassifierCost?Model? (Hypothesis?)train = tf.train.GradientDescentOptimizer(lear
5、ning_rate=0.01).minimize(cost)GradientDescentlogits = tf.matmul(X,W)+bhypothesis = tf.nn.softmax(logits)# Cross entropy cost/losscost = tf.reduce_mean (-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)8Softmax ClassifierCost?Model? (Hypothesis?)train = tf.train.GradientDescentOptimizer(learning_rate=0.
6、01).minimize(cost)GradientDescentlogits = tf.matmul(X,W)+bhypothesis = tf.nn.softmax(logits)# Cross entropy cost/losscost = tf.reduce_mean (-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y)9(Deep) Neural Nets# input
7、placeholdersX = tf.placeholder(tf.float32, None, 784)Y = tf.placeholder(tf.float32, None, 10)# weights & bias for nn layersW1 = tf.Variable(tf.random_normal(784, 256)b1 = tf.Variable(tf.random_normal(256)L1 = tf.nn.relu(tf.matmul(X, W1) + b1)W2 = tf.Variable(tf.random_normal(256, 256)b2 = tf.Variabl
8、e(tf.random_normal(256)L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)W3 = tf.Variable(tf.random_normal(256, 10)b3 = tf.Variable(tf.random_normal(10)hypothesis = tf.matmul(L2, W3) + b3# define cost/loss & optimizercost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)opt
9、imizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)10NN tipsInitilizing weightsActivation functionsRegularizationOptimizers11Random?W = tf.Variable(tf.random_normal(1), name=weight)12http:/ 13NN tipsInitilizing weightsActivation functionsRegularizationOptimizers14http:/ funct
10、ionstf.nn.relutf.tanh15http:/ functionstf.nn.relutf.tanhtf.sigmoid16NN tipsInitilizing weightsActivation functionsRegularizationOptimizers17Overfitting18Am I overfitting?Very high accuracy on the training dataset (eg: 0.99)Poor accuracy on the test data set (0.85)19Solutions for overfittingMore trai
11、ning data!Reduce the number of featuresRegularization20RegularizationLets not have too big numbers in the weight 21RegularizationLets not have too big numbers in the weight cost22Dropout: A Simple Way to Prevent Neural Networks from Overfitting Srivastava et al. 2014232425TensorFlow implementationke
12、ep_prob = tf.placeholder(float)L1 = L1_d = tf.nn.dropout(L1, keep_prob=keep_prob)TRAIN:sess.run(optimizer, feed_dict=X: batch_xs, Y: batch_ys, keep_prob: 0.7)EVALUATION:print Accuracy:, accuracy.eval(X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1)26Dropout for MNIST# dropout (keep_prob) ra
13、te 0.7 on training, but should be 1 for testingkeep_prob = tf.placeholder(tf.float32)W1 = tf.get_variable(W1, shape=784, 512)b1 = tf.Variable(tf.random_normal(512)L1 = tf.nn.relu(tf.matmul(X, W1) + b1)L1 = tf.nn.dropout(L1, keep_prob=keep_prob)W2 = tf.get_variable(W2, shape=512, 512)b2 = tf.Variable
14、(tf.random_normal(512)L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)L2 = tf.nn.dropout(L2, keep_prob=keep_prob)# train my modelfor epoch in range(training_epochs): . for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = X: batch_xs, Y: batch_ys, keep_prob: 0.7 c,
15、_ = sess.run(cost, optimizer, feed_dict=feed_dict) avg_cost += c / total_batch# Test model and check accuracycorrect_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)print(Accuracy:, sess.run(accuracy, feed_dict= X: mnist
16、.test.images, Y: mnist.test.labels, keep_prob: 1)https:/ 27NN tipsInitilizing weightsActivation functionsRegularizationOptimizers28Optimizerstrain = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)https:/www.tensorflow.org/api_guides/python/train 29Optimizerstrain = tf.train.Gradi
17、entDescentOptimizer(learning_rate=0.1).minimize(cost) tf.train.AdadeltaOptimizer tf.train.AdagradOptimizer tf.train.AdagradDAOptimizer tf.train.MomentumOptimizer tf.train.AdamOptimizer tf.train.FtrlOptimizer tf.train.ProximalGradientDescentOptimizer tf.train.ProximalAdagradOptimizer tf.train.RMSProp
18、Optimizerhttps:/www.tensorflow.org/api_guides/python/train 30ADAM: a method for stochastic optimization Kingma et al. 201531Use Adam Optimizer# define cost/loss & optimizercost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)optimizer = tf.train.AdamOptimizer(le
19、arning_rate=learning_rate).minimize(cost)32TensorFlow USTDay 2: CNNSung Kim 33http:/cs231n.stanford.edu/3435363738394041424344454647484950515253545556575859606162636465Convolutional Neural Networks for Sentence Classification Yoon Kim, 2014 66CNN BasicsSung Kim Code: https:/ TF 1.0!67CNNhttp:/parse.
20、ele.tue.nl/cluster/2/CNNArchitecture.jpg 68CNN for CT imagesAsan Medical Center & Microsoft Medical Bigdata Contest Winner by GeunYoung Lee and Alex Kim https:/ layer and max pooling 70Simple convolution layerStride: 1x13x3x12x2x1 filter71Toy image72Simple convolution layerImage: 1,3,3,1 image, Filt
21、er: 2,2,1,1, Stride: 1x1, Padding: VALID11111234567891.,1., 1.,1.shape=(2,2,1,1)73Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID1111123456789741111123456789000000033Simple convolution layerImage: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME75Image: 1,3,3,1 image, Fi
22、lter: 2,2,1,1, Stride: 1x1, Padding: SAME111112345678900000007677Max Poolinghttps:/ 78https:/ MNIST image loading79https:/ MNIST Convolution layer80https:/ MNIST Max pooling81CNN MNIST 99%Sung Kim Code: https:/ TF 1.0!82CNNhttp:/parse.ele.tue.nl/cluster/2/CNNArchitecture.jpg83Simple CNN84# input pla
23、ceholdersX = tf.placeholder(tf.float32, None, 784)X_img = tf.reshape(X, -1, 28, 28, 1) # img 28x28x1 (black/white)Y = tf.placeholder(tf.float32, None, 10)# L1 ImgIn shape=(?, 28, 28, 1)W1 = tf.Variable(tf.random_normal(3, 3, 1, 32, stddev=0.01)# Conv - (?, 28, 28, 32)# Pool - (?, 14, 14, 32)L1 = tf.
24、nn.conv2d(X_img, W1, strides=1, 1, 1, 1, padding=SAME)L1 = tf.nn.relu(L1)L1 = tf.nn.max_pool(L1, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)Tensor(Conv2D:0, shape=(?, 28, 28, 32), dtype=float32)Tensor(Relu:0, shape=(?, 28, 28, 32), dtype=float32)Tensor(MaxPool:0, shape=(?, 14, 14, 32), dtype
25、=float32)Conv layer 1https:/ 85Tensor(Conv2D:0, shape=(?, 28, 28, 32), dtype=float32)Tensor(Relu:0, shape=(?, 28, 28, 32), dtype=float32)Tensor(MaxPool:0, shape=(?, 14, 14, 32), dtype=float32)# L2 ImgIn shape=(?, 14, 14, 32)W2 = tf.Variable(tf.random_normal(3, 3, 32, 64, stddev=0.01)# Conv -(?, 14,
26、14, 64)# Pool -(?, 7, 7, 64)L2 = tf.nn.conv2d(L1, W2, strides=1, 1, 1, 1, padding=SAME)L2 = tf.nn.relu(L2)L2 = tf.nn.max_pool(L2, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)L2 = tf.reshape(L2, -1, 7 * 7 * 64)Tensor(Conv2D_1:0, shape=(?, 14, 14, 64), dtype=float32)Tensor(Relu_1:0, shape=(?, 1
27、4, 14, 64), dtype=float32)Tensor(MaxPool_1:0, shape=(?, 7, 7, 64), dtype=float32)Tensor(Reshape_1:0, shape=(?, 3136), dtype=float32)Conv layer 2https:/ 86Tensor(Conv2D_1:0, shape=(?, 14, 14, 64), dtype=float32)Tensor(Relu_1:0, shape=(?, 14, 14, 64), dtype=float32)Tensor(MaxPool_1:0, shape=(?, 7, 7,
28、64), dtype=float32)Tensor(Reshape_1:0, shape=(?, 3136), dtype=float32)L2 = tf.reshape(L2, -1, 7 * 7 * 64)# Final FC 7x7x64 inputs - 10 outputsW3 = tf.get_variable(W3, shape=7 * 7 * 64, 10, initializer=tf.contrib.layers.xavier_initializer()b = tf.Variable(tf.random_normal(10)hypothesis = tf.matmul(L2
29、, W3) + b# define cost/loss & optimizercost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y)optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)Fully Connected (FC, Dense) layerhttps:/ 87Training and Evaluationhttps:/ # initializesess =
30、tf.Session()sess.run(tf.global_variables_initializer()# train my modelprint(Learning stared. It takes sometime.)for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_si
31、ze) feed_dict = X: batch_xs, Y: batch_ys c, _, = sess.run(cost, optimizer, feed_dict=feed_dict) avg_cost += c / total_batch print(Epoch:, %04d % (epoch + 1), cost =, :.9f.format(avg_cost)print(Learning Finished!)# Test model and check accuracycorrect_prediction = tf.equal(tf.argmax(hypothesis, 1), t
32、f.argmax(Y, 1)accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)print(Accuracy:, sess.run(accuracy, feed_dict=X: mnist.test.images, Y: mnist.test.labels)88https:/ # initializesess = tf.Session()sess.run(tf.global_variables_initializer()# train my modelprint(Learning stared. It takes s
33、ometime.)for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = X: batch_xs, Y: batch_ys c, _, = sess.run(cost, optimizer, feed_dict=feed_dict) avg_cos
34、t += c / total_batch print(Epoch:, %04d % (epoch + 1), cost =, :.9f.format(avg_cost)print(Learning Finished!)# Test model and check accuracycorrect_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)print(Accuracy:, sess.ru
35、n(accuracy, feed_dict=X: mnist.test.images, Y: mnist.test.labels)Epoch: 0001 cost = 0.340291267Epoch: 0002 cost = 0.090731326Epoch: 0003 cost = 0.064477619Epoch: 0004 cost = 0.050683064.Epoch: 0011 cost = 0.017758641Epoch: 0012 cost = 0.014156652Epoch: 0013 cost = 0.012397016Epoch: 0014 cost = 0.010
36、693789Epoch: 0015 cost = 0.009469977Learning Finished!Accuracy: 0.9885Training and Evaluation89Deep CNNImage credit: http:/personal.ie.cuhk.edu.hk/ccloy/project_target_code/index.html 90# L3 ImgIn shape=(?, 7, 7, 64)W3 = tf.Variable(tf.random_normal(3, 3, 64, 128, stddev=0.01)# Conv -(?, 7, 7, 128)#
37、 Pool -(?, 4, 4, 128)# Reshape -(?, 4 * 4 * 128) # Flatten them for FCL3 = tf.nn.conv2d(L2, W3, strides=1, 1, 1, 1, padding=SAME)L3 = tf.nn.relu(L3)L3 = tf.nn.max_pool(L3, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)L3 = tf.nn.dropout(L3, keep_prob=keep_prob)L3 = tf.reshape(L3, -1, 128 * 4 *
38、4)Tensor(Conv2D_2:0, shape=(?, 7, 7, 128), dtype=float32) Tensor(Relu_2:0, shape=(?, 7, 7, 128), dtype=float32) Tensor(MaxPool_2:0, shape=(?, 4, 4, 128), dtype=float32) Tensor(dropout_2/mul:0, shape=(?, 4, 4, 128), dtype=float32) Tensor(Reshape_1:0, shape=(?, 2048), dtype=float32)# L4 FC 4x4x128 inp
39、uts - 625 outputsW4 = tf.get_variable(W4, shape=128 * 4 * 4, 625, initializer=tf.contrib.layers.xavier_initializer()b4 = tf.Variable(tf.random_normal(625)L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)L4 = tf.nn.dropout(L4, keep_prob=keep_prob)Tensor(Relu_3:0, shape=(?, 625), dtype=float32) Tensor(dropout_3
40、/mul:0, shape=(?, 625), dtype=float32)# L5 Final FC 625 inputs - 10 outputsW5 = tf.get_variable(W5, shape=625, 10, initializer=tf.contrib.layers.xavier_initializer()b5 = tf.Variable(tf.random_normal(10)hypothesis = tf.matmul(L4, W5) + b5Tensor(add_1:0, shape=(?, 10), dtype=float32)Deep CNNhttps:/ #
41、L1 ImgIn shape=(?, 28, 28, 1)W1 = tf.Variable(tf.random_normal(3, 3, 1, 32, stddev=0.01)# Conv - (?, 28, 28, 32)# Pool - (?, 14, 14, 32)L1 = tf.nn.conv2d(X_img, W1, strides=1, 1, 1, 1, padding=SAME)L1 = tf.nn.relu(L1)L1 = tf.nn.max_pool(L1, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)L1 = tf.
42、nn.dropout(L1, keep_prob=keep_prob)Tensor(Conv2D:0, shape=(?, 28, 28, 32), dtype=float32) Tensor(Relu:0, shape=(?, 28, 28, 32), dtype=float32) Tensor(MaxPool:0, shape=(?, 14, 14, 32), dtype=float32) Tensor(dropout/mul:0, shape=(?, 14, 14, 32), dtype=float32)# L2 ImgIn shape=(?, 14, 14, 32)W2 = tf.Va
43、riable(tf.random_normal(3, 3, 32, 64, stddev=0.01)# Conv -(?, 14, 14, 64)# Pool -(?, 7, 7, 64)L2 = tf.nn.conv2d(L1, W2, strides=1, 1, 1, 1, padding=SAME)L2 = tf.nn.relu(L2)L2 = tf.nn.max_pool(L2, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)L2 = tf.nn.dropout(L2, keep_prob=keep_prob)Tensor(Con
44、v2D_1:0, shape=(?, 14, 14, 64), dtype=float32) Tensor(Relu_1:0, shape=(?, 14, 14, 64), dtype=float32) Tensor(MaxPool_1:0, shape=(?, 7, 7, 64), dtype=float32) Tensor(dropout_1/mul:0, shape=(?, 7, 7, 64), dtype=float32)91Deep CNNhttps:/ # L1 ImgIn shape=(?, 28, 28, 1)W1 = tf.Variable(tf.random_normal(
45、3, 3, 1, 32, stddev=0.01)# Conv - (?, 28, 28, 32)# Pool - (?, 14, 14, 32)L1 = tf.nn.conv2d(X_img, W1, strides=1, 1, 1, 1, padding=SAME)L1 = tf.nn.relu(L1)L1 = tf.nn.max_pool(L1, ksize=1, 2, 2, 1, strides=1, 2, 2, 1, padding=SAME)L1 = tf.nn.dropout(L1, keep_prob=keep_prob)Tensor(Conv2D:0, shape=(?, 2
46、8, 28, 32), dtype=float32) Tensor(Relu:0, shape=(?, 28, 28, 32), dtype=float32) Tensor(MaxPool:0, shape=(?, 14, 14, 32), dtype=float32) Tensor(dropout/mul:0, shape=(?, 14, 14, 32), dtype=float32).# L4 FC 4x4x128 inputs - 625 outputsW4 = tf.get_variable(W4, shape=128 * 4 * 4, 625, initializer=tf.cont
47、rib.layers.xavier_initializer()b4 = tf.Variable(tf.random_normal(625)L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)L4 = tf.nn.dropout(L4, keep_prob=keep_prob)Tensor(Relu_3:0, shape=(?, 625), dtype=float32) Tensor(dropout_3/mul:0, shape=(?, 625), dtype=float32)# L5 Final FC 625 inputs - 10 outputsW5 = tf.ge
48、t_variable(W5, shape=625, 10, initializer=tf.contrib.layers.xavier_initializer()b5 = tf.Variable(tf.random_normal(10)hypothesis = tf.matmul(L4, W5) + b5Tensor(add_1:0, shape=(?, 10), dtype=float32)# Test model and check accuracycorrect_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)a
49、ccuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)print(Accuracy:, sess.run(accuracy, feed_dict=X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1)Epoch: 0013 cost = 0.027188021Epoch: 0014 cost = 0.023604777Epoch: 0015 cost = 0.024607201Learning Finished!Accuracy: 0.993892Class, L
50、ayers, EnsembleSung Kim Code: https:/ TF 1.0!93CNNhttps:/ # L1 ImgIn shape=(?, 28, 28, 1)W1 = tf.Variable(tf.random_normal(3, 3, 1, 32, stddev=0.01)# Conv - (?, 28, 28, 32)# Pool - (?, 14, 14, 32)L1 = tf.nn.conv2d(X_img, W1, strides=1, 1, 1, 1, padding=SAME)L1 = tf.nn.relu(L1)L1 = tf.nn.max_pool(L1,