InceptionV3を蒸留してMovidiusで動かす

公開できるレベルではないくらいの雑さ
もっときれいにしたらgithubにアップする予定

テキトーなCNNに蒸留してみる

テキトーなCNN

入力 : 299 x 299 x 3
出力 : caltech101

conv2d(299,299,3)
conv2d(,,32)
conv2d(, , 64)
conv2d(, , 128)
dense(625)
dropout
dense(101)

結果

----------------------------------------------
EPOCH 3/3 

100%|██████████| 72/72 [00:29<00:00,  2.40it/s]
100%|██████████| 54/54 [00:06<00:00,  7.94it/s]

    loss: 0.8792 val accuracy: 0.8924

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:17<00:00,  4.09it/s]
100%|██████████| 54/54 [00:03<00:00, 16.21it/s]

    loss: 7.9115 val accuracy: 0.4034

蒸留なしで子

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:10<00:00,  6.81it/s]
100%|██████████| 54/54 [00:03<00:00, 15.68it/s]

    loss: 16.6656 val accuracy: 0.4259

負けた

Movidiusで動かす

mvNCProfileで推論の時間だけ測定できた。
299 x 299 x 3の画像で、1枚当たり130.27msらしい

実測値:

Time : 58.298219442367554 (435 images)

1枚当たり134msとのこと
つまり、画像のロードには4msくらいしかかかってない

ちなみに、パラメータ数は83M

まとめ

InceptionV3をファインチューニングして、それを別のネットワークに蒸留したものをMovidiusで動かせるようになった。

InceptionV3をMobilenetV1 224x224 a=1に学習させる

tensorflow/modelsのnetsをインストール

python setup.py install

そしたらロードできるようになる

import nets.mobilenet_v1

で、

stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1()

をするときに、得られるstu2_end_pointsにはmobilenetの詳細が入ってる。
これの中身を見るといろいろ分かる。

['Conv2d_0', 'Conv2d_1_depthwise', 'Conv2d_1_pointwise', 'Conv2d_2_depthwise', 'Conv2d_2_pointwise', 'Conv2d_3_depthwise', 'Conv2d_3_pointwise', 'Conv2d_4_depthwise', 'Conv2d_4_pointwise', 'Conv2d_5_depthwise', 'Conv2d_5_pointwise', 'Conv2d_6_depthwise', 'Conv2d_6_pointwise', 'Conv2d_7_depthwise', 'Conv2d_7_pointwise', 'Conv2d_8_depthwise', 'Conv2d_8_pointwise', 'Conv2d_9_depthwise', 'Conv2d_9_pointwise', 'Conv2d_10_depthwise', 'Conv2d_10_pointwise', 'Conv2d_11_depthwise', 'Conv2d_11_pointwise', 'Conv2d_12_depthwise', 'Conv2d_12_pointwise', 'Conv2d_13_depthwise', 'Conv2d_13_pointwise', 'AvgPool_1a', 'Logits', 'Predictions']

モデルを作るときに、出力層の数は指定できるけど、学習済みチェックポイントと一致しないから、変わった部分の重みは読み込まないようにしなくちゃダメ

例えば、上のを見るとLogitsPredictionsがあやしい

print("shape of logits: ", ep2['Logits'].shape)
print("shape of prediction: ", ep2['Predictions'].shape)

気になるやつらのシェイプを確認して、自分が指定した出力層の数になってたら、そのやつらには読み込ませない。

mbnet_pretrained_include = ["MobilenetV1"]
mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"]
mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude)
mbnet_pretrained_saver = tf.train.Saver(
    mbnet_pretrained_vars, name="mobilenet_pretrained_saver")

やってみる

----------------------------------------------
EPOCH 30/30 

100%|██████████| 72/72 [00:16<00:00,  4.35it/s]
100%|██████████| 54/54 [00:03<00:00, 13.75it/s]

    loss: 0.0003 val accuracy: 0.8582

----------------------------------------------
EPOCH 72/300 

100%|██████████| 72/72 [00:17<00:00,  4.01it/s]
100%|██████████| 54/54 [00:02<00:00, 23.67it/s]

    loss: 6.1732 val accuracy: 0.4126

Movidius用に変換してみる

mvNCCompileすると

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value MobilenetV1/Conv2d_0/BatchNorm/moving_mean

とのこと、

変換するときのSaverが保存する変数がtrainableのみになってたので、tf.global_variables()にしたところ、

NotFoundError: Key MobilenetV1/Conv2d_0/BatchNorm/moving_mean not found in checkpoint

ということは、根本的に保存した時点でこの変数が抜けていたようだ。
なので、おおもとのファイルでもtf.global_variables()にして保存した。

[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found preprocess/rescaled_inputs
[Error 5] Toolkit Error: Stage Details Not Supported: Top Not Found mbnet_struct/truediv

別のエラーになった。
divに対応していないのだろうか?

前者は事前に255で割って入力することにした。
後者は使ってなかった。

消して再実行

tokunn@nanase 9:00:09 [~/Documents/distil_incep2mbnet0921] $ python3 movidius.py /home/tokunn/caltech101/butterfly 2>/dev/null
Image path : /home/tokunn/caltech101/butterfly/*.jpg or *.png
['image_0032.jpg', 'image_0073.jpg', 'image_0017.jpg', 'image_0089.jpg', 'image_0085.jpg', 'image_0078.jpg', 'image_0056.jpg', 'image_0069.jpg', 'image_0074.jpg', 'image_0025.jpg']
imgshape  (91, 224, 224)
Start predicting ...
butterfly Faces butterfly chandelier butterfly Faces chandelier butterfly butterfly butterfly butterfly revolver butterfly butterfly revolver butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly sunflower revolver butterfly chandelier butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier butterfly butterfly butterfly Faces butterfly Faces butterfly butterfly butterfly Faces butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly chandelier cellphone butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly butterfly 
Time : 3.817107915878296 (91 images)

すばらしい!

ソースコード

テキトーなCNN編

親子

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
#from __future__ import print_function, division
import loadimg_caltech as loadimg
from tqdm import tqdm
import matplotlib.pyplot as plt

start = time.time()


# In[2]:


np_aryname = './models/data{0}.npy'
    
try: # LOAD
    X_train = np.load(np_aryname.format('X_train'))
    Y_train = np.load(np_aryname.format('Y_train'))
    X_test = np.load(np_aryname.format('X_test'))
    Y_test = np.load(np_aryname.format('Y_test'))
    number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes')))
    
except FileNotFoundError:
    print("### Load from Images ###")
    X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg(
        '/home/tokunn/caltech101')

    np.save(np_aryname.format('X_train'), X_train)
    np.save(np_aryname.format('Y_train'), Y_train)
    np.save(np_aryname.format('X_test'), X_test)
    np.save(np_aryname.format('Y_test'), Y_test)
    np.save(np_aryname.format('number_of_classes'), number_of_classes)


print("X_train", X_train.shape)
print("Y_train", Y_train.shape)
print("X_test", X_test.shape)
print("Y_test", Y_test.shape)
print("Number of Classes", number_of_classes)


# In[3]:


SNAPSHOT_FILE = "./models/snapshot.ckpt"
STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
#[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = number_of_classes                    # Change N_CLASSES to suit your needs

temperature = 4


# In[4]:


def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False):
    with tf.variable_scope(scope, reuse = reuse) as sc:
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                                      
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
            net = tf.nn.dropout(net, keep_prob_conv)

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            net = tf.nn.dropout(net, keep_prob_conv)
            
            net = slim.conv2d(net, 256,scope='conv4')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
            net = tf.nn.dropout(net, keep_prob_conv)
    
            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,1000,scope='fc1') # 625
            net = tf.nn.dropout(net, keep_prob_hidden)
            net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
            #net = tf.nn.softmax(net/temperature)
            return net


# In[5]:


def loss(prediction,output):#,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy#,accuracy


# In[6]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y")
        tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha")
        tf_is_training = tf.placeholder_with_default(False, shape=None,
                                                     name="is_training")
        stu_keep_prob_conv = tf.placeholder(tf.float32)
        stu_keep_prob_hidden = tf.placeholder(tf.float32)
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")

    # BODY
    arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training,
            dropout_keep_prob=0.8)
        
    with tf.name_scope("softmax") as scope:
        tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax")
        tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual")
        
        
    # Student     
    stu_logits = NetworkStudent(tf_X, stu_keep_prob_conv,
                               stu_keep_prob_hidden, scope='student')
    with tf.name_scope("stu_struct"):
        # softmax
        stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
        stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'InceptionV3' in var.name]
    var_student = [var for var in model_vars if 'student' in var.name]
   

    # PREDICTIONS
    tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds")

    # LOSS - Sums all losses (even Regularization losses)
    with tf.variable_scope('loss') as scope:
        #unrolled_labels = tf.reshape(tf_Y, (-1,))
        #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels,
        
        tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits)
        tf_loss = tf.losses.get_total_loss()
        
        #tf_loss = loss(tch_y_actual, tf_Y)

    # OPTIMIZATION - Also updates batchnorm operations automatically
    with tf.variable_scope('opt') as scope:
        #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm
        #with tf.control_dependencies(update_ops):
        #    tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op")
        
        grad_teacher = tf.gradients(tf_loss, var_teacher)
        tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher))
            
    # Evaluation
    with tf.variable_scope('eval') as scope:
        y = tf.nn.softmax(tf_logits, name='softmax')
        accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )
        
    # PRETRAINED SAVER SETTINGS
    # Lists of scopes of weights to include/exclude from pretrained snapshot
    pretrained_include = ["InceptionV3"]
    pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"]

    # PRETRAINED SAVER - For loading pretrained weights on the first run
    pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=pretrained_include,
        exclude=pretrained_exclude)
    tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver")

    
         
    # Student
    with tf.name_scope("stu_train"):
        # loss
        tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits)
        stu_loss1 = tf.losses.get_total_loss()
        #stu_loss1 = loss(stu_y_actual, tf_Y)
        stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
            tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1))
        stu_loss = 0.2 * stu_loss1 + stu_loss2
        #stu_loss = stu_loss1
       
        # optimization
        grad_student = tf.gradients(stu_loss,var_student)
        stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
        #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #stu_trainer = tf.train.AdadeltaOptimizer()
        train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student))
        #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer")
        #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op")
        # evaluation
        stu_accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )

        
    
    # MAIN SAVER - For saving/restoring your complete model
    tf_saver = tf.train.Saver(var_teacher, name="saver")
    
    # STUDENT SAVER
    
    stu_saver = tf.train.Saver(var_student, name="stu_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[7]:


def initialize_vars(session):
    # INITIALIZE VARS
    LOAD_FROM_CHECKPOINT = False
    if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE):
        print(" Loading from Main Checkpoint")
        session.run(tf.global_variables_initializer())
        tf_saver.restore(session, SNAPSHOT_FILE)
    else:
        print("Initializing from Pretrained Weights")
        session.run(tf.global_variables_initializer())
        tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE)


# In[ ]:


with tf.Session(graph=graph) as sess:
    n_epochs = 5
    batch_size = 32 # small batch size so inception v3 can be run on laptops
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size

    initialize_vars(session=sess)

    print("##### Teacher Training Section #####")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: True}
            #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict)
            tf_train_step.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: False}            
            val_accuracy.append(accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = {
            tf_X: [X_test[5]],
            tf_Y: [Y_test[5]],
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
       
        # SAVE SNAPSHOT - after each epoch
        tf_saver.save(sess, SNAPSHOT_FILE)
        

    print("### Student Training Section ###")
    n_epochs = 300
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         #tf_alpha:0.001,
                         stu_keep_prob_conv: 0.8,
                         stu_keep_prob_hidden: 0.5}
            #loss, _ = sess.run(stu_loss, feed_dict=feed_dict)
            train_step_student.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         #tf_alpha:0.001,
                         stu_keep_prob_conv: 1.0,
                         stu_keep_prob_hidden: 1.0}      
            val_accuracy.append(stu_accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([stu_logits, stu_loss], feed_dict = {
            tf_X: X_batch,
            tf_Y: Y_batch,
            #tf_alpha:0.001,
            stu_keep_prob_conv: 1.0,
            stu_keep_prob_hidden: 1.0
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
        
        stu_saver.save(sess, STU_SNAPSHOT_FILE)
            
            
            
        
     


# In[ ]:


end = time.time()
print("Time : {0}".format(end-start))


# In[ ]:





# In[ ]:



変換用

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim


# In[2]:


STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
STU_FLOZEN_FILE = "./models/student_flozen.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [299,299] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = 101                    # Change N_CLASSES to suit your needs

temperature = 4


# In[3]:


def NetworkStudent(input,scope='Student', reuse = False):
    with tf.variable_scope(scope, reuse = reuse) as sc:
        with slim.arg_scope([slim.conv2d],
                            kernel_size = [3,3],
                            stride = [1,1],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu):
                                                      
            net = slim.conv2d(input, 32, scope='conv1')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')

            net = slim.conv2d(net, 64,scope='conv2')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')

            net = slim.conv2d(net, 128,scope='conv3')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
            
            net = slim.conv2d(net, 256,scope='conv4')
            net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
    
            net = slim.flatten(net)
        with slim.arg_scope([slim.fully_connected],
                            biases_initializer=tf.constant_initializer(0.0),
                            activation_fn=tf.nn.relu) :
            
            net = slim.fully_connected(net,1000,scope='fc1') # 625
            net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
            #net = tf.nn.softmax(net/temperature)
            return net


# In[4]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")

    # Student     
    stu_logits = NetworkStudent(tf_X, scope='student')
    with tf.name_scope("stu_struct"):
        # softmax
        stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
        stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_student = [var for var in model_vars if 'student' in var.name]
    
    # parameter 
    total_parameters = 0
    for variable in tf.trainable_variables():
        # shape is an array of tf.Dimension
        shape = variable.get_shape()
        #print(shape)
        #print(len(shape))
        variable_parameters = 1
        for dim in shape:
            #print(dim)
            variable_parameters *= dim.value
        #print(variable_parameters)
        total_parameters += variable_parameters
    print("total params: ",total_parameters)
    
    # STUDENT SAVER
    
    stu_saver = tf.train.Saver(var_student, name="stu_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[5]:


with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())

        
    stu_saver.restore(sess, STU_SNAPSHOT_FILE)
    stu_saver.save(sess, STU_FLOZEN_FILE)


# In[ ]:





# In[ ]:



Movidiusでの予測用

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = '/home/tokunn/caltech101'
if (len(sys.argv) > 1):
    IMAGE_DIR_NAME = sys.argv[1]
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

def predict(input):
    print("Start prediting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./models/graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        print(np.argmax(output), end=' ')
    stop = time.time()
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list][:10])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.resize(image, (299, 299))
        img_list.append(image)
    img_list = np.asarray(img_list)# * (1.0/255.0)
    #img_list = np.reshape(img_list, [-1, 784])
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))

InceptionV3をMobilenetV1へ編

親子

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob,sys
#sys.path.append('/home/tokunn/sources/models/research/slim')
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
import nets.mobilenet_v1
#from __future__ import print_function, division
import loadimg_caltech as loadimg
from tqdm import tqdm
import matplotlib.pyplot as plt

start = time.time()


# In[2]:


np_aryname = './models/data{0}.npy'
    
try: # LOAD
    X_train = np.load(np_aryname.format('X_train'))
    Y_train = np.load(np_aryname.format('Y_train'))
    X_test = np.load(np_aryname.format('X_test'))
    Y_test = np.load(np_aryname.format('Y_test'))
    number_of_classes = np.asscalar(np.load(np_aryname.format('number_of_classes')))
    
except FileNotFoundError:
    print("### Load from Images ###")
    X_train, Y_train, X_test, Y_test, number_of_classes = loadimg.loadimg(
        '/home/tokunn/caltech101')

    np.save(np_aryname.format('X_train'), X_train)
    np.save(np_aryname.format('Y_train'), Y_train)
    np.save(np_aryname.format('X_test'), X_test)
    np.save(np_aryname.format('Y_test'), Y_test)
    np.save(np_aryname.format('number_of_classes'), number_of_classes)


print("X_train", X_train.shape)
print("Y_train", Y_train.shape)
print("X_test", X_test.shape)
print("Y_test", Y_test.shape)
print("Number of Classes", number_of_classes)


# In[3]:


SNAPSHOT_FILE = "./models/snapshot.ckpt"
#STU_SNAPSHOT_FILE = "./models/student_snapshot.ckpt"
MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt"
PRETRAINED_SNAPSHOT_FILE = "./models/inception_v3.ckpt"
PRETRAINED_MOBILENET_FILE = "./models/mobilenet/mobilenet_v1_1.0_224.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = number_of_classes                    # Change N_CLASSES to suit your needs

temperature = 20


# In[4]:


def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False):
    #with tf.variable_scope(scope, reuse = reuse) as sc:
    arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training)#,
            #depth_multiplier=1.0)
        return stu2_logits, stu2_end_points


# In[5]:


# def NetworkStudent(input,keep_prob_conv,keep_prob_hidden,scope='Student', reuse = False):
#     with tf.variable_scope(scope, reuse = reuse) as sc:
#         with slim.arg_scope([slim.conv2d],
#                             kernel_size = [3,3],
#                             stride = [1,1],
#                             biases_initializer=tf.constant_initializer(0.0),
#                             activation_fn=tf.nn.relu):
                                                      
#             net = slim.conv2d(input, 32, scope='conv1')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool1')
#             net = tf.nn.dropout(net, keep_prob_conv)

#             net = slim.conv2d(net, 64,scope='conv2')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool2')
#             net = tf.nn.dropout(net, keep_prob_conv)

#             net = slim.conv2d(net, 128,scope='conv3')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool3')
#             net = tf.nn.dropout(net, keep_prob_conv)
            
#             net = slim.conv2d(net, 256,scope='conv4')
#             net = slim.max_pool2d(net,[2, 2], 2, scope='pool4')
#             net = tf.nn.dropout(net, keep_prob_conv)
    
#             net = slim.flatten(net)
#         with slim.arg_scope([slim.fully_connected],
#                             biases_initializer=tf.constant_initializer(0.0),
#                             activation_fn=tf.nn.relu) :
            
#             net = slim.fully_connected(net,1000,scope='fc1') # 625
#             net = tf.nn.dropout(net, keep_prob_hidden)
#             net = slim.fully_connected(net,N_CLASSES,activation_fn=None,scope='fc2')
            
#             #net = tf.nn.softmax(net/temperature)
#             return net


# In[6]:


def loss(prediction,output):#,temperature = 1):
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(
        tf.cast(output, tf.float32) * tf.log(tf.clip_by_value(prediction,1e-10,1.0)),
                                                  reduction_indices=[1]))      
    #correct_prediction = tf.equal(tf.argmax(prediction,1), tf.argmax(output,1))
    #accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    return cross_entropy#,accuracy


# In[7]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
        tf_Y = tf.placeholder(tf.int32, shape=[None, N_CLASSES], name="Y")
        tf_alpha = tf.placeholder_with_default(0.001, shape=None, name="alpha")
        tf_is_training = tf.placeholder_with_default(False, shape=None,
                                                     name="is_training")
        stu_keep_prob_conv = tf.placeholder(tf.float32)
        stu_keep_prob_hidden = tf.placeholder(tf.float32)
        
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")
        scaled_inputs = tf_X

    # BODY
    arg_scope = tf.contrib.slim.nets.inception.inception_v3_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        tf_logits, end_points = tf.contrib.slim.nets.inception.inception_v3(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=tf_is_training,
            dropout_keep_prob=0.8)
        
    with tf.name_scope("softmax") as scope:
        tch_y = tf.nn.softmax(tf_logits/temperature, name="teacher_softmax")
        tch_y_actual = tf.nn.softmax(tf_logits, name="teacher_softmax_actual")
        
        
    # Student     
#     stu_logits = NetworkStudent(scaled_inputs, stu_keep_prob_conv,
#                                stu_keep_prob_hidden, scope='student')
#     with tf.name_scope("stu_struct"):
#         # softmax
#         stu_y = tf.nn.softmax(stu_logits/temperature, name="softmax")
#         stu_y_actual = tf.nn.softmax(stu_logits, name="actual_softmax")
        
    mbnet_logits, mbnet_end_point = NetworkStudent2(
        scaled_inputs, tf_is_training=tf_is_training, scope='mbnet')
    with tf.name_scope("mbnet_struct"):
        # softmax
        mbnet_y = tf.nn.softmax(mbnet_logits/temperature, name="softmax")
        mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_teacher = [var for var in model_vars if 'InceptionV3' in var.name]
    #var_student = [var for var in model_vars if 'student' in var.name]
    save_vars = tf.global_variables()
    var_mbnet = [var for var in save_vars if 'MobilenetV1' in var.name]
   

    # PREDICTIONS
    tf_preds = tf.to_int32(tf.argmax(tf_logits, axis=-1), name="preds")

    # LOSS - Sums all losses (even Regularization losses)
    with tf.variable_scope('loss') as scope:
        #unrolled_labels = tf.reshape(tf_Y, (-1,))
        #tf.losses.softmax_cross_entropy(onehot_labels=unrolled_labels,
        
        #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=tf_logits)
        #tf_loss = tf.losses.get_total_loss()
        
        tf_loss = loss(tch_y_actual, tf_Y)

    # OPTIMIZATION - Also updates batchnorm operations automatically
    with tf.variable_scope('opt') as scope:
        #tf_optimizer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # for batchnorm
        #with tf.control_dependencies(update_ops):
        #    tf_train_op = tf_optimizer.minimize(tf_loss, name="train_op")
        
        grad_teacher = tf.gradients(tf_loss, var_teacher)
        tf_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
        tf_train_step = tf_trainer.apply_gradients(zip(grad_teacher, var_teacher))
            
    # Evaluation
    with tf.variable_scope('eval') as scope:
        y = tf.nn.softmax(tf_logits, name='softmax')
        accuracy = tf.reduce_mean(
            tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )
        
    # PRETRAINED SAVER SETTINGS
    # Lists of scopes of weights to include/exclude from pretrained snapshot
    pretrained_include = ["InceptionV3"]
    pretrained_exclude = ["InceptionV3/AuxLogits", "InceptionV3/Logits"]

    # PRETRAINED SAVER - For loading pretrained weights on the first run
    pretrained_vars = tf.contrib.framework.get_variables_to_restore(
        include=pretrained_include,
        exclude=pretrained_exclude)
    tf_pretrained_saver = tf.train.Saver(pretrained_vars, name="pretrained_saver")

    mbnet_pretrained_include = ["MobilenetV1"]
    mbnet_pretrained_exclude = ["MobilenetV1/Predictions", "MobilenetV1/Logits"]
        
    mbnet_pretrained_vars = tf.contrib.framework.get_variables_to_restore(
            include=mbnet_pretrained_include, exclude=mbnet_pretrained_exclude)
    mbnet_pretrained_saver = tf.train.Saver(
        mbnet_pretrained_vars, name="mobilenet_pretrained_saver")
         
    # Student
#     with tf.name_scope("stu_train"):
#         # loss
#         #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=stu_logits)
#         #stu_loss1 = tf.losses.get_total_loss()
#         stu_loss1 = loss(stu_y_actual, tf_Y)
#         stu_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
#             tf.clip_by_value(stu_y, 1e-10,1.0)), reduction_indices=1))
#         stu_loss = 0.4 * stu_loss1 + stu_loss2
#         #stu_loss = stu_loss1
       
#         # optimization
#         grad_student = tf.gradients(stu_loss,var_student)
#         stu_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
#         #stu_trainer = tf.train.AdamOptimizer(tf_alpha, name="optimizer")
#         #stu_trainer = tf.train.AdadeltaOptimizer()
#         train_step_student = stu_trainer.apply_gradients(zip(grad_student, var_student))
#         #stu_optimizer = tf.train.AdamOptimizer(tf_alpha, name="stu_optimizer")
#         #stu_train_op = tf_optimizer.minimize(stu_loss, name="stu_train_op")
#         # evaluation
#         stu_accuracy = tf.reduce_mean(
#             tf.cast(tf.equal(tf.argmax(stu_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
#         )
        
        
    # Mobilenet V1
    with tf.name_scope("mbnet_train"):
        # loss
        #tf.losses.softmax_cross_entropy(onehot_labels=tf_Y, logits=mbnet_logits)
        #mbnet_loss1 = tf.losses.get_total_loss()
        mbnet_loss1 = loss(mbnet_y_actual, tf_Y)
        mbnet_loss2 = tf.reduce_mean(- tf.reduce_sum(tch_y * tf.log(
            tf.clip_by_value(mbnet_y, 1e-10,1.0)), reduction_indices=1))
        mbnet_loss = 0.4 * mbnet_loss1 + mbnet_loss2
        #mbnet_loss = mbnet_loss1
        
        # optimization
        grad_mbnet = tf.gradients(mbnet_loss,var_mbnet)
        mbnet_trainer = tf.train.RMSPropOptimizer(learning_rate = 0.0002)
        train_step_mbnet = mbnet_trainer.apply_gradients(zip(grad_mbnet, var_mbnet))

        # evaluation
        mbnet_accuracy = tf.reduce_mean(
            tf.cast(tf.equal(
                tf.argmax(mbnet_y_actual, 1), tf.argmax(tf_Y, 1)), tf.float32)
        )

        
    
    # MAIN SAVER - For saving/restoring your complete model
    tf_saver = tf.train.Saver(var_teacher, name="saver")
    
    # STUDENT SAVER
    
    #stu_saver = tf.train.Saver(var_student, name="stu_saver")
    
    mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[8]:


def initialize_vars(session):
    # INITIALIZE VARS
    LOAD_FROM_CHECKPOINT = False
    if LOAD_FROM_CHECKPOINT: #tf.train.checkpoint_exists(SNAPSHOT_FILE):
        print(" Loading from Main Checkpoint")
        session.run(tf.global_variables_initializer())
        tf_saver.restore(session, SNAPSHOT_FILE)
    else:
        print("Initializing from Pretrained Weights")
        session.run(tf.global_variables_initializer())
        tf_pretrained_saver.restore(session, PRETRAINED_SNAPSHOT_FILE)
        mbnet_pretrained_saver.restore(session, PRETRAINED_MOBILENET_FILE)


# In[9]:


with tf.Session(graph=graph) as sess:
    n_epochs = 2
    batch_size = 32 # small batch size so inception v3 can be run on laptops
    steps_per_epoch = len(X_train)//batch_size
    steps_per_epoch_val = len(X_test)//batch_size

    initialize_vars(session=sess)
    
    """
    try:
        print("#### Debuggin Section ####")
        ep2 = sess.run(stu2_end_point, feed_dict = {tf_X: [X_train[0]],
                                              tf_Y: [Y_train[0]],
                                              tf_is_training: True})
        print("EP2 : ", ep2.keys())
        print("shape of logits: ", ep2['Logits'].shape)
        print("shape of prediction: ", ep2['Predictions'].shape)
        #print("pretrained_vars: ", mbnet_pretrained_vars)"""


    print("##### Teacher Training Section #####")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: True}
            #loss, _ = sess.run([tf_loss, tf_train_op], feed_dict=feed_dict)
            tf_train_step.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_alpha:0.0001,
                         tf_is_training: False}            
            val_accuracy.append(accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([tch_y, tf_loss], feed_dict = {
            tf_X: [X_test[5]],
            tf_Y: [Y_test[5]],
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
       
        # SAVE SNAPSHOT - after each epoch
        tf_saver.save(sess, SNAPSHOT_FILE)
        

    print("### Student Training Section ###")
    n_epochs = 30
    steps_per_epoch = len(X_train)//batch_size // 3 # FOR DEBUG
    steps_per_epoch_val = len(X_test)//batch_size
    print("/////////////////////////////////////////////////////////")
    for epoch in range(n_epochs):
        print("----------------------------------------------", flush=True)
        print("EPOCH {}/{}".format(epoch+1, n_epochs), flush=True, end=' ')
           
        ## TRAINING
        for step in tqdm(range(steps_per_epoch)):
            # EXTRACT A BATCH OF TRAINING DATA
            X_batch = X_train[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_train[batch_size*step: batch_size*(step+1)]

            # RUN ONE TRAINING STEP - feeding batch of data
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_is_training: True}
            train_step_mbnet.run(feed_dict=feed_dict)
            
        ## EVALUATE
        val_accuracy = []
        for step in tqdm(range(steps_per_epoch_val)):
            # EXTRACT A BATCH OF TEST DATA
            X_batch = X_test[batch_size*step: batch_size*(step+1)]
            Y_batch = Y_test[batch_size*step: batch_size*(step+1)]
            
            # Evalution
            feed_dict = {tf_X: X_batch,
                         tf_Y: Y_batch,
                         tf_is_training: False}      
            val_accuracy.append(mbnet_accuracy.eval(feed_dict=feed_dict))
            
        # PRINT FEED BACK - once every `print_every` steps
        total_val_accuracy = np.average(np.asarray(val_accuracy))
        pre_logits, pre_loss = sess.run([mbnet_logits, mbnet_loss], feed_dict = {
            tf_X: X_batch,
            tf_Y: Y_batch,
            tf_is_training: False
        })
        print("\tloss: {:0.4f} val accuracy: {:0.4f}".format(pre_loss, total_val_accuracy))
        plt.plot(pre_logits[0])
        plt.show()
        
        mbnet_saver.save(sess, MBNET_SNAPSHOT_FILE) 
    


# In[10]:


end = time.time()
print("Time : {0}".format(end-start))


# In[ ]:





# In[ ]:



変換用

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os,time,glob,sys
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets
import nets.mobilenet_v1


# In[2]:


MBNET_SNAPSHOT_FILE = "./models/mbnet_student_snapshot.ckpt"
MBNET_FLOZEN_FILE = "./models/mbnet_flozen.ckpt"

# somewhere to store the tensorboard files - to visualise the graph
TENSORBOARD_DIR = "logs"
[os.remove(i) for i in glob.glob(os.path.join(TENSORBOARD_DIR, '*.nanase'))]

# IMAGE SETTINGS
IMG_WIDTH, IMG_HEIGHT = [224,224] # Dimensions required by inception V3
N_CHANNELS = 3                    # Number of channels required by inception V3
N_CLASSES = 101                    # Change N_CLASSES to suit your needs

temperature = 20


# In[3]:


def NetworkStudent2(input,scope='Student', tf_is_training=False, reuse = False):
    #with tf.variable_scope(scope, reuse = reuse) as sc:
    arg_scope = nets.mobilenet_v1.mobilenet_v1_arg_scope()
    with tf.contrib.framework.arg_scope(arg_scope):
        stu2_logits, stu2_end_points = nets.mobilenet_v1.mobilenet_v1(
            scaled_inputs,
            num_classes=N_CLASSES,
            is_training=False)#,
            #depth_multiplier=1.0)
        return stu2_logits, stu2_end_points


# In[4]:


graph = tf.Graph()
with graph.as_default():
    # INPUTS
    with tf.name_scope("inputs") as scope:
        input_dims = (None, IMG_HEIGHT, IMG_WIDTH, N_CHANNELS)
        tf_X = tf.placeholder(tf.float32, shape=input_dims, name="X")
      
    # PREPROCESSING STEPS
    with tf.name_scope("preprocess") as scope:
        #scaled_inputs = tf.div(tf_X, 255., name="rescaled_inputs")
        scaled_inputs = tf_X
        
    # Student             
    mbnet_logits, mbnet_end_point = NetworkStudent2(scaled_inputs, scope='mbnet')
    with tf.name_scope("mbnet_struct"):
        # softmax
        mbnet_y_actual = tf.nn.softmax(mbnet_logits, name="actual_softmax")
        
        
    # Seperate vars
    model_vars = tf.trainable_variables()
    var_mbnet = [var for var in model_vars if 'MobilenetV1' in var.name]

    # parameter 
    total_parameters = 0
    for variable in tf.trainable_variables():
        # shape is an array of tf.Dimension
        shape = variable.get_shape()
        #print(shape)
        #print(len(shape))
        variable_parameters = 1
        for dim in shape:
            #print(dim)
            variable_parameters *= dim.value
        #print(variable_parameters)
        total_parameters += variable_parameters
    print("total params: ",total_parameters)
    
    # STUDENT SAVER
    #mbnet_saver = tf.train.Saver(var_mbnet, name="mbnet_saver")
    mbnet_saver = tf.train.Saver(tf.global_variables(), name="mbnet_saver")

    # TENSORBOARD - To visialize the architecture
    with tf.variable_scope('tensorboard') as scope:
        tf_summary_writer = tf.summary.FileWriter(TENSORBOARD_DIR, graph=graph)
        tf_dummy_summary = tf.summary.scalar(name="dummy", tensor=1)


# In[5]:


with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    
    mbnet_saver.restore(sess, MBNET_SNAPSHOT_FILE) 
    #sess.run(tf.initialize_all_variables())
    mbnet_saver.save(sess, MBNET_FLOZEN_FILE)

loadimg

#!/usr/bin/env python2

import os
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import load_img, img_to_array
from keras.utils import np_utils
import matplotlib.pyplot as plt
import glob
from sklearn.model_selection import train_test_split


IMGSIZE = 224
IMGSIZE = 224

def loadimg_one(DIRPATH, NUM):
    x = []
    y = []

    img_list = os.listdir(DIRPATH)
    img_list = sorted(img_list)
    if (NUM) and (len(img_list) > NUM):
        img_list = img_list[:NUM]
    #print("[loadimg] : img_list : ", end=' ')
    #print(img_list)
    
    with open('categories.txt', 'w') as f:
        f.write('\n'.join(img_list))
        f.write('\n')

    img_count = 0

    for number in img_list:
        dirpath = os.path.join(DIRPATH, number)
        dirpic_list = glob.glob(os.path.join(dirpath, '*.jpg'))
        dirpic_list += glob.glob(os.path.join(dirpath, '*.png'))
        for picture in dirpic_list:
            #img = img_to_array(load_img(picture, color_mode = "grayscale", target_size=(IMGSIZE, IMGSIZE)))
            img = img_to_array(load_img(picture, target_size=(IMGSIZE, IMGSIZE)))
            x.append(img)
            y.append(img_count)
            #print("Load {0} : {1}".format(picture, img_count))
        img_count += 1

    output_count = img_count
    x = np.asarray(x)
    x = x.astype('float32')
    x = x/255.0
    y = np.asarray(y, dtype=np.int32)
    y = np_utils.to_categorical(y, output_count)

    return x, y, output_count


def loadimg(COMMONDIR='./', NUM=None):
    print("########## loadimg ########")

    #COMMONDIR = './make_image'
    #TRAINDIR = os.path.join(COMMONDIR, 'train')
    #TESTDIR = os.path.join(COMMONDIR, 'test')
    x, y, class_count = loadimg_one(COMMONDIR, NUM)
    #x_test,  y_test,  _  = loadimg_one(TESTDIR, NUM)
    #for i in range(0, x_test.shape[0]):
    #    plt.imshow(x_test[i])
    #    plt.show()
    #x = np.concatenate((x_train, x_test))
    #x = np.reshape(x, [-1, 784])
    #y = np.concatenate((y_train, y_test)) 

    print("x_train, y_train, x_test, y_test, class_count")
    print("x_train shape : ", x.shape)

    print("########## END of loadimg ########")
    x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.8, test_size=0.2)
    return x_train,  y_train, x_test, y_test, class_count

if __name__ == '__main__':
    loadimg()

Movidius

import mvnc.mvncapi as mvnc
import numpy as np
from PIL import Image
import cv2
import time, sys, os

import glob

IMAGE_DIR_NAME = '/home/tokunn/caltech101'
if (len(sys.argv) > 1):
    IMAGE_DIR_NAME = sys.argv[1]
#IMAGE_DIR_NAME = 'github_deep_mnist/ncappzoo/data/digit_images'

CATEGORIES_FILE = './categories.txt'
with open(CATEGORIES_FILE, 'r') as f:
    categories = f.read().split('\n')

def predict(input):
    print("Start predicting ...")
    devices = mvnc.EnumerateDevices()
    device = mvnc.Device(devices[0])
    device.OpenDevice()

    # Load graph file data
    with open('./models/graph', 'rb') as f:
        graph_file_buffer = f.read()

    # Initialize a Graph object
    graph = device.AllocateGraph(graph_file_buffer)

    predict = []
    start = time.time()
    for i in range(len(input)):
        # Write the tensor to the input_fifo and queue an inference
        graph.LoadTensor(input[i], None)
        output, userobj = graph.GetResult()
        predict.append(np.argmax(output))
    stop = time.time()
    
    for i in predict:
        print(categories[i], end=' ', flush=True)
    print('')
    print("Time : {0} ({1} images)".format(stop-start, len(input)))

    graph.DeallocateGraph()
    device.CloseDevice()

    return output

if __name__ == '__main__':
    print("Image path : {0}".format(os.path.join(IMAGE_DIR_NAME, '*.jpg or *.png')))
    jpg_list = glob.glob(os.path.join(IMAGE_DIR_NAME, '*.jpg'))
    jpg_list += glob.glob(os.path.join(IMAGE_DIR_NAME, '*.png'))
    if not len(jpg_list):
        print("No image file")
        sys.exit()
    jpg_list.reverse()
    print([i.split('/')[-1] for i in jpg_list][:10])
    img_list = []
    for n in jpg_list:
        image = cv2.imread(n)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        image = cv2.resize(image, (224, 224))
        img_list.append(image)
    img_list = np.asarray(img_list) * (1.0/255.0)
    #img_list = np.reshape(img_list, [-1, 784])
    print("imgshape ", img_list.shape)
    predict(img_list.astype(np.float16))