tensorflow python script examples

Tensorflow中每个节点的每个输出被称为Tensor。 (Represents one of the outputs of an Operation.)
python中的tf.Tensor只是一个symbolic handle。
下面是tensorflow的一个简单的example
# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
e = tf.matmul(c, d)

# Construct a `Session` to execute the graph.
sess = tf.Session()

# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e)

#clean up
sess.close()
输出:
[[ 1. 3.]
[ 3. 7.]]
上面这段代码中,c,d,e,都是tf.Tensor类型,result是numpy.ndarray类型。矩阵运算是在sess.run()里面执行的。一个graph,就像是数据库里的预编译好的语句(prepared statement),而sess.run()就是执行这个statement。tf.matmul,tf.constant这些都是operation。
就像我们在执行SQL的时候可以塞参数进去一样,执行tensorflow的graph时也是如此。Tensorflow中把这个机制称为Feeding。

Feeding

下面是一个带参数的例子:
import tensorflow as tf

# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
e = tf.matmul(c, d)

# Construct a `Session` to execute the graph.
sess = tf.Session()

# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e,feed_dict={c:[[0.0, 0.0], [3.0, 4.0]]})
print(result)
sess.close()
输出:
[[ 0. 0.]
[ 3. 7.]]
和上面代码唯一的区别就是加了一个feed_dict。feed进去的值,将会替换该node原本的输出。 feed_dict是一个dict,一次可以塞进去很多个tensor。几乎任何node都可以被feed。举个例子,如果我们feed进去的不是c,而是e
result = sess.run(e,feed_dict={e:[[0.0, 0.0], [3.0, 4.0]]})
那么输出将会变成:
[[ 0. 0.]
[ 3. 4.]]

PlaceHolder

PlaceHolder是一种特殊的Operation。它不能被执行,一执行就会fail。它和Const很相似,除了没有默认值。

Stateful node and Variable

Tensorflow中的Operation(Node) 分为有状态的和无状态的。大部分Operation(如matmul,constant)都是无状态的。
import tensorflow as tf

# Build a dataflow graph.
count = tf.Variable([0],trainable=False);
init_op = tf.global_variables_initializer()
update_count = count.assign_add(tf.constant([2]))

# Construct a `Session` to execute the graph.
sess = tf.Session()
sess.run(init_op)

for step in range(10):
    result = sess.run(update_count)
    print("step %d: count = %g" % (step,result))

sess.close()
输出:
step 0: count = 2
step 1: count = 4
step 2: count = 6
step 3: count = 8
step 4: count = 10
step 5: count = 12
step 6: count = 14
step 7: count = 16
step 8: count = 18
step 9: count = 20
还有一种创建Variable的方法就是用tf.get_variable函数。它的优势是这样创建出来的Variable可以在多个graph中共用。例如上面的

count = tf.Variable([0],trainable=False);

可以改为:
count = tf.get_variable("count",[1],initializer = tf.zeros_initializer(),trainable=False,dtype=tf.int32)

关于variable的生命周期:
对于单机版的TF来说,variable里的内容是附着在session上。一旦session被销毁,variable及它里面的内容也会被销毁。分布式的TF则不是如此。它的variable是附着在server上,可以跨session。

以下面的代码为例:
import tensorflow as tf

# Build a dataflow graph.
count = tf.get_variable("count",[1],initializer = tf.zeros_initializer(),trainable=False,dtype=tf.int32)
init_op = tf.global_variables_initializer()
update_count = count.assign_add(tf.constant([2]))

target = ""
is_distributed = True
if is_distributed:
 server = tf.train.Server.create_local_server()
 target = server.target
 
# Construct a `Session` to execute the graph.
with tf.Session(target=target) as sess:
 sess.run(init_op)
 result = sess.run(update_count)
 print("count = %g" % (result))


with tf.Session(target=target) as sess:
    print(sess.run(count))


以当is_distributed设置为True的时候,这两句print语句执行的时候,count都等于2。但当is_distributed设置为False的时候,在执行第二个print语句时就会抛异常:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value count

Reader and Queue

所有的Reader都必须和Queue一起用。因为ReaderReadOp要从OpKernelContext寻找名为"queue_handle"、类型为QueueInterface的resource。
import tensorflow as tf

# Build a dataflow graph.
filename_queue = tf.train.string_input_producer(['1.txt'],num_epochs=1)
reader = tf.TextLineReader()
key,value = reader.read(filename_queue)
num = tf.decode_csv(value,record_defaults=[[0]])
count = tf.Variable([0],trainable=False)
init_op = tf.group(tf.global_variables_initializer(),
                   tf.local_variables_initializer())
update_count = count.assign_add(num)

with tf.Session() as sess:
    sess.run(init_op)
    # Start input enqueue threads.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    step = 0
    try:
        while not coord.should_stop():                  
            result = sess.run(update_count)
            print("step %d: count = %g" % (step,result))
            step += 1
    except tf.errors.OutOfRangeError:
        print("Finished")
    finally:
        # When done, ask the threads to stop.
        coord.request_stop()

    # Wait for threads to finish.
    coord.join(threads)
假设1.txt的内容为:
3
2
5
8
那么上述程序的输出应该为:
step 1: count = 3
step 2: count = 5
step 3: count = 10
step 4: count = 18
为了简化main loop和线程管理,上述代码也可以改写成这样:
import tensorflow as tf

# Build a dataflow graph.
filename_queue = tf.train.string_input_producer(['1.txt'],num_epochs=1)
reader = tf.TextLineReader()
key,value = reader.read(filename_queue)
num = tf.decode_csv(value,record_defaults=[[0]])
count = tf.Variable([0],trainable=False)
update_count = count.assign_add(num)

def train_fn(sess):
  train_fn.counter += 1
  result = sess.run(update_count)
  print("step %d: count = %g" % (train_fn.counter,result))

train_fn.counter = 0

sv = tf.train.Supervisor()
tf.train.basic_train_loop(sv,train_fn)

Gradient

import tensorflow as tf

# Build a dataflow graph.
filename_queue = tf.train.string_input_producer(['1.txt'],num_epochs=1)
reader = tf.TextLineReader()
key,value = reader.read(filename_queue)
num = tf.decode_csv(value,record_defaults=[[0]])
x = tf.Variable([0])
loss = x * num
grads = tf.gradients([loss],x)
grad_x = grads[0]

def train_fn(sess):
  train_fn.counter += 1
  result = sess.run(grad_x)
  print("step %d: grad = %g" % (train_fn.counter,result))

train_fn.counter = 0

sv = tf.train.Supervisor()
tf.train.basic_train_loop(sv,train_fn)
假设1.txt的内容为:
3
2
5
8
那么上述程序的输出应该为:
step 1: grad = 3
step 2: grad = 2
step 3: grad = 5
step 4: grad = 8

view graph in browser

下面是怎么把graph打出来看看
import tensorflow as tf

# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]], name = 'c')
d = tf.constant([[1.0, 1.0], [0.0, 1.0]], name = 'd')
e = tf.matmul(c, d, name = 'e')

# Construct a `Session` to execute the graph.
sess = tf.Session()
train_writer = tf.summary.FileWriter("t3", sess.graph)
# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e)
print(result)
train_writer.close()
sess.close()

此博客中的热门博文

少写代码,多读别人写的代码

在windows下使用llvm+clang

tensorflow distributed runtime初窥