I am working on a machine-learning script with tflearn and gym.
I am able to get one network working in my python-script but whenever I try to call my functions to build up a 2nd or 3rd network and train it with model.fit, I get a
tensorflow.python.framework.errors_impl.InvalidArgumentError
edit; The goal should be to build up several different networks in order to compare them. First this should be only focused on the input_data and number of training epochs, but in the end, I'd like to compare different networksizes. Additionally I'd like to run it in a loop, building up more than two networks.
The following code reproduces my error:
- initial_population(pop_size)
creates an array of random actions, size of pop_size
- neural_network_model(input_size):
creates a neural network
- train_model(training_data)
creates a new model, if none is passed, and trains the model on the provided training data
import gym
import random
import numpy as np
import tflearn
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
LR = 1e-3
env = gym.make('CartPole-v0')
env.reset()
goal_steps = 500
score_requirement = 1
def initial_population(pop_size):
training_data = []
scores = []
accepted_scores = []
for _ in range(pop_size):
score = 0
game_memory = []
prev_observation = []
for _ in range(goal_steps):
action = random.randrange(0,2)
observation, reward, done, info = env.step(action)
if len(prev_observation) > 0:
game_memory.append([prev_observation, action])
prev_observation = observation
score += reward
if done:
break
if score >= score_requirement:
accepted_scores.append(score)
for data in game_memory:
if data[1] == 1:
output = [0,1]
elif data[1] == 0:
output = [1,0]
training_data.append([data[0], output])
env.reset()
scores.append(score)
return np.array(training_data)
def neural_network_model(input_size):
network = input_data(shape=[None, input_size, 1], name='input')
network = fully_connected(network, 128, activation='relu')
network = dropout(network, 0.8)
network = fully_connected(network, 2, activation='softmax')
network = regression(network, optimizer='adam', learning_rate=LR,
loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(network, tensorboard_dir='log')
return model
def train_model(training_data, model=False, n_training_epochs=5):
X = np.array([i[0] for i in training_data]).reshape(-1, len(training_data[0][0]), 1)
Y = [i[1] for i in training_data]
if not model:
model = neural_network_model(input_size = len(X[0]))
model.fit({'input':X}, {'targets':Y}, n_epoch=n_training_epochs, snapshot_step=500, show_metric=True)
return model
if __name__ == "__main__":
training_data = initial_population(5)
print("still alive 1")
model = train_model(training_data, n_training_epochs=1)
print("still alive 2")
training_data = initial_population(1)
print("still alive 3")
model = train_model(training_data, n_training_epochs=1)
print("still alive 4")
With the output:
C:\Users\username\AppData\Local\Programs\Python\Python36\python.exe C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py
curses is not supported on this machine (please install/reinstall curses for an optimal experience)
still alive 1
2017-11-21 01:03:45.096492: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2017-11-21 01:03:45.355914: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 980 Ti major: 5 minor: 2 memoryClockRate(GHz): 1.228
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2017-11-21 01:03:45.356242: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
2017-11-21 01:03:46.394283: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
---------------------------------
Run id: BCIV9S
Log directory: log/
---------------------------------
Training samples: 137
Validation samples: 0
--
Training Step: 1 | time: 0.224s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0000 -- iter: 064/137
Training Step: 2 | total loss: 0.62389 | time: 0.234s
| Adam | epoch: 001 | loss: 0.62389 - acc: 0.4500 -- iter: 128/137
Training Step: 3 | total loss: 0.68097 | time: 0.239s
| Adam | epoch: 001 | loss: 0.68097 - acc: 0.3631 -- iter: 137/137
--
still alive 2
still alive 3
2017-11-21 01:03:47.234643: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
2017-11-21 01:03:48.302791: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
---------------------------------
Run id: HHBWWQ
Log directory: log/
---------------------------------
Training samples: 20
Validation samples: 0
--
2017-11-21 01:03:49.928408: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1192] Invalid argument: You must feed a value for placeholder tensor 'input_1/X' with dtype float and shape [?,4,1]
[[Node: input_1/X = Placeholder[dtype=DT_FLOAT, shape=[?,4,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
2017-11-21 01:03:49.928684: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1192] Invalid argument: You must feed a value for placeholder tensor 'input_1/X' with dtype float and shape [?,4,1]
[[Node: input_1/X = Placeholder[dtype=DT_FLOAT, shape=[?,4,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Traceback (most recent call last):
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1/X' with dtype float and shape [?,4,1]
[[Node: input_1/X = Placeholder[dtype=DT_FLOAT, shape=[?,4,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Dropout_1/cond/Merge/_119 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_274_Dropout_1/cond/Merge", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py", line 69, in <module>
model = train_model(training_data, n_training_epochs=1)
File "C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py", line 58, in train_model
model.fit({'input':X}, {'targets':Y}, n_epoch=n_training_epochs, snapshot_step=500, show_metric=True)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tflearn\models\dnn.py", line 216, in fit
callbacks=callbacks)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tflearn\helpers\trainer.py", line 339, in fit
show_metric)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tflearn\helpers\trainer.py", line 818, in _train
feed_batch)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1/X' with dtype float and shape [?,4,1]
[[Node: input_1/X = Placeholder[dtype=DT_FLOAT, shape=[?,4,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Dropout_1/cond/Merge/_119 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_274_Dropout_1/cond/Merge", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'input_1/X', defined at:
File "C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py", line 69, in <module>
model = train_model(training_data, n_training_epochs=1)
File "C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py", line 57, in train_model
model = neural_network_model(input_size = len(X[0]))
File "C:/Users/username/.PyCharm2017.1/config/scratches/scratch.py", line 44, in neural_network_model
network = input_data(shape=[None, input_size, 1], name='input')
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tflearn\layers\core.py", line 81, in input_data
placeholder = tf.placeholder(shape=shape, dtype=dtype, name="X")
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1599, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3090, in _placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
op_def=op_def)
File "C:\Users\username\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1/X' with dtype float and shape [?,4,1]
[[Node: input_1/X = Placeholder[dtype=DT_FLOAT, shape=[?,4,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Dropout_1/cond/Merge/_119 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_274_Dropout_1/cond/Merge", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Process finished with exit code 1
The critical part seems to be, that the function model.fit doesn't get the right datatype, the second time it gets called. It looks like that both instances might share some variables, data, etc., which screws something up.
For the regular tensorflow, I've seen that you might have to do a seperate session for every new model, but I don't know if that applies to the tflearn package.
I am working on Windows 10 and Python 3.6.