Error in saving and using model of TensorForestEstimator for Android

Question

I use the randomforest estimator, implemented in tensorflow, to predict if a text is english or not. I saved my model (A dataset with 2k samples and 2 class labels 0/1 (Not English/English)) using the following code (train_input_fn function return features and class labels):

model_path='test/'
TensorForestEstimator(params, model_dir='model/')
estimator.fit(input_fn=train_input_fn, max_steps=1)

After running the above code, the graph.pbtxt and checkpoints are saved in the model folder. Now I want to use it on Android. I have 2 problems:

As the first step, I need to freeze the graph and checkpoints to a .pb file to use it on Android. I tried freeze_graph (I used the code here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py). When I call the freeze_graph in my mode, I get the following error and the code cannot create the final .pb graph:

File "/Users/XXXXXXX/freeze_graph.py", line 105, in freeze_graph _ = tf.import_graph_def(input_graph_def, name="") File "/anaconda/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 258, in import_graph_def op_def = op_dict[node.op] KeyError: u'CountExtremelyRandomStats'

this is how I call freeze_graph:

def save_model_android():
    checkpoint_state_name = "model.ckpt-1"
    input_graph_name = "graph.pbtxt"
    output_graph_name = "output_graph.pb"
    checkpoint_path = os.path.join(model_path, checkpoint_state_name)

    input_graph_path = os.path.join(model_path, input_graph_name)
    input_saver_def_path = None
    input_binary = False
    output_node_names = "output"
    restore_op_name = "save/restore_all"
    filename_tensor_name = "save/Const:0"
    output_graph_path = os.path.join(model_path, output_graph_name)
    clear_devices = True

    freeze_graph.freeze_graph(input_graph_path, input_saver_def_path,
                              input_binary, checkpoint_path,
                              output_node_names, restore_op_name,
                              filename_tensor_name, output_graph_path,
                              clear_devices, "")

I also tried the freezing on the iris dataset in "tf.contrib.learn.datasets.load_iris". I get the same error. So I believe it is not related to the dataset.

As a second step, I need to use the .pb file on the phone to predict a text. I found the camera demo example by google and it contains a lot of code. I wonder if there is a step by step tutorial how to use a Tensorflow model on Android by passing a feature vector and get the class label.

Thanks, in advance!

UPDATE

By using the recent version of tensorflow (0.12), the problem is solved. However, now, the problem is that what I should pass to output_node_names ??? How can I get what are the output nodes in the graph ?

score 1 · Answer 1 · answered Nov 28 '16 at 17:30

1

Re (1) it looks like you are running freeze_graph on a build of tensorflow which does not have access to contrib ops. Maybe try explicitly importing tensorforest before calling freeze_graph?

Re (2) I don't know of a simpler example.

answered Nov 28 '16 at 17:30

Alexandre Passos

5,186
1
14
19

Thanks for the reply. I didn't get what you mean. Would you please describe more by an example? Thanks a lot! – TryToBeNice Nov 28 '16 at 17:46
An example of what exactly? – Alexandre Passos Nov 28 '16 at 17:55
"try explicitly importing tensorforest before calling freeze_graph" -> How should I do it in the save_model_android function? – TryToBeNice Nov 28 '16 at 18:03
import tensorflow as tf; tf.contrib.tensor_forest.python.tensor_forest, or something like this. – Alexandre Passos Nov 28 '16 at 18:23
Sorry you're hitting problems! If you can email me at at google.com, I'd like to see if you can share your GraphDef and checkpoint so I can debug the problem. On the tutorial front, you might find https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/ useful (though it's aimed at iOS). – Pete Warden Dec 07 '16 at 16:37
Thanks a lot, Pete! I've already seen your tutorial, but as far as it is for iOS, I didn't dive into. I just sent an email to you containing the model. I really appreciate if you can take a look. Thanks!! – TryToBeNice Dec 12 '16 at 18:34

score 0 · Answer 2 · answered Nov 28 '16 at 19:22

0

CountExtremelyRandomStats is one of TensorForest's custom ops, and exists in tensorflow/contrib. As was pointed out, TF switched to including contrib ops by default at some point. I don't think there's an easy way to include the contrib custom ops in the global registry in the previous releases, because TensorForest uses the method of building a .so file that is included as a data file which is loaded at runtime (a method that was the standard when TensorForest was created, but may not be any longer). So there are no easily-included python build rules that will properly link in the C++ custom ops. You can try including tensorflow/contrib/tensor_forest:ops_lib as a dep in your build rule, but I don't think it will work.

In any case, you can try installing the nightly build of tensorflow. The alternative includes modifying how tensorforest custom ops are built, which is pretty nasty.

answered Nov 28 '16 at 19:22

Gilbert Hendry

116
1
3

Thanks, Gilbert! I was using tensorflow 0.11. I upgraded tensorflow to version 0.12 today. Now I receive the same error but on another ops I guess. KeyError: u'TreePredictions'. So you mean it is a bug in tensorflow? because I am just passing one part of tensorflow to anothers. – TryToBeNice Nov 29 '16 at 13:44
It seems the problem is the kind of saving in TensorForestEstimator: WARNING:tensorflow:TensorFlow's V1 checkpoint format has been deprecated. WARNING:tensorflow:Consider switching to the more efficient V2 format: WARNING:tensorflow: `tf.train.Saver(write_version=tf.train.SaverDef.V2)` WARNING:tensorflow:now on by default. I used the following to save the model: classifier = tf.contrib.learn.TensorForestEstimator(hparams, model_dir='test/'). What should I do now? – TryToBeNice Nov 29 '16 at 14:05
Don't worry about the warning regarding Saver version, the switch to the new format will happen under the hood. That shouldn't be a problem unless you're trying to load a checkpoint from before you upgraded TF (is that what you're saying?). – Gilbert Hendry Nov 30 '16 at 15:09
Let me try to reproduce the TreePredictions error. Things in contrib (TensorForest) are not strictly supported, so we don't yet have any testing around using freeze_graph with it. – Gilbert Hendry Nov 30 '16 at 15:23
Thanks! I put a small example here: https://github.com/tensorflow/tensorflow/issues/5938 – TryToBeNice Nov 30 '16 at 15:59
Per my comment on the github issue, there's a mismatch in checkpoint version between what tf.learn.Estimators use and what tf.import_graph_def uses. – Gilbert Hendry Dec 06 '16 at 15:36
Great, return commits the comment. For now, you could bypass the estimator and train a random forest with raw tensorflow. You construct a tensor_forest graph builder: – Gilbert Hendry Dec 06 '16 at 15:38
builder = tf.contrib.tensor_forest.python.tensor_forest.RandomForestGraphs(params) train_op = builder.training_graph(features, labels) – Gilbert Hendry Dec 06 '16 at 15:39
Thanks, Gilbert! Let me please check what you suggested. – TryToBeNice Dec 06 '16 at 15:41
How should I save the graph now? – TryToBeNice Dec 06 '16 at 16:12
It still has a problem when I save and want to freeze the graph. Can you kindly provide a small example that works? – TryToBeNice Dec 07 '16 at 09:35
I think you actually get farther than i do (you're still getting the KeyError with the custom op?). Are you building with Bazel? Have you tried including tensorflow/contrib/tensor_forest:all_ops as a dependency? – Gilbert Hendry Dec 07 '16 at 16:05
I do get the error of the custom op. I am a beginner with tensorflow so I will try your new suggestions (I mean Bazel and dependency). Thanks again! – TryToBeNice Dec 07 '16 at 16:24
1

Another thing I thought of that might be simpler: many contrib libraries seem to load their custom ops at module scope, which is maybe why Alexandre suggested explicitly importing tensorforest. However, Tensorforest doesn't load them at module scope, they're hidden behind a Load() function. So maybe try calling training_ops.Load() and inference_ops.Load() (found in tensorflow/contrib/tensor_forest/python/ops) before calling freeze_graph. – Gilbert Hendry Dec 08 '16 at 21:02
Thanks, Gilbert! It solved the issue, but now I am getting the following error: Could not open forest_model/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? I am going to email the model to Pete to see the problem. Please let me know if you know something about the new error. Thanks! – TryToBeNice Dec 12 '16 at 18:25
That sounds like trying to load a "V1" checkpoint as a "V2" checkpoint (V2 checkpoints are sstables I believe, V1 checkpoints are not). This is the error I mentioned in the github issue, where tf.import_graph_def probably uses the default checkpoint version (which switched to V2 some time ago), and tf.learn (which the TensorForestEstimator is based on) had been hanging on to the V1 checkpoints. However, tf.learn just switched to V2 over the weekend, so if you can pick up the nightly build, that might fix your problem. – Gilbert Hendry Dec 13 '16 at 19:00
I downloaded nightly build today and installed it. I tried the simplest example of estimator (Linear Classifier) from https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn I still get the warning of "TensorFlow's V1 checkpoint format has been deprecated". I use "Mac CPU-only: Python 2" version. – TryToBeNice Dec 14 '16 at 09:34
With the new version of tensorflow that I downloaded today, I don't see the graph version warning anymore. Just see some warnings related to deprecated functions. I corrected one in the tensor_forest.py by using function, but I cannot replace concat with concat_v2 as I get an error. Anyway, the error still exists: "Unable to open table file test/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?" – TryToBeNice Dec 15 '16 at 17:08
Are you still passing the directory as input_saver? Have you tried passing None for it? – Gilbert Hendry Dec 16 '16 at 18:04
If I pass None as input_saver, I get the same error namely, "Unable to open table file test/checkpoint". If I pass the path of graph.pbtxt as input_saver (it means both input_graph and input_saver have the same value), I get this error: "Message type "tensorflow.SaverDef" has no field named "node"". I really don't know what's going on :-(. Shall I completely ignore the estimator and use the original tensorflow? – TryToBeNice Dec 19 '16 at 17:50
I solved the issue by using the model_checkpoint_path: "model.ckpt-1" in the checkpoint file. I thought that I had to pass the name of the checkpoint file rather than the name of the file. Now, there is another issue about the output_node_names. How can I find the output_node_names in the model ? – TryToBeNice Dec 20 '16 at 11:18
Should I pass names in the graph to output_node_names like this one: output_node_names = "Placeholder_1,Placeholder,zeros,Variable,..." ?? – TryToBeNice Dec 20 '16 at 17:19

Error in saving and using model of TensorForestEstimator for Android

UPDATE

2 Answers2