0

When we run models/syntaxnet/$ echo "sentence to parse" | ./syntaxnet/demo.sh, which specific tensor does receive "sentence to parse"?

I made a SyntaxNet server (AWS, django) to help my other conversational tasks. Every time I send a sentence query to my server, it takes around 3.5 seconds to get parsed sentences back.

This is not a fast enough for my task. So I tried to find where is bottleneck. I found import tensorflow as tf takes 0.8sec , and it actually takes 1.6 seconds (2*0.8 seconds) since SyntaxNet has two steps (POS tagging and parsing), even before it loads parameters and builds graphs.

I want my server always 'awake' and ready to parse sentence with pre-loaded graphs and parameters. So I try to tweak SyntaxNet to work as below. The session receives user input continuously by input() and prints calculated tensor and never turns off.

import tensorflow as tf
def multiplication(sess):
    x1 = int(input())
    matrix1 = tf.constant([[x1, x1]]
    matrix2 = tf.constant([[2],[2]])
    product = sess.run([tf.matmul(matrix1, matrix2)])   
    return product
with tf.Session() as sess:
    while True:
        print(multiplication(sess))
------------------------------------------
1
[array([[4]], dtype=int32)]
2
[array([[8]], dtype=int32)]
3
[array([[12]], dtype=int32)]

However, I cannot where to implement input() part. when we run models/syntaxnet/$ echo "sentence to parse" | ./syntaxnet/demo.sh, how does demo.sh receive stdin? In other words, where does "sentence to parse" go? I couldn't find any read inside the bash script.

So I just directly ran parser_eval by

python bazel-bin/syntaxnet/parser_eval \
  --input=stdin \
  --output=stdout-conll \
  --hidden_layer_sizes=64 \
  --arg_prefix=brain_tagger \
  --graph_builder=structured \
  --task_context=syntaxnet/models/parsey_mcparseface/context.pbtxt \
  --model_path=syntaxnet/models/parsey_mcparseface/tagger-params \
  --slim_model \
  --batch_size=1024 \
  --alsologtostderr \

and tried to find where does the python file syntaxnet/parser_eval.py receive input.

It seems parser.evaluation['documents'] below receives stdin somehow.

def Eval(sess):
...
...
    while True:
    tf_eval_epochs, tf_eval_metrics, tf_documents = sess.run([
              parser.evaluation['epochs'],
              parser.evaluation['eval_metrics'],
              parser.evaluation['documents'],
          ])

          if len(tf_documents):
            logging.info('Processed %d documents', len(tf_documents))
            num_documents += len(tf_documents)
            sess.run(sink, feed_dict={sink_documents: tf_documents})

          num_tokens += tf_eval_metrics[0]
          num_correct += tf_eval_metrics[1]
          if num_epochs is None:
            num_epochs = tf_eval_epochs
          elif num_epochs < tf_eval_epochs:
            break
...
...
def main(unused_argv):
  print >> sys.stderr, "parser_eval.py main start", time.time()
  logging.set_verbosity(logging.INFO)
  temp_counter = 0

  while True:
    with tf.Session() as sess:
      Eval(sess, temp_counter)
      temp_counter +=1

if __name__ == '__main__':
  tf.app.run()

I also tracked down graph_builder.py, gen_parser_ops.py, but couldn't find which specific tensor or variable receives the stdin sentence yet.

Could you please explain where does SyntaxNet receive stdin sentence?

It would be also helpful if you can answer some related questions.

  • How can I put while True: loop inside parser_eval.py (I tried this some parts at parser_eval.py, but it receives stdin only once.)
  • Can tensorflow serving help this issue?

Thanks in advance.

J-min
  • 457
  • 4
  • 9

1 Answers1

0

The syntaxnet protoio package has utilities to do this reading.

Alexandre Passos
  • 5,186
  • 1
  • 14
  • 19