3

I am using demo.sh provided in syntaxnet repository. If I give input with '\n' separation, it is taking 27.05 seconds for running 3000 lines of text but when I run each line individually, it is taking more than one hour.

It means loading the model takes over 2.5 seconds. If this step is separated and has been put on cash, it will make the whole pipeline faster.

Here is modified version of demo.sh:-

PARSER_EVAL=bazel-bin/syntaxnet/parser_eval
MODEL_DIR=syntaxnet/models/parsey_mcparseface
[[ "$1" == "--conll" ]] && INPUT_FORMAT=stdin-conll || INPUT_FORMAT=stdin

$PARSER_EVAL \
  --input=$INPUT_FORMAT \
  --output=stdout-conll \
  --hidden_layer_sizes=64 \
  --arg_prefix=brain_tagger \
  --graph_builder=structured \
  --task_context=$MODEL_DIR/context.pbtxt \
  --model_path=$MODEL_DIR/tagger-params \
  --slim_model \
  --batch_size=1024 \
  --alsologtostderr \
   | \
  $PARSER_EVAL \
  --input=stdin-conll \
  --output=stdout-conll \
  --hidden_layer_sizes=512,512 \
  --arg_prefix=brain_parser \
  --graph_builder=structured \
  --task_context=$MODEL_DIR/context.pbtxt \
  --model_path=$MODEL_DIR/parser-params \
  --slim_model \
  --batch_size=1024 \
  --alsologtostderr \

I want to build a function call that will take input sentence and give output with dependency parser stored on to local variable like below( the below code is just to make the question clear )

dependency_parsing_model = ...

def give_dependency_parser(sentence,model=dependency_parsing_model):    
    ...
    #logic here
    ...
    return dependency_parsing_output

In the above, model is stored in to a variable, so it takes lesser time for running each line on function call.

How to do this ?

Community
  • 1
  • 1
Manjush
  • 166
  • 1
  • 8

1 Answers1

3

The current version of syntaxnet's Parsey McParseface has two limitations which you've run across:

  1. Sentences are read from stdin or a file, not from a variable
  2. The model is in two parts and not a single executable

I have a branch of tensorflow/models:

https://github.com/dmansfield/models/tree/documents-from-tensor

which I'm working with the maintainers to get merged. With this branch of the code you can build the entire model in one graph (using a new python script called parsey_mcparseface.py) and feed sentences in with a tensor (i.e. a python variable).

Not the best answer in the world I'm afraid because it's very much in flux. There's no simple recipe for getting this working at the moment.

dmansfield
  • 1,108
  • 10
  • 22