3

I am running into trouble spawning the Stanford Parser as a child process in SBCL lisp:

(defvar *p* (sb-ext:run-program "/usr/bin/java"
   (list     "-cp"
    "\"/home/todd/CoreNLP/*\""
    "-Xmx2g"
    "edu.stanford.nlp.pipeline.StanfordCoreNLP"
    "-annotators"
    "tokenize,ssplit,pos,lemma,ner,parse,dcoref"
    "-outputFormat"
    "text")
    :wait nil :input :stream :output :stream :error :output))

It looks like it kicks off the program but then the parser dies. I can´t really express everything that´s going on because this text window keeps formatting my text into something else. At any rate, this does not happen with other programs I try to run:

(defvar *g* (sb-ext:run-program "/usr/bin/gnuplot" nil
                                :wait nil
                                :input :stream
                                :output :stream
                                :error :output))

In this case, the program (gnuplot) keeps running.

I´m wondering if this is because it just takes so long for the Stanford Parser to start that lisp gives up on it.

If anybody has any insights into that, I´d be thrilled. It would be the ideal way to talk to the Stanford Parser from within Lisp. Otherwise, I might have a perfectly valid workaround which is to kick off the parser with its input coming from, and output going to, named pipes in the filesystem. This must happen with the command line options above, since the program must be in interactive mode (the parser creates a different type of output if it is not in interactive mode)

This, though, moves a bit off-topic into a Unix question, so this is just if anybody is an expert:

Supposing I had an inpipe and outpipe in the CoreNLP directory, what would be my command line to kick off the parser so its input and output would be connected to the program´s stdin and stdout respectively? Are there any steps I can take (at that point) to make sure that I don´t run into buffering problems later when I access the pipes from within a Lisp program?

Does anybody have any ideas on how to talk to the Stanford Parser from within lisp?

Any insights are appreciated, as always.

-Todd

Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
Todd Pierce
  • 161
  • 7
  • "this text window keeps formatting my text into something else" - It uses [markdown](https://stackoverflow.com/editing-help). You can put 4 spaces to the beginning of the line to format it as code. – jkiiski Aug 26 '16 at 05:11
  • The `run-program` returns a process object. In order to get the error from that object, you can try this in the REPL: `(uiop:copy-stream-to-stream (sb-ext:process-error *) *standard-output*)`, where `*` represents the process object. I don't see a reason why what you are doing shouldn't work, but alternatively, you could try to interface with the Java world with ABCL (Armed Bear Common Lisp). – coredump Aug 26 '16 at 05:19

1 Answers1

4

I recommend you to use inferior-shell for executing commands in common lisp.

I never used standford-parser. so I installed it on my Mac whit homebrew, then I can use it as a command line:

 2016-08-26 09:04:06 ☆ |ruby-2.2.3@laguna| Antonios-MBP in ~/learn/lisp/cl-l/stackoverflow/scripts
± |master ?:2 ✗| → lexparser.sh text.txt
[main] INFO edu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
 done [0.6 sec].
Parsing file: text.txt
Parsing [sent. 1 len. 42]: The strongest rain ever recorded in India shut down the financial hub of Mumbai , snapped communication lines , closed airports and forced thousands of people to sleep in their offices or walk home during the night , officials said today .
(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)

Parsed file: text.txt [1 sentences].
Parsed 42 words in 1 sentences (18.00 wds/sec; 0.43 sents/sec).

really this execute a shell script whit is, a java command essentially:

 2016-08-26 09:04:24 ☆ |ruby-2.2.3@laguna| Antonios-MBP in ~/learn/lisp/cl-l/stackoverflow/scripts
± |master ?:2 ✗| → cat /usr/local/Cellar/stanford-parser/3.6.0/libexec/lexparser.sh
#!/usr/bin/env bash
#
# Runs the English PCFG parser on one or more files, printing trees only

if [ ! $# -ge 1 ]; then
  echo Usage: `basename $0` 'file(s)'
  echo
  exit
fi

scriptdir=`dirname $0`

java -mx150m -cp "$scriptdir/*:" edu.stanford.nlp.parser.lexparser.LexicalizedParser \
 -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz $*

then I have all the things so execute it with common lisp:

first install it with quicklisp:

CL-USER> (ql:quickload 'inferior-shell)
To load "inferior-shell":
  Load 1 ASDF system:
    inferior-shell
; Loading "inferior-shell"

(INFERIOR-SHELL)

Then try if it works:

CL-USER> (inferior-shell:run/ss '(lexparser.sh))
"Usage: lexparser.sh file(s)
"
NIL
0

perfect it executes the lexparser and return a string with the standard-output, nil for standard error, and 0 for the execution program.

finally prepare a text, I choose the sample from their web:

text.txt:

The strongest rain ever recorded in India shut down the financial hub of Mumbai, snapped communication lines, closed airports and forced thousands of people to sleep in their offices or walk home during the night, officials said today.

and then when I execute it.

CL-USER> (inferior-shell:run/ss '(lexparser.sh text.txt))
"(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)
"
NIL
0

or I can put the resul in a list:

CL-USER> (multiple-value-list (inferior-shell:run/ss '(lexparser.sh text.txt)))
("(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)
" NIL 0)

remember that this program uses java 8, and I'm using standford-parser 3.6.0

anquegi
  • 11,125
  • 4
  • 51
  • 67
  • Awesome idea! But I´m trying to download inferior shell (or any of the files) from https://gitlab.common-lisp.net/qitab/inferior-shell/tree/master but the site doesn´t work. Any ideas where to get it? – Todd Pierce Aug 28 '16 at 13:41
  • Okay, nevermind. I got it installed. However, I´m a bit scared about this in the documentation: First, inferior-shell at this point only supports synchronous execution of sub-processes. Does that mean it can´t be used interactively? I´m looking for back and forth pipes so I can submit text to the program and retrieve the output... not just run it once. Will be fooling around with it here... – Todd Pierce Aug 28 '16 at 14:08
  • @ToddPierce, the best way to download inferior-shell is using quicklisp https://www.quicklisp.org/beta/, and you can use it with pipes take a look here http://stackoverflow.com/questions/32703224/how-to-use-sbcl-to-use-shell-command – anquegi Aug 29 '16 at 06:17
  • So, I installed inferior-shell. It can run the program, but now I don´t know how to interact with the program. The way I´d like it to be is that the program is running in the background with stdin and stdout bound to streams in lisp. I don´t see anything like that in the documentation. Anybody have any examples of that? – Todd Pierce Aug 31 '16 at 15:07
  • still trying to figure this one out. I have looked into just using UIOP but there are not any examples of anybody maintaining the stdin and stdout of the child process so the Lisp program can send/receive data to/from it. – Todd Pierce Sep 07 '16 at 19:04
  • why you not try to work with files, then read this files as a stream in lisp. for the output redirect the parser to a file. Other option is working by pieces of the input and send everytime the data. I do not understand how do you want to work with the parser – anquegi Sep 08 '16 at 15:33