2

I have recently started using JRI to run R code/script within Java. While most of the statements appear to work fine (such as simple assignments Test <- 123, and specific functions like source(...) read.csv(...), rpart(...), data.frame(...)), there is one function that would always return null: predict(...).

Specifically, I have been trying to run rengine.eval("prediction <- predict(fit, predict_entry, type = \"class\")"); where both "fit" and "predict_entry" are not null, and appear to contain valid values. Then, when I try to run rengine.eval("prediction"), the result is always null.

I am not sure if I missed some library path that causes the problem - please note that the same command runs fine directly on the RStudio console. The output of my java.library.path and R_HOME appears correct too:

System.getProperty("java.library.path"): C:\Users\...\Documents\R\win-library\3.1\rJava\jri\x64;C:\Program Files\R\R-3.1.1\bin\x64

System.getenv("R_HOME"): C:/Program Files/R/R-3.1.1

Does anybody have any suggestion as to what the issue might be? Please let me know.

Thanks!

EDIT: Here are some additional information I missed (thanks for pointing out BondedDust!)

  • my rpart() function came with base R, and was imported via library(rpart)
  • call that created "fit": fit <- rpart(Verdict ~ TestEvent1A + TestEvent1B + TestEvent2C, data=training_set, method="class") and training_set was read from a CSV file via read.csv(); Verdict, TestEvent1A, TestEvent1B, and TestEvent2C are column headings of that CSV file
  • very good call - both terms(fit) and str(predict_entry) return [NULL ] from rengine.eval(); however, fit and predict_entry alone return [VECTOR ([VECTOR ([FACTOR {levels=("<leaf>","TestEvent1A","TestEvent1B","TestEvent2C"),ids=(2,3,0,2,0,0,1,3,0,0,0)}], [INT* (500, 409, 329, 80, 26, 54, 91, 68, 33, 35, 23)], [REAL* (500.0, 409.0, 329.0, 80.0, 26.0, 54.0, ... and [VECTOR ([FACTOR {levels=("1"),ids=(0)}], [FACTOR {levels=("5"),ids=(0)}], [FACTOR {levels=("3"),ids=(0)}])] respectively - both containing data that I put in to test. Could this be the source of the problem?

EDIT#2: I tried running term(fit) and str(predict_entry) on the RStudio console, and got the following outputs (not NULL!)

> terms(fit)
Verdict ~ TestEvent1A + TestEvent1B + TestEvent2C
attr(,"variables")
list(Verdict, TestEvent1A, TestEvent1B, TestEvent2C)
attr(,"factors")
            TestEvent1A TestEvent1B TestEvent2C
Verdict               0           0           0
TestEvent1A           1           0           0
TestEvent1B           0           1           0
TestEvent2C           0           0           1
attr(,"term.labels")
[1] "TestEvent1A" "TestEvent1B" "TestEvent2C"
attr(,"order")
[1] 1 1 1
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: R_GlobalEnv>
attr(,"predvars")
list(Verdict, TestEvent1A, TestEvent1B, TestEvent2C)
attr(,"dataClasses")
    Verdict TestEvent1A TestEvent1B TestEvent2C 
  "numeric"   "numeric"   "numeric"   "numeric"

> str(predict_entry)
'data.frame':   1 obs. of  3 variables:
 $ TestEvent1A: num 1
 $ TestEvent1B: num 5
 $ TestEvent2C: num 3
user1589408
  • 21
  • 2
  • 6
  • You have not indicated which package the `rpart` function might have come from. You should post that information , and the call that created `fit`, as well as the results of `terms(fit)` and `str(predict_entry)` – IRTFM Jan 29 '15 at 20:21
  • Thank you for the comment, I will update my original post. – user1589408 Jan 29 '15 at 20:38
  • An `rpart::rpart` fit object should be a list object whose first element is 'frame', second is 'where' and you can look up the other names at `?'rpart.object'`. The item named 'levels' is supposed to be in the attributes attached to the object. – IRTFM Jan 29 '15 at 21:51
  • Thanks for the further assistance, BondedDust. Much appreciated. I am actually a little bit confused - I was under the assumption that the issue is with JRI, as the exact same statements could run in RStudio/R console successfully. I followed quite closely the syntax introduced in this tutorial: http://trevorstephens.com/post/72923766261/titanic-getting-started-with-r-part-3-decision. Do the statements need to be written differently when calling from Java via JRI? – user1589408 Jan 29 '15 at 22:48
  • I can't really tell what your code looks like and how your Java-client R-server is configured (and I'm by no means even an advanced intermediate with this. I as trying to describe what should be seen in the returned values. There is a mailing list where rJava and JRI topics are discussed. https://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel – IRTFM Jan 29 '15 at 23:34
  • Can you please post your full .java file (with main class)? its impossible to answer your problem this way – Yehoshaphat Schellekens Feb 04 '15 at 11:49

1 Answers1

0

Oh my, I have figured out the problem. A stupid one.

While I was preparing a compact version of my code in response to Yehoshaphat's comment, I decided to hard-code the values for TestEvent1A/1B/2C (i.e. rengine.eval("TestEvent1A <- 3")). And then all of a sudden, predict() worked. That is when I realized that I was doing this:

Matcher matcher = PAYLOAD.matcher(testEvent1A); if (matcher.find()) rengine.eval(String.format("TestEvent1A <- '%s'", matcher.group(1)));

when I should've been doing this:

Matcher matcher = PAYLOAD.matcher(testEvent1A); if (matcher.find()) rengine.eval(String.format("TestEvent1A <- %s", matcher.group(1)));

Spot the difference? I accidentally passed in String values into TestEvent1A/1B/2C with the single quotes, when I meant to put in integer/real values. Arghhghghghg.

Thanks for all of the help that you guys have provided, BondedDust and Yehoshaphat. Very much appreciated :)

user1589408
  • 21
  • 2
  • 6