0

This works fine in R Studio.

library(stringdist)
result <- afind(input1, input2, method="cosine")
distance <- result[2]
real_distance <- distance[[1]][1]
output <- real_distance

When I add it as a column expression in Spotfire, where input1 is a column and input 2 is a document property, I get a cryptic error message.

I am not sure if I can use external libraries in custom column R expressions in Spotfire, and am not sure what the issue is?

Error is:

TIBCO Enterprise Runtime for R returned an error (4)

The expression function 'SearchDistance' could not be executed.

Error in .Call() : dims [product 1290496000000] do not match the length of object [2005811200] eval(script, envir = .GlobalEnv) eval(script, envir = .GlobalEnv) withCallingHandlers({ afind(input1, input2, method = "cosine") .Call()

UPDATE***

If I enable TERR debugging I get this:

Data function 'CalculateSearchScore' debug output (3)

Unmarshalling 2 input parameters. Input 'input1', sent by inline XML chr [1:1136000] "MYALGIA FEVER" "FATIGUE" "ASTHENIA" "CYSTITIS" ...

  • attr(*, "SpotfireColumnMetaData")=List of 1 .. $ Description: chr "" Input 'input2', sent by inline XML chr [1:1136000] "NAUSEA" "NAUSEA" "NAUSEA" "NAUSEA" ...
  • attr(*, "SpotfireColumnMetaData")=List of 1 .. $ Description: chr "" Done unmarshalling input parameters. Loading required package: stringdist

My column formula is as below:

SearchDistance([medical_term],"${MedicalTerm}")

I had assumed when you use this in a calculated column that it would run once per row, passing in each column value independently...perhaps what it is doing is passing in a vector containing the entire set of column values?

SOLUTION*****

This works:

library(stringdist)

calculate_match_score <- function(target, pattern) {    
    print(target[1])
    result <- afind(target, pattern, method="cosine")
    distance <- result[2]
    return(distance[[1]][,1])
}

output <- calculate_match_score(target = input1, pattern = input2[1])
smackenzie
  • 2,880
  • 7
  • 46
  • 99
  • it works for me, both in RStudio (pointing to TERR) and in Spotfire. The only thing is that real_distance should return a column for an expression function, so I would remove the last [1]. I called with input1 the name of a column and input2 the document property value (surrounded by quotes) You could try it as a data function (not expression function) and see if it works for you. If it does not, then maybe check your TERR version compatibility with stringdist. – Gaia Paolini Aug 01 '22 at 06:57
  • what do you mean remove the last [1]. I need to return a single Real value, and this is how I do it, returning a list wont work as a column expression? You cant use data functions as column expressions though, do you mean just to test TERR? Also, does stringdist need to be on the TERR server, if I have installed locally and am using the client locally for now? I did test a data function and it did return a table called output with a value in it. – smackenzie Aug 01 '22 at 08:08
  • @GaiaPaolini I updated the post with the TERR debug output, and how I have defined my custom column formula – smackenzie Aug 01 '22 at 08:28
  • Yes in expression functions you pass an entire column and return an entire column. That's why I asked if you could remove the [1]. In your example, you are sending a column of 1136000 values and the input2 is also vectorized to 1136000 values (all equal to NAUSEA). Which seems fine. My suggestion to use a data function was to rule out problems with TERR itself, rather than with the way the expression function is called. My example works so I don't understand why yours does not, unfortunately. – Gaia Paolini Aug 01 '22 at 08:58
  • mine is defined as function type=Column Function, return type=Real and category=Statistical Functions – Gaia Paolini Aug 01 '22 at 09:03
  • @GaiaPaolini so what aFind returns if I send in 1000 items (1000 items in the data), is actually a 1000 x 1000 matrix, so not sure how yours is working. It matches each column, against a vector of 1000 "NAUSEA" values, and so returns a 1000x1000 matrix of results – smackenzie Aug 01 '22 at 09:10
  • can you try input2=input2[1] before you call afind? – Gaia Paolini Aug 01 '22 at 09:19
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/246939/discussion-between-smackenzie-and-gaia-paolini). – smackenzie Aug 01 '22 at 09:27

0 Answers0