0

We keep all our laboratory data in a Sybase database. When I want to do data manipulation and analysis I read the data into R with RODBC.

library(RODBC)
channellab <- odbcConnect("Labdata")
indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen')

So far so good, except that CGS.Specimen is a table for our entire lab holdings. There are almost 40000 specimens (rows) with 66 variables. It takes an unnecessary amount of time to read especially when I am only interested in the holdings of one study which has about 1000 specimens. The obvious way to fix it would be

indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode="RP"')  

But when I run that I get

[1] "42S22 -143 [Sybase][ODBC Driver][Adaptive Server Anywhere]Column not found: Column 'RP' not found" "[RODBC] ERROR: Could not SQLExecDirect 'SELECT * from CGS.Specimen WHERE StudyCode=\"RP\"'"

I have played with quotation marks and I have played with escape slashes but alas I have gotten nowhere.

for instance

indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode= 'RP'  ')

Error: unexpected symbol in "indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode= 'RP"

indivs <-sqlQuery(channellab,"SELECT * from CGS.Specimen WHERE StudyCode= 'RP'  ")

Error in .Call(C_RODBCFetchRows, attr(channel, "handle_ptr"), max, buffsize, : negative length vectors are not allowed

What do you suggest?

Danny Beckett
  • 20,529
  • 24
  • 107
  • 134
Farrel
  • 10,244
  • 19
  • 61
  • 99
  • Does the problem revolve around the single quotation marks around the sql statement getting messed up with the quotation marks around the variable value? – Farrel Sep 01 '10 at 21:12
  • This error is due to the double quotes around RP. I'm confused have you tried the single quotes in @Preet's answer? – Mark Sep 01 '10 at 22:15
  • Yes. I have tried Preet's suggestions. They do not work. – Farrel Sep 01 '10 at 22:53

2 Answers2

4

The last error you report is from the .Call function to the c function RODBCFetchRows. Thus your problem with the quotation marks appears to be solved. However, with the said error you still have a problem. Perhaps the rows are not correctly reported (the c function tries to allocate vectors based on this value, but does not check for negative values). Try:

indivs <-sqlQuery(channellab,"SELECT * from CGS.Specimen WHERE StudyCode= 'RP'", believeNRows = FALSE)
eyjo
  • 1,180
  • 6
  • 8
  • I tried running your line. No joy. > indivs [1] "42000 -131 [Sybase][ODBC Driver][Adaptive Server Anywhere]Syntax error or access violation: near ',' in ...StudyCode= 'RP'[,] believeNRows =..." [2] "[RODBC] ERROR: Could not SQLExecDirect 'SELECT * from CGS.Specimen WHERE StudyCode= 'RP', believeNRows = FALSE'" > str(indivs) chr [1:2] "42000 -131 [Sybase][ODBC Driver][Adaptive Server Anywhere]Syntax error or access violation: near ',' in ...StudyCode= 'RP'[,] b"| __truncated__ ... – Farrel Sep 01 '10 at 23:58
  • Ooops, a little typo there ... the believeNRows = FALSE should not have been in the query string. Fixed it. – eyjo Sep 02 '10 at 00:05
  • Mazal Tov! It works. Why do you think the number of rows returned by the ODBC connection is messed up. I guess it does not matter as long as we get it to work. – Farrel Sep 02 '10 at 13:21
  • The manual talks of some Sybase drivers not reporting believable number of rows (same for some Oracle drivers and SQLight for Mac OS X). I don't know if the driver could be fixed (i.e. updated), but you might be able to increase performance by adjusting buffsize. – eyjo Sep 02 '10 at 23:10
2

Edited:

indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode=\'RP\'')  

Ok can you run another query where some data is returned?

WHERE StudyCode=StudyCode for example? Does this give the same error about -ve lengths?

this means that the above is correct for escaping the StudyCode. The -Ve lenght is another error. Can you return not all columns with * but specific columns where there are no nulls?

Preet Sangha
  • 64,563
  • 18
  • 145
  • 216
  • It did not work. I got Error: unexpected string constant in "indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode=''RP'" – Farrel Sep 01 '10 at 20:58
  • It still does not work. Error in .Call(C_RODBCFetchRows, attr(channel, "handle_ptr"), max, buffsize, : negative length vectors are not allowed I do not have a clue what all that means but it does not work. What else can we try. – Farrel Sep 01 '10 at 21:48
  • I tried indivs <-sqlQuery(channellab,'SELECT * from CGS.Specimen WHERE StudyCode=StudyCode') and got Error in .Call(C_RODBCFetchRows, attr(channel, "handle_ptr"), max, buffsize, : negative length vectors are not allowed – Farrel Sep 01 '10 at 22:24
  • I do not know if I understood your additional question, "Can you return not * specific columns where there are no nulls?" You may have meant that I run a line such as indivs <-sqlQuery(channellab,'SELECT SubjLastName, SubjSex, StudyCode from CGS.Specimen WHERE StudyCode=\'RP\''). I still got an error. Error in .Call(C_RODBCFetchRows, attr(channel, "handle_ptr"), max, buffsize, : negative length vectors are not allowed – Farrel Sep 01 '10 at 23:33
  • Some real time voip or im may be in order. Do you use gmail or skype? If so I am fjbuch on both. – Farrel Sep 01 '10 at 23:35