I was reading a RCurl document and came across a new piece of code:
stockReader =
function()
{
values <- numeric() # to which the data is appended when received
# Function that appends the values to the centrally stored vector
read = function(chunk) {
con = textConnection(chunk)
on.exit(close(con))
tmp = scan(con)
values <<- c(values, tmp)
}
list(read = read,
values = function() values # accessor to get result on completion
)
}
followed by
reader = stockReader()
getURL(’http://www.omegahat.org/RCurl/stockExample.dat’,
write = reader$read)
reader$values()
it says 'numeric' in the sample but surely this code sample can be adapted? Read the attached document. I'm sure you will find what you're looking for.
It also says
The basic use of getURL(), getForm() and postForm() returns the contents of the requested document as a single block of text. It is accumulated by the libcurl facilities and
combined into a single string. We then typically traverse the contents of the document to
extract the information into regular data, e.g. vectors and data frames. For example, suppose
the document we requested is a simple stream of numbers such as prices of a particular stock
at different time points. We would download the contents of the file, and then read it into
a vector in R so that we could analyze the values. Unfortunately, this results in essentially
two copies of the data residing in memory simultaneously. This can be prohibitive or at least
undesirable for large datasets.
An alternative approach is to process the data in chunks as it is received by libcurl. If we can
be notified each time libcurl receives data from the reply and do something meaningful with
the data, then we need not accumulate the chunks. The largest extra piece of information we
will need to have is the largest chunk. In our example, we could take each chunk and pass it
to the scan() function to turn the values into a vector. Then we can concatenate this with
the vector from the previously processed chunks.