1

I recently asked a question on how to apply a function on data frames inside a list. Hereby I show the link, where it worked perfectly executing the answer received in the post.

Apply a function to a List of dataframes in R

Where in this example it was only two dataframes of size 6x6 in the list.

As I tried to replicate the output for my list, I get the following error.

Error in matrix(r, nrow = len.r, ncol = count) : 
 invalid 'ncol' value (too large or NA)
 In addition: Warning message:
 In combn(unique(x$id), 2) : NAs introduced by coercion to integer range

My List is basically a big data frame 2328439 signatures of 11 variables divided in chunks making a list of Large list 6236 elements, 3.5Gb

I basically want to pair up all the possible combinations of them and compare them side by side, but since it is huge, I decided to try and group them, so the data is divided in chunks, which are different data frames to be paired.

If we consider the signatures data frame, before dividing it in chunks, it would be like this:

> ids <- combn(unique(signatures$uniqueid),2)
Error in combn(unique(signatures$uniqueid), 2) : n < m

So this code works for a small dataset,( Reference: R Generate non repeating pairs in dataframe) but as I tried it on my big data frame I got the previous error.

Any suggestions?

Community
  • 1
  • 1
Saul Garcia
  • 890
  • 2
  • 9
  • 22
  • How many unique values are there for `uniqueid`? – Roman Luštrik Mar 18 '16 at 11:36
  • There are `2328439 unique_id`, which means that there are these amount of signatures. When splitting it into blocks, I get 6236 different data frames, with different amount of `unique_id` – Saul Garcia Mar 18 '16 at 12:03
  • That's a lot of pairwise comparisons. Makes me wonder why this is even needed? – Roman Luštrik Mar 18 '16 at 12:05
  • As a project for my Masters program, we are trying an author disambiguation.. So I guess is the method I have thought to compare authors distances.. But I mean, I might be wrong, I am just trying to look if this method could work. – Saul Garcia Mar 18 '16 at 14:46

0 Answers0