2

We are creating the following dataframe inside an Rcpp function:

  Rcpp::DataFrame res =
    Rcpp::DataFrame::create(
     Rcpp::Named("A")=a
    ,Rcpp::Named("B")=b
    ,Rcpp::Named("C")=c
    ,Rcpp::Named("D")=d
    ,Rcpp::Named("E")=e
    ,Rcpp::Named("F")=f
    ,Rcpp::Named("G")=g
    ,Rcpp::Named("H")=h
    ,Rcpp::Named("I")=i
    ,Rcpp::Named("J")=j
    ,Rcpp::Named("K")=k
    ,Rcpp::Named("L")=l
    ,Rcpp::Named("M")=m
    ,Rcpp::Named("N")=n
    ,Rcpp::Named("O")=o
    ,Rcpp::Named("P")=p
    ,Rcpp::Named("Q")=q
    ,Rcpp::Named("R")=r
    ,Rcpp::Named("S")=s
    ,Rcpp::Named("T")=t
    ,Rcpp::Named("U")=u
  );

This dataframe is intended as a returned result. However it can't compile because of the following error:

error: no matching function for call to Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>::create
In file included from local/lib64/R/library/Rcpp/include/Rcpp/DataFrame.h:97:0,
                 from local/lib64/R/library/Rcpp/include/Rcpp.h:57,
                 from file54f121e6a937.cpp:1:
local/lib64/R/library/Rcpp/include/Rcpp/generated/DataFrame_generated.h:142:23: note: template<class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8, class T9, class T10, class T11, class T12, class T13, class T14, class T15, class T16, class T17, class T18, class T19, class T20> static Rcpp::DataFrame_Impl<StoragePolicy> Rcpp::DataFrame_Impl<StoragePolicy>::create(const T1&, const T2&, const T3&, const T4&, const T5&, const T6&, const T7&, const T8&, const T9&, const T10&, const T11&, const T12&, const T13&, const T14&, const T15&, const T16&, const T17&, const T18&, const T19&, const T20&) [with T1 = T1; T2 = T2; T3 = T3; T4 = T4; T5 = T5; T6 = T6; T7 = T7; T8 = T8; T9 = T9; T10 = T10; T11 = T11; T12 = T12; T13 = T13; T14 = T14; T15 = T15; T16 = T16; T17 = T17; T18 = T18; T19 = T19; T20 = T20; StoragePolicy = Rcpp::PreserveStorage]
 static DataFrame_Impl create( const T1& t1, const T2& t2, const T3& t3, const T4& t4, const T5& t5, const T6& t6, const T7& t7, const T8& t8, const T9& t9, const T10& t10, const T11& t11, const T12& t12, const T13& t13, const T14& t14, const T15& t15, const T16& t16, const T17& t17, const T18& t18, const T19& t19, const T20& t20 ) {
                       ^
local/lib64/R/library/Rcpp/include/Rcpp/generated/DataFrame_generated.h:142:23: note:   template argument deduction/substitution failed:
file54f121e6a937.cpp:771:7: note:   candidate expects 20 arguments, 21 provided

It works fine with 20 arguments. How do we overcome this problem? Thanks

Dimon
  • 436
  • 5
  • 15
  • It's essentially a FAQ. If you need more than twenty, nest them in a list object which element using up to twenty columns. – Dirk Eddelbuettel Sep 18 '17 at 18:18
  • @Dirk Eddelbuettel. Thank you. Does Rcpp::List::create also have limit of 20? If so then the cap to 20x20 = 400? – Dimon Sep 18 '17 at 18:38
  • It is a recursive data structure, so 20 x 20 x 20 x 20 x 20 ... – Dirk Eddelbuettel Sep 18 '17 at 19:36
  • @Dirk Eddelbuettel. Thank you. Wouldn't be more user-friendly to push this into Rcpp stack and make the macro limit parameterizable by user at compile time (anyways the user has to compile the code on each run). You have c++11 option for the compiler choice. So, this should not have any problem. – Dimon Sep 18 '17 at 21:04
  • We need to support key features on older compilers. But the architecture is extensible so _you_ easily contribute a package with a function to create lists with you as many elements as you desire. – Dirk Eddelbuettel Sep 18 '17 at 21:06
  • @Dirk Eddelbuettel. Ok :-). Another option I tried is to use `.push_back(data, "name")` after reaching the limit of 20 but that option changes the created data frame from List of 'data.frame's to List of Lists and breaks my R code. Why is this and how can I fix it? – Dimon Sep 18 '17 at 21:15

1 Answers1

3

Yes. This is covered in many different places...

Off the top of my head:

https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-FAQ.pdf#page=17

In essence, and in order to be able to compile it with the largest number of compilers, Rcpp is constrained by the older C++ standards which do not support variadic function arguments. So we actually use macros and code generator scripts to explicitly enumerate arguments, and that number has to stop at some limit. We chose 20.


The approach to use to create a data.frame with more than 20 columns is to build a list, then coerce to data.frame.

Sample code:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::List dynamic_df(Rcpp::DataFrame df) {

  // Number of variables in data.frame
  int num_vars = df.ncol();

  // Instantiate list with p variable entries
  Rcpp::List long_list(num_vars);

  // Make a variable to name columns
  Rcpp::CharacterVector namevec(num_vars);

  // Copy from data.frame into list.
  for (int i=0;i < num_vars; ++i) {
    long_list[i] = df(i); // Move vector from data frame to list
    namevec[i] = i;
  }

  // Add colnames
  long_list.attr("names") = namevec;

  // Coerce list to data.frame
  long_list.attr("row.names") = Rcpp::IntegerVector::create(NA_INTEGER, df.nrow());
  long_list.attr("class") = "data.frame";

  // Return result.. Will appear as data.frame
  return long_list;
}

/*** R

head(dynamic_df(mtcars))

*/

Output:

head(dynamic_df(mtcars))
#      0 1   2   3    4     5     6 7 8 9 10
# 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4  4
# 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4  4
# 3 22.8 4 108  93 3.85 2.320 18.61 1 1 4  1
# 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3  1
# 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3  2
# 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3  1

Though, you should really consider using the List_Builder class by Kevin in the duplicate entry.

c.f.

how many vectors can be added in DataFrame::create( vec1, vec2 ... )?

coatless
  • 20,011
  • 13
  • 69
  • 84