3

Basically the title. I need to parse a lot of CSVs into a dataframe, then shove them into a sql database. I've already written the relevant parsing code for a single data frame, and there's plenty of documentation on the site regarding how to handle going from a data frame to a sqllite file, but I need to merge the dataframes into one DB, and I figure the easiest way to do that will be via an Rbind equivalent.

Overall, I've tried concatenate, stack-rows, and a few other forms both on the sequence/simple-vector side of things, and the lisp-stat side of things with little to no luck, so any ideas would be seriously helpful.

johnabs
  • 41
  • 4

1 Answers1

3

The way to do that is with the array-operations stacking functions. Here's an example using one of the built-in data sets:

(data 'mtcars) ; load the data frame
(tail mtcars) ; view the last few rows

;;   MODEL           MPG CYL DISP  HP DRAT    WT QSEC VS AM GEAR CARB
;; 0 Lotus Europa   30.4   4 95.1 113 3.77 1.513 16.9  1  1    5    2
;; 1 Ford Pantera L 15.8   8 351.0 264 4.22 3.170 14.5  0  1    5    4
;; 2 Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.5  0  1    5    6
;; 3 Maserati Bora  15.0   8 301.0 335 3.54 3.570 14.6  0  1    5    8
;; 4 Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2

Create a new row:

(defparameter boss-mustang #("Boss Mustang" 12.7d0 8 302 405 4.11d0 2.77d0 12.5d0 0 1 4 4)) ; a new row for the Ford Mustang

and add it using the stack-rows function:

(matrix-df
 (keys mtcars)
 (aops:stack-rows mtcars boss-mustang))
LS-USER> (tail *)

;;   MODEL           MPG CYL DISP  HP DRAT    WT QSEC VS AM GEAR CARB
;; 0 Lotus Europa   30.4   4 95.1 113 3.77 1.513 16.9  1  1    5    2
;; 1 Ford Pantera L 15.8   8 351.0 264 4.22 3.170 14.5  0  1    5    4
;; 2 Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.5  0  1    5    6
;; 3 Maserati Bora  15.0   8 301.0 335 3.54 3.570 14.6  0  1    5    8
;; 4 Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2
;; 5 Boss Mustang   12.7   8 302.0 405 4.11 2.770 12.5  0  1    4    4
NIL
CL-USER
  • 715
  • 3
  • 9
  • Oh okay I tried this but I didn't use the (as-array) function for the original dataframe. So let's say I want to take multiple complete dataframe and stack them, would the rough idea be something like this? – johnabs Oct 09 '22 at 20:29
  • Okay, I coudn't edit my last comment by accident, but I tried this and can't get it to work well. Assume df_list = (cons df1 df2) and so forth (or something similar) What I wanted to do was: (matrix-df (keys (first df_list)) (apply #'aops:stack-rows df_list)) The first part (getting the keys) works fine, but the apply function fails with an error: Attempt to use VALUES-LIST on dotted list. Any ideas on how I can get this part working? – johnabs Oct 09 '22 at 21:25
  • You can add multiple matrices with stack-rows. You could use the given example to create a function that adds a single data frame and then map over all of them. In your example, you're creating a dotted list. To create the list of data frames, use df-list = (list df1 df2). Cons creates a cons cell; also, to use apply, you'll need to ensure the data frames are converted to matrices. I suggest using the first option. – CL-USER Oct 10 '22 at 12:38