I'm asking this as a general/beginner question about R
, not specific to the package I was using.
I have a dataframe
with 3 million rows and 15 columns. I don't consider this a huge dataframe, but maybe I'm wrong.
I was running the following script and it's been running for 2+ hours - I imagine there must be something I can do to speed this up.
Code:
ddply(orders, .(ClientID), NumOrders=len(OrderID))
This is not an overly intensive script, or again, I don't think it is.
In a database, you could add an index to a table to increase join speed. Is there a similar action in R
I should be doing on import to make functions/packages run faster?