1

I have a data frame, Returns which looks something like this:

Date         Company            LstPrice    r
1987-02-27   NOVO NORDISK 'B'   2.29        0.031531532
1987-03-31   NOVO NORDISK 'B'   2.33        0.017467249
1987-04-30   NOVO NORDISK 'B'   2.25        -0.034334764
1987-05-29   NOVO NORDISK 'B'   2.22        -0.013333333
1987-06-30   NOVO NORDISK 'B'   2.47        0.1126126137
1987-07-31   NOVO NORDISK 'B'   2.46        -0.004048583
1987-08-31   NOVO NORDISK 'B'   1.98        -0.195121951
1987-09-30   NOVO NORDISK 'B'   1.90        -0.040404040
1987-02-27   DANSKE BANK        24.29       -0.130637079
1987-03-31   DANSKE BANK        24.97       0.027995060
1987-04-30   DANSKE BANK        25.43       0.018422107
1987-05-29   DANSKE BANK        26.19       0.029885961
1987-06-30   DANSKE BANK        26.50       0.011836579
1987-07-31   DANSKE BANK        26.57       0.002641509
1987-08-31   DANSKE BANK        28.55       0.074520135
1987-09-30   DANSKE BANK        26.25       -0.080560420

I would want to create new data frames for different months. For example, I would want a new data frame with the observations for the first three months, a new data frame for the next three months, and so on. They would look something like this:

Data Frame, FirstThreeMonths:

Date         Company            LstPrice    r
1987-02-27   NOVO NORDISK 'B'   2.29        0.031531532
1987-03-31   NOVO NORDISK 'B'   2.33        0.017467249
1987-04-30   NOVO NORDISK 'B'   2.25        -0.034334764
1987-02-27   DANSKE BANK        24.29       -0.130637079
1987-03-31   DANSKE BANK        24.97       0.027995060
1987-04-30   DANSKE BANK        25.43       0.018422107

Data Frame, NextThreeMonths:

Date         Company            LstPrice    r
1987-05-29   NOVO NORDISK 'B'   2.22        -0.013333333
1987-06-30   NOVO NORDISK 'B'   2.47        0.1126126137
1987-07-31   NOVO NORDISK 'B'   2.46        -0.004048583
1987-05-29   DANSKE BANK        26.19       0.029885961
1987-06-30   DANSKE BANK        26.50       0.011836579
1987-07-31   DANSKE BANK        26.57       0.002641509

....and so on (I have data for approx. 2200 companies for the last 30 years, so I will have to create a lot of data frames).

I have tried several different ways, both using if and for loops, and the subset command, but so far I can't get any of them to work. I also tried searching for similar questions, but couldn't find a solution that works for my exact problem. Is there an easy way to do something like this.

Every effort to help is much appreciated!

  • Look at `split`, and `cut` – Sotos Sep 22 '17 at 08:52
  • Why do you want to make several `data.frame`s? For sending elsewhere, or to apply a function to each `data.frame`? You could probably go a long way creating a simple group ID for each (set of) month(s) you want grouped together, then using `dplyr::group_by()` and/or `split()` as already suggested to get to where you need to be. – Jim Leach Sep 22 '17 at 09:08
  • I'm not completely sure that I have to make several data frames, but with my limited knowledge of R, this seemed like the easiest solution for me. I said I only needed to do this for the first three months, next three months, etc. but that was really just an example. I also need to do the same for six months, nine months and twelve months. I'll start by giving simple group ID a try, and see where that get's me. Thanks – Vegard Dyran Sep 22 '17 at 09:16
  • if you still want to split in multiple data frames try looking at [this](https://stackoverflow.com/questions/18527051/split-a-large-dataframe-into-a-list-of-data-frames-based-on-common-value-in-colu). duplicate? – simone Sep 22 '17 at 09:23
  • @Sotos and @Jim Leach I tried `df <- as.data.frame(split.Date(Returns, 1987-02-27, drop = TRUE))` but the new df is exactely equal to the old one (Returns) with the exception that the headers have weird names. As you may have guessed, I am quite new to R, so I may be way off here – Vegard Dyran Sep 22 '17 at 10:04
  • i posted a solution; what do you think of it? – h3rm4n Sep 22 '17 at 11:12

1 Answers1

1

You need to make a split vector first. For example:

splitter <- cut(as.integer(format(df$Date,'%m')),
                breaks = c(0,3,6,9,12),
                labels = c('First three','Second three','Third three','Fourth three'))

dflist <- split(df, splitter)

The result:

> dflist
$`First three`
         Date        Company LstPrice           r
1  1987-02-27 NOVO NORDISK B     2.29  0.03153153
2  1987-03-31 NOVO NORDISK B     2.33  0.01746725
9  1987-02-27    DANSKE BANK    24.29 -0.13063708
10 1987-03-31    DANSKE BANK    24.97  0.02799506

$`Second three`
         Date        Company LstPrice           r
3  1987-04-30 NOVO NORDISK B     2.25 -0.03433476
4  1987-05-29 NOVO NORDISK B     2.22 -0.01333333
5  1987-06-30 NOVO NORDISK B     2.47  0.11261261
11 1987-04-30    DANSKE BANK    25.43  0.01842211
12 1987-05-29    DANSKE BANK    26.19  0.02988596
13 1987-06-30    DANSKE BANK    26.50  0.01183658

$`Third three`
         Date        Company LstPrice            r
6  1987-07-31 NOVO NORDISK B     2.46 -0.004048583
7  1987-08-31 NOVO NORDISK B     1.98 -0.195121951
8  1987-09-30 NOVO NORDISK B     1.90 -0.040404040
14 1987-07-31    DANSKE BANK    26.57  0.002641509
15 1987-08-31    DANSKE BANK    28.55  0.074520135
16 1987-09-30    DANSKE BANK    26.25 -0.080560420

$`Fourth three`
[1] Date     Company  LstPrice r       
<0 rows> (or 0-length row.names)

Removing empty dataframes from that list can be done like this:

dflist <- split(df, splitter)
dflist <- dflist[sapply(dflist, nrow) > 0]
h3rm4n
  • 4,126
  • 15
  • 21