0

Probably a very simple solution but I'm quite new to R and recently finished some elementry Datacamp courses. I'm trying to import hourly data from a csv file and convert it to an xts object.

First I tried to directly read is as a zoo object but this did not work. Then I found a way to first extract the date + time containing column with as.POSIXct and use that to order the dataframe by. But this gives the following error.

Code I tried:

> btch <- read.csv(file = "btch.csv", header = T, sep = ";")
> head(btch)
              date  symbol  open  high   low close             Volume.BTC      Volume.USD
1 22-11-2021 00:00 BTC/USD 58671 58793 58025 58255  4.106.503.846.330.780 239.224.381.568
2 21-11-2021 23:00 BTC/USD 59163 59322 58575 58671    494.530.646.266.469 290.146.075.471
3 21-11-2021 22:00 BTC/USD 59540 59571 58971 59163 22.018.707.614.894.400 130.269.279.862
4 21-11-2021 21:00 BTC/USD 59647 59703 59432 59540 16.367.382.035.438.300  97.451.392.639
5 21-11-2021 20:00 BTC/USD 59373 59734 59319 59647 25.563.673.617.449.300 152.479.644.026
6 21-11-2021 19:00 BTC/USD 59488 59611 59265 59373 17.191.861.324.170.900  10.207.323.824

> pt <- as.POSIXct(btch$date, format = "%d-%m-%Y %H:%M")
> btch_ts <- xts(btch[,-1], order.by = pt)

Error in xts(btch[, -1], order.by = pt) : 'order.by' cannot contain 'NA', 'NaN', or 'Inf'

The first 10 rows of my CSV file:

                            date  symbol  open  high   low close             Volume.BTC      Volume.USD
1  22-11-2021 00:00 BTC/USD 58671 58793 58025 58255  4.106.503.846.330.780 239.224.381.568
2  21-11-2021 23:00 BTC/USD 59163 59322 58575 58671    494.530.646.266.469 290.146.075.471
3  21-11-2021 22:00 BTC/USD 59540 59571 58971 59163 22.018.707.614.894.400 130.269.279.862
4  21-11-2021 21:00 BTC/USD 59647 59703 59432 59540 16.367.382.035.438.300  97.451.392.639
5  21-11-2021 20:00 BTC/USD 59373 59734 59319 59647 25.563.673.617.449.300 152.479.644.026
6  21-11-2021 19:00 BTC/USD 59488 59611 59265 59373 17.191.861.324.170.900  10.207.323.824
7  21-11-2021 18:00 BTC/USD 59516 59591 59324 59488 20.877.146.473.238.200  12.419.396.894
8  21-11-2021 17:00 BTC/USD 59908 60067 59469 59516  4.256.875.185.395.520 253.352.183.534
9  21-11-2021 16:00 BTC/USD 59258 60000 59183 59912  4.848.661.867.906.260  29.049.302.983
10 21-11-2021 15:00 BTC/USD 58948 59315 58852 59258 24.230.832.403.726.000 143.587.066.658

Ok so an update on the much smaller csv file which only contains 5 rows to try an reproduce it from there.

> btchs <- read.csv(file = "btch_small.csv", header = T, sep = ";")
> head(btchs, n = 8)
              date  symbol  open  high   low close             Volume.BTC      Volume.USD
1 22-11-2021 00:00 BTC/USD 58671 58793 58025 58255  4.106.503.846.330.780 239.224.381.568
2 21-11-2021 23:00 BTC/USD 59163 59322 58575 58671    494.530.646.266.469 290.146.075.471
3 21-11-2021 22:00 BTC/USD 59540 59571 58971 59163 22.018.707.614.894.400 130.269.279.862
4 21-11-2021 21:00 BTC/USD 59647 59703 59432 59540 16.367.382.035.438.300  97.451.392.639
> index <- as.POSIXct(btchs$date, format = "%d-%m-%Y %H:%M")
> btchs_ts <- xts(btchs[,-1], order.by = index)
> head(btchs_ts)
                    symbol    open    high    low     close   Volume.BTC               Volume.USD       
2021-11-21 21:00:00 "BTC/USD" "59647" "59703" "59432" "59540" "16.367.382.035.438.300" "97.451.392.639" 
2021-11-21 22:00:00 "BTC/USD" "59540" "59571" "58971" "59163" "22.018.707.614.894.400" "130.269.279.862"
2021-11-21 23:00:00 "BTC/USD" "59163" "59322" "58575" "58671" "494.530.646.266.469"    "290.146.075.471"
2021-11-22 00:00:00 "BTC/USD" "58671" "58793" "58025" "58255" "4.106.503.846.330.780"  "239.224.381.568"

Now my other values all changed to characters. Don't understand why this happened.

  • The CSV is a download from FTX. – Kurtosis78 Nov 26 '21 at 10:59
  • The error message is pretty clear. You have NAs, NANs, or Infs in `pt`. If you are not interested in knowing why these NAs are there, you could try something like `btch_ts <- xts(btch[!is.na(pt),-1], order.by = pt[!is.na(pt)])`. – djas Nov 26 '21 at 11:23
  • @djas, thanks but then the number of rows doesn't match. `btch_ts <- xts(btch[!is.na(pt),-1], order.by = pt[!is.na(pt)])` Error in xts(btch[!is.na(pt), -1], order.by = pt[!is.na(pt)]) : NROW(x) must match length(order.by) ` – Kurtosis78 Nov 26 '21 at 11:34
  • It can be hard to nail this without a reproducible example. But the idea is to remove the NAs, NANs, and Infs from `pt`, and the corresponding rows in `btch`, before passing them to `xts()`. If you didn't expect this column to have NAs & co, it would be a good idea to first understand what is happening, instead of blindly removing these problematic values. – djas Nov 26 '21 at 11:41
  • The input shown at the end of the question does not correspond to the read.csv command. It has no semicolons. – G. Grothendieck Nov 26 '21 at 13:31
  • @djas Tried locating the NA value(s), so far without success. ```> which(is.na(pt)) integer(0)```. Think I will delete 99% of my data from the csv file to see if I can make it work for a much smaller dataset. – Kurtosis78 Nov 26 '21 at 13:37
  • According to the error message, the offending data points may also be NANs or Infs. Just keep that in mind. But yes, it is a good idea to start working through your problem with controlled data. – djas Nov 26 '21 at 13:40
  • @G. Grothendieck: I seem to have pasted data incorrectly. The later copied in smaller dataframe is correct. – Kurtosis78 Nov 26 '21 at 13:46
  • @djas Yes I also searched for NANs / Infs but there do not seem to be any. With the smaller dataset it does work to create an xts. Only all my other data has been transformed to text. I of course don't want this and don't understand why it happened . – Kurtosis78 Nov 26 '21 at 13:49
  • It still has no semicolons but your read.csv says it does. Something is wrong. Please provide the correct input before being read by R. – G. Grothendieck Nov 26 '21 at 13:56
  • @G. Grothendieck; can it be excel messes up the CSV file? Interpretting date format and changing the sepparators for instance? Will try downloading the datafile again without opening it in excel. Problem is the source website puts lines above it which I need to remove before importing. – Kurtosis78 Nov 27 '21 at 14:25

0 Answers0