0

TLDR; go to the answers below, I have provided what I have done to solve my problem.

I want to change unixtime in a read CSV file so that I can export this CSV file to matlab for a school project. I have actual unixtimes in that column that I want as dates (I have found questions on how to do it on stackoverflow but they are not working - what do I mean? The codes I have used below I found on stackoverflow, but I have no fundamental knowledge of R).

After this question I want to find OHLC of the data using another answer on stackoverflow. The problem is I cant get past the first stage.

These are imported data for price, time and volume.

     X33287 X1331992243 X4.985 X5.72E.00
 [1,]  33291  1331992243  4.985    1.0000
 [2,]  33291  1331992243  4.988    0.3300
 [3,]  33291  1331992243  4.990    1.0000
 [4,]  33291  1331992243  4.993    2.7800
 [5,]  33292  1331992243  4.998   13.5000
 [6,]  33293  1331992243  4.999    0.2660
......

Where the first column is number, second is unixdate time, 3rd price, 4th volume.

To get this far I used this: import using test<-read.csv(##FILELOCATION) test=as.matrix(test) ##I made a matrix from experimentation, I guess it was ##anyway

Then using

as.POSIXct(time,origin="1970-01-01") 

I can find the values of the actual date (yay!), but then I can't do anything with it.

Look

A=seq(1,10)*0
for (i in 1:10)
{
A[i]=as.vector(as.POSIXct(test[i,2],origin="1970-01-01"))
print(A[i])

}
[1] 1331992243
[1] 1331992243
[1] 1331992243
[1] 1331992243
[1] 1331992243
[1] 1331992243
[1] 1331992243
[1] 1331992281

It's supposed to be a date, not unixtime. Then when I as.POSIXct(A) it gives me the dates like this

[1] "2012-03-17 11:20:43 NDT" "2012-03-17 11:20:43 NDT" "2012-03-17 11:20:43     NDT" "2012-03-17 11:20:43 NDT"

[5] "2012-03-17 11:20:43 NDT" "2012-03-17 11:20:43 NDT" "2012-03-17

I've tried

x=seq(1,length(time))*0 

and also x=seq(1,40)*0 to set up a vector. for (i in 1:10) { x[i]<- as.POSIXct(time[i,2],origin="1970-01-01") print(x[i]) }

tells me i have incorrect dimensions.

also tried

A=seq(1,40)*0
for (i in 1:40)
{
A[i]=as.vector(as.POSIXct(test[i,2],origin="1970-01-01"))
print(A[i])

}

....again it just outputs Unix time, then when I type

A

....it just brings the list of numbers (num)

 [1] 1331992243 1331992243 1331992243 1331992243 1331992243 1331992243           1331992243 1331992281 1331993630
[10] 1331993693 1331993752 1331993754 1331994303 1331994884 1331998567 1331999674 1331999973 1331999984
[19] 1332002200 1332002326 1332002740 133200275

also tried this I saw in another question...also didnt work.

p=test[1:40,2]
A=seq(1,10)*0
for (i in 1:10)
{
A[i]=head(as.POSIXct(as.numeric(as.character(p[i])),origin="1970-01-01"))
print(A[i])

}

EDIT: using R: Assigning POSIXct class to a data frame

I did this:

text=read.csv("d:/test.csv")

   X33287 X1331992243 X4.985 X5.72E.00
1   33288  1331992243  4.985    1.0000
2   33289  1331992243  4.988    0.3300
3   33290  1331992243  4.990    1.0000
..............

then taking out extra column, after making it a matrix with ,
then remove the first useless column

textm=as.matix(text)
textx=textm[,1:3] #redundent oops

which gave me

     X1331992243 X4.985 X5.72E.00
[1,]  1331992243  4.985    1.0000
[2,]  1331992243  4.988    0.3300
[3,]  1331992243  4.990    1.0000
..............

then this works

myxts <- xts(testx[,2:4], order.by=as.POSIXct(testx[,1], format='%m/%d/%y %H:%M'))

##then you have your answer 
to.minutes(myxts)
                myxts.Open myxts.High myxts.Low myxts.Close
2012-03-17 11:20:43      4.985      4.999     4.985       4.999
2012-03-17 11:21:21      4.999      4.999     4.999       4.999
2012-03-17 11:43:50      4.907      4.907     4.907       4.907
2012-03-17 11:44:53      4.999      4.999     4.999       4.999

Now I just have to do this with 1000000 rows, then export it to CSV which matlab can read and start some data analysis. I'd like to continue with R, but only have a few days to finish some lagging indicator checks and a simple ANN.

Community
  • 1
  • 1
Alan
  • 13
  • 1
  • 5
  • 1
    Clean up your question. First you say you "cant" get the date, then its sounds like you _do_ get something which you do not describe. You also made an error with your data entry since the header names is clearly supposed to be the first row of data. There is no column named 'time'. And ... Why would you multiply a sequence by 0??? – IRTFM Dec 01 '15 at 22:25
  • Edited, cant should say CAN. I multiplied by 0 as I told by a classmate to make an empty row vector that will fill up after my For loop completes. It doesnt matter though as it stays in UNIX no matter what I do. When I am saying for "time" to be used, this is what I called the matrix after import of CSV. So time[1:40,2] for me means rows 1 to 40, in the second column. The root I want to do is take the CSV colomn, change it to dates with hours/min/sec, then make OHLC bars with a volume column. Then export and use in matlab. – Alan Dec 01 '15 at 22:50
  • 1
    https://cran.r-project.org/doc/manuals/R-intro.pdf Pay particular attention to the `rep`, `as.Date` and `read.table` illustrations and do read the help pages. – IRTFM Dec 01 '15 at 23:22
  • Ill do that shortly thanks – Alan Dec 01 '15 at 23:51
  • First I didnt read the whole document, did searches and read though the areas they were in (as.Date wasnt there but I searched elsewhere). read.table brough my csv in as an object but didnt help much. Likely i missed somthing due to limited understading... rep being 'replicate' didnt help me much, didnt know what or why I would use it. Will edit above to show what I have done in last hour or so. – Alan Dec 02 '15 at 03:14

2 Answers2

1

If you have a "unixtime" column of integer seconds since jan 1, 1970 in a dataframe named 'dfrm' with a col named time then the way to get dates is

as.Date(as.POSIXct( dfrm$time , origin="170-01-01"))

... But that's not what you are showing. You apparently have a headerless file and you are using read.csv which defaults to header=TRUE, unless you specifically tell it to not use headers with ... wait for it ... header=FALSE. If you don't have headers and do not give it colnames in the read.csv call then you will need to look at the file with head(dfrm) so that you can use the correct column name.

If you are using xts then I doubt that converting to Dates will make sense since it assumes unique index values.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanked your comment for trying to help, i like R, but no time to really learn and use it now. also like that its free. Not sure the header or not changed anything other than looking better in my excel file. :D – Alan Dec 02 '15 at 23:32
0

here is the csv to test https://www.dropbox.com/s/ej18jchoxf34b58/test.csv?dl=0 what I did

test<-read.csv("d:/test.csv" headers=FALSE) ##note it doesnt really matter in my case
test[,2]<- as.POSIXct(test[,2], origin='1970-01-01')
##dont know why but I needed to do this after some guess work, if not then later it will
##work properly... im a noob to R. Perhaps someone can explain or ill find it in the 
##manual later.
myxts<- xts(test[,3:4], order.by=as.POSIXct(test[,2], format='%m/%d/%y     %H:%M'))
to.minutes(myxts)

I dont see the difference in using headers or not. to export I did this write.zoo(xmat, file="/tmp/demo.csv", sep=",") ##or c:/ from can I write an xts object using write.csv in R

this then outputs as a CSV you can open in excel or import to matlab. For me as I have some exp in matlab and feel comfortable with many commands I prefer to use it. Though I do see how R is much "easier" to write in with the to.minutes or as.matrix, etc commands show.

Community
  • 1
  • 1
Alan
  • 13
  • 1
  • 5