0

I am using R and sparklyr process some data from Spark. I am reading two parquet files, in sequence, with

v1 <- spark_read_parquet(sc, "events","s3n://project/sessions.parquet", memory="true")
head(v1)
v2 <- spark_read_parquet(sc, "events","s3n://project/users.parquet", memory="true")
head(v1)
head(v2)

After the first read, head gives, correctly, the info that is in the table just read. After the second read, head(v1) gives the info that is in second table, not the first. Data is always correct in v2. Any hints?

user2345448
  • 159
  • 2
  • 11
  • I'm not sure how I can reproduce this. I have some doubts about your memory parameter thought. I believe it should be `memory=True` or `memory=T` and not `"true"`. There should be no brackets there. – eliasah Mar 01 '18 at 16:30
  • you are right, but it works this way too. – user2345448 Mar 01 '18 at 16:43

0 Answers0