1

I'm trying to break some log files using R through 'rhdfs' and 'rmr2' packages. the source is in local Linux directory in cloud and the destination folder where I'm tried to store the file is in HDFS cluster.

The code was working fine until last night when it just stopped running throwing multiple errors that are different each time. the system configuration was same as before. I have tried running all the lines one by one and now the error I'm getting is,

# A fatal error has been detected by the Java Runtime 
Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fddc769ab15, pid=8466, tid=0x00007fddc9fa4940
#
# JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libc.so.6+0x82b15]

I'm running the following code and it's running fine till this,

filenames <- hdfs.ls("new")$file
f<-lapply(regmatches(filenames,regexec("/user/akashb/new/(.*)",filenames)), `[`, 2L)

  x <- from.dfs(filenames[1],format="text") $val
  tf <- as.character(f[1])
  dts <- paste("/user/akashb/new2/",tf, ".csv", sep = "")

the value in x is also showing fine. but then when I try to run:

x <- x[grep("ads.xxx.com",x,ignore.case = T,invert=T)]

the above error is showing.("A fatal error...") Also, when I was running the script through unix shell, the error generated is:

*** caught segfault ***
address 0x1d146408, cause 'memory not mapped'

I am not at all familiar with Java, so I can not diagonise the error at all. I tried some stackoverflow recommended methods like uninstalling and reinstalling all packages that are not base, but they were of no use. Any help will be appreciated.

0 Answers0