I am trying to run a spark application with yarn. The application uses the pipe()
action to run a local php program. The wired thing is that every time the PHP process is started, it will receive a SIGPIPE
signal about 1 minute later (after a few records processed successfully) and get terminated.
PS: the same program could run smoothly on the standalone os env, but goes wrong in production env ,which is a cluster.
Can anyone help what's the possible cause ? thanks.
write(1, "success\te10489280713f60626d15d20"..., 34564) = 26372
--- SIGPIPE (Broken pipe) @ 0 (0) ---
write(1, "ang__meta\":{\"value\":\"empty\",\"gro"..., 8192) = -1 EPIPE >(Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
--- SIGTERM (Terminated) @ 0 (0) ---
code is as below:
Scala side:
def main(args: Array[String]): Unit = {
if (args.length < 2) {
sys.exit(1)
}
val dt_src = args(0)
val limit = args(1)
var sql = s"SELECT * FROM db_dw.table_input_json WHERE dt='${dt_src}'"
val limit_str = s" LIMIT ${limit}"
if (limit != "0") {
sql = sql + limit_str
}
val df = sqlContext.sql(sql)
var rdd = df.map(r => r.getAs[String]("json_str")).pipe("/home/work/software/php56/bin/php ./yiic php_file index"
}
PHP side:
public function actionIndex()
{
while ($line = fgets(STDIN)) {
list($success, $json_str,$message) = $this->_handle($line);
if(!$success) {
echo "fail" . "\t" . $message . "\n";
continue;
}
echo "success" . "\t" . $json_str . "\n"
}