2

I am trying to run a spark application with yarn. The application uses the pipe() action to run a local php program. The wired thing is that every time the PHP process is started, it will receive a SIGPIPE signal about 1 minute later (after a few records processed successfully) and get terminated.

PS: the same program could run smoothly on the standalone os env, but goes wrong in production env ,which is a cluster.

Can anyone help what's the possible cause ? thanks.

write(1, "success\te10489280713f60626d15d20"..., 34564) = 26372
--- SIGPIPE (Broken pipe) @ 0 (0) ---
write(1, "ang__meta\":{\"value\":\"empty\",\"gro"..., 8192) = -1 EPIPE >(Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
--- SIGTERM (Terminated) @ 0 (0) ---

code is as below:
Scala side:

def main(args: Array[String]): Unit = {
    if (args.length < 2) {
      sys.exit(1)
    }
    val dt_src = args(0)
    val limit = args(1)
    var sql = s"SELECT * FROM db_dw.table_input_json  WHERE dt='${dt_src}'"
    val limit_str = s" LIMIT ${limit}"
    if (limit != "0") {
        sql = sql + limit_str
    }
val df = sqlContext.sql(sql)
var rdd = df.map(r =>  r.getAs[String]("json_str")).pipe("/home/work/software/php56/bin/php ./yiic php_file index"
}

PHP side:

public function actionIndex()
{
while ($line = fgets(STDIN)) {
         list($success, $json_str,$message) = $this->_handle($line);
         if(!$success) {
             echo "fail" . "\t"  . $message . "\n";
             continue;
         }
          echo "success" . "\t"  . $json_str . "\n"
}
Bill
  • 41
  • 3

0 Answers0