0

I have a Java aplication running on Google App Engine with version 1.9.36 of the SDK.The application is running Java 7 with access to Datastore, BigQuery and Cloud Storage. The container is a B8 class backend server with the tasks being submitted by a receiving servlet that does some basic checking and then submits a TaskQueue entry.

The problem that I am facing is that the application simply stops responding. The data is read from the BigQuery table using the the JobQueue. While checking for the Job status to go from RUNNING to DONE the application simply stops logging and processing. This number of wait iterations varies, as well as where the application stops.

It is not consistant, so can't pin it down to specific piece of code.

Some days the application runs for days without a problem, the one day I cannot even get one iteration to complete without stopping.

Data in the table varies between 600 and 6000 rows. I read in chunks of 2000 rows at a time.

Sometimes it reaches the "Job done - let process results" part and then simply stops. Sometimes it logs a couple of RUNNING messages and then stops.

This is the part where I setup the job parameters and then start the job. I have left ou tthe code in the do{}while loop. That is just processing bits, but works fine. The trouble seems to be with the retrieval of the data somewhere.

public List<String> retrieveMergedTableTxIds(Date runDate) throws ProcessorCustomException {

    List<String> existingIds = new ArrayList<>();

    BigQueryQryStringBuilder queryStringBuilder = new BigQueryQryStringBuilderImpl();

    String tempTableName = queryStringBuilder.buildTempTableName();

    String qryString;

    try {
        qryString = queryStringBuilder.buildMergeTableQuery(ApplicationConstants.projectName, ApplicationConstants.DATASET_ID,
                ApplicationConstants.TRXID_VIEW, runDate);

        logger.info("Query string is |" + qryString);

        JobQueryParameters jobQueryParameters = new JobQueryParameters();
        jobQueryParameters.setBigquery(bigquery);
        jobQueryParameters.setProjectId(ApplicationConstants.projectName);
        jobQueryParameters.setDatasetId(ApplicationConstants.DATASET_ID);
        jobQueryParameters.setQuerySql(qryString);
        jobQueryParameters.setTempTableName(tempTableName);


        JobReference jobId = startQuery(jobQueryParameters);

        logger.fine("JobID for submitted job is " + jobId.getJobId());
        logger.fine("Polling job for DONE status");

        TableReference completedJob = pollJobStatus(jobQueryParameters.getBigquery(), ApplicationConstants.projectName, jobId);

        logger.fine("Job done - let process results!");

        Job job = jobQueryParameters.getBigquery().jobs().get(ApplicationConstants.projectName, jobId.getJobId()).execute();

        logger.fine("JobID is " + job.getId());

        logger.fine("JobID is " + job.getId());

        GetQueryResultsResponse response = jobQueryParameters.getBigquery().jobs()
                .getQueryResults(ApplicationConstants.projectName, job.getJobReference().getJobId()).execute();

        logger.fine("Response total rows is " + response.getTotalRows());

        // Default to not looping
        boolean moreResults = false;
        String pageToken = null;

        do {
            logger.fine("Insize the per-token do-while loop");

            TableDataList queryResult = jobQueryParameters.getBigquery().tabledata()
                    .list(completedJob.getProjectId(), completedJob.getDatasetId(), completedJob.getTableId())
                    .setMaxResults(ApplicationConstants.MAX_RESULTS).setPageToken(pageToken).execute();

            logger.info("Value for isEmpty is " + queryResult.isEmpty());

            if (!queryResult.isEmpty() && queryResult != null) {
                logger.fine("Row size for token is " + queryResult.size());
            }


        } while (moreResults);
public TableReference pollJobStatus(Bigquery bigquery, String projectId, JobReference jobId) throws IOException,
        InterruptedException {

    while (true) {
        Job pollJob = bigquery.jobs().get(projectId, jobId.getJobId()).execute();

        logger.info("Job status for JobId " + pollJob.getId() + " is " + pollJob.getStatus().getState());

        if (pollJob.getStatus().getState().equals("DONE")) {
            logger.info("Returning the TableReference in pollJobStatus");
            return pollJob.getConfiguration().getQuery().getDestinationTable();
        }
        // Pause execution for one second before polling job status again,
        // to
        // reduce unnecessary calls to the BigQUery API and lower overall
        // application bandwidth.
        Thread.sleep(2000);
    }
}

public JobReference startQuery(JobQueryParameters jobQueryParameters) throws IOException {

    Job job = new Job();
    JobConfiguration config = new JobConfiguration();
    JobConfigurationQuery queryConfig = new JobConfigurationQuery();

    queryConfig.setAllowLargeResults(true);

    TableReference reference = new TableReference();
    reference.setProjectId(jobQueryParameters.getProjectId());
    reference.setDatasetId(jobQueryParameters.getDatasetId());
    reference.setTableId(jobQueryParameters.getTempTableName());

    Table table = new Table();
    table.setId(jobQueryParameters.getTempTableName());
    table.setExpirationTime(Calendar.getInstance().getTimeInMillis() + 360000L);
    table.setTableReference(reference);

    jobQueryParameters.getBigquery().tables()
            .insert(jobQueryParameters.getProjectId(), jobQueryParameters.getDatasetId(), table).execute();

    queryConfig.setDestinationTable(reference);

    config.setQuery(queryConfig);

    job.setConfiguration(config);

    queryConfig.setQuery(jobQueryParameters.getQuerySql());

    Insert insert = jobQueryParameters.getBigquery().jobs().insert(jobQueryParameters.getProjectId(), job);
    insert.setProjectId(jobQueryParameters.getProjectId());

    JobReference jobId = insert.execute().getJobReference();

    return jobId;
}

When looking at the AppEngine console it still shows the instance as running, but the graph showing request processed also stops.

Anybody with similar experiences where the behaviour is so eratic without code changes or re-deploys?

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
Andre Kapp
  • 23
  • 8
  • Ok - found the problem. It seems that when the AppEngine instance log debug/info/error messages from the JAva logger, it is "accumulated" from the different instances and then written to common storage point. So when the application logs very few message before crashing, it seems as if it didn't log/run at all. So that is why it seems to crash at different positions in the code.! Added logging to CloudStorage andfound the returned tx list was null. The write to the storage is nor happening as and when the log record is recieved, but rather as a bulk write when it reached a threshold amount. – Andre Kapp Apr 13 '16 at 06:13
  • The null list then throws an exception and stops the application as per design. The problem was that the logs were not written yet due to the above. – Andre Kapp Apr 13 '16 at 06:16
  • Something that I find a bit annoying is that with JPA Eclispelink/Hibernate and other ORM tools you get a proper initialized list back of database records. BigQuery lists are returned as null when no records are found, so the list is not initialized and throws nullpointer when getting the list size. Just something to keep in mind when retrieving data from BigQuery. Seen the same with Datastore requests. – Andre Kapp Apr 13 '16 at 06:18

0 Answers0