I have a Java aplication running on Google App Engine with version 1.9.36 of the SDK.The application is running Java 7 with access to Datastore, BigQuery and Cloud Storage. The container is a B8 class backend server with the tasks being submitted by a receiving servlet that does some basic checking and then submits a TaskQueue entry.
The problem that I am facing is that the application simply stops responding. The data is read from the BigQuery table using the the JobQueue. While checking for the Job status to go from RUNNING to DONE the application simply stops logging and processing. This number of wait iterations varies, as well as where the application stops.
It is not consistant, so can't pin it down to specific piece of code.
Some days the application runs for days without a problem, the one day I cannot even get one iteration to complete without stopping.
Data in the table varies between 600 and 6000 rows. I read in chunks of 2000 rows at a time.
Sometimes it reaches the "Job done - let process results" part and then simply stops. Sometimes it logs a couple of RUNNING messages and then stops.
This is the part where I setup the job parameters and then start the job. I have left ou tthe code in the do{}while loop. That is just processing bits, but works fine. The trouble seems to be with the retrieval of the data somewhere.
public List<String> retrieveMergedTableTxIds(Date runDate) throws ProcessorCustomException {
List<String> existingIds = new ArrayList<>();
BigQueryQryStringBuilder queryStringBuilder = new BigQueryQryStringBuilderImpl();
String tempTableName = queryStringBuilder.buildTempTableName();
String qryString;
try {
qryString = queryStringBuilder.buildMergeTableQuery(ApplicationConstants.projectName, ApplicationConstants.DATASET_ID,
ApplicationConstants.TRXID_VIEW, runDate);
logger.info("Query string is |" + qryString);
JobQueryParameters jobQueryParameters = new JobQueryParameters();
jobQueryParameters.setBigquery(bigquery);
jobQueryParameters.setProjectId(ApplicationConstants.projectName);
jobQueryParameters.setDatasetId(ApplicationConstants.DATASET_ID);
jobQueryParameters.setQuerySql(qryString);
jobQueryParameters.setTempTableName(tempTableName);
JobReference jobId = startQuery(jobQueryParameters);
logger.fine("JobID for submitted job is " + jobId.getJobId());
logger.fine("Polling job for DONE status");
TableReference completedJob = pollJobStatus(jobQueryParameters.getBigquery(), ApplicationConstants.projectName, jobId);
logger.fine("Job done - let process results!");
Job job = jobQueryParameters.getBigquery().jobs().get(ApplicationConstants.projectName, jobId.getJobId()).execute();
logger.fine("JobID is " + job.getId());
logger.fine("JobID is " + job.getId());
GetQueryResultsResponse response = jobQueryParameters.getBigquery().jobs()
.getQueryResults(ApplicationConstants.projectName, job.getJobReference().getJobId()).execute();
logger.fine("Response total rows is " + response.getTotalRows());
// Default to not looping
boolean moreResults = false;
String pageToken = null;
do {
logger.fine("Insize the per-token do-while loop");
TableDataList queryResult = jobQueryParameters.getBigquery().tabledata()
.list(completedJob.getProjectId(), completedJob.getDatasetId(), completedJob.getTableId())
.setMaxResults(ApplicationConstants.MAX_RESULTS).setPageToken(pageToken).execute();
logger.info("Value for isEmpty is " + queryResult.isEmpty());
if (!queryResult.isEmpty() && queryResult != null) {
logger.fine("Row size for token is " + queryResult.size());
}
} while (moreResults);
public TableReference pollJobStatus(Bigquery bigquery, String projectId, JobReference jobId) throws IOException,
InterruptedException {
while (true) {
Job pollJob = bigquery.jobs().get(projectId, jobId.getJobId()).execute();
logger.info("Job status for JobId " + pollJob.getId() + " is " + pollJob.getStatus().getState());
if (pollJob.getStatus().getState().equals("DONE")) {
logger.info("Returning the TableReference in pollJobStatus");
return pollJob.getConfiguration().getQuery().getDestinationTable();
}
// Pause execution for one second before polling job status again,
// to
// reduce unnecessary calls to the BigQUery API and lower overall
// application bandwidth.
Thread.sleep(2000);
}
}
public JobReference startQuery(JobQueryParameters jobQueryParameters) throws IOException {
Job job = new Job();
JobConfiguration config = new JobConfiguration();
JobConfigurationQuery queryConfig = new JobConfigurationQuery();
queryConfig.setAllowLargeResults(true);
TableReference reference = new TableReference();
reference.setProjectId(jobQueryParameters.getProjectId());
reference.setDatasetId(jobQueryParameters.getDatasetId());
reference.setTableId(jobQueryParameters.getTempTableName());
Table table = new Table();
table.setId(jobQueryParameters.getTempTableName());
table.setExpirationTime(Calendar.getInstance().getTimeInMillis() + 360000L);
table.setTableReference(reference);
jobQueryParameters.getBigquery().tables()
.insert(jobQueryParameters.getProjectId(), jobQueryParameters.getDatasetId(), table).execute();
queryConfig.setDestinationTable(reference);
config.setQuery(queryConfig);
job.setConfiguration(config);
queryConfig.setQuery(jobQueryParameters.getQuerySql());
Insert insert = jobQueryParameters.getBigquery().jobs().insert(jobQueryParameters.getProjectId(), job);
insert.setProjectId(jobQueryParameters.getProjectId());
JobReference jobId = insert.execute().getJobReference();
return jobId;
}
When looking at the AppEngine console it still shows the instance as running, but the graph showing request processed also stops.
Anybody with similar experiences where the behaviour is so eratic without code changes or re-deploys?