com.google.cloud.spanner.SpannerException: DEADLINE_EXCEEDED

Question

i've been stuck with this issue for over a day and thought someone here might know the answer. Basically, as a simple test i'm trying to read data from a table and simply output the results to a log file. The table is rather large (~167 million rows). I keep getting the following error

java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: com.google.cloud.spanner.SpannerException: DEADLINE_EXCEEDED: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 119997536405ns

Followed by this :

Workflow failed. Causes: S15:SpannerIO.ReadAll/Read from Cloud Spanner/Shuffle partitions/Reshuffle/GroupByKey/Read+SpannerIO.ReadAll/Read from Cloud Spanner/Shuffle partitions/Reshuffle/GroupByKey/GroupByWindow+SpannerIO.ReadAll/Read from Cloud Spanner/Shuffle partitions/Reshuffle/ExpandIterable+SpannerIO.ReadAll/Read from Cloud Spanner/Shuffle partitions/Values/Values/Map+SpannerIO.ReadAll/Read from Cloud Spanner/Read from Partitions+ParDo(FinishProcess) failed., The job failed because a work item has failed 4 times. Look in previous log entries for the cause of each one of the 4 failures. For more information, see https://cloud.google.com/dataflow/docs/guides/common-errors. The work item was attempted on these workers: ricardocascadejob-m0771463-12021322-cpj6-harness-4nvn Root cause: Work item failed., ricardocascadejob-m0771463-12021322-cpj6-harness-c485 Root cause: Work item failed., ricardocascadejob-m0771463-12021322-cpj6-harness-kcgb Root cause: Work item failed., ricardocascadejob-m0771463-12021322-cpj6-harness-kcgb Root cause: Work item failed.

Here is the main code that is running on dataflow

PipelineOptionsFactory.register(RicardoPriceLoadOptions.class);
    RicardoPriceLoadOptions opts = PipelineOptionsFactory.fromArgs(args)
        .withValidation().as(RicardoPriceLoadOptions.class);
    Pipeline pipeline = Pipeline.create(opts);

    SpannerConfig spannerConfig =
        SpannerConfig.create()
            .withProjectId(opts.getGcpProjectId())
            .withInstanceId(opts.getSpannerInstanceId())
            .withDatabaseId(opts.getSpannerDatabaseId());

    PCollectionView<Transaction> tx =
        pipeline.apply(SpannerIO.createTransaction().withSpannerConfig(spannerConfig));

    //Fetch All Price Events
    PCollection<Struct> pepLIst = pipeline.apply(Create.of(ReadOperation.create()
        .withColumns("DisabledFlag", "PriceEventPriceableId", "PriceableItemId",
            "OutgoingType", "PriceOriginal", "PriceIntermediate", "PriceRetail", "SaleValue", "SaleValueIntermediate",
            "SchedulableFlag", "SendToSiteFlag", "StartTime", "EndTime", "DisplayCode")
        .withTable("ABC")))
        .apply(SpannerIO.readAll().withTransaction(tx).withSpannerConfig(spannerConfig));

    pepLIst.apply(ParDo.of(new FinishProcessFn()));

The last DoFn is a simple one that simply logs the spanner row.

public class FinishProcessFn extends DoFn<Struct, Void> {

@ProcessElement
public void process(@Element Struct elem) {
    log.debug(elem.toString());
}

}

I have tried google suggestions as shown here

Common Errors

The code seems simple enough, but i'm not sure why i keep getting the error above. Any input or help is appreciated.

Thanks!

Here is the table schema

    CREATE TABLE ABC (
    PriceEventPriceableId INT64 NOT NULL,
    Created TIMESTAMP NOT NULL,
    CreatedBy STRING(MAX) NOT NULL,
    DisabledFlag STRING(MAX) NOT NULL,
    DisplayCode STRING(MAX),
    EndTime TIMESTAMP,
    ErrorCode INT64,
    EstablishmentOverrideFlag STRING(MAX),
    LastUpdated TIMESTAMP NOT NULL,
    LastUpdatedBy STRING(MAX) NOT NULL,
    NotApplicableFlag STRING(MAX),
    OnSaleRatioOverrideFlag STRING(MAX),
    OutgoingType INT64,
    OwnedValue STRING(MAX),
    ParentPriceableItemId INT64,
    PriceableItemId INT64 NOT NULL,
    PriceEventId INT64 NOT NULL,
    PriceIntermediate STRING(MAX),
    PriceOriginal STRING(MAX),
    PriceRetail STRING(MAX),
    ReasonUnschedulable STRING(MAX),
    SaleValue STRING(MAX),
    SaleValueIntermediate STRING(MAX),
    SavingsMaxOverrideFlag STRING(MAX),
    SchedulableFlag STRING(MAX),
    SendToSiteFlag STRING(MAX),
    SentToSiteDate DATE,
    StartTime TIMESTAMP,
    StoredValue STRING(MAX),
    TenPercentOverrideFlag STRING(MAX),
    Timestamp TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp=true),
) PRIMARY KEY (PriceEventPriceableId)

Do you happen to know if you are writing particularly large data to a row? Or if you might be writing a lot of data to spanner in a way which triggers hotspotting to occur? https://cloud.google.com/spanner/docs/schema-design I.e. if you write a lot of data to the same spanner key, or spanner keys which are in the same range. This can cause a single machine in the spanner backend to become overloaded. Would you mind posting details of your spanner schema? Such as what you are using as the primary key? — Alex Amato, Dec 02 '19 at 22:28
Hi Alex, I'm not writing any data to spanner. Simply doing a readAll and logging what I read. I cut my code down to the bare minimum to isolate the issue. What you see above is all I have at the moment. I'll add the schema in my original post — Ricardo Riveros, Dec 02 '19 at 22:45
Which version of beam are you using? Have you tried this pipeline and is it working with a smaller database? — Nithin Sujir, Dec 02 '19 at 23:03
Is it necessary for your use case to use .withTransaction()? If you don't need the data to read from a strictly consistent snapshot then you cam omit it. I am wondering if the behaviour of SpannerIO.readAll().withTransaction() is that it tries to read the whole database at once, rather than iterating through it row by row. Would you mind removing withTransaction and seeing if the issue goes away? — Alex Amato, Dec 02 '19 at 23:51
Thanks for responding guys. I'm using 2.16.0 of beam and with another table we have (~8 million rows) It does work properly. I originally didn't have the .withTransaction() but added it after looking at the example for a google dataflow template here https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/spanner/ExportTransform.java. Regardless, it still throws the same error — Ricardo Riveros, Dec 03 '19 at 14:51
Seems like this error is coming from Cloud Spanner (https://cloud.google.com/spanner/docs/reference/rest/v1/Code). Probably try contacting Google Cloud support with information regarding your Cloud Spanner instance. — chamikara, Dec 03 '19 at 22:15

score 0 · Answer 1 · answered Dec 04 '19 at 12:02

The DEADLINE_EXCEEDED error means that the operation didn’t complete in the given time.

For operations that change the state of the system, this error may be returned even if the operation has completed successfully (for example, a successful response from a server could have been delayed long enough for the deadline to expire).

I could see no issues with Cloud Spanner in Google Cloud Status Dashboard, however, it would be better to contact GCP support, so that they could inspect your project and take a deeper look into the issue.

score 0 · Accepted Answer · edited Jun 21 '20 at 21:19

0

So just as an update what i did to fix the issue was to use EXACTLY the same version of beam and google spanner drivers as the Spanner Export to Avro template available in gpc dataflow and my code started to magically work. I made no code changes.

edited Jun 21 '20 at 21:19

iker lasaga

330
1
3
18

answered Dec 12 '19 at 15:53

Ricardo Riveros

31
1
4

can you share what was the version of spanner driver & beam you used? – user9484528 Apr 30 '20 at 01:09

com.google.cloud.spanner.SpannerException: DEADLINE_EXCEEDED

2 Answers2