2

Getting below error:

--incremental lastmodified cannot be used in conjunction with --as-avrodatafile.

when running command:

gcloud dataproc jobs submit hadoop \
    --project='aca-ingest-dev' \
    --cluster='sqoop-gcp-ingest-d3' \
    --region='us-east1' \
    --class=org.apache.sqoop.Sqoop \
    --jars='gs://aca_utilities_dev/ingestion_jars/sqoop-1.4.7-hadoop260.jar,gs://aca_utilities_dev/ingestion_jars/avro-tools-1.8.2.jar,gs://aca_utilities_dev/ingestion_jars/ojdbc7.jar' \
    -- \
    import \
    -Dmapreduce.job.user.classpath.first=true \
    --connect='jdbc:oracle:thin:@10.25.42.52:1521/uataca.aaamidatlantic.com' \
    --username='XX' --password-file='XX' \
    --query='select comm_ctr_i from tab1 where $CONDITIONS OFFSET 0 ROWS FETCH NEXT 1000 ROWS ONLY' \
    --target-dir='gs://aca-ingest-d3-dev/hist_arch_call/source/2019-08-16_6' \
    --num-mappers=1 \
    --incremental=lastmodified \
    --check-column='arch_date' \
    --last-value='2019-08-16T06:07:37.036611' \
    --as-avrodatafile
Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
DataVishesh
  • 197
  • 1
  • 5

1 Answers1

2

While some discussion threads may seem to imply that support for using lastmodified with as-avrodatafile was added in https://issues.apache.org/jira/browse/SQOOP-1094 which is applied to Sqoop 1.4.7+, we can actually see the particular use case you're exercising being explicitly blocked in both 1.4.6 and 1.4.7:

https://github.com/apache/sqoop/blob/branch-1.4.6/src/java/org/apache/sqoop/tool/ImportTool.java#L1105 https://github.com/apache/sqoop/blob/branch-1.4.7/src/java/org/apache/sqoop/tool/ImportTool.java#L1153

So the branch-1.4.7 still has the following code:

if (options.getIncrementalMode() == SqoopOptions.IncrementalMode.DateLastModified
    && options.getFileLayout() == SqoopOptions.FileLayout.AvroDataFile) {
  throw new InvalidOptionsException("--"
      + INCREMENT_TYPE_ARG + " lastmodified cannot be used in conjunction with --"
      + FMT_AVRODATAFILE_ARG + "." + HELP_STR);
}

Probably your best bet here would be to import in a different file format first and then string your sqoop job with another job converting your intermediate file format into Avro if that's what you need in the end.

As for native Sqoop support, it appears this JIRA was filed with the same question in mind, noting that it's unclear whether the fast-fail check is intended to still be valid or not: https://issues.apache.org/jira/projects/SQOOP/issues/SQOOP-3369

You can consider subscribing to that last jira to follow progress on adding support for your use case.

Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
Dennis Huo
  • 10,517
  • 27
  • 43