10

I have a hive query:

insert override directory /x
select ...

Then I'm try to export the data with sqoop

sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x  --input-fields-terminated-by 0x01 --lines-terminated-by '\n'

But this seems to fail to parse the fields according to delimiter What am I missing? I think the --input-fields-terminated-by 0x01 part doesn't work as expected?

I do not want to create additional tables in hive that contains the query results.

stack trace:

 2013-09-24 05:39:21,705 ERROR org.apache.sqoop.mapreduce.TextExportMapper: Exception: 
 java.lang.NumberFormatException: For input string: "9-2"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Integer.parseInt(Integer.java:458)
 ...

The vi view of output

16-09-2013 23^A1182^A-1^APub_X^A21782^AIT^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0
16-09-2013 23^A1182^A6975^ASoMo Audience  Corp^A2336143^AUS^A1^A1^A0^A0^A0^A0.2^A0.0^A0.0
16-09-2013 23^A1183^A-1^APub_UK, Inc.^A1564001^AGB^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0
17-09-2013 00^A1120^A-1^APub_US^A911^A--^A181^A0^A0^A0^A0^A0.0^A0.0^A0.0
Kit Menke
  • 7,046
  • 1
  • 32
  • 54
Julias
  • 5,752
  • 17
  • 59
  • 84
  • 2
    Yes **--input-fields-terminated-by 0x01** part is not working correctly. You can try this: **sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N'** – Mukesh S Sep 29 '13 at 18:56
  • @MukeshS the delimiter '\001', didn't work for me, instead '\0001' worked. The extra '0' did the trick – Saurabh Mishra Apr 03 '18 at 06:33

4 Answers4

11

I've found the correct solution for that special character in bash

#!/bin/bash

# ... your script
hive_char=$( printf "\x01" )

sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x  --input-fields-terminated-by ${hive_char} --lines-terminated-by '\n'

The problem was in correct separator recognition (nothing to do with types and schema) and that was achieved by hive_char.

Another possibility to encode this special character in linux to command-line is to type Cntr+V+A

Julias
  • 5,752
  • 17
  • 59
  • 84
  • 4
    you could also simply the octal representation of ^A '\001'. sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x --input-fields-terminated-by '\001' --lines-terminated-by '\n' – Dharmendar Kumar 'DK' Sep 14 '14 at 20:34
4

Using

--input-fields-terminated-by '\001' --lines-terminated-by '\n'

as flags in the sqoop export command seems to do the trick for me.

So, in your example, the full command would be:

sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x  --input-fields-terminated-by '\001' --lines-terminated-by '\n'
Mark Grover
  • 4,070
  • 22
  • 21
  • Question: how do you know it's '\001' instead of '\01' or '\1'? I checked the manual: https://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html#_inserts_vs_updates but didn't find any introduction on how to input by ASCII code... – Jing He Aug 16 '17 at 11:57
0

I think its the DataType mismatch with your RDBMS schema.

Try to find the column name of "9-2" value and check the datatype in RDBMS schema.

If its int or numeric then Sqoop will parse the value and insert. And as it seems "9-2" is not numeric value.

Let me know if this doesn't work.

0

It seems like sqoop is taking '0' as a delimiter . You are getting an error because:- First column in your mysql table could be varchar and second column is a number. As per below string:-

16- 0 9-2 0 13 23^A1182^A-1^APub_X^A21782^AIT^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0

Your first column parsed by sqoop is :-16- and second column is:-9-2

So its better to specify a delimiter in quotes('0x01') or

(Its always easy and has better control)use hive create table command as:- create table tablename row format delimited fields terminated by '\t' as select ... and specify '\t' as delimiter in your sqoop command.

Ankit Singhal
  • 834
  • 7
  • 8