0

I am exporting a simple hive table to Sql server. Both tables have the exact schema. There is an identity column in Sql Server and I have done a "set identity_insert table_name on" on it.

But when I export from sqoop to sql server, sqoop gives me an error saying that "IDENTITY_INSERT is set to off".

If I export to a Sql Server table having no identity column then all works fine.

Any idea about this? Anyone faced this issue while exporting from sqoop to sql server?

Thanks

Kumar Vaibhav
  • 2,632
  • 8
  • 32
  • 54

3 Answers3

2

In Short:

Postfix -- --identity-insert to your Sqoop export command


Detailed:

Here is an example for anyone searching (and possibly for my own later reference).

SQLSERVER_JDBC_URI="jdbc:sqlserver://<address>:<port>;username=<username>;password=<password>"
HIVE_PATH="/user/hive/warehouse/"
$TABLENAME=<tablename>

sqoop-export \
    -D mapreduce.job.queuename=<queuename> \
    --connect $SQLSERVER_JDBC_URI \
    --export-dir "$HIVE_PATH""$TABLENAME" \
    --input-fields-terminated-by , \
    --table "$TABLENAME" \
    -- --schema <schema> \
    --identity-insert

Note the particular bits on the last line -- -- --schema <schema> --identity-insert . You can omit the schema part, but leave in the extra --.

That allows you to set the identity insert ability for that table within your sqoop session. (source)

Community
  • 1
  • 1
binaryaaron
  • 651
  • 7
  • 14
1

Tell SQL Server to let you insert into the table with the IDENTITY column. That's an autoincrement column that you normally can't write to. But you can change that. See here or here. It'll still fail if one of your values conflicts with one that already exists in that column.

Community
  • 1
  • 1
criticalfix
  • 2,870
  • 1
  • 19
  • 32
  • Seems like if you just don't use that identity column while making your schema in hive and while exporting mention all columns to be exported then it all works fine. But thanks your answer was very useful and informative. – Kumar Vaibhav Apr 03 '13 at 10:08
  • Yes, thanks, I should have added that the INSERT statement usually just doesn't mention the identity column, and then SQL Server adds its own autoincrement value in that field. So it makes sense that excluding that column from the schema would also work. – criticalfix Apr 03 '13 at 12:00
  • How to sqoop import specific columns to SQL Server from Hive tables? – Amrutha K Nov 22 '19 at 08:25
0

The SET IDENTITY_INSERT statement is session-specific. So if you set it by opening a query window, executing the statement, and then ran the export anywhere else, IDENTITY_INSERT was only set in that session, not in the export session. You need to modify the export itself if possible. If not, a direct export from sqoop to MSSQL will not be possible; instead you will need to dump the data from sqoop to a file that MSSQL can read (such as tab delimited) and then write a statement that first does SET IDENTITY_INSERT ON, then BULK INSERTs the file, then does SET IDENTITY_INSERT OFF.

Adam Anderson
  • 508
  • 3
  • 12
  • Just to be clear, I set if off on my sql server and ran the export from the linux box(of course) so I am not sure what's meant by "IDENTITY_INSERT was only set in that session, not in the export session". And how do I modify the export itself? I changed the columns to be exported and excluded the identity column then also same error. Just to be more clear, when we give the --table option in sqoop export command it is the sql server table and --export-dir option specifies my hdfs directory for hive table. Right? – Kumar Vaibhav Apr 02 '13 at 05:12
  • @AdamAnderson is right, this may not be possible directly out of Linux. Seriously, try exporting the table to a tab delimited file, and then using the SQL Server Import wizard or an SSIS package. Then there are explicit ways to handle the identity column from within SQL Server. – criticalfix Apr 02 '13 at 15:56
  • @KumarVaibhav, I am not familiar with sqoop, but you would either need built-in support for identity columns in its export or the ability to customize the export by being able to run arbitrary SQL before/after the export itself. Otherwise, try exporting the data to a file and importing it from MSSQL, where support for identity columns will definitely exist. – Adam Anderson Apr 02 '13 at 17:56