3

I'm developing an SSIS package to extract data from a table in an Oracle 11g database and load it to a table in a SQL Server 2014 database.

The table in the Oracle database has string columns defined as VARCHAR2 (single byte / codepage 8859-1); the table in the SQL Server database has string columns defined as VARCHAR (single byte).

However, when I add an OLE DB source / ADO .NET source to the SSIS package and configure it to select columns from the table in the Oracle source, the External Columns (Advanced Editor) shows the string columns as DT_WSTR (Unicode).

This Oracle article (see section 'How Oracle Unicode Support Works') advises that "When the client character set is not a superset of the server character set or the database character set is a multibyte character set, OraOLEDB automatically enables the Unicode mode."

How do I get the OraOLEDB provider / SSIS OLE DB data source to expose the string columns in the source as single byte?

What I've tried

  • Setting the AlwaysUseDefaultCodePage property (of the OLE DB Source) to true and setting the DefaultCodePage property to 1252 (8859-1 is a sub set of this).
  • Setting the AlwaysUseDefaultCodePage property (of the OLE DB Source) to true and setting the DefaultCodePage property to 28591 (the Windows equivalent of 8859-1).
  • Reinstalled the Oracle client.

Constraints

  • I'm transferring 2M+ rows and the duration the package takes to execute is an issue. Ideally, I don't want to have to add a Unicode > Non-Unicode Data Conversion transformation to the data flow that could negatively impact performance.
  • I have no control of the Oracle server / database configuration.

Environment Configurations

Windows Client

  • Windows 7 Pro
  • BIDS / Visual Studio 2013
  • Oracle client 11g Revision 5 (32 bit only - 64 bit client not installed)
  • Oracle NLS_LANG = ENGLISH_UNITED KINGDOM.WE8MSWIN1252 (Set in the Wow6432Node registry - no overriding environment variable)

SSIS Package / Data Flow Task / OLE DB Data Source

LocaleID = English (United Kingdom)

Client Session Connection Info

SELECT * FROM V$SESSION_CONNECT_INFO
  • CLIENT_CHARSET = UTF8 (When query called from SSIS using Full Instant Client)
  • CLIENT_CHARSET = Unknown (When query called from SQL Developer using jdbcthin)

Oracle Server

SELECT * FROM SYS.NLS_DATABASE_PARAMETERS
  • NLS_LANGUAGE = ENGLISH
  • NLS_TERRITORY = UNITED KINGDOM
  • NLS_CHARACTERSET = WE8ISO8859P1
John
  • 1,043
  • 15
  • 20

2 Answers2

1

As you already noted, "unicode mode" is used by the Oracle client when your client's character set is not a superset of the database's. The database's character set is definitely WE8ISO8859P1. Your client's character set is yet undetermined. The client session connection info is not a reliable source.

The environment variable NLS_LANG overrides a registry setting, if any. To make sure that the OraOLEDB client uses the desired NLS_LANG value, ensure that the user that executes the SSIS package has that environment variable set. You can realise this e.g. by editing HKEY_USERS\<that user's id>\Environment\NLS_LANG. I suggest starting with making it equal to the database, WE8ISO8859P1, to rule out that WE8MSWIN1252 is not regarded as a superset by the Oracle client.

Excellent article on this.

DᴀʀᴛʜVᴀᴅᴇʀ
  • 7,681
  • 17
  • 73
  • 127
boerl
  • 11
  • 1
0

Using the advanced editor (right click on the component to display this as an option), you can set the Output type for the column manually. See the article below:

http://www.sqlservergeeks.com/sql-server-data-type-conversion-options-in-ssis/

Nick Heidke
  • 2,787
  • 2
  • 34
  • 58
  • that assumes implicit conversion which may not always work, there's a big matrix in the msdn documenation that shows that can/not be converted in this way – adolf garlic Nov 12 '21 at 12:29