0

I am trying to do a simple BULK INSERT from a large CSV file to a table. The table and the file have matching columns. This is my code:

BULK INSERT myTable
FROM 'G:\Tests\mySource.csv'
WITH (
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    -- ROWTERMINATOR = '0x0a',
    BATCHSIZE = 1000,
    MAXERRORS = 2
)
GO

As you can see I have tried with row terminators \n and 0x0a (and a bunch more)

I keep getting a type mismatch error:

Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 18 (createdAt).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 3, column 18 (createdAt).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 18 (createdAt).
Msg 4865, Level 16, State 1, Line 1
Cannot bulk load because the maximum number of errors (2) was exceeded.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

Column createdAt is of type datetime:

CREATE TABLE [dbo].[myTable]
(
    ...
    [createdAt] [datetime] NULL,
    ...
)

These are the values of the createdAt column as taken from the first three rows:

2020-08-22 13:51:57
2020-08-22 14:13:13
2020-08-22 14:16:23

I also tried with a different number format as suggested. I also tried changing the column type to DATETIME2(n):

2020-08-22T13:51:57
2020-08-22T14:13:13
2020-08-22T14:16:23

I have no idea what else to review.

I would appreciate any help.

Thanks!

Wilmar
  • 558
  • 1
  • 5
  • 16
  • Your DDL implies you have multiple columns in your table, `MyTable`, but your CSV only have 1 column, so yes, it will fail. Also, `yyyy-MM-dd hh:mm:ss` is not am unambiguous format for `datetime`; if youlre not American then SQL Server will (stupidly) read it as `yyyy-dd-MM hh:mm:ss`. – Thom A Oct 15 '20 at 16:32
  • Hi @Larnu. My CSV has 126 columns. What makes you believe it only has one column? On the format comment, what format should I set it up to? Thank you! – Wilmar Oct 15 '20 at 16:36
  • *"These are the values as taken from the first three rows:... {3 rows with a single column}"* What about the above would suggest it had more? – Thom A Oct 15 '20 at 16:41
  • What language is your `LOGIN` set to? – Thom A Oct 15 '20 at 16:41
  • @Larnu, I understand the confusion. Those are the values of the createAt column only. I edited the question to make it clear. Regarding the Login language, it is set to English-us_english. Thanks! – Wilmar Oct 15 '20 at 16:50

1 Answers1

2

There are many formats of string literals to be converted to dates & times supported by SQL Server - see the MSDN Books Online on CAST and CONVERT. Most of those formats are dependent on what settings you have - therefore, these settings might work some times - and sometimes not. And the DATETIME datatype in particular is notoriously picky about what formats of string literals work - and which others (most) don't.... DATETIME2(n) is much more forgiving and less picky to deal with!

The way to solve this is to use the (slightly adapted) ISO-8601 date format that is supported by SQL Server - this format works always - regardless of your SQL Server language and dateformat settings.

The ISO-8601 format is supported by SQL Server comes in two flavors:

  • YYYYMMDD for just dates (no time portion); note here: no dashes!, that's very important! YYYY-MM-DD is NOT independent of the dateformat settings in your SQL Server and will NOT work in all situations!

or:

  • YYYY-MM-DDTHH:MM:SS for dates and times - note here: this format has dashes (but they can be omitted), and a fixed T as delimiter between the date and time portion of your DATETIME.

This is valid for SQL Server 2000 and newer.

If you use SQL Server 2008 or newer and the DATE datatype (only DATE - not DATETIME!), then you can indeed also use the YYYY-MM-DD format and that will work, too, with any settings in your SQL Server.

Don't ask me why this whole topic is so tricky and somewhat confusing - that's just the way it is. But with the YYYYMMDD format, you should be fine for any version of SQL Server and for any language and dateformat setting in your SQL Server.

The recommendation for SQL Server 2008 and newer is to use DATE if you only need the date portion, and DATETIME2(n) when you need both date and time. You should try to start phasing out the DATETIME datatype if ever possible

In your case, I'd try one of two things:

  • if you can - use DATETIME2(n) instead of DATETIME as your column's datatype - that alone might solve all your problems

  • if you can't use DATETIME2(n) - try to use 2020-08-22T13:51:57 instead of 2020-08-22 13:51:57 for specifying your date&time in the CSV import file.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Hi @marc_s. I am using `SQL Server 2017`. This is a new DB, so I can easily change the column type. I just did it. It shows as `[createdAt] [datetime2](7) NULL`. However I keep getting the same error. I am going to try with your other suggestion. Thank you – Wilmar Oct 15 '20 at 17:03
  • Well, I tried with format `2020-08-22T13:51:57` on column datatypes `DATETIME2(n)` and `DATETIME` and got the same result. I wonder if my problem is somewhere else. I am no really sure what to do at this point. – Wilmar Oct 15 '20 at 18:33
  • I ended up doing this in an easy but less than optimal way: using Python and Pandas: `df = pd.read_csv('mySource.csv')` and then just using `df.to_sql('myTable', engine, schema='dbo', if_exists='append',index=False,chunksize=100)`. For what I understand this is very suboptimal and, in fact, it took quite a while to complete the process but it took care of all the data types nightmares. I am not adding this as an answer because it is not, this is just an ugly workaround. – Wilmar Oct 16 '20 at 12:01