0

Initially I had an issue with the data type "guesses" when dealing with the jet driver (through oledb). If a sheet had mixed types, it would bring in null/empty values.

-Edit-

There is an IMEX setting in the connection string as well as in the registry that will tell jet/ace to use text for columns with multiple data types. This way if the first 6 rows have an integer value and the 7th cell has a text value, there won't be a type cast failure. There is also a setting in the registry (and connection string) that will allow you to say how many rows jet should use for sampling.

-end edit-

I changed the connection string, and the registry settings on the server. So now the program is reading fine. It will read values as text, and not use {n} rows for sampling. I thought it was working fine.

Now I have a data source that lists files in order to be read. If I have multiple files in there, it will have the same type casting issues... or at least the same symptoms. If I upload the files one at a time without using the queue then it works fine. It's when I have multiple files in a row that it seems to have the type casting issue.

I'm not really sure what is causing this to happen when reading multiple files in a row, but not when reading one at a time. The connection opens, reads all the data, and then closes... so I don't think it has to do with that.

I am just looking for any ideas ? It was hard enough to find the original problem. Working with Jet seems to be asking for a butt ache.


Added relevant code as per request

public static readonly String CONNECTION_STRING = "Provider=Microsoft.ACE.OLEDB.12.0;Data   Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=YES; ReadOnly=True;IMEX=1;\"";

private System.Data.DataTable Query(String worksheetName, String selectList = "*")
{
    DataTable table = new DataTable();

    _connection.Open();
    var query = String.Format(Constants.DATA_QUERY, selectList, worksheetName);
    new OleDbDataAdapter(query, _connection).Fill(table);
    _connection.Close();
    return table;
}
T.S.
  • 18,195
  • 11
  • 58
  • 78
loctrice
  • 2,454
  • 1
  • 23
  • 34
  • And where is your relevant code? – Matthijs Jun 16 '14 at 19:06
  • Multiple files definitely NOT causing your issue. What is your logic? I don't remember needing to edit registry to read `excel workbooks`. Use `ACE OleDB` and you should have no problems reading it. Really. you wrote so much but nothing that can help to understand your issue – T.S. Jun 16 '14 at 19:09
  • @T.S. There is an IMEX setting for when excel guesses the wrong data type. So what happens is if your first 6 rows appear to be integer types, and the 10th row is text it will have a type cast failure and bring in a value of "" for the 10th row. – loctrice Jun 16 '14 at 19:34
  • You need to write your logic to parse the type. What I've done in one application, is I parsed all rows as text and then each row went through data validation and those that failed were added to a log. Later, user could go to that log and download those failed rows as Excel. User would quickly fix it [because only failed rows present] and re-upload the file. You need to work in direction of building the logic that will take care of your incompatibility issues, or use only strings. – T.S. Jun 16 '14 at 19:55
  • I have it set to read all mixed types as text. That works when it's reading a file, it's when I read multiple files in a row this seems to be ignored. I'm not allowed, in this situation, to make the user fix the file. – loctrice Jun 16 '14 at 20:00
  • You seem doing good here. But I have suspicion that somewhere your connection not being new/reset. Try to force new connection in this method, always call `= new Connection();` in your method `Query`. BTW, don't call your method like this, use `DoQuery` or `GetTableFromExcelWorksheet` – T.S. Jun 16 '14 at 20:15
  • also, can you post exact error? – T.S. Jun 16 '14 at 20:19
  • It's not causing an error. It's just reading "" from rows/columns that do have data in them. It's acting as if it's ignoring the text type and reverting back to the old behavior. – loctrice Jun 16 '14 at 23:47
  • I have a feeling that something wrong with your logic - `connection/worksheetName`. You probably reading off wrong worksheet – T.S. Jun 17 '14 at 13:32
  • I don't think so. The rest of the data is correct, and this only happens where I load multiples. I only read one per process, but something happens on the server when I load files too frequently. Everything works appropriately so long as I don't load them too close together. – loctrice Jun 17 '14 at 15:01

2 Answers2

0

I'd recommend using a native library if possible, something like Excel Data Reader or EPPlus instead of OLEDB

Wiebe Tijsma
  • 10,173
  • 5
  • 52
  • 68
  • Thanks for the recommendation. I will look at those items. – loctrice Jun 16 '14 at 19:47
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. – Ian R. O'Brien Jun 16 '14 at 20:17
  • @IanO'Brien Sure, I'd normally do that however these are not articles but 3d party libraries. Or should I have posted this just as a comment as it doesn't directly answer the question? – Wiebe Tijsma Jun 16 '14 at 20:28
  • I happen to be in a situation where I can switch away from Jet/Ace. However, that is not always the case. I'd like to point out that this does not actually answer the question at hand. (Meaning it doesn't attempt to solve the actual problem) – loctrice Jun 18 '14 at 15:09
-1

I found the solution here

https://www.codeproject.com/Tips/702769/How-to-Get-Data-from-Multiple-Workbooks-using-One

Provider setup:

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\path\fileName1.xls;Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

The SQL Statement must be set like this:

Select * From[Hoja1$]
UNION ALL
Select * From [Hoja1$] IN 'C:\path\fileName2.xls' 'Excel 8.0;HDR=Yes;IMEX=1'

If you want to make an inner join

Select * from [Hoja1$] as a
INNER JOIN (select * from [Hoja1$] IN 'C:\path\fileName2.xls' 'Excel 8.0;HDR=Yes;IMEX=1') as b
ON a.FOLIO=b.FOLIO
Miguel Carrillo
  • 327
  • 5
  • 7
  • You're driver is Microsoft.Jet.OLEDB.4.0 and mine was Microsoft.ACE.OLEDB.12.0 so you can connect to standard excel files but not xlsx. The Ace is backwards compatible but the jet doesn't go forward.This question is also 6 years old. – loctrice Aug 13 '20 at 13:00
  • It's also worth pointing out that getting data from multiple sources was not the issue. The issue was the sampling the imex setting used where there were mixed types in a column. That's the part that was not working. – loctrice Aug 13 '20 at 13:03