0

I'm having an issue with Visual Studio 2010 picking incorrect/inconsistent DataType for the columns when using the OleDbDataAdapter.
Is it possible to focus the Data Type of each column to be string?
Currently i'm trying to convert my CSV file into a Datatable. Some of the columns end up being double when i tried to use the same method on the same column names it turns out to be string (because the 2nd CSV file starts off with a '-' sign so it just assumes it's a string)

using (OleDbConnection connection = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Path.GetDirectoryName(filePath) + ";Extended Properties=\"Text;HDR=Yes;TypeGuessRows=0;ImportMixedTypes=Text\""))
using (OleDbCommand command = new OleDbCommand(@"SELECT * FROM [" + Path.GetFileName(filePath) + "]", connection))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
adapter.Fill(dt);

trying to merge it with another csv file:

using (OleDbConnection connection = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Path.GetDirectoryName(part2FilePath) + ";Extended Properties=\"Text;HDR=Yes;TypeGuessRows=0;ImportMixedTypes=Text\""))
using (OleDbCommand command = new OleDbCommand(@"SELECT * FROM [" + Path.GetFileName(part2FilePath) + "]", connection))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
     DataTable tmpDt = new DataTable();
     adapter.Fill(tmpDt);
     dt.Merge(tmpDt, true, MissingSchemaAction.Add);
}

I run into this conflict that the data Type does not match. the first CSV has double as one of the column but the same column in the 2nd CSV it is coming up as string.

If possible i'd love to focus all of them to be string and i'll convert them on the fly.

Thanks.

Ehsan Sajjad
  • 61,834
  • 16
  • 105
  • 160
civic.sir
  • 400
  • 1
  • 9
  • 26
  • try looking at this link http://stackoverflow.com/questions/1050112/how-to-read-a-csv-file-into-a-net-datatable if not sounds like you will need to create an empty instance of a datatable and add the columns / type like this `table11.Columns.Add("ID", typeof(double));` for example then populate the datatable that way try this one as well it works btw http://stackoverflow.com/questions/3306330/creating-a-datatable-from-csv-file – MethodMan Feb 12 '16 at 19:12
  • Try using a List and read your CSV into that list using a loop and an OleDbDataReader. reader = command.ExecuteReader(); while (reader.Read()) { items.Add(new YourCsvSchema( reader.GetDouble(0))); .... } The reader gives you more control over how to read and convert the values. – Jon Feb 12 '16 at 19:14
  • @Mangist That sounds pretty smart. So i create a class ie. `class CsvSchema { public string Name {get; set;} public double Value {get; set;}}` and then loop through the reader to add the items ... Just a little confused about how you set the header name with the correct CsvSchema object. ie how to set ["Name"] into Name object and ["Value"] to Value object. Could you please write a simple solution on this it might solve the issue. Thanks. – civic.sir Feb 12 '16 at 19:22
  • Note that this CSV file has a lot of unnecessary columns too. This way i think it'll avoid those columns correct? Thanks – civic.sir Feb 12 '16 at 19:23
  • I will post an answer in a minute – Jon Feb 12 '16 at 19:24
  • Great Thanks @Mangist – civic.sir Feb 12 '16 at 19:26

2 Answers2

1

I've posted a class that will read your CSV file for you into a list of CsvLineItem objects. I've shown a couple of different ways to read the values (by column index, or by column name, and how to handle a NULL value)

public class CsvLineItem
{
    public string Id { get; set; }
    public string Name { get; set; }
    public double Value1 { get; set; }
    public double Value2 { get; set; }
}
public static class CsvReader
{
    public static IList<CsvLineItem> Read(string csvFilename)
    {
        var items = new List<CsvLineItem>();

        using (var connection = new OleDbConnection(
            @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
                + Path.GetDirectoryName(csvFilename)
                + ";Extended Properties=\"Text;HDR=Yes;TypeGuessRows=0;ImportMixedTypes=Text\""))
        {
            connection.Open();

            using (var command = new OleDbCommand(@"SELECT * FROM [" + Path.GetFileName(csvFilename) + "]", connection))
            {
                using (var reader = command.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        items.Add(new CsvLineItem
                        {
                            Id = reader.GetInt32(0), // By column index
                            Name = reader.GetString(reader.GetOrdinal("Name")), // By column name
                            Value1 = reader.GetDouble(2),
                            Value2 = reader.IsDBNull(3) ? 0 : reader.GetDouble(3) // Handling nulls
                        });
                    }
                }
            }
        }

        return items;
    }
Jon
  • 3,230
  • 1
  • 16
  • 28
0

My suggestion is use an extra step to deal with CSV files that have a harder to read structure. This solution can be used if the file is not huge:

1) Load the CSV in a more friendly structure

Use Microsoft.VisualBasic.FileIO.TextFieldParser to parse all the file and get the data a list of list of strings (or something similar). More details can be found here.

2) While loading the data, convert or skip the values according to your needs.

This solution might be slower, but it gives full control over the parsing.

Community
  • 1
  • 1
Alexei - check Codidact
  • 22,016
  • 16
  • 145
  • 164