0

I had the assignment of writing several queries to read in some .dbf files. The files are Visual FoxPro files and the big problem here is when I start using joins, because this makes the temporary files extremely big, eventually resulting in errors like:

File c:\users\me\appdata\local\temp\00001kjd000a.tmp is too large.

Furthermore the querying takes a really long time and this is not what I want. I tried accessing this data using sqlserver and also using c# code, but it is extremely slow.

The sizes of the databases are about 350mb, 100mb and 10mb. I have these files locally to "speed up" the process, however a triple join from these takes more than 15min...

and i know i will have to use another one which is 2gb

code i'm using:

string connStr = @"Provider=VFPOLEDB.1;Data Source=D:\data\B. Mayer Real\;";
            string qryStr = @"
                select top 100 *
                from db1 a, db2 b, db3 c
                where a.id = b.id
                and b.id = c.id
                order by a.id
                ";

            OleDbConnection conn = new OleDbConnection(connStr);

            conn.Open();

            OleDbCommand cmd = new OleDbCommand(qryStr, conn);
            OleDbDataAdapter adapter = new OleDbDataAdapter(cmd);

            DataSet data = new DataSet();

            adapter.Fill(data);

            DataTable table = data.Tables[0];

            foreach (DataRow myDataRow in table.Rows)
            {
                Console.WriteLine("Found data {0}", myDataRow[1]);
            }

            Console.ReadLine();

            conn.Close();

EDIT:

The biggest problem is simply going through the files... If I do something like this:

SELECT *
FROM [CARATLOCAL]...[lzarb]
where la_nummer = 364999

This already takes 30 seconds

This query took 38 minutes! (it's just one line)

select max(la_datum + convert(datetime, la_time, 108)) as book_datetime, la_nummer, la_index from [CARATLOCAL]...[lzarb]
where la_datum is not null and la_time is not null and la_nummer = 364999
group by la_nummer, la_index
WtFudgE
  • 5,080
  • 7
  • 47
  • 59

2 Answers2

0

Since the "Order by" is the basis of the query, and you are only looking for the most recent 100 (hence my order by is DESCENDING), I would suggest trying to prequery just the ID as a PREQUERY, then join to the rest such as... Make sure you have an index on each table on the "ID" column.

select 
      PreQuery.*, 
      b.*, 
      c.*
   from
      ( select top 100 a.*
           from db1 a
           order by a.ID DESC ) PreQuery
      JOIN db2 b on PreQuery.ID = b.ID
      JOIN db3 c on PreQuery.ID = c.ID

if that doesn't cut it, and the "a.ID" is a type of auto-increment column, you may want to get the max() ID and further strip from that.

ADDITIONALLY... if table "B" and "C" have multiple records per ID, then you are probably running into a Cartesian result... So, if B has 10 entries for each "A" ID, and C has 20 entries for same "A" ID, then you now have 200 entries for that one ID... Do this for 100 entries and you have 20,000 entries. Not a problem with that small a set, but something to consider.

DRapp
  • 47,638
  • 12
  • 72
  • 142
  • I was hoping if there was a quicker way of reading the VFP files, bcs this is my main concern.. Because I need multiple queries so I will have more than only this problem :s – WtFudgE Oct 31 '12 at 02:58
  • @WtFudgE, With proper indexes, VFP is very fast at querying, lack of indexes obviously won't optimize as much. But having a pre-query like I've sampled is virtually the same as the other, just prevents joining everything, then spit out what is not wanted. Get what is wanted, THEN join. – DRapp Oct 31 '12 at 12:56
  • alright, thanks for replies, so it's basically the fault of my client's data -_- – WtFudgE Nov 01 '12 at 02:07
0

Seems to me the problem is that the tables don't have the right indexes to optimize your queries. VFP's query engine is really, really fast, given the right indexes.

Ideally, you have a tag for each field that's used in a WHERE clause. (This is an oversimplification, but a good place to start.)

Tamar E. Granor
  • 3,817
  • 1
  • 21
  • 29