1

I wasnt quite sure how to word the title so sorry about that. I am trying to load some start and stop schedule times into a database but sometimes a start and stop time for the same system will overlap such as one time may be 5:30 to 12:30 and then another time for the same system may be 8:30 to 10:30 I want to avoid inserting that data into the table.

I am using a jdbc and odbc bridge to do this all in a java program, when the user clicks generate schedules it reads all the schedule info from a text file, and then inserts it into the database. When the programs reads the times that are in between already existing times I want to skip doing the insert.

My Ideas so have to been to some how compared the end time i get from the text file with the MAX value of the times in the database and if its less than that value then skip this insert but I dont know how to tie the MAX value into an if statement. Another idea was after im done all the inserts then just delete the rows where the SCHEDULE_TIME is greater than the min value in the SCHEDULE_TIME column and the SCHEDULE_TIME is less than the max value in the SCHEDULE_TIME column.

here is an example of what the data in my table looks like:

SITE_ID ------- DEV_ID ------- SCHEDULE_TIME ------- VALUE_ENUM
---------------------------------------------------------------
1               3000           09:30:00              1
1               3000           15:30:00              0
1               3000           12:30:00              1
1               3000           13:30:00              0
1               3000           16:30:00              1
1               3000           18:30:00              0

the rows alternate from start top to stop time all the rows where VALUE_ENUM are 1 are start times and all the rows where VALUE_ENUM are 0 are stop times. Im trying to delete the time that falls between a other start and stop times in this case delete rows 3 and 4. Keep in mind this tables actually creates hundreds of rows from the text file so I can not just delete it manually, it would be best if I could find a way to just avoid inserting it.

Here is a copy of my current insert method, ignore all the extra columns I use, they are unrelated to the problem, they are just using so I add and delete for the correct systems.

private void Insert() throws SQLException
{
    stmt = conn.createStatement();  

    String sqlStm = "update ARRAY_BAC_SCH_Schedule set SCHEDULE_TIME = {t '" + finalEnd + "'} WHERE SCHEDULE_TIME >=  {t '" + finalStart + "'} AND" +
        " SCHEDULE_TIME <=  {t '" + finalEnd + "'} AND VALUE_ENUM = 0 AND DEV_ID = " + devID + " and INSTANCE = " + instanceNum;
    int updateSuccess = stmt.executeUpdate(sqlStm);

    if (updateSuccess < 1)
    {

        sqlStm = "insert into ARRAY_BAC_SCH_Schedule (SITE_ID, DEV_ID, INSTANCE, DAY, SCHEDULE_TIME, VALUE_ENUM, Value_Type) " +
                " values (1, " + devID + ", " + instanceNum + ", " + day + ", {t '" + finalStart + "'}, 1, 'Unsupported')";
        stmt.executeUpdate(sqlStm);
        sqlStm = "insert into ARRAY_BAC_SCH_Schedule (SITE_ID, DEV_ID, INSTANCE, DAY, SCHEDULE_TIME, VALUE_ENUM, Value_Type) " +
                " values (1," + devID + ", " + instanceNum + ", " + day + ", {t '" + finalEnd + "'}, 0, 'Unsupported')";
        stmt.executeUpdate(sqlStm);
    }


    if(stmt!=null)
        stmt.close();
}

I hope I explained this enough, sorry if the question is unclear.

I have also posted this question under sql tags to see if someone knows a way to do it with just sql.

Update: I had it working before by taking the last start and end time entering and then when I go to enter 2 new times I checked to see that the new start time was >= the previous one and that the new end time <= the previous end time and if they were I would skip the insert.

However I have had to make some changes to the program which I can no longer get the previous start/end time, my idea now is if I can select the last 2 times in the table and saved them to a variable on the java side and then do the same comparison I did before

Beef
  • 1,413
  • 6
  • 21
  • 36
  • How do you know determine what range to use? For example 9:30 to 15:30 (line 1,line2) was your example. But could it be 9:30 to 13:30 (line1,line4)? – Alfredo Osorio Sep 19 '11 at 16:59
  • @AlfredoO yes it could be `9:30 to 13:30` but then it wouldnt `insert` the `12:30 to 13:30` because the `update` in my function would be successful so `updateSuccess` would equal 1 so it would not enter the `if` statement that has the `insert`, hope that answered your question – Beef Sep 19 '11 at 17:05
  • So is in the order in which you are processing right? – Alfredo Osorio Sep 19 '11 at 17:09
  • yes, like in my text file I could have `9:30 to 11:30` and then `10:30 to 15:30` my program would insert `9:30` and `11:30` and then in my update it would see that `10:30` falls before the last stop time and that `15:30` falls after it, so instead of inserting I update the `11:30` to equal `15:30` and the next line in the text file could be the `12:30 to 13:30` and that doesnt meet the requirements of my update so that fails which allows it to enter the insert – Beef Sep 19 '11 at 17:14
  • the text file is ordered by start times so its processes in order of start times – Beef Sep 19 '11 at 17:17

3 Answers3

1

I think the problem is with your design. If your text file is relying in the order you shouldn't be saving as is in the database.

I would have store that information in just one record. For example instead of having VALUE_ENUM and two records in the database you will just need one record with the following fields START_SCHEDULE_TIME and END_SCHEDULE_TIME. Now with that design you could just do a select like this:

select count(*) from ARRAY_BAC_SCH_Schedule where finalStart >= START_SCHEDULE_TIME and finalEnd <= END_SCHEDULE_TIME

if the select returns 1 then you just skip and don't insert.

Alfredo Osorio
  • 11,297
  • 12
  • 56
  • 84
  • Thanks but I dont have any control over the design of the database. – Beef Sep 19 '11 at 18:01
  • I also came to realize I cant just delete where >= min of SCHEDULE_TIME and <= max of SCHEDULE_TIME, because I have schedules with several starts and stops it will delete them such as data like: `8:30 to 10:30`, `11:00 to 14:00` and then `12:00 to 13:00` because it will delete all the times between `8:30` and `14:00` even though i need to keep the `10:30` and `11:00` – Beef Sep 19 '11 at 18:06
  • I need something along those lines but instead of getting `START_SCHEDULE_TIME` and `END_SCHEDULE_TIME` since those rows dont exist, I need to get the last value in table `where dev_id == 3000` and the 2nd last value in the table `where dev_id == 3000` and to be able to use those values on the java side of my application – Beef Sep 22 '11 at 16:51
1

Is it fair to say that the problem area is characterised/identified by 2 or more consecutive start times? I suspect it is.

If that is the case then perhaps just load everything you have into the table, then parse the table in time order. Each time you encounter 2 consecutive start times you know you must delete 'this' start/end pair.

Of course if you encounter a THIRD start time in a row you know you have to delete 2 end times.

so in pseudocode - The dataset is ordered by Dev_Id, schedule_time, value_enum desc

int NestCount = 0;
int CurrDevId = 0;

for each record
{

   // Optional DevId Logic Starts Here
   if ThisRecord.DevId <> CurrDevId then
   {
      if NestCount <> 0
      {
         CurrDevId = 0;
         RollBack;
         Shriek("Cannot Proceed Because The Data Are Fubar");
      }
      else
      {
         CurrDevId = ThisRecord.DevId;
      }
   } // And Ends Here

   if 
   CurrDevId <> 0 && // Optional DevId Logic 
   IsAStartRecord() 
   {
      If NestCount > 0 
      {
        DeleteThisRecord();
      } 
      NestCount += 1;     
   }
   else // Is an end record
   {
      NestCount -= 1; 
      If NestCount > 0 
      {
        DeleteThisRecord();
      } 
   }
}

If it is possible to implement this as a stored procedure I probably would.

Does that help ?

Hugh Jones
  • 2,706
  • 19
  • 30
  • Have just thought there might be a circumstance where an end-time and the next start-time might be identical. This would be addressed by ensuring the correct sort order - order by schedule_time, value_enum desc – Hugh Jones Sep 20 '11 at 10:08
  • the problem with that is DEV_ID is for each system, cause each system has a separate schedule, so in my example you only see one DEV_ID but really there will be hundreds of DEV_IDs in there, so if I order by schedule time then all the DEV_IDs with a specific time with be grouped together so I could end up up deleting the wrong row – Beef Sep 20 '11 at 12:41
  • Then order by dev_id, schedule_time, value_enum desc. I have pretty much assumed your data is already 'clean', so the algorithm will work 'as-is'. However - you can use the fact that the NestCounter should always return to zero for each dev_id as an additional check. I will modify the pseudocode to take account of that. – Hugh Jones Sep 20 '11 at 13:46
  • I seemed to have figured it out another way on my own, but I will keep this in mind because I am not sure how error proof my method is – Beef Sep 20 '11 at 14:14
  • All I have tried to do is describe an algorithm, and I think it works. – Hugh Jones Sep 20 '11 at 14:21
0

thanks for the help, but I ended up just doing a string replace on the : in the time and made it into a . then converted it to a double and saved that time into a temp variable and then the next time I had times for that system I checked if the start time was greater than the last start time and if the end time was less than the last one so if they fell between I could avoid adding, not 100% tested but it worked for the first system I tried

Beef
  • 1,413
  • 6
  • 21
  • 36