0

Prelude: The design of this database is truly horrible - this isn't the first "crooked" question I've asked, and it won't be the last. The question is what it is, and I'm only asking because A) I only have a couple of years of experience with SQL Server, and B) I've already been pounding my face on my keyboard for a couple of days trying to find a viable solution.

Having said all of that...

We have a database with two tables relevant to this issue. The schema is ridiculous, so I'm going to paraphrase so that it can be understood:

T_Customer & T_Task

T_Task holds data about various work that has been performed on behalf of a customer in T_Customer.

In T_Customer, there is a field called "Sort_Type" (again, paraphrasing...). In this field, there is a concatenated string of various fields from T_Task which, in the order specified, determine how the customer's report is produced in the client program. There are a total of 73 possible fields in T_Task that can be chosen as Sort_Type in T_Customer, and the user can choose up to 5 of them in any given order. For example:

T_Customer

Customer_ID | Sort_Type
------------|-------------------------------
  1         | 'Task_Date,Task_Type,Task_ID'
  2         | 'Task_Type,Destination'
  3         | 'Route,Task_Type,Task_ID'

T_Task

Task_ID | Customer_ID | Task_Type | Task_Date | Route   | Destination
--------|-------------|-----------|-----------|---------|-------------
 12345  | 1           | 1         | 01/01/2017| '1 to 2'| '2'
 12346  | 1           | 1         | 01/02/2017| '3 to 4'| '4'
 12347  | 2           | 2         | 12/31/2016| '6 to 2'| '2'
 12348  | 3           | 3         | 01/01/2017| '4 to 1'| '1'

In this example, Customer #1's report would be sorted/totaled by the Task_Date, then by Task_Type, then by Task_ID; but not simply by doing an ORDER BY. This function requires one single value which can be ordered as a whole, single unit. As such...

Up until today, a field existed in T_Task called (paraphrasing....) 'MySort'. This field contained a concatenated string of fixed-width values filled in with zeroes and created according to the order and content of the values in T_Customer.Sort_Type. In this case:

Task_ID | Customer_ID | Task_Type | Task_Date | Route | Destination | MySort
--------|-------------|-----------|-----------|-------|-------------|-------
 12345  | 1           | 1         | 01/01/2017| 1 to 2| 2           |'002017010100000000010000012345'
 12346  | 1           | 1         | 01/02/2017| 3 to 4| 4           |'002017010200000000010000012346'
 12347  | 2           | 2         | 12/31/2016| 6 to 2| 2           |'000000000000000000020000000002'
 12348  | 3           | 3         | 01/01/2017| 4 to 1| 1           |'000040to0100000000030000012348'

During the printing phase of every single report, the program would search for the customer, find the values in T_Customer.Sort_Type, split them by commas, and then run an update on all of the tasks of that customer to update the value of MySort accordingly...

Can you guess what the problem is with this? Performance (not to mention chronic insanity)

I have been tasked with finding a more efficient way of performing this same task server-side, within SQL Server 2005 if possible, using whatever means will eventually allow me to return a result set including all of the details of the tasks requested, together with a concatenated string similar to the one used in the past (which the client program relies upon in order to sort and subtotal the report).

I've tried Views, UDFs in computed columns, and parameterized queries, but I know my limitations. I'm too inexperienced to know all of my options.

Question: Aside from quitting (not an option) or going berserk (considering it...), what methods might you use to solve this problem?

EDIT: Having received two questions about the MySort column already, I'll explain a bit better.

T_Task.MySort =

REPLICATE('0',10 - LEN(T_Customer.Sort_Type[Value1]) 
    + CAST(T_Customer.Sort_Type[Value1] AS VARCHAR(10))
        +
REPLICATE('0',10 - LEN(T_Customer.Sort_Type[Value2]) 
    + CAST(T_Customer.Sort_Type[Value2] AS VARCHAR(10))
        +
REPLICATE('0',10 - LEN(T_Customer.Sort_Type[Value3]) 
    + CAST(T_Customer.Sort_Type[Value3] AS VARCHAR(10))
WHERE T_Customer.Customer_ID = T_Task.Customer_ID

...Up to T_Customer.Sort_Type[Value5].

Reminder: Those values are not constants at all, so the value of the field MySort had to constantly be updated before printing a report. The idea is to somehow remove the need to constantly update the field, and instead return the string as part of the result set.

The resulting string should always be 50 characters in length. I didn't do that here simply to save a bit of space and time - I chose only 3 for the example. The real string would simply have another twenty zeroes leading the value: '00000000000000000000002017010100000000010000012345'

3BK
  • 1,338
  • 1
  • 8
  • 11
  • Can you explain how you get the values stored in `MySort`? I didn't quite get that part. – Zohar Peled Jan 26 '17 at 12:38
  • I can't figure out the bits in mysort column which relate to the (up to) 5 columns selected - can you expand on this a bit. – P.Salmon Jan 26 '17 at 12:39
  • Absolutely. Thanks for your time. In the example given, Customer #1 uses the T_Task fields Task_Date, Task_Type, and Task_ID for his custom sort. Each value must be 10 characters in length, with any unexploited whitespace replaced by zeroes. So the value being sorted comes out '00' + '2017' [Year] + '01' [Month] + '01' [Day] + '000000000' + '1' [Task_Type] + '00000' + '12345' [Task_ID]. The full, concatenated string to be sorted by the program is therefore '002017010100000000010000000001'. – 3BK Jan 26 '17 at 12:42
  • So should I expect 3 blocks of 10 if there are 3 sort_types and 2 blocks of 10 if there are 2 sort types (etc) - your example does not look like that? – P.Salmon Jan 26 '17 at 12:46
  • What if the user picks 5 columns to sort? how will the string look like? – Zohar Peled Jan 26 '17 at 12:51
  • The string is always 50 characters. I used only 30 for my example. If the user picks 5 fields for the value, then the value would be 5 individual values, each exactly 10 characters in length, with any whitespace removed and replaced by zeroes. If the user picks only two values, there will be thirty leading zeroes, followed by two 10-character values, i.e. '00000000000000000000000000000000000123450000012345' If the user picks one, there will be forty leading zeroes, followed by a single 10-character value, i.e. '00000000000000000000000000000000000000000000012345' – 3BK Jan 26 '17 at 13:00
  • That's quite a pickle you've got there... – Zohar Peled Jan 26 '17 at 13:19
  • @ZoharPeled : My pickle is bigger than your pickle? – 3BK Jan 26 '17 at 13:22
  • I tried an `OUTER APPLY` with a UDF I created which gave the desired results, but the performance is terrible, probably because it's effectively calculating the string per-row. Would there maybe be a clever way to do this by result sets? After all, the customer's Sort_Type doesn't change in the time it takes to produce a single report... Maybe there's some way to calculate all of that given customer's MySort values for all of his/her tasks in one batch without iterating over every row programmatically? – 3BK Jan 27 '17 at 14:03

0 Answers0