1

I'm looking for some ideas on methods for persisting customer configurable data in a relational database (SQL Server 2000 in my case).

For example let's say you have a standard order entry application where your customer enters a product they want to buy. In addition to the fields that are important to both you and the customer (itemID, cost etc), you want to allow the client to enter information only relevant to them and persist it for them (for later retrieval on reports or invoices or whatever). You also want the labeling of these "customer fields" to be configured by the customer. So one customer might have a field called "Invoice Number" another customer might have 2 fields called "Invoice#" and "Invoice Date" etc...

I can think of a few ways to do this. You could have a customerfields table with some reasonable number of varchar fields related to each transaction and then another table clientcustomerfields which contains the meta data about how many fields a customer uses, what the field names are etc. Alternatively you could use XML to persist the customer data so you don't have to worry about filly up X # of fields, you'd still need some table to describe the customers meta data (maybe through an XSD).

Are there any standard ways of doing this type of thing?

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Stimy
  • 1,491
  • 3
  • 15
  • 36

5 Answers5

3

You should read Best Practices for Semantic Data Modeling for Performance and Scalability. This is exactly the question addressed by the white paper in the link.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
2

One strategy I'd use:

customer_fields
- field_id
- cusomer_id
- field_name

customer_transaction_field_values
- transaction_id
- field_id
- field_value

Matt Wrock
  • 6,590
  • 29
  • 23
  • Having a table that has the a FK to the customer and then these fields would be a good, database only solution and fairly easy to sort and implement if/as needed. – JamesEggers Sep 24 '09 at 20:49
  • the fk to the customers table would probably best be in the transaction table – Matt Wrock Sep 24 '09 at 20:52
  • 1
    I'm aware of at least one product that started out using this type of approach and then abandoned it because it doesn't work as well at you might think. The field_values table gets quite large and queries often involve joining against the table multiple times. Putting a clustered index on the table mitigates the worst of the performance drain but the table is still relatively fiddly to query. – ConcernedOfTunbridgeWells Sep 24 '09 at 20:56
1

As a generalisation I would recommend against using opaque XML blobs to store field-oriented data in a relational database.

  1. The best solution is to have some 'user' fields and configuration within the application to set up how these fields are used and presented. If the fields are varchars the overhead for empty fields is fairly minimal, IIRC about 1 byte per field. Although this looks inelegant and has a finite number of fields, it is the simplest to query and populate which makes it the fastest. One option would be to make the configuration agnostic to the number of fields and simply run a script to add a few more fields if you need them.

  2. Another option is to have a 'coding' table hanging off entities which user-configurable fields. It has 'entity ID', 'field type' and 'field code' columns where the 'field type' column denotes the actual content. The particular disadvantage is that it makes queries slower as they have to potentially join against this table multiple times.

I've actually seen both (1) and (2) in use on the same system. The vendor originally started with (2) but then found it to be a pain in the arse and subsequent subsystems on the application went to using (1). This change in approach was borne out of bitter experience.

The principal strike against XML blobs is that they are not first class citizens in the database schema. The DBMS cannot enforce referential integrity on the blob by itself, it cannot index individual columns within the blob and querying the data from the blob is more complex and may not be supported by reporting tools. In addition, the content of the blob is completely opaque to the system data dictionary. Anyone trying to extract the data back out of the system is dependent on the application's documentation to get any insight into the contents.

ConcernedOfTunbridgeWells
  • 64,444
  • 15
  • 143
  • 197
  • 1
    Once the Op upgrades to a new version of SQL Server the XML can then be indexed with XML Indexes. There's nothing wrong with storing data that you don't ever need to search against as an XML data if that's the easiest way for the app to deal with it. I've got several places when I store XML data as a value in a field, because the website needs to be able to store a variety of customer settings data, but we don't want to redesign the tables each time the site needs to add another item (or have a named value pairs table). Since the database doesn't care about the value, storing as XML works. – mrdenny Sep 24 '09 at 22:01
  • People tend to forget that application developers aren't the only stakeholders in a database. Just because 'I' don't need to search an opaque field doesn't mean that no-one needs to search it. – ConcernedOfTunbridgeWells Sep 25 '09 at 22:35
  • @ConcernedOfTunbridgeW - you make a good point but the I expect the overall cost of development following you're suggested route would be significantly more than using XML blobs. Ultimately, it's not relational data and might be better served in something like Azure's Table Storage but storing XML is better than forcing the data into a relational model like you suggest. You can't rely on the benefits of a DBMS for that data any more but you can always write query and caching logic for the XML in your DAL. – EightyOne Unite Jul 14 '11 at 23:43
  • Persisting the data in XML might or might not make the application easier to develop, but it makes it much, much harder to work with the database. XML is just not a first-class citizen in the ETL, reporting or database tooling world. My experience is that this type of storage makes ETL processing really slow at runtime and time consuming to develop. Reporting directly off the data is even worse. The cure is worse than the disease - any local savings in application development time will be lost many times over in additional support and integration costs. – ConcernedOfTunbridgeWells Jul 15 '11 at 09:54
0

In addition to your own suggestions, another way is to look at the Profile Provider system in ASP.net (Assuming a MS tech stack on this). You can extend the ProfileBase to include 2 arrays representing your user defined keys and another for the corresponding values. At that point, the SqlProfileProvider will handle the storage and retrieval of such and create your implementation of the ProfileBase object. Ultimately, this would be similar to if you were trying to use the ProfileProvider system in a Web Application project and not a Web Site project (which use different build managers).

JamesEggers
  • 12,885
  • 14
  • 59
  • 86
0

I have done this in the past and used the concept of user defined fields. I would create four tables for the basic types:

  1. UDFCharacter - id - int, order_id - int, value - varchar(4000)
  2. UDFNumber - id - int, order_id - int, value - float
  3. UDFDateTime - id - int, order_id - int, value - datetime
  4. UDFText - id - int, order_id - int, value - text

I would then have a table that described the fields along with their type:

CustomField - id - int, customer_id - int (linked to customer table), fieldType - 'UDFCharacter, UDFNumber, etc', name - varchar, and other meta info

The responses to the fields go in the UDF tables. The fields get displayed on the page based on the CustomField table. Our system was may more complex and required more tbales but this seems like it would work.

efleming
  • 247
  • 2
  • 4
  • 9
  • to get he benefit of strong(er) typing you can also implement this approach using the SQL_Variant data type in Sql Server http://msdn.microsoft.com/en-us/library/ms173829.aspx – Jason Horner Sep 26 '09 at 03:34