1

Disclaimer: Below is my very simplified description of a problem. Please while reading it imagine some complicated modular application (ERP, Visual Studio, Adobe Photoshop) that is evolving over the years during which its features will be added and removed in the hope that it will not end up in spaghetti code.

Let say I have following entity and corresponding table in database

class Customer
{
    public int Id { get; set; }
}

then I will use some ORM, build DataContext and create my application

static void Main(string[] args)
{
    //Data Layer
    var context = new DataContext();
    IQueryable<Customer> customers = context.Customers;

    //GUI
    foreach (var customer in customers)
    foreach (var property in customer.GetType().GetProperties())
        Console.WriteLine($"{property.Name}:{property.GetValue(customer)}");
}

Application is done, my customer is happy, case closed.

Half year later my customer asks me to add Name to the Customer. I want to do it without touching previous code.

So first I will create new entity and corresponding table in database and add it in ORM (please ignore the fact that I'm modifying the same DataContext, this is easy to fix)

class CustomerName
{
    public int Id { get; set; }
    public string Name { get; set; }
}

CustomerName will add new property to Customer but to have complete Customer information we need to join them together, so let's try to modify our application without touching previous code

static void Main(string[] args)
{
    //Data Layer
    var context = new DataContext();
    IQueryable<Customer> customers = context.Customers;

     //new code that doesn't even compile
    customers = from c in customers
                join cn in context.CustomerNames on c.Id equals cn.Id
                select new {c, cn}; //<-- what should be here??

    //GUI
    foreach (var customer in customers)
    foreach (var property in customer.GetType().GetProperties())
        Console.WriteLine($"{property.Name}:{property.GetValue(customer)}");
}

As you can see I have no idea to what map my information from join so that it still will be a valid Customer object.

And no, I cannot use inheritance.

Why?

Because at the same time another developer can be asked for functionality to block customer and create following entity:

class BlockedCustomer
{
    public int Id { get; set; }
    public bool Blocked { get; set; }
}

He will not know anything about Customer Name, therefore he may only depend on Customer, and at runtime our both features will result in something like this:

static void Main(string[] args)
{
    //Data Layer
    var context = new DataContext();
    IQueryable<Customer> customers = context.Customers;

     //new code that doesn't even compile
    customers = from c in customers
                join cn in context.CustomerNames on c.Id equals cn.Id
                select new {c, cn}; //<-- what should be here??

    customers = from c in customers
        join b in context.BlockedCustomers on c.Id equals b.Id
        select new { c, b }; //<-- what should be here??

            //GUI
            foreach (var customer in customers)
    foreach (var property in customer.GetType().GetProperties())
        Console.WriteLine($"{property.Name}:{property.GetValue(customer)}");
}

I have 2 ideas how to solve it

  1. Create some container class that will inherit from Customer and play with casting/converting it to CustomerName or BlockedCustomer when needed.

Something like this:

class CustomerWith<T> : Customer
{
    private T Value;

    public CustomerWith(Customer c, T value) : base(c)
    {
        Value = value;
    }
}

and then

customers = from c in customers
            join cn in context.CustomerNames on c.Id equals cn.Id
            select new CustomerWith<CustomerName>(c, cn);
  1. Use ConditionalWeakTable to store (at data layer level) CustomerName and BlockedCustomer associated with Customer and modify (once) UI to be aware of such things

As to my knowledge, both solutions unfortunately require me to write my own LINQ mapper (including change tracking) and I want to avoid it.

  1. Do you know any ORM that know how to handle such requirements?
  2. Or maybe there is much better/simpler solution to write applications and don't violate Open/Closed principle?

Edit - Some clarifications after comments:

  1. I'm talking about properties that have one-to-one relationship with Customer. Usually such properties are added as additional columns in the table.
  2. I want to send only one SQL query to database. So adding 20 such new features/properties/columns shouldn't end up in 20 queries.
  3. What I showed was a simplified version of application I have in mind. I'm thinking about app which dependencies will be structured in the following way [UI]-->[Business Logic]<--[Data]. All 3 should be open for extension and closed for modification but in this question I'm focusing on [Data] layer. [Business Logic] will ask [Data] layer (using Linq) for Customers. So even after extending it, Customer BL will just ask for Customer (in Linq), but extension to [Business Logic] will need CustomerName. Question is how to extend [Data] Layer so that it will still return Customer to Customer BL, but provide CustomerName to Customer name BL and still send one query to database.
  4. When I showed Joins as a proposed solution, it may be not clear that they will be not hard coded in [Data] layer, but rather [Data] layer should know that it should call some methods that may want to extend the query and which will be registered by main() module. Main() module is the only one that knows about all dependencies and for the purpose of this question it's not important how.
SeeR
  • 2,158
  • 1
  • 20
  • 35
  • One solution would be to have a single child table that contains extension properties for the parent Customer Table: the CustomerProperties Table. It would have a foreign key (the Customer Id), a Feature Id (key), a Property Name (an additional key), and a Value (a string that would need to be parsed). Another solution would be to use multiple child tables, each specific to the kind of functionality you are adding. CustomerPurchases, CustomerContactDetails, etc. – Ryan Pierce Williams Oct 30 '18 at 01:11
  • I've worked with both. The first is easy to implement, but it will mingle the data from all such features in one table (though easy to filter with the Feature Id). The second allows you to use strongly typed classes without the overhead of custom parsing logic. – Ryan Pierce Williams Oct 30 '18 at 01:14
  • @RyanPierceWilliams Thank you. Your second solution is not exactly the one I've described in my question and what I want to achieve. Yes I want strongly typed classes but I'm talking about one-to-one relationship. Your examples seem to be really separate things that can be shown separately in UI (one customer to many X) while I'm talking about properties that should be showed on the same Customer List/Card With your first one on the other side I'm scared how Linq/SQL will look to search on each column - smells like Linq will not handle it and I would need to build pure SQL to do it. – SeeR Oct 30 '18 at 08:07
  • The *real* problem is mixing up the data classes and business entities. You don't hate to use the same classes in both the business and data models. In fact you *shouldn't*, except in very simple problems. Once the models get too complex you need to *separate* them so each model can evolve independently – Panagiotis Kanavos Oct 30 '18 at 08:23
  • BTW you shouldn't be using joins in LINQ either, it's the ORM's job to generate joins based on relations. If you use joins it means the EF context configuration is missing relations. The `Customer` should have a `CustomerNames` collection. `CustomerName` itself should be configured, so that `context.Customers.Where(c=>c.Id=whatever)` will load both the customer *and* the related names – Panagiotis Kanavos Oct 30 '18 at 08:24
  • @PanagiotisKanavos Please remember that I've simplified everything so my question would be easy to understand. Customer and CustomerName are data classes that data source should provide. I don't care how they are provided by my datasource. Linq is used to decouple datasource from business logic. BL sends query to datasource and doesn't care how it will execute it. Your solution will result in dependency Customer-->CustomerName or even in both directions. Customer shouldn't know anything about features added/removed later. But on the other side CustomerName-->Customer is perfectly fine. – SeeR Oct 30 '18 at 10:05
  • @SeeR the *data model* should definitely know about *data*. Those aren't features, and hard-coding a JOIN doesn't decouple anything. It moves data concerns outside the data model and into the domain/business code. Nobody said you should have *one* DbContext either. In fact, you should have different contexts/models per scenario so you don't end up with entities that either have unwanted properties or have to be tied up in code afterwards – Panagiotis Kanavos Oct 30 '18 at 10:08
  • @PanagiotisKanavos Your focusing on the wrong thing. For simplicity purpose I was not clear enough in my question that joins will be added in data layer, and that data layer should also be extensible and not violate open/closed principle. So BL (and UI) will ask for Customers and Customers only, but extension to BL (& UI) needs additional information that should be retrieved together with Customers. Therefore I need to also extend data layer to get this information. – SeeR Oct 30 '18 at 10:19
  • @PanagiotisKanavos I've extended my question with some clarification – SeeR Oct 30 '18 at 10:42
  • @SeeR I've built model-driven applications in the past. I've hit the same walls you are hitting because you are working on the wrong abstraction level. Besides, data binding and MVVM make it trivial to display *any* object in a XAML or MVC view. An ORM makes MDD *harder* because it imposes its own model, using its own configuration. The `join` in your code tries to *define a relation* at the wrong level for both ORM and MDD. If you convert your objects into "Entity" and "Attribute" classes though, you end up with just a DataTable, a lot of complexity and no benefits from using LINQ, ORMs etc – Panagiotis Kanavos Oct 30 '18 at 10:46
  • @PanagiotisKanavos I hear what you say and understand your point about DataTable, but can't understand what you mean by "The join in your code tries to define a relation at the wrong level". I cannot join it in DB, because that way I would need to create and modify Customer View every time I add/remove something. [Data] layer seems to be the best place for this. Linq also gives me easy ability to pass user intention to filter and order the data from UI down to DB (where it should be done) without creating hundreds of methods in [Data] layer for each such case (too much coupling between BL&DB). – SeeR Oct 30 '18 at 11:05
  • @PanagiotisKanavos using "Entity" and "Attribute" classes, Linq allows me to 1) Prepare (Extend) Customer query at data layer 2) add additional where and orderby conditions in BL either for Customer or its attributes. – SeeR Oct 30 '18 at 11:11
  • @PanagiotisKanavos also I completely don't know what you mean by "model-driven applications" and "An ORM makes MDD harder because it imposes its own model". If you can, please write and answer to my question with a way how you would do it as a "model-driven application" without ORM. Thank you. – SeeR Oct 30 '18 at 11:15
  • Build the entity model the way it *should be*, then model one or more models on top of that which project that model to a form that's ideal for your UI (so project the entity model to e.g. denormalized models). What you're proposing is equal to adding new features to an app using only new classes and not refactoring anything. You'll end up with a mess that's illogical. So in short: don't do this: the ppl who have to maintain it will be grateful. – Frans Bouma Oct 30 '18 at 14:04
  • @FransBouma In extreme case (which my question may suggest) you are right, but also in extreme case someone may say "layers are bad because they make refactoring harder". Yet we are creating them because (if not taken to extreme) benefits are greater than costs. It all should be balanced. How? For example, based on Single Responsibility Principle which says to group code so that it will change at the same time for the same reason. – SeeR Oct 30 '18 at 14:43

1 Answers1

0

I have following entity and corresponding table in database

I think that this basic premise is the root cause for most problems.

To be clear, there's nothing inherently bad about basing an application architecture on an underlying relational database schema. Sometimes, a simple CRUD application is all the stakeholders need.

It's important to realise, though, that whenever you make that choice, application architecture is going to be inherently relational (i.e. not object-oriented).

A fully normalised relational database is characterised by relations. Tables relate to other tables via foreign keys. These relationships are enforced by the database engine. Usually, in a fully normalised database, (almost) all tables are (transitively) connected to all other tables. In other words, everything's connected.

When things are connected, in programming we often call it coupling, and it's something we'd like to avoid.

You want to divide your application architecture into modules, yet still base it on a database where everything's coupled. I know of no practical way of doing that; I don't think it's possible.

ORMs just make this worse. As Ted Neward taught us more than a decade ago, ORMs is the Vietnam war of computer science. I've given up on them years ago, because they just make everything worse, including software architecture.

A more promising way to design a complex modular system is CQRS (see e.g. my own attempt at explaining the concept back in 2011).

This would enable you to treat your domain events like naming a customer, or blocking a customer as Commands that encapsulate not only how, but also why something should happen.

On the read side, then, you could provide data via (persistent) projections of transactional data. In a CQRS architecture, however, transactional data often fits better with event sourcing, or a document database.

This sort of architecture fits well with microservices, but you can also write larger, modular applications this way.

Relational databases, like OOD, sounded like a good idea for a while, but it has ultimately been my experience that both ideas are dead ends. They don't help us reduce complexity. They don't make things easier to maintain.

Mark Seemann
  • 225,310
  • 48
  • 427
  • 736