Mapping a collection joined by either of two columns with Fluent NHibernate?

Question

I'm modeling accounting where I have accounts with transactions that debit one account and credit another.

Here are the details of the situation (simplified). My tables (in SQL Server 2008) are:

CREATE TABLE Account
(
    AccountID int IDENTITY(1,1) NOT NULL,
    AccountNumber nvarchar(10) NOT NULL
)

CREATE TABLE [Transaction]
(
    TransactionID [bigint] IDENTITY(1,1) NOT NULL,
    DebitAccountID [int] NOT NULL,
    CreditAccountID [int] NOT NULL,
    Amount money NOT NULL
)

My classes are:

public class Account
{
    public int Id { get; set; }
    public string AccountNumber { get; set; }
    public IList<Transaction> Transactions { get; set; }
}

public class Transaction
{
    public int Id { get; set; }
    public Account DebitAccount { get; set; }
    public Account CreditAccount { get; set; }
}

So the question is "How do I map the Transactions collection in the Account class using fluent NHibernate?"

What I want (for performance reasons) is for the accessing of the transactions collection to execute the query:

SELECT ...
FROM [Transaction]
WHERE DebitAccountID=@accountID OR CreditAccountID=@accountID

The important part there is the OR in the where clause.

So the code I need is:

public class AccountMap : SubclassMap<Account>
{
    public AccountMap()
    {
        Id(x => x.Id).Column("AccountID");
        Map(x => x.AccountNumber);
        HasMany(x => x.Transactions)
            // What goes here to explain this mapping to NHibernate?
            .Inverse().Cascade.AllDeleteOrphan()
            .Access.CamelCaseField();
    }
}

Note: I am aware that I could map the transactions as two separate collections, one for "debits" and the other for "credits". That is not an acceptable answer because of performance issues. In particular, there is actually a second type like this related to the account (resulting in even more queries) and mapping as two collections prevents the use of eager loading with Fetch(). The second type is PaymentScheduleLine which contains a plan of all the correct payment transactions over the life of the account. It is associated to the account in the same way as transaction i.e. PaymentScheduleLine has a DebitAccount and CreditAccount and Account has a PaymentSchedule collection. Typically, complex calculations involve the relationship between the transactions and the payment schedule.

I am aware of http://stackoverflow.com/questions/1615604/fluent-nhibernate-mapping-a-column-against-one-of-two-columns however his answer is to map to two different collections, which is not acceptable. I am also aware of http://stackoverflow.com/questions/8701020/nhibernate-map-a-collection-where-key-can-be-two-different-columns however that answer is to modify the domain which is not acceptable here. — Jeff Walker Code Ranger, Mar 19 '12 at 20:51
As a sidenote, this seems a strange structure for accounting data given that a typical accounting transaction can affect two OR MORE accounts. This of course means that further normalisation of your data structure would be required but the problem you are confronting here would go away. — Phil Degenhardt, Mar 19 '12 at 22:46
I understand that in many situations accounting involves what Martin Fowler calls "Multilegged" Transactions in his book "Analysis Patterns". We previously had that model and discovered it was too complex. We're following a DDD approach and in our particular domain all transactions only involve two accounts. When we had the database normalized into Transaction and Entry we actually had severe performance issues and the code was needlessly complex. It was very hard to answer a question like how much money was moved from account A to account B over a given timeframe. — Jeff Walker Code Ranger, Mar 20 '12 at 13:57
Can you add the "second type like this related to the asset" to your example? Thanks. — Daniel Schilling, Mar 22 '12 at 15:27
I've added some more info about the "second type like this related to the account" under the note. I didn't want to clutter up the main example. — Jeff Walker Code Ranger, Mar 28 '12 at 14:24

score 0 · Answer 1 · edited Jun 20 '20 at 09:12

I might revise my answer once I hear more about this "second type like this related to the asset," but for now...

Decide what's important and be willing to make sacrifices.

It sounds like performance is your primary concern here, right? You might have to relax your demands that the domain model not be changed or that the query look a certain way.

Batch queries using `.Future()` to avoid Cartesian products.

You're right that having collections named Debits and Credits could lead to performance problems. Maybe this is the query you're thinking of:

// BAD QUERY - DO NOT USE - cartesian product of rows - Debits X Credits.
var account = session.QueryOver<Account>()
    .Fetch(x => x.Debits).Eager
    .Fetch(x => x.Credits).Eager
    .Where(x => x.Id == accountId)
    .SingleOrDefault();

If the account had 1,000 transactions, 500 of them being debits, and 500 credits, then this query would result in 250,000 rows (500 * 500), which is clearly unacceptable!

You don't have to write the query that way, though. This one is better:

var futureAccount = session.QueryOver<Account>()
    .Fetch(x => x.Debits).Eager
    .Where(x => x.Id == accountId)
    .FutureValue();

session.QueryOver<Account>()
    .Fetch(x => x.Credits).Eager
    .Where(x => x.Id == accountId)
    .Future();

var account = futureAccount.Value;

Even though this is really two queries, it will be executed in one round-trip to the database, with only 1,000 rows returned (500 + 500). NHibernate will reuse the same Account instance for the two queries, and just populate whatever data was eagerly fetched. The result will be an Account with fully populated Debits and Credits.

Only fetch the data you need.

Now, 1,000 rows is still a lot. Are you absolutely sure you need to load all of the transactions for a given account? What are you using them for? For the scenario you mentioned, calculating how much money moved from account A to account B over a given timeframe, you would achieve much better performance if you wrote a query to calculate exactly that, or at the very least, only loaded the transactions you were actually interested in using a more specific query. A query like this...

var transactions = session.QueryOver<Transaction>()
    .Where(x => x.DebitAccount.Id == fromId
        && x.CreditAccount.Id == toId
        && x.Date >= startDate
        && x.Date < endDate.AddDays(1))
    .List();

... could easily cut the number of transactions you're working with from 1,000 to 20 or less.

Daniel, thanks for the valuable info. Unfortunately, I am in a challenging position. We are close to product release and we already have this domain model (we actually have the Debits and Credits collections and then union those collections to a Transactions collection). This is not a single place where one query is the problem (I was aware of, and we have avoided the cartesian product you mention). Rather we are looking at situations where the transactions are lazy-loaded. — Jeff Walker Code Ranger, Mar 28 '12 at 14:31
For example, during posting of new transactions we have to traverse accounts and transactions which are determined by the data in the file. There is no simple query to pre-load only the accounts of interest and there are too many accounts to pre-load them all. Your suggestion of using `Future` may help in specific cases (I was not aware of that), but if we could get nhibernate to understand the mapping it would be a great performance boon to our almost completed application and help us meet our deadline. — Jeff Walker Code Ranger, Mar 28 '12 at 14:33
UPDATE: I tried using Future() as you suggested to batch the 6 queries of objects needed for each account. That is, after I determined that a particular account is needed I load all objects related to it. That provided essentially no performance improvement on my development box. I thought it would. — Jeff Walker Code Ranger, Mar 28 '12 at 18:34

Mapping a collection joined by either of two columns with Fluent NHibernate?

1 Answers1

Decide what's important and be willing to make sacrifices.

Batch queries using .Future() to avoid Cartesian products.

Only fetch the data you need.

Batch queries using `.Future()` to avoid Cartesian products.