Sunday, November 18, 2007

Repository Implementation (DDD/NHibernate)

Christian Bauer (Hibernate bod) has written a post about the repository pattern. I think the post is a little bit misinformed and concentrates too much on one possible way to implement the repository pattern, however it does show that people are still having trouble working out how to implement repositories so I thought I would explain how we do it.

I actually think the pattern is relatively simple, especially when used with NHibernate, so here goes...

1. Accessing From The Domain
I don't acces the repositories from the domain, with NHibernate I rarely see the need. I don't actually want to couple them because:

  1. Complexity - Having the domain classes call the repositories makes it harder to understand them.
  2. Testing - Most DDD practitioners seem to focus on state testing and thats certainly my preference, but if your calling repositories from the domain those tests become (as I see it) layer crossing tests and testing becomes more difficult.

I don't want to have to mock out repositories when testing the domain, for me thats reason enough to avoid the coupling. There isn't too much about this one the Web other than one forum post and bits and pieces on peoples blogs, but making your domain testable in isolation is (in my view) well worth the effort.

2. Associations
I tend to focus on modeling the most important associations in the domain.

Within an aggregate this is simple, you can always navigate from the root to the parts.

Where I want to associate one aggregate with another I'll put the most important association in the domain model and (optionally) handle the inverse using a repository e.g.:

IList orders = customer.Orders;
Customer customer = customerRepository.GetForOrder(customer);

Ofcourse sometimes you might bidirectional association here, if it wasn't costing you too much complexity/coupling wise.

Note that in many cases people focus on modeling the association from the one to the many, e.g. Order to Customer. This sometimes works, sometimes it doesn't...do whatever makes sense. I try to do it without simply putting in the associations that make persistence simplest.

3. Implementation With NHibernate
NHibernate makes cascading and lazy loading simple. In the mapping files I don't lazy load within an aggregate though I do lazy load between aggregates, cascading only goes as far as aggregate boundaries.

The implementation becomes ridiculosly simple, for example for our key repository its basically this:

public class CustomerRepository : Repository
{

//..any custom queries
}


The base class is doing all the heavy lifting, for simple cases all I need to say is that the key to the Customer table is an int (using a generic generic parameter, which is missing from the code because blogger is cutting it out :)).

The base class is also very simple, it has methods for SaveOrUpdate/GetById/GetAll and we have extra an IDeletionRepository that we can add on which just has a Delete method. We also have a RetrievalRepository base class for completely readonly cases.

I am coupling the implementation of my repositories to NHibernate but that has so far not proved to be an issue. What we do strive to avoid is putting anything NHibernate specific on the interface of the repository, not just to follow the pattern but because its sensible as we may not always use NHibernate for all of the queries. So I avoid making the repository a leaky abstraction by, for example, passing in some ICriteria to one of the queries.

3.1 Testing
Testing the basic Save/Update/Concurrency also becomes mickey mouse as we have a base class called AggregateRootPersistenceTestBase that does the heavy lifting.

As an example this base class has a TestSaving method that delegates to a SaveTestHelper, this class does the following:

1) Create Repository - Create an instance of the repository under test.
2) Create Aggregate - Create an instance of the aggregate we are testing the persistence of.
3) Save - Save the aggregate to the database (save then flush).
4) Reload - Evict the saved aggregate from the session (or Clear the session) and reload it from the database. We could instead use a seperate ISession, either way we are ensuring that the reloaded object is fresh from the DB (not from the first level cache).
5) Compare - Compare the two objects, this works as I've written an ObjectHierarchyComparer that can be given two objects and will use reflection over their properties to ensure they match. In navigates right down the hierarchy until it gets to build in primitives to compare so can handle very complex object structures.

This is very simple, all you need to do to use SaveTestHelper is pass in two delegates, one that creates the repository and one that creates the aggregate. You can optionally pass in a string[] of property names that the ObjectHierarchyComparer should ignore (such as properties that get default values from the DB, because the reloaded object will have different values for them).

This is all very reusable. We then write tests for the aggregate roots in a few different scenarios:

  1. Unpopulated - Create an instance and save it.
  2. Populated - Move it into a non-default state (if it has lifecycle), populate the entire aggregate.
  3. FullyPopulated - Same but with associations to other aggregates populated.
We also test other parts of the aggregate, I can provide more details if its useful...

3.2 Custom Queries
Any custom queries are written in the repositories, preferably in HQL/ICriteria so we can refactor the DB or code (made harder if you encode SQL). You could actually put the HQL into a named query in the mapping file if you wanted to.

3.3 Eager Fetching
We haven't really dealt with this issue fully yet but Udi Dahan has posted about it. I don't think I'd use his implementation but I do like the idea of fetching strategies and I'd probably choose the appropriate one in a coordination/application layer.

4. Specifications
We probably underuse specifications, they can be useful if your getting lots of custom queries on the repositories that are all just for specific cases:

public IList GetByFullName(...);
public IList GetByFirstAndSurname(...);
public IList GetBySurname(...);

You could encapsulate these name queries in one or more specifications and pass them in:

public IList GetByName(CustomerNameSpecification specification);

The problem is converting your (domain) CustomerNameSpecification into a query. I don't want the class itself to be talking in terms of SQL/HQL/ICriteria so choices that we've though of are:
  1. Conditional - A single CustomerNameSpecification and the CustomerRepository picks it apart, for example the CustomerNameSpecification would have a FirstName property and if its not null the repository adds an appropriate ICriteria to the query.
  2. Switch - Subclasses of CustomerNameSpecification (such as CustomerFirstNameSpecification) and a switch in the repository that calls an appropriate method to create the query for each type of specification.
  3. Visitor : This one of the GOF patterns. Subclasses of CustomerNameSpecification (such as CustomerFirstNameSpecification) and each "accepts" a QueryVisitor, the power of double dispatch is then used to ensure the appropriate method is called on this class.
None of these options are great, though the third is definitely the least awful.

My hope is that in the future extension methods and LINQ should make this easier.

5. Performance
I've discussed eager queries but sometimes domain clases just aren't appropriate. For example we have cases where we display lists of objects in the GUI, for example we might have a grid displaying Orders that actually displays the Customer name.

Loading each Order domain object and the associated Customer is going to be deeply inefficient, for those cases we map seperate presentation (or as we call them info) objects.

We'd thus map an OrderInfo class to a database view that would bring in the information that is needed from whatever tables are involved. These classes are loaded using Loaders (not repositories) to emphasize that they are not domain classes. We also nly create these classes if we are sure that they are needed, so you only find them where we have proved that using the domain classes was going to cause performance issues.

It is worth emphasizing that this is not a presentation model, these are read-only classes that are not in any way associated to the aggregates in the domain.

Share This - Digg It Save to del.icio.us Stumble It! Kick It DZone

6 comments:

  1. Great post! We are heading down the same path as you but you're miles ahead. We are also using a repositorybase (very basic) so it would be very interesting to see you implementation, any chance you could share it on you blog? (and your ObjectHierarchyComparer would of course also be interesting). Anyway, great blog I will definitly continue to read it!

    ReplyDelete
  2. Hi,

    Thanks for your comments and glad to hear your doing DDD!

    I can't post up our code as its from my current project but I had started to do a DDD example (http://www.codeplex.com/domaindrivendesign) and was going to try to use the same approach on it.

    I'll just have to make the time to work on it!

    Ta,

    Colin

    ReplyDelete
  3. wizzsm@hotmail.com6:37 pm

    I added ORACLE support to the NHibernateEg/Tutorial1A. I sent diff modules to 'kpixel@users.sourceforge.net', but dont know if they will actually go anywhere.

    Steve.

    ReplyDelete
  4. =lionel5:35 am

    Post is a bit old but still relevant to me.

    With your implementation of CustomerRepository; how do you update an Entity in the same Aggregate. Say Customer and Address belong in one Aggregate with Customer as the root. How do you update a particular Address for a Customer?

    Do you have something like customerRepository.UpdateAddress(addressData); ?

    thanks

    ReplyDelete
  5. Lionel!

    If customer is the root you should update adress using a metod on you customer domain object:

    var customer = custRepo.GetById(1);
    customer.UpdateAdress(adressData);

    custRepo.Update(customer);

    /Andreas

    ReplyDelete
  6. Thanks Andreas.

    That is what I'm doing but it bothers me that I have to get the customer first to update an address.

    Really all I want to do is session.Update(addressData). Since the NHibernate Session is hidden in repositories I would need a method in the repository to do that.

    It just doesn't make sense to me that I have to effectively do a SELECT to then do an UPDATE.

    I know I shouldn't think in database terms but at the end of the day that is what happens.

    ReplyDelete