Saturday 29 November 2008

In Defense of Rolling-Your-Own Data Access Layer

I would always say that if some code already exists to implement a particular piece of functionality then you should use that library. There are so many great open source components that implement pretty much everything you can imagine that you would be foolish not to reuse them. The only place where I would go against my own advice is at the data access layer of an application.

Libraries that implement data access components are available in abundance. You can get both commercial and open source offerings for any language that you care to think of. So, why with all this available code, could it be a good idea to go your own way when it comes to data access?

The first reason is down to performance. Given all the meta data interrogation, object creation and data type reflection that is necessary to implement a generic data access layer, you will always be able to implement a quicker application specific data access layer than is available from an off-the-shelf solution. You'll find a number of different studies on the web that do comparisons of raw performance verses performance through a data access library if you do a quick search on google. (Here is a typical one comparing performance using various persistence technologies available for the Java language.)

The second reason is down to developer laziness. A data access layer allows the developer to forget about the fact they are interacting with a database. This can lead to very bad code being produced. For example, performing multi-way joins between tables in the application code rather than allowing the database to take the strain. This is something that I don't think would happen if the developers explicitly understood they were interacting with the database.

One of the arguments for not going down the roll-your-own route is why re-invent the wheel when there are plenty of implementations available. Well, there is NOT an implementation available that specifically meets the access requirements of your application and your database. Other libraries are providing specific functionality, not dependent on aspects of your application. The data access layer is dependent on your application code and the structure of your database. For this reason, I think the data access layer is a special case here.

One of the reasons that are are so many implementations around is that it is not that difficult to write a data access layer, especially one for a specific application. There are good books available to get you started (Data Access Patterns by Clifton Nock or Patterns of Enterprise Application Architecture by Martin Fowler.) So, go ahead and be brave, roll your own. It is not as hard as you think and your application will benefit enormously.