12.11.2008

Validating Domain Objects: Who's Responsible?

Overview

Whenever you persist data for an application, there are certain constraints that the data must adhere to. Some of those constraints can be defined on the database schema itself (assuming a database is the data store). Things like a category name being unique, or a price that must be defined (not null) for an item. There are also other constraints, business constraints, that don't map as well to a database schema. Making sure a person's age being between 0 and 125, or that a date for scheduling a bill payment occurs in the future are things that are valid to an application's flow, but not readily definable on your database schema. You don't want invalid data being persisted, so someone's got to step up to the plate and make sure that doesn't happen.

Who's in Charge?

So where is the data validation going to occur? There are a few options that are available. Validation can be done explicitly, either in the user interface itself or somewhere on the server, or it can be done implicitly by tossing the data to the database and seeing if the database has any complaints. You may also choose a combination of these approaches. Let's look at some of the pros and cons to each of these methods.

UI Validation

There are some definite benefits to putting data validation into the UI layer. You can do a large chunk of validation without ever having to make a call to the server. Because you never have to hit the server, validation can be done on every key event if needed (think determining password strength). Even if you don't go balls to the wall validating on keystrokes, you can still see everything that is invalid about an object all at once; you aren't locked into any kind of 'fail on first' validation error that masks other validation problems.

There are also some definite drawbacks to this approach. The UI can't validate data uniqueness without going back to the server. Even if you've opted to load the entire domain model into the UI, you can't guarantee uniqueness in a concurrent application (unless you've made the mistake of using an ORM solution, locking rows, not detaching objects and leaving sessions open indefinitely in transactions with a serializable isolation). The point is, you NEED to go to the database to guarantee uniqueness. Another drawback to validating in the UI is that you are now putting intelligence into the user interfaces that many (myself included) believe doesn't belong there. In a good design, the user interface should be as ignorant of the business logic as possible, and it should definitely be ignorant of the schema of the data store. By doing certain UI validations, you're breaking this rule. The UI needs to know that there is a NOT NULL column so that it can validate the existence of that column's value. Changes to business rules or data store schema require corresponding changes in UI components, making more development work and increasing the likelihood of a feature disconnect.

'Server' Tier Validation

I can't hide my true feelings, this is my preferred location for data validation. It shares many of the same benefits as UI validation. A well-designed validation framework on the server can tell you everything that is wrong with data, preventing the 'fail on first' validation issue. Client to server communication won't be as quick as all-client validation, but an intelligent communication protocol and well designed inputs and outputs could make even the validate on keystroke scenario viable with server validation. Server validation has the added benefit of supporting multiple user interfaces. If you need to provide server access as an API, or if you are developing an application that will have customizable interfaces, you don't want to reinvent the validation wheel every time. With server validation you don't have to.

Like the UI, the server by itself can't guarantee uniqueness. This seems to be the consistent wrinkle in otherwise clean validation framework solutions. You could query the data store prior to inserting or updating your data, but do you want to incur the extra query for every data manipulation? Maybe you do, maybe you don't; it's your call. Server validation will not provide as rich of a UI experience as actually doing the UI validation, at least not as easily. If you are doing something that is that visually rich and exciting and UI-centric, then my recommendation would be to enhance server validation with the necessary UI validation, instead of picking UI validation OVER server validation.

Database Validation

Obviously I've picked a favorite approach already, but this solution is still worth mentioning. Why? There may be situations where it is an appropriate solution. Please don't ask me what they are. The beauty of a database solution is that you don't really need to do any extra work. If you try to save data that isn't valid according to the schema, it simply won't work. You could simply pass the resulting error back up the stack to the UI, or even parse it, make it a little nicer and THEN send it up the stack to the UI. How does it deal with uniqueness? Same way it deals with any other invalid data, it doesn't work. No precheck necessary to see if the value already exists.

A big drawback to this approach is that it doesn't validate business logic. You CAN do some business logic validation, it just requires tighter database integration, more complex SQL statements and probably some good, old-fashioned vendor-specific functionality. If you're going to go through all that trouble, you may as well put a validation framework in place. Another big drawback is that the database will only report one error at a time, the first one it comes across. If you are trying to save a user without a first name or a last name (both required), you'll get a message to provide a first name. Send it again with a first name and it'll give you another message asking for a last name. By the time you've sent your third try, someone's taken the username you wanted. Not an awesome user experience.

Wrapup

Data validation is an important part of a quality data-driven application. There are different ways to do validation, and an argument can be made for each of those ways depending on the situation. Overall, a server-based validation approach is the best way to go. A server approach can be enhanced with some well-placed UI validation in certain cases. Database validation isn't really validation so much as it is hopeful laziness. A future article will probably cover integrating and extending Hibernate's Validation Framework.

No comments:

Post a Comment