12.11.2008

Entity Relationships and DTOs: A Better Way (Part 2)

Recap

In the previous article we looked at a scalability issue that can arise from using a findById method to establish entity relationships for a collection of transfer objects. Our solution was to look up the collection of related entities (Role) with a single find, store those entities in a map keyed off the entity's primary key, and then retrieve the entity from the map when building the collection of enclosing entities (User) from the transfer objects (UserDto). It's a good solution.

Good But Not Great

The previous solution isn't appropriate in all scenarios, as you're about to see. Consider the following entity:
public class User {
private Long id;
private String firstName;
private String lastName;
private String email;
private Role role; // entity relationship
private ContactInfo contactInfo; // entity relationship #2
private Collection<User> dependents; // entity relationship #3

// getters/setters
...
}

We've added a new reference to the ContactInfo entity, which we've specifically chosen to be an entity rather than a component for management and lookup reasons. We've also added a collection of User entities to represent dependent Users (perhaps our fictitious system is of interest to the IRS). The DTO for this entity might now look something like this:
public class UserDto {
public Long id;
public String firstName;
public String lastName;
public String email;
public Long roleId;
public Long contactInfoId;
public Collection<Long> dependentIds;
}

To use our previous solution, we would do the following:
  • Get set of Role entities into a map
  • Get set of ContactInfo entities into a map
  • Get set of dependent User entities into a map
  • Create Users from UserDtos, pulling from the maps as necessary

The 9 extra lines of code is now 27 extra lines of code, unless you built the generic utility method of course. You DID build the generic utility method didn't you? For our collection of 100 transfer objects, we will have run 3 SELECTs, which is not as good as 1, but still better than the 300 we would have run otherwise. For a collection of 100 elements, or even 1000 elements, 3 additional statements against the database doesn't seem so bad. The problem is that managing a single User at a time (a likely scenario) still requires those same 3 SELECTs.

Taking Advantage of Proxies

It would be nice if we could eliminate the extra SELECT statements completely, but how? The entities don't have references to primary keys, they have references to full-fledged entity instances (and this is a good thing). But the database only needs certain keys to represent the entity relationships, not the whole entity structure, so any UPDATE or INSERT generated by Hibernate will only need the foreign key information from the related entities.
We need instances of the related entities that have their primary keys set AND that are managed by Hibernate's persistence context. We need the Session.load() method.

Session.load() Solution

Here's the code to use the Session.load() method. An explanation follows the code:
for(UserDto dto : userDtos) {
User u = new User();
u.setId(dto.getId());
u.setfirstName(dto.getFirstName());
u.setLastName(dto.getLastName());
u.setEmail(dto.getEmail());
u.setRole(roleDao.loadReference(dto.roleId));
u.setContactInfo(contactInfoDao.loadReference(dto.contactInfoId));
u.setDependents(userDao.loadReferences(dto.dependentIds));

// add u to list and save in bulk after loop
}

The loadReference method in the DAO would look something like this:
public Role loadReference(final Long id) {
return (Role) getSession().load(Role.class, id);
}

The loadReference() methods in the other DAOs would look eerily similar. This load method will create a proxy instance of the Role and put it into the persistence context, all without actually hitting the database. This instance can now be used wherever that Role instance is needed. The getter for the primary key will return the id value that the proxy was initialized with, so Hibernate can use these proxy instances to generate the database statements. We now have a solution with simplified coding that eliminates the need to hit the database before saving an entity relationship.

NOT a Silver Bullet

There are times when a load isn't appropriate. If you don't know for sure that a row with the specified identifier exists in the database, then you shouldn't use load. A proxy instance will still be created, but a runtime exception will be thrown when the transaction tries to actually write the rows to the database and runs into the constraint violation.
Additionally, if you need any information from the proxied instance other than its identifier the load() method has no advantage, since the first call to get any of that additional information will result in retrieving that row from the database.

No comments:

Post a Comment