Okay, granted, I’m biased, I work for Terracotta. Be that as it may, I’d like to share some experiences my teammates and I have had using Hibernate recently while developing a web app.
First, some brief background. We are developing a “reference” web app at Terracotta to use to promote and explore the Sessions Clustering Use Case which we are working to nail. The app is an online exam-taking application, with the goal of supporting 40,000 concurrent users. I’ve blogged about this before, and you can read about the technology stack we settled on. Development has been done primarily by myself and my teammates Geert Bevin, Abhishek Sanoujam, and our supervisor Alex Miller.
Hibernate is wonderful, and it is an integral part of our web app. It feels to me like we got moving pretty quickly using Hibernate for persistence of our domain objects. For ORM, it’s unbeatable.
But the thing I noticed is, there’s just no avoiding the fact that whatever your domain POJO’s are that need to be persisted, chances are good that the use of ORM will impose some constraints on how you must write those POJO’s. I have two examples of this to share.
Example One – Generics
First, we have an exam Section class which, conceptually, is a container for either multiple sub sections, or Questions, but not both. The ideal solution would be to define Section as this (JPA annotations omitted):
where TestContent is an interface implemented by both Section and Question. Thus, an instance of Section could be declared as having type of either Section or Section, which satisfies our constraint.
However, at runtime (when starting Tomcat), Hibernate (the JPA provider) threw an exception pointing out that Section had an unbounded type (or something like that). After a little digging around on the internet, I found a forum where someone explained that an Entity cannot have a generic type, because it’s not known until instantiation time what the linked Entity will be.
Therefore I had to compromise. I modified Section, removing it’s generic type and adding two explicit collections, one for Questions, one for Sections.
This is less than ideal because the Section API itself doesn’t naturally prevent a single Section instance from having both sub-sections and questions, even though we don’t want to allow this.
Example Two – complex object tree
Similarly, for my other example, one of the constraints is that a Question must have exactly one correct choice (from among it’s two or more choices). So our first inclination was to structure the Question class thusly:
But this caused problems when saving an edited Exam which had had a Question added to it. I no longer have the stack trace handy, but the gist of the Hibernate exception was that a transient (unsaved) object was detected in the object graph being merged (updated).
Alex and I dug in and finally examined the generated database schema. What we saw was that the QUESTION table had a CORRECT_CHOICE column which was a foreign key into another table, QUESTION_CHOICE I think it was. Alex and I theorized that there was a possible ordering problem in updating an Exam with a new Question and Choices – what if Hibernate attempted to set the CORRECT_CHOICE foreign key before inserting the new choices for the question?
I’m not 100% positive that’s the correct explanation, but in any case Alex made the executive decision to simplify our domain model and not spend any more time debugging. We added a boolean “isCorrect” property to Choice, and removed the “correctChoice” reference from Question:
Problem solved – we no longer got the Hibernate exception. But, as Abhishek pointed out, our domain objects no longer enforced the constraint that a question could have only one correct choice. With the updated classes, nothing would prevent instantiating a question with multiple choices marked as correct. This put the burden on additional validation code to enforce this constraint, and overall is just less than ideal.
How Terracotta Can Help
The point I am agonizingly slowly building to is, I think it’s acceptable to have these constraints on our persistent domain objects, but only on the ones that should be persisted. An anti-pattern that we at Terracotta have seen again and again is the misuse of the database and ORM to persist state that really does not belong in the System of Record, but rather is transient state that must be persisted only to scale applications by keeping the applications stateless. One of the Terracotta co-founders, Orion, coined the term “State Monster” to describe this abuse of the db, and recently Wille Faler wrote a very good blog describing this.
Terracotta can help by providing an alternative to making apps stateless for scalability purposes. With Terracotta, go ahead and write your application in the most natural way, including shared state that is only transient. Consider this helpful graph about data lifetimes when deciding what state belongs in the SOR and what state is merely transient or pending. Then, use Terracotta to both cluster and persist the POJOs that don’t belong in the SOR. The advantage is that Terracotta does not impose any constraints on the API of the sorts I have written about here – generics are fine, arbitrarily complex object graphs are no problem. Terracotta clustered objects don’t even have to implement Serializable.