Thursday, November 11, 2010
Hibernate Caching Mechanism and Types
High-volume database traffic is a frequent cause of performance problems in Web applications. Hibernate is a high-performance, object/relational persistence and query service, but it won't solve all your performance issues without a little help. In many cases, second-level caching can be just what Hibernate needs to realize its full performance-handling potential. This article examines Hibernate's caching functionalities and shows how you can use them to significantly boost application performance.
An Introduction to Caching
Caching is widely used for optimizing database applications. A cache is designed to reduce traffic between your application and the database by conserving data already loaded from the database. Database access is necessary only when retrieving data that is not currently available in the cache. The application may need to empty (invalidate) the cache from time to time if the database is updated or modified in some way, because it has no way of knowing whether the cache is up to date.
Hibernate uses two different caches for objects:
first-level cache and second-level cache.
First-level cache is associated with the Session object, while second-level cache is associated with the Session Factory object. By default, Hibernate uses first-level cache on a per-transaction basis. Hibernate uses this cache mainly to reduce the number of SQL queries it needs to generate within a given transaction.
For example, if an object is modified several times within the same transaction, Hibernate will generate only one SQL UPDATE statement at the end of the transaction, containing all the modifications. This article focuses on second-level cache. To reduce database traffic, second-level cache keeps loaded objects at the Session Factory level between transactions.
These objects are available to the whole application, not just to the user running the query. This way, each time a query returns an object that is already loaded in the cache, one or more database transactions potentially are avoided.
In addition, you can use a query-level cache if you need to cache actual query results, rather than just persistent objects.
Caches are complicated pieces of software, and the market offers quite a number of choices, both open source and commercial. Hibernate supports the following open-source cache implementations out-of-the-box:
JBoss TreeCache-JBossTreeCache (org.hibernate.cache.TreeCacheProvider)
Each cache provides different capacities in terms of performance, memory use, and configuration possibilities:
EHCache is a fast, lightweight, and easy-to-use in-process cache.
It supports read-only and read/write caching, and memory- and disk-based caching.
However, it does not support clustering.
OSCache is another open-source caching solution. It is part of a larger package, which also provides caching functionalities for JSP pages or arbitrary objects. It is a powerful and flexible package, which, like EHCache, supports read-only and read/write caching, and memory- and disk-based caching. It also provides basic support for clustering via either JavaGroups or JMS.
SwarmCache is a simple cluster-based caching solution based on JavaGroups. It supports read-only or nonstrict read/write caching (the next section explains this term). This type of cache is appropriate for applications that typically have many more read operations than write operations.
JBoss TreeCache is a powerful replicated (synchronous or asynchronous) and transactional cache. Use this solution if you really need a true transaction-capable caching architecture.
Another cache implementation worth mentioning is the commercial Tangosol Coherence cache.
Once you have chosen your cache implementation, you need to specify your access strategies. The following four caching strategies are available:
Read-only: This strategy is useful for data that is read frequently but never updated. This is by far the simplest and best-performing cache strategy.
Read/write: Read/write caches may be appropriate if your data needs to be updated. They carry more overhead than read-only caches. In non-JTA environments, each transaction should be completed when Session.close() or Session.disconnect() is called.
Nonstrict read/write: This strategy does not guarantee that two transactions won't simultaneously modify the same data. Therefore, it may be most appropriate for data that is read often but only occasionally modified.
Transactional: This is a fully transactional cache that may be used only in a JTA environment.
Support for these strategies is not identical for every cache implementation. Table 1 shows the options available for the different cache implementations.
Table 1. Supported Caching Strategies for Hibernate Out-of-the-Box Cache Implementations
Setting Up a Read-Only Cache
To begin with something simple, here's the Hibernate mapping for the Country class:
<class name="Country" table="COUNTRY" dynamic-update="true">
<id name="id" type="long" unsaved-value="null" >
<column name="cn_id" not-null="true"/>
<property column="cn_code" name="code" type="string"/>
<property column="cn_name" name="name" type="string"/>