Header background

JPA Under The Hood – Understanding the Dynamics of Your JPA Framework

I recently gave a talks on the behavior of different JPA frameworks at W-JAX (Germany) and TheServerSide Java Symposium (Prague). As some people have asked me, I am publishing the samples as well. I’d also give away the eclipse project, however, with all the third-party libraries I’m sure I’ll end up not doing it legally correct. Additionally, I can add some comments on the samples and why they are as they are :-).

The goal of my experiment was to compare different JPA frameworks regarding their runtime characteristics. I addressed the following points:

  • Object Loading
  • Object Creation
  • Update behavior
  • Caching
  • Connection Handling

Preparation – SQL Scripts, Entity Classes and Persistence Unit Definitions

First start with the SQL scripts for creating the necessary tables. I use two tables – user and accounts. A user can have multiple accounts.


CREATE TABLE users (
`username`  VARCHAR(15)  NOT NULL,
`password`  VARCHAR(15)  NOT NULL,
`firstname` VARCHAR(30)  NOT NULL,
`lastname`  VARCHAR(30)  NOT NULL,
`street`    VARCHAR(30)  NOT NULL,
`town`      VARCHAR(15)  NOT NULL,
`zip`       VARCHAR(10)  NOT NULL,
PRIMARY KEY (id)
)

CREATE TABLE accounts (
`id`         INT(10)       NOT NULL AUTO_INCREMENT,
`IBAN`       VARCHAR(34)   NOT NULL,
`BIC`        VARCHAR(11)   NOT NULL,
`userID`     INT(10)       NOT NULL,
`amount`     DECIMAL(16,2) NOT NULL
PRIMARY KEY (id),
FOREIGN KEY (`userId`) REFERENCES `Users` (`id`),
)

Next we need to define the persistence classes. We define a User class and an Account class. Getter and setter methods are omitted for brevty here.


@Entity
@Table(name="users")
// @Cache(usage=CacheConcurrencyStrategy.READ_WRITE)
public class User {
  private long id;
  private String firstName;
  private String lastName;
  private String userName;
  private String password;
  private String street;
  private String town;
  private String zip; >
  private List<Account> accounts;

  @Id
  public long getId() {
    return id;
  }

  @OneToMany(mappedBy="user")
  public List<Account> getAccounts (){
   return accounts;
  }

}

@Entity
@Table(name="accounts")
public class Account {

	private long id;
	private User user;
	private String BIC;

	@Id
	public long getId() {
	  return id;
	}

	@ManyToOne
	@JoinColumn(name="userID")
	public User getUser(){
	  return user;
	}
}

So far no rocket science. In the next step, we define the persistence units. I defined a single unit per persistence provider. According to the JPA spec this should work fine. However, some strange things might happen 😉


<persistence  xmlns="http://java.sun.com/xml/ns/persistence" version="1.0">
  <persistence-unit name="netPayEclipse" transaction-type="RESOURCE_LOCAL"
    xmlns="http://java.sun.com/xml/ns/persistence"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
    http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd" >

    <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>

    <!-- Entities -->
    <class>com.dynatrace.talks.jpahood.entity.User</class>
    <class>com.dynatrace.talks.jpahood.entity.Transaction</class>
    <class>com.dynatrace.talks.jpahood.entity.Account</class>

    <properties>
         <property name="eclipselink.jdbc.user" value="root"/>
         <property name="eclipselink.jdbc.password" value="admin" />
         <property name="eclipselink.jdbc.driver" value="com.mysql.jdbc.Driver"/>
         <property name="eclipselink.jdbc.url" value="jdbc:mysql://localhost/netpay"/>
         <property name="eclipselink.target-database" value="MySQL4" />
          <!-- <property name="eclipselink.cache.shared.default" value="false"/> -->
         <property name="eclipselink.jdbc.read-connections.min" value="1" />
         <property name="eclipselink.jdbc.read-connections.max" value="1" />
         <property name="eclipselink.jdbc.write-connections.min" value="1" />
         <property name="eclipselink.jdbc.write-connections.max" value="1" />
     </properties>
   </persistence-unit>

   <persistence-unit name="netPayOpenJPA" transaction-type="RESOURCE_LOCAL"
    xmlns="http://java.sun.com/xml/ns/persistence"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
    http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd" >

      <provider>org.apache.openjpa.persistence.PersistenceProviderImpl</provider>

      <!-- Entities  -->
      <class>com.dynatrace.talks.jpahood.entity.User</class>
      <class>com.dynatrace.talks.jpahood.entity.Transaction</class>
      <class>com.dynatrace.talks.jpahood.entity.Account</class>

      <properties>
            <property name="openjpa.ConnectionProperties"
                value="DriverClassName=com.mysql.jdbc.Driver,
                  Url=jdbc:mysql://localhost/netpay,
                  MaxActive=1000,
                  MaxWait=10000,
                  TestOnBorrow=false,
                  Username=root,
                  Password=admin"/>
            <property name="openjpa.ConnectionDriverName"
                value="org.apache.commons.dbcp.BasicDataSource"/>
         <!--
         <property name="openjpa.DataCache" value="true"/>
         <property name="openjpa.RemoteCommitProvider" value="sjvm"/>
          -->
       <property name="openjpa.QueryCache" 
        value="CacheSize=1000, SoftReferenceSize=100"/>
     </properties>
   </persistence-unit>

   <persistence-unit name="netPayHib" transaction-type="RESOURCE_LOCAL"
    xmlns="http://java.sun.com/xml/ns/persistence"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
    http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd" >

     <provider>org.hibernate.ejb.HibernatePersistence</provider>

     <!-- Entities  -->
      <class>com.dynatrace.talks.jpahood.entity.User</class>
      <class>com.dynatrace.talks.jpahood.entity.Transaction</class>
      <class>com.dynatrace.talks.jpahood.entity.Account</class>

      <properties>
         <property name="hibernate.dialect" value="org.hibernate.dialect.MySQLDialect"/>
         <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver"/>
         <property name="hibernate.connection.username" value="root"/>
         <property name="hibernate.connection.password" value="admin"/>
         <property name="hibernate.connection.url" value="jdbc:mysql://localhost/netpay"/>
         <property name="hibernate.max_fetch_depth" value="3"/>
         <property name="hibernate.connection.pool_size" value="500"/>
         <property name="hibernate.ejb.cfgfile"
          value="/com/dynatrace/talks/jpahood/hibernate.cfg.xml"/>
      </properties>
   </persistence-unit>
</persistence>

That’s it for preparation now we are ready to look at the samples, which will help us to understand the inner workings of JPA frameworks.

Dynamic Behavior of JPA Frameworks

Now let us go through the various samples. The samples are deliberately kept very simple. However they show typical usage scenarios

Sample 1 – It depends on what you make out of it

The goal of this sample is to test whether a framework detects parameters in query strings and automatically creates proper prepared statemenets. Here is the sample for querying the user with id 1.


public static void simpleLoadSample() {
  EntityManager em = EntityManagerUtil.getEMFactory(provider).createEntityManager();
  Query query = em.createQuery("select u from User u where u.id=1");
  iterateOverItems(query.getResultList());
  em.close();
}

Actually, a JPA frameworks should produce the same SQL statement as for the code below.


public static void simpleLoadwithParameter() {
  EntityManager em = EntityManagerUtil.getEMFactory(provider).createEntityManager();
  Query query = em.createQuery("select u from User u where u.id=?");
  query.setParameter(1, 1L);
  iterateOverItems(query.getResultList());
  em.close();
}

In my tests both -OpenJPA and ExclipseLink – create proper prepared statements in both cases. However Hibernate in the first case creates a statement that looks like this: “select … from user where id=1” and also prepares this statement. Prepared statements like this can have render PreparedStatement caching as well as database query caching obsolete.

Sample 2 – The Magic Value

This sample deals with object construction. What I have seen in my presentation, a lot of people aren’t sure what is happening here. We are loading an object with our query. While waiting for input, we modify the value in the database and then we query the value again.


public static void loadTwiceWithQuery() {
  EntityManager em = EntityManagerUtil.getEMFactory(provider).createEntityManager();
  Query query = em.createQuery("select u from User u where u.id=1");
  iterateOverItems(query.getResultList());
  em.close();
  try {
    System.in.read();
    // change value in database 
  } catch (IOException e) {
  e.printStackTrace();
  }
  em = EntityManagerUtil.getEMFactory(provider).createEntityManager();
  query = em.createQuery("select u from User u where u.id=1");
  iterateOverItems(query.getResultList());
  em.close();
}

When trying this example with different JPA frameworks you will see that two database queries will be executed unless query caching is enabled. However the second query will return the object with the “old” values. Why that? The query is used to retreive the id of the user. As it realized that the object has already been loaded it will not construct that object again. In case you always want the latest state, you would have to use refresh().

Sample 3 – Staying up to date

In this sample we look at updating. We load again a user update and then update the first name in a very creative way ;-).


public static void simpleUpdate (){
  EntityManager em1= EntityManagerUtil.getEMFactory(provider).createEntityManager();
  em1.getTransaction().begin();
  User user = em1.find(User.class, 1L);
  user.setFirstName("otherFirstName" + System.currentTimeMillis());
  em1.getTransaction().commit();
  em1.close ();
}

Guess what happens … the object gets updated ;-). Well, that’s what you’d expect. The interesting part here is again, what the statement looks like. Actually, we only want the firstname column to be updated. EclipseLink and OpenJPA do so be default. Hibernate however will update all fields. In case you’ve defined trigger in the database this can cause serious performance problems as triggers or stored procedures might be invoked although they shouldn’t. As Garvin mentioned in his comment below using specific queries for each different update will result in a much higher number of total queries which can lead to problems with the JDBC PreparedStatement cache.

Sample 4 – Having good references

This sample deal with the getReference method of the EntityManager. The JavaDoc says:

Get an instance, whose state may be lazily fetched. … The application should not expect that the instance state will be available upon detachment, unless it was accessed by the application while the entity manager was open.

Hmmm, I do not know how you feel about this, but the word may confused me here a bit. Actually this means I do not know whether the object will be fetched or not. I used the following code sample to see what’s happening


public static void getReferenceSample (){
  EntityManager em= EntityManagerUtil.getEMFactory(provider).createEntityManager();
  em.getReference(User.class, 1L);
  em.close ();
}

Here my experiments show that eclipseLink loads the data will Hibernate and OpenJPA do not load the data.

Sample 5 – Staying in good relations

In the next sample we look at the behavior for loading detail-master relationships. Hey, that is master detail not the other way round. Yes, I know but here we first load the detail and then the master.


public static void loadRelationSample () {
  EntityManager em= EntityManagerUtil.getEMFactory(provider).createEntityManager();
  Query query = em.createQuery("select acc from Account acc where acc.id = 1");
  Account account = (Account) query.getSingleResult();
  User user = account.getUser();
  em.close ();
}

Very interestingly, all frameworks I used by default load the master record as well. How they actually do this depends on the framework as well as the database used. OpenJPA for example by default uses a join, eclipseLink does not and when using Hibernate it depends on the used dialect (and database).

Sample 6 – Yam Session

In this sample we look at connection handling and sessions. The first example creates more and more EntityManager and queries for an object. The second sample sample does the same, however it also uses transactions. … and what is the ArrayList for? Well, we want to avoid Garbage Collection and automatic closing of the EntityManager


public static void checkMaxSessions() {
  ArrayList<EntityManager> myEMs = new ArrayList<EntityManager>();
  for (int i = 1; i < 51; ++i) {
    try {
      EntityManager em = EntityManagerUtil.getEMFactory(provider)
      .createEntityManager();
      myEMs.add(em);
      User u = (User) em.find(User.class, new Long(i));
      u.getFirstName();
      System.out.println("Concurrent sessions: " + i);
     } catch (Exception ex) {
      System.err.println(ex);
      break;
    }
    try {
      Thread.sleep(700);
    } catch (InterruptedException e) {
    }
  }
}

public static void checkMaxSessionsWithTransaction() {
  ArrayList<EntityManager> myEMs = new ArrayList<EntityManager>();
  for (int i = 1; i < 51; ++i) {
    try {
      EntityManager em = EntityManagerUtil.getEMFactory(provider)
      .createEntityManager();
      myEMs.add(em);
      em.getTransaction().begin();
      User u = (User) em.find(User.class, new Long(i));
      u.getFirstName();
      System.out.println("Concurrent sessions: " + (i));
    } catch (Exception ex) {
      System.err.println(ex);
      break;
    }
    try {
      Thread.sleep(300);
    } catch (InterruptedException e) {
    }
  }
}

What we can see here is that when using no transactions, we can do all the work with one connection. When we use transactions, however, Hibernate will open a new connection per EntityManager. So, if you do not need transactions – when you just load a single list on a website for example – you are better off not using them. However you should be aware of the implications of not using transactions across multiple queries (which I assume).

Conclusion

Although JPA is standardizing the interface for persistence frameworks there is still a lot of freedom regarding runtime behavior. This can easily impact the performance of your application. It also shows that you should not rely on the default settings of framework. In case you need consistent behavior across JPA providers, you have to test the runtime behavior and tweak it to your needs.

Further Readings

Below you will find a lot of links to other persistence-related posts. Specifically to caching in Hibernate. Additionally I recommend checking out the database diagnosis section of Dynatrace.

Thank you everybody for the feedback!