Blending two worlds: Apache AGE with MyBatis (part 1)

In my previous post I introduced Apache AGE and the possibility to use Graph Databases in an existing project while at the same time keeping the relational structures in place.

As previously discussed, if you have an existing project with a lot of data structures already in place it would be quite expensive and dangerous to completely rewrite everything in a big bang approach using a completely different technology.

One of the advantages of using Apache AGE, is based on the fact that it's a PostgreSQL extension, so you can keep using your previous database and libraries while gradually adopting the new technology.

On the other side, a graph database is a very different world compared to the relational model, so you have to expect some changes to be required during the transition.

MyBatis

To build my projects I almost always use Java as the go-to technology together with the Spring Boot framework and a bunch of other libraries depending on the specific needs of the application. To access databases my preferred library is MyBatis, because it is rock solid and slim, it is perfectly integrated with the Spring Boot ecosystem and it allows me to decouple SQL from Java code through the use of mapper files (XML files with the SQL code).

I know you could think that there are other solutions like Hibernate that can hide the SQL details and bring some magic to data access, but I don't like some technical details of Hibernate like the use of proxies on the data objects and the use of session to "transparently" access relations.

MyBatis surely needs a bit more boilerplate configurations and code (there are IDE plugins and tools to generate them) when compared to Hibernate, but in return it gives more flexibility while keeping the door open to the possibility of tweaking the queries and use native features of databases (like AGE) that cannot be used otherwise.

In addition to this, I work on many enterprise projects when it is often required to interface with existing legacy databases, not designed for the requirements of tools like Hibernate, so MyBatis is a better solution to me because it allows me to map any existing database very easily.

The mapper files

In MyBatis you can use the @Query annotation to add your query on top of a mapper interface, but I strongly prefer to separate the SQL code from the Java sources for readability. The IDE plugins are more than good enough to help you move from Java code to the linked XML query location without messing up your source files.

Here is an example XML mapper file that contains a couple of queries, as you can see it is very easy to create these files and they help you separate application code from database code. It is also possible to use different XML files to add compatibility to other databases if you need.

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
<mapper namespace="net.fm.server.mapper.app.UserMapper">

    <select id="findById" resultType="net.fm.server.model.app.User">
        SELECT * FROM sy_user WHERE id = #{id}
    </select>
    
    <select id="findByUsername" resultType="net.fm.server.model.app.User">
        SELECT * FROM sy_user WHERE username = #{username}
    </select>
    
    <select id="findByEmail" resultType="net.fm.server.model.app.User">
        SELECT * FROM sy_user WHERE email = #{email}
    </select>
    
    <select id="findAuthoritiesByUserId" resultType="String">
        SELECT authority
        FROM sy_user_authority
        WHERE user_id = #{userId}
        UNION DISTINCT
        SELECT ga.authority
        FROM sy_group_authority ga
            INNER JOIN sy_group_member gm on ga.group_id = gm.group_id and gm.user_id = #{userId}
    </select>
</mapper>

The mapper tag has an attribute that refers to a Java Interface that is the one you will use in your code to access database, and the select tags (or insert, update and delete ones) refers to single queries. The id attribute of each query refers to a method of the mapper interface, while the parameters passed to the SQL are enclosed in #{} or ${} tags, like #{username} or ${schema}. These are placeholders that will be replaced at runtime by the real value of the parameters.

There are a lot of other features related to XML mapper files, like collection mapping, conditional blocks and so on. You can find the documentation of all the features and the XML syntax in the MyBatis Mapper documentation page.

When used in combination with the Spring Boot library, MyBatis becomes a little bit more "magic", meaning that you just need to provide the XML files with the queries and declare a Java Interface with the correct names and ids, while the glue code needed to make things work will be generated by MyBatis at runtime during the boot of the application.

The example interface for the previous mapper file is the following:

package net.fm.server.mapper.app;

import net.fm.server.model.app.User;
import org.apache.ibatis.annotations.Mapper;
import org.apache.ibatis.annotations.Param;

import java.util.List;

@Mapper
public interface UserMapper {
    User findById(@Param("id") String id);
    User findByEmail(@Param("email") String email);
    User findByUsername(@Param("username") String username);

    List<String> findAuthoritiesByUserId(@Param("userId") String userId);
}

As you can see, you have to annotate the interface with @Mapper, match the interface name and package with the namespace attribute of the mapper tag inside the XML and the methods name with the id attribute of each query block. Then you can map the method parameters to the query parameters usin the @Param annotation and match the method return type with the resultType of the query blocks of the mapper file.

That's it, you can then autowire (aka Inject) the mapper interface in your Services and Components and you are ready to invoke your queries in pure Java code while forgetting the details of SQL Sessions, connections and so on.

Here is an example service using the UserMapper interface, as you can see the use of the database becomes transparent to the application business logic.

@Service
public class MyServiceImpl implements MyService {
    private final UserMapper userMapper;
    
    public MyServiceImpl(UserMapper userMapper) {
    	this.userMapper = userMapper;
    }
    
    @Override
    public User findUserByEmail(String userEmail) {
    	return userMapper.findByEmail(userEmail);
    }
}

Apache AGE with MyBatis mappers

The AGE extension adds the power of Graph Databases to PostgreSQL and introduces a new datatype called agtype that is a superset of JSONB. Internally the graph structures are stored inside a schema with the name of the graph and a bunch of tables with the ids and labels of edges and vertices, with a lot of agtype columns containing the additional information like the custom properties of edges and vertices. Even the return type of every query is in fact of type agtype.

This means that to use an AGE Cypher query inside of MyBatis we first need to find a way to map parameters and results to and from agtype.

Let's start with the output parameters, that are the results of the graph query. As discussed, every output in AGE will have type agtype but since it is a superset of JSON, it means that we are getting a JSON string as our output type.

Take the following Cypher query as an example, where Item is the label of our nodes and each node has a set of properties, mapped as a dynamic agtype/JSON object. The output will be a list of records with the ids of the nodes labelled with Item and the properties of each

SELECT * FROM cypher('${schema}', $$
	MATCH (i:Item)
	RETURN id(i), properties(i)
$$) AS (id agtype, properties agtype)

The example result is the following:

id properties
844424930131985 {"code": "1001", "name": "Item 1", "code2": "P1001"}
844424930131986 {"code": "1002", "name": "Item 2", "code2": "P1002"}
844424930131987 {"code": "1003", "name": "Item 3", "code2": "P1003"}

Where the id field is a long value and the properties field is a JSON string.

To help us map from the internal representation to the Java object that should contain it, MyBatis uses TypeHandlers that are objects that map to and from parameters inside Prepared Statements.

To play with that query I created the following MyBatis Mapper interface:

package net.fm.server.mapper.graph;

import net.fm.server.datatype.graph.GraphItem;
import net.fm.server.datatype.graph.ItemProperties;
import org.apache.ibatis.annotations.Mapper;
import org.apache.ibatis.annotations.Param;

import java.util.List;

@Mapper
public interface ItemMapper {

    List<GraphItem<ItemProperties>> findItems();

    List<GraphItem<ItemProperties>> findItem(@Param("code") String code);

}

The expected result of the findItems query is List<GraphItem<ItemProperties>>. The List is handled transparently by MyBatis when there is more than one result in the output. The inner type, that is the GraphItem, is a simple generic bean that holds the id of the node and a generic properties object that map to the custom properties of the node or relationship

package net.fm.server.datatype.graph;

public class GraphItem<T> {

    private long id;
    private T properties;

    public long getId() {
        return id;
    }

    public void setId(long id) {
        this.id = id;
    }

    public T getProperties() {
        return properties;
    }

    public void setProperties(T properties) {
        this.properties = properties;
    }
}

The properties object, in this example, is represented by the ItemProperties type that is another simple JavaBean that maps the output of the properties JSON.

package net.fm.server.datatype.graph;

public class ItemProperties {

    private String code;
    private String code2;	
    private String name;

    public String getCode() {
        return code;
    }

    public void setCode(String code) {
        this.code = code;
    }

    public String getCode2() {
        return code2;
    }

    public void setCode2(String code2) {
        this.code2 = code2;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

To correctly map the query and the results we need the XML mapper file that is the following:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE mapper
 PUBLIC "-//ibatis.apache.org//DTD Mapper 3.0//EN"
 "http://ibatis.apache.org/dtd/ibatis-3-mapper.dtd">

<mapper namespace="net.fm.server.mapper.graph.ItemMapper">

	<resultMap id="item" type="net.fm.server.datatype.graph.GraphItem">
		<id column="id" property="id" />
		<result column="properties" property="properties" javaType="net.fm.server.datatype.graph.ItemProperties" typeHandler="net.fm.server.mapper.typehandler.JsonTypeHandler" />
	</resultMap>

	<select id="findItems" resultMap="item">
		SELECT * FROM cypher('${schema}', $$
			MATCH (i:Item)
			RETURN id(i), properties(i)
		$$) AS (id agtype, properties agtype)
	</select>

	<select id="findItem" resultMap="item">
		SELECT * FROM cypher('${schema}', $$
			MATCH (i:Item {code: '${code}'})
			RETURN id(i), properties(i)
		$$) AS (id agtype, properties agtype)
	</select>

</mapper>

In this example you can see that at the top of the file there is a new tag, the resultMap tag. This is another powerful feature of MyBatis that allows us to specify the details of how we want the output to be mapped to the Java counterpart. In our case we have a GraphItem that has two properties:

  • the id property mapped with the <id /> tag (this is an optimization for caching reasons)
  • the properties property that is mapped via the <result /> tag to an ItemProperties object through a custom JsonTypeHandler class.

The latter is where the "magic" happens, meaning theat when the MyBatis gets the results of the query, for each record it invokes the type handler to map the JSON structure to our ItemProperties class and then sets it and the id inside the corresponding properties of the GraphItem object, then it collects all the results inside a list and returns them, reflecting the exact signature of our mapper method.

The custom JsonTypeHandler relies on the Jackson library to convert to and from JSON, and is defined as follows:

package net.fm.server.mapper.typehandler;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import org.apache.ibatis.type.BaseTypeHandler;
import org.apache.ibatis.type.JdbcType;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.sql.CallableStatement;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.Map;

public class JsonTypeHandler<T> extends BaseTypeHandler<T> {
    private static final Logger logger = LoggerFactory.getLogger(JsonTypeHandler.class);
    private static final ThreadLocal<ObjectMapper> objectMapper = ThreadLocal.withInitial(() -> {
        ObjectMapper objectMapper = new ObjectMapper();
        objectMapper.configure(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS, false);
        return objectMapper;
    });
    private final Class<T> type;

    public JsonTypeHandler(Class<T> type) {this.type = type;}
    public JsonTypeHandler() {this.type = (Class<T>) new Object().getClass();}

    @Override
    public void setNonNullParameter(PreparedStatement preparedStatement, int i, T t, JdbcType jdbcType) throws SQLException {
        try {
            preparedStatement.setString(i, objectMapper.get().writeValueAsString(t));
        } catch (JsonProcessingException e) {
            throw new SQLException(e);
        }
    }

    @Override
    public T getNullableResult(ResultSet resultSet, String columnName) throws SQLException {
        String jsonString = resultSet.getString(columnName);
        return readFromString(jsonString);
    }

    @Override
    public T getNullableResult(ResultSet resultSet, int columnIdx) throws SQLException {
        String jsonString = resultSet.getString(columnIdx);
        return readFromString(jsonString);
    }

    @Override
    public T getNullableResult(CallableStatement callableStatement, int columnIdx) throws SQLException {
        String jsonString = callableStatement.getString(columnIdx);
        return readFromString(jsonString);
    }

    private T readFromString(String jsonString) throws SQLException {
        if (jsonString == null) {
            return null;
        }
        try {
            if (type == null) {
                return (T) objectMapper.get().readValue(jsonString, Map.class);
            } else if (type.equals(String.class)) {
                return (T) jsonString.replaceAll("^\"|\"$", "");
            } else {
                return objectMapper.get().readValue(jsonString, type);
            }
        } catch (Exception e) {
            logger.error("Error converting JSON value", e);
            return null;
        }
    }
}

This allows the mapping of the JSON object to our Java counterpart, skipping properties that we don't need or know. In case we need to map all the properties we can use a Java Map<String, Object> object that is the most generic way to represent a JSON object in Java.

MyBatis parameters in Cypher queries

If you noticed, inside the mapper there is also another method, findItem that takes a code parameter to look for a specific item inside the graph.

Unfortunately, at the time of writing, AGE doesn't fully support Prepared Statement's parameter mapping, so we need a workaround to let MyBatis pass the parameter value to the query.

This can be done using the text replacement feature of MyBatis, that is the ${param} representation (instead of the #{param} one), but it must be used wisely since it is a plain replacement of the value, in fact I had to wrap the parameter with the single quotation mark ('${code}'). This can cause SQL Injection problems if the value is not correctly handled and escaped so we need to address it.

In the next article we will look at how to solve this issue with parameters mapping and how to pass complex objects to the CREATE and other statements to build more useful Cypher queries using MyBatis. The Cypher syntax is in fact slightly different from JSON when you need to represent the internal properties of a node (it doesn't wrap the property names in double quotes), so we need a different component to translate a complex object to the corresponding Cypher representation.