Blending two worlds: Apache AGE with MyBatis (part 2)

Let's make queries more useful by passing data and parameters via the MyBatis mappers

Blending two worlds: Apache AGE with MyBatis (part 2)
Photo by Bilal O. / Unsplash

In my previous post I talked about Apache AGE and how to use it in a Spring Boot application together with MyBatis. In this new article I will make things a little bit more interesting by explaining how to use dynamic queries and passing parameters and data to the database.

In the previous post, I showed how to get data from the graph database using a combination of custom Java objects and a JsonTypeHandler for MyBatis. In the example I used only static queries, meaning that no parameters nor data was passed to the query to get results.

Since no application can be really useful just by using static queries, we will now have a look at how to interact with the database in a more dynamic way, by passing data and parameters to the Cypher code.

Querying with Cypher

As previously discussed, to interact with Apache AGE we need to use the cypher function inside the FROM clause of the SQL query. The function takes three parameters:

  • the name of the graph
  • the Cypher query
  • an optional parameter object - more on this later

The example query used in my previous post was the following:

SELECT * FROM cypher('${schema}', $$
    MATCH (i:Item)
    RETURN id(i), properties(i)
$$) AS (id agtype, properties agtype)
Retrieve all the Item nodes

This just means: "match all the nodes tagged as Item and return their id and custom properties".

Cypher is the query language created by Neo4J to interact with graph databases in a clear and easy way. Apache AGE uses a dialect of Cypher named OpenCypher, but at the time of writing this article it only supports a subset of the complete language specification.

That said, in Cypher, like in SQL, there is often more than one way to do the same query, and you are free to choose the one that is best suited to the specific needs of your application.

Let's say that instead of all of the nodes tagged with Item, we just want one of them, the one with the custom property code with value 1001. This can be done in two ways:

  • by adding the custom property filter inside the MATCH instruction
  • by using a WHERE clause

The first way, using the MATCH filter is like the following:

SELECT * FROM cypher('${schema}', $$
    MATCH (i:Item { code: "1001" })
    RETURN id(i), properties(i)
$$) AS (id agtype, properties agtype)
MATCH with custom property

As you can see the parameter is used as a template inside the MATCH instruction. This way we can retrieve all nodes tagged with the Item label AND having a custom property named code with value 1001, just as desired, but we can only match by equality of the exact value of the property.

The second one using WHERE is the following:

SELECT * FROM cypher('${schema}', $$
    MATCH (i:Item)
    WHERE i.code = "1001"
    RETURN id(i), properties(i)
$$) AS (id agtype, properties agtype)
Using the WHERE clause

With the WHERE clause we can use more expressive conditions like in a traditional SQL query, using both operators (like =, <, > or =~ for like) and functions.

Passing parameters to the query

Aside writing static queries like the previous ones, Apache AGE supports the use of dynamic parameters, but in an impractical way in my opinion. The documentation explains how to use prepared statements to pass parameters to Cypher queries, but the only way is by explicitly defining the prepared statement like in the following example:

PREPARE find_item_query(agtype) AS
SELECT *
FROM cypher('graph', $$
    MATCH (i:Item) 
    WHERE i.code = $code //Cypher parameter
    RETURN i
$$, $1) //An SQL Parameter must be placed in the cypher function call
AS (i agtype);

To run the query, you then need to execute it with the EXECUTE statement, like the following example:

EXECUTE find_item_query('{"code": "1001"}');

This way you can pass a parameter object of type agtype (the custom JSON superset introduced by AGE), and it surely works if you call the query by hand, but it still has a couple of drawbacks:

  • first, you have to define all the queries before you can use them, and you are not allowed to call the PREPARE statement with the same name twice of course. If you need to change the query you first need to DEALLOCATE the prepared statement and them redefine it.
  • second, it doesn't work with the way MyBatis handles the query and pass parameters to it with the standard #{param} notation.

A viable workaround

In addition to the #{param} notation to pass parameters, MyBatis offers an additional way to pass dynamic content to a query, that is the ${param} notation. This is not the way prepared statement uses to pass parameters to the query, instead it is a more basic text replacement. You read it correctly, it's text substitution, meaning that the raw value of the parameter is copied inside the query text replacing the ${param} placeholder.

This of course is not a good way to pass parameters coming from the outside world because it can lead to any sort of SQL Injection attack. But the fact that a raw text replacement is unsafe doesn't mean that the entire process must be unsafe.

Let me explain. When you insert a ${param} placeholder in your query, MyBatis takes the param variable from the inputs to the query, gets it string content and replaces placeholder with the text. If you noticed, I used this exact same notation to pass the name of the graph to the cyhper function in my example query. I can do it safely because the schema variable comes from the application settings, so it is handled by me and I can be quite sure about its value.

So how can you make a totally unsafe text replacement into a safe alternative to pass parameters to a query?

As I said, to replace the placeholder with the final value, MyBatis takes the parameter and gets its string representation. This means that if you pass an object that is not a String, MyBatis invokes the toString() method to get the final String representation.

This means that we can create a wrapper object that takes the value of our parameter and translates it into its safe string representation. And since AGE internally uses the agtype for all the inputs and outputs we can build our wrapper on top of it.

The AgtypeWrapper

The idea is pretty simple, take the following code as an example

public class AgtypeWrapper<T> {
    private final T object;

    public AgtypeWrapper(T object) {this.object = object;}

    public static <T> AgtypeWrapper<T> from(T object) {
        return new AgtypeWrapper<>(object);
    }

    // body of the class skipped for brevity
    
    @Override
    public String toString() {
        if (object instanceof String
                || object instanceof Integer || object instanceof Long
                || object instanceof Double || object instanceof Float
                || object instanceof BigInteger
                || object instanceof BigDecimal
                || object instanceof Boolean) {
            return serializeAgtype(object);
        } else {
            return serializeAgtype(makeAgtypeFromObject(object));
        }
    }
}

This is the main Idea behind this wrapper, take the content of an object and convert it into its agtype representation, then pass it to the MyBatis query as text.

I know it can be a raw approach but it is very similar to the way TypeHandlers work when converting output data, and it works.

The following is an example of the usage of the wrapper:

FindItemParams params = new FindItemParams();
params.setCode(code);
GraphItem<ItemProperties> item = itemsMapper.findItemByParams(AgtypeWrapper.from(params))
example usage of the AgtypeWrapper inside a MyBatis Mapper

You define your parameter object, in this example by declaring a variable of type FindItemParams. To pass it to the mapper you then need to call the from(object) builder method of the AgtypeWrapper. Since the wrapper is not a string, MyBatis will then call the toString() method to get the string representation of the value, and this triggers the internal converison of our param into its agtype equivalent.

In my case the FinditemParams class only contains the code parameter, so the agtype representation is the following:

{
  code: "value"
}
agtype representation of the params variable

This looks like JSON but it doesn't have the double quotes around the property name, similar to a JavaScript object definition.

The query invoked by the mapper is the following:

SELECT * FROM cypher('${schema}', $$
    MATCH (i:Item ${params})
    RETURN id(i), properties(i)
$$) AS (id agtype, properties agtype);

where our params are passed as ${params} without the need of additional escaping.

The same can be done for basic types like String, Integer, Double, Boolean and so on since the wrapper will automatically escape the value while converting it to agtype.

To pass a string for example you can use the wrapper like this:

AgtypeWrapper.from("myvalue")

since the from method is generic and automatically creates the AgtypeWrapper wit the correct representation

public static <T> AgtypeWrapper<T> from(T object) { /*...*/}
the AgtypeWrapper from method

How the wrapper works

The principle around which the wrapper works is pretty simple: when someone calls the toString method, it takes the internal value, converts it to agtype and then serializes the value to string.

The serializeAgtype method of the wrapper is a slightly modified version of the method with the same name inside the AgtypeUtil class you can find in the Java driver inside the AGE GitHub repository.

incubator-age/AgtypeUtil.java at master · apache/incubator-age
Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL. - incubator-age/AgtypeUtil.java at master · apache/incubator-age

The makeAgtypeMapFromObject method uses Java reflection to get all the properties of the wrapped object and build an agtype representation. This can be a simple value if the object is of basic type, like a string, integer, boolean and so on, or an AgtypeMap or AgtypeList if it is a complex object or an array or collection.

The wrapper implementation currently handles the following property types:

  • String
  • Integer / Long
  • Float / Double
  • BigInteger
  • BigDecimal
  • Boolean
  • Array
  • Collection (like List or Set)
  • Map
  • Custom object following the JavaBean standard

You can find the AgtypeWrapper inside my age-utils repository:

GitHub - fabiomarini/age-utils: Apache AGE utilities for usage with Spring and MyBatis
Apache AGE utilities for usage with Spring and MyBatis - GitHub - fabiomarini/age-utils: Apache AGE utilities for usage with Spring and MyBatis