How to start working with Hibernate Search

Published on November 06, 2018

How to start working with Hibernate Search

    Today, many are developing enterprise Java applications using spring boot. In the course of projects there are often problems in creating search engines of varying complexity. For example, if you are developing a system that stores data about users and books, then sooner or later it may need to search by user name / last name, by name / annotation for books.



    In this post I will briefly talk about tools that can help in solving such problems. And then I will present a demo project of a search service where a more interesting and complex feature is implemented - synchronization of entities, database and search index. Using this demo project as an example, you can get acquainted with Hibernate Search - a convenient way to communicate with the full-text Solr, Lucene, ElasticSearch indexes.

    Among the tools for deploying search engines, I would highlight three.

    Lucene is a java-library that provides a low-level denormalized database interface with full-text search capability. With it, you can create indexes and fill them with records (documents). Read more about Lucene here .

    Solr is a Lucene-based end product, a full-text database, a standalone, separate web server. It has an http interface for indexing and full-text queries, allows you to index documents and search them. Solr has a simple API and a built-in UI, which saves the user from manual index manipulation. On Habré there was a good comparative analysis of Solr and Lucene.

    ElasticSearch is a more modern analogue of Solr. It is also based on Apache Lucene. Compared to Solr, ElasticSearch can withstand higher document indexing loads and therefore can be used to index log files. On the net you can find a detailed table comparing Solr and ElasticSearch.

    This, of course, is not a complete list; I have selected above only those systems that deserve the most attention. There is a lot of systems for search organization. PostgreSQL has full text search capabilities; Do not forget about Sphinx.

    Main problem


    Go to the main thing. RDB (Relational Database) is commonly used for reliable / consistent data storage. It provides transactivity in accordance with ACID principles. For the search engine to work, an index is used in which you need to add entities and those fields of the tables that will be searched. That is, when a new object enters the system, it must be saved in the relational database and in the full-text index.

    If the transactionality of such changes is not organized within your application, various types of desynchronization may occur. For example, you are sampling from a database, and there is no index in this object. Or vice versa: there is an object record in the index, and it was deleted from the RDB.

    This problem can be solved in different ways. You can manually organize transactional changes using JTA and Spring Transaction Management mechanisms . Or you can go a more interesting way - use Hibernate Search, which will do it all by yourself. The default is Lucene, which stores the index data inside the file system, in general, the connection to the index is configured. When the system is started, you start the startAndWait () synchronization method, and the records will be stored in the RDB and index during operation.

    To illustrate this solution, I prepared a demo project with Hibernate Search. We will create a service containing methods for reading, updating, and searching for users. It can form the basis of an internal database with full text search capability by first name, last name, or other metadata. To interact with relational databases, use the framework Spring Data Jpa .

    Let's start with the entity class to represent the user:

    import org.hibernate.search.annotations.Field
    import org.hibernate.search.annotations.Indexed
    import javax.persistence.Entity
    import javax.persistence.Id
    import javax.persistence.Table
    @Entity
    @Table(name = "users")
    @Indexed
    internal data class User(
            @Id
            val id: Long,
            @Field
            val name: String,
            @Field
            val surname: String,
            @Field
            val phoneNumber: String)
    

    Everything is standard, we denote an entity with all the necessary annotations for spring data. With the help of Entity we specify the entity, with the help of the Table we specify the table in the database. Annotation Indexed indicates that the entity is indexed and will fall into the full-text index.

    JPA-Repository required for CRUD operations with users in the database:

    internal interface UserRepository: JpaRepository<User, Long>
    

    Service for working with users, UserService.java:

    import org.springframework.stereotype.Service
    import javax.transaction.Transactional
    @Service
    @Transactional
    internal class UserService(private val userRepository: UserRepository, private val userSearch: UserSearch) {
        fun findAll(): List<User> {
            return userRepository.findAll()
        }
        fun search(text: String): List<User> {
            return userSearch.searchUsers(text)
        }
        fun saveUser(user: User): User {
            return userRepository.save(user)
        }
    }

    FindAll gets all users directly from the database. Search uses the userSearch component to retrieve users from the index. Component for working with user search index:

    @Repository
    @Transactional
    internal class UserSearch(@PersistenceContext val entityManager: EntityManager) {
        fun searchUsers(text: String): List<User> {
            //извлекаем fullTextEntityManager, используя entityManager
            val fullTextEntityManager = org.hibernate.search.jpa.Search.getFullTextEntityManager(entityManager)
            // создаем запрос при помощи Hibernate Search query DSL
            val queryBuilder = fullTextEntityManager.searchFactory
                    .buildQueryBuilder().forEntity(User::class.java).get()
            //обозначаем поля, по которым необходимо произвести поиск
            val query = queryBuilder
                    .keyword()
                    .onFields("name")
                    .matching(text)
                    .createQuery()
            //оборачиваем Lucene Query в Hibernate Query object
            val jpaQuery: FullTextQuery = fullTextEntityManager.createFullTextQuery(query, User::class.java)
            //возвращаем список сущностей
            return jpaQuery.resultList.map { result -> result as User }.toList()
        }
    }
    

    REST controller, UserController.java:

    import org.springframework.web.bind.annotation.GetMapping
    import org.springframework.web.bind.annotation.PostMapping
    import org.springframework.web.bind.annotation.RequestBody
    import org.springframework.web.bind.annotation.RestController
    import java.util.*
    @RestController
    internal class UserController(private val userService: UserService) {
        @GetMapping("/users")
        fun getAll(): List<User> {
            return userService.findAll()
        }
        @GetMapping("/users/search")
        fun search(text: String): List<User> {
            return userService.search(text)
        }
        @PostMapping("/users")
        fun insertUser(@RequestBody user: User): User {
            return userService.saveUser(user)
        }
    }

    We use two methods to extract from the database and search by string.

    Before the application to work, it is necessary to initialize the index, we do it with the ApplicationListener.

    
    package ru.rti
    import org.hibernate.search.jpa.Search
    import org.springframework.boot.context.event.ApplicationReadyEvent
    import org.springframework.context.ApplicationListener
    import org.springframework.stereotype.Component
    import javax.persistence.EntityManager
    import javax.persistence.PersistenceContext
    import javax.transaction.Transactional
    @Component
    @Transactional
    class BuildSearchService(
            @PersistenceContext val entityManager: EntityManager)
        : ApplicationListener<ApplicationReadyEvent> {
        override fun onApplicationEvent(event: ApplicationReadyEvent?) {
            try {
                val fullTextEntityManager = Search.getFullTextEntityManager(entityManager)
                fullTextEntityManager.createIndexer().startAndWait()
            } catch (e: InterruptedException) {
                println("An error occurred trying to build the search index: " + e.toString())
            }
        }
    }

    For the test used PostgreSQL:

    spring.datasource.url=jdbc:postgresql:users
    spring.datasource.username=postgres
    spring.datasource.password=postgres
    spring.datasource.driver-class-name=org.postgresql.Driver
    spring.datasource.name=users
    

    And finally build.gradle:

    buildscript {
        ext.kotlin_version = '1.2.61' 
        ext.spring_boot_version = '1.5.15.RELEASE'
        repositories {
            jcenter()
        }
        dependencies {
            classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version" 
            classpath "org.jetbrains.kotlin:kotlin-allopen:$kotlin_version" 
            classpath "org.springframework.boot:spring-boot-gradle-plugin:$spring_boot_version"
            classpath "org.jetbrains.kotlin:kotlin-noarg:$kotlin_version"
        }
    }
    apply plugin: 'kotlin' 
    apply plugin: "kotlin-spring" 
    apply plugin: "kotlin-jpa" 
    apply plugin: 'org.springframework.boot'
    noArg {
        invokeInitializers = true
    }
    jar {
        baseName = 'gs-rest-service'
        version = '0.1.0'
    }
    repositories {
        jcenter()
    }
    dependencies {
        compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version" 
        compile 'org.springframework.boot:spring-boot-starter-web'
        compile 'org.springframework.boot:spring-boot-starter-data-jpa'
        compile group: 'postgresql', name: 'postgresql', version: '9.1-901.jdbc4'
        compile group: 'org.hibernate', name: 'hibernate-core', version: '5.3.6.Final'
        compile group: 'org.hibernate', name: 'hibernate-search-orm', version: '5.10.3.Final'
        compile group: 'com.h2database', name: 'h2', version: '1.3.148'
        testCompile('org.springframework.boot:spring-boot-starter-test')
    }
    

    The above demo is a simple example of using Hibernate Search technology, with which you can understand how to make friends with Apache Lucene and Spring Data Jpa. If necessary, projects based on this demo can be connected to Apache Solr or ElasticSearch. The potential direction of the project development is a search by large indices (> 10 GB) and measuring their performance. You can create configurations for ElasticSearch or more complex index configurations, exploring the power of Hibernate Search at a deeper level.

    Useful links: