ElasticSearch

ElasticSearch Fuzzy Query Example in Java

ElasticSearch fuzzy query can be used in scenarios when the user searches with mistyped keywords or misspellings. Alternatively, it can also be used for performing the search for similar words based on Levenshtein Edit Distance, which can be defined as the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

In this post, Fuzzy Search using ElasticSearch Java API is demonstrated. Some of the following points are covered:

  • Getting Setup with ElasticSearch and Kibana
  • ElasticSearch Library POM Entries
  • Using Fuzzy Query API for fuzzy search
  • Using Match Query API for fuzzy search
  • Using Bool Query API for Fuzzy Search

Getting Setup with ElasticSearch and Kibana

First and foremost, get set up with ElasticSearch and Kibana. For Windows environment, refer to my post on Getting Started with ElasticSearch and Kibana on Windows

ElasticSearch Library POM Entries

Create a Java Maven project. Put the following in pom.xml file for working with ElasticSearch Java APIs:

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>transport</artifactId>
    <version>6.2.2</version>
</dependency>
<dependency>
    <groupId>org.json</groupId>
    <artifactId>json</artifactId>
    <version>20180130</version>
</dependency>

Using Fuzzy Query API for Fuzzy Search

Pay attention to some of the following which is required for using fuzzy query for search the index:

  • Create an instance of TransportClient
  • Create an instance of QueryBuilder using fuzzyQuery API
  • Create an instance of SearchRequestBuilder for creating the request object
  • Invoke Get API on the instance of SearchRequestBuilder
  • Iterate through the search results
public class App {

    private static final String INDEX_NAME = "recruitment";
    private static final String INDEX_TYPE = "interviews";

    public static void main(String[] args) throws IOException {
        //
        // Create an instance of TransportClient
        //
        TransportClient client = = new PreBuiltTransportClient(Settings.EMPTY)
                .addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
        //
        // Create a query builder using fuzzyQuery Method
        // Name of the key to search: name
        // Value to search: "vitalflux"
        //
        QueryBuilder queryBuilder = QueryBuilders.fuzzyQuery(name, "vitalflux").boost(1.0f).prefixLength(0).fuzziness(Fuzziness.ONE).transpositions(true);
        //
        // Create an instance of SearchRequestBuilder
        //
        SearchRequestBuilder requestBuilder = client.prepareSearch(INDEX_NAME).setTypes(INDEX_TYPE)
                .setQuery(queryBuilder).setSize(100);
        //
        // Get the search result
        //
        SearchResponse response = requestBuilder.get();
        //
        // Iterate through search results
        //
        SearchHit[] srchHits = response.getHits().getHits();
        String[] result = new String[srchHits.length];
        int i = 0;
        for (SearchHit srchHit : srchHits) {
            result[i++] = (String) srchHit.getSourceAsMap().get(KEY_NAME);
        }
    }
}

Using Match Query API for Fuzzy Search

The following code can be used to build the QueryBuilder instance with Match Query API which is later used to build the instance of SearchRequestBuilder. The rest of the code remains same as above code.

QueryBuilder queryBuilder = QueryBuilders.matchQuery("name", "vitalflux").fuzziness(Fuzziness.ONE).boost(1.0f).prefixLength(0).fuzzyTranspositions(true);
//
// Create an instance of SearchRequestBuilder
//
SearchRequestBuilder requestBuilder = client.prepareSearch(INDEX_NAME).setTypes(INDEX_TYPE).setQuery(queryBuilder).setSize(100);
//
// Get the search result
//
SearchResponse response = requestBuilder.get();

Using Bool Query API for Fuzzy Search

The following code can be used to build the QueryBuilder instance with Bool Query API which is later used to build the instance of SearchRequestBuilder. The rest of the code remains same as above code.

QueryBuilder queryBuilder = QueryBuilders.matchQuery(KEY_NAME, refNumber).fuzziness(editDistance).boost(1.0f).prefixLength(0).fuzzyTranspositions(true);
//
// Create Bool Query Builder
//
final QueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().must(fuzzyQueryBuilder);
//
// Create an instance of SearchRequestBuilder
//
SearchRequestBuilder requestBuilder = client.prepareSearch(INDEX_NAME).setTypes(INDEX_TYPE).setQuery(boolQueryBuilder).setSize(100);
//
// Get the search result
//
SearchResponse response = requestBuilder.get();

Further Reading / References

Summary

In this post, you learned about using fuzzy query with ElasticSearch using Java APIs.

Did you find this article useful? Do you have any questions or suggestions about this article in relation to doing fuzzy search using ElasticSearch Java APIs? Leave a comment and ask your questions and I shall do my best to address your queries.

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Feature Engineering in Machine Learning: Python Examples

Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…

2 days ago

Feature Selection vs Feature Extraction: Machine Learning

Last updated: 2nd May, 2024 The success of machine learning models often depends on the…

3 days ago

Model Selection by Evaluating Bias & Variance: Example

When working on a machine learning project, one of the key challenges faced by data…

3 days ago

Bias-Variance Trade-off in Machine Learning: Examples

Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…

4 days ago

Mean Squared Error vs Cross Entropy Loss Function

Last updated: 1st May, 2024 As a data scientist, understanding the nuances of various cost…

4 days ago

Cross Entropy Loss Explained with Python Examples

Last updated: 1st May, 2024 In this post, you will learn the concepts related to…

4 days ago