Demystifying ElasticSearch refresh !

By default ElasticSearch is configured to refresh the index every 1 second. This means it will take atleast 1 second to propagate the changes that are made to a document to be made visible during search.But what if we have a requirement to trigger a process only when the search results are made available.

We want index/update or insert request to wait , until the changes made to the documents are available for search before it returns.

refresh parameter is available for these API’s to control when we want our index to refresh and changes made available to user.

  • Setting it to true , refresh = true will cause relevant Primary and Secondary shard , not complete index to be refreshed immediately.
  • Setting it to wait_for, refresh=wait_for will cause the request to wait until the index is refreshed by ElasticSearch based on index.refresh-interval i.e 1 sec by default. Once the index is refreshed the request returns.
  • Setting it to false, refresh=false has no impact on refresh and request returns immediately. It simply means the data will be available in near future.

Note: ElasticSearch will refresh only those shards that have changed, not the entire index

But there is catch in these simple parameters, there are cases that will cause refresh to happen irrespective of the value of refresh parameter you have set

  • if index.max_refresh_listeners which defaults to 1000 is reached. refresh=wait_for will cause the relevant shard to be refreshed immediately.
  • By default GET is realtime i.e each time a GET request is made, it issues index refresh for the appropriate segment. Causing all changes to be made available.

Leave a Reply

Your email address will not be published.