A few weeks back, our Elasticsearch cluster stopped executing any watchers. Doing initial analysis it looked like there is some problem with AWS SMTP service. As we use AWS SMTP for sending mail alerts to our LDAP accounts. After going through more logs and spending some time in understanding the sent mail statistics on AWS, thanks to AWS for providing intuitive UI to get insights of emails that are getting rejected. We were sure there is no problem with sending of email but something is wrong on the current master. Analyzing below log line it was clear that there is some issue with .watcher index.
2018-05-10T07:05:16,969][WARN ][o.e.g.DanglingIndicesState] [es-master-1] [[.watches/23nm9NSrSkeZaK4Dtyughg]] cannot be imported as a dangling index, as index with same name already exists in cluster metadata
- Delete the local directory: The log line tells the node name that is holding a stale copy of index along with the directory name. In our case it was es-master-1 node name with the directory 23nm9NSrSkeZaK4Dtyughg under data folder for the master.
- Restart Watcher Service: Once the stale index directory is deleted, restart the watcher service
Horizontal Scaling means you are adding more nodes(m/c ‘s )to your cluster in order to support more load and increase performance while Vertical Scaling is increasing power of a single m/c in order to increase performance or to maintain it with increasing load by increasing (RAM,CPU)
In DB world Horizontal scaling refers to Data Sharding where the data is split into small partitions with each partition(Shard) lying in different nodes in a cluster whereas in Vertical scaling data resides on a single node and scaling is achieved by multiple cores
Sharding is the partitioning mechanism of dividing a very large DB into small,faster, manageable parts called Shards such that all these shards are independent of each other and shares nothing and thus can be distributed across different servers while enjoying all the benefits of horizontal scaling .
Sharding is just another name for “horizontal partitioning” of a database
Horizontal partitioning is a design principle whereby rows of a database table are held separately, rather than splitting by columns.Where each partition consist of some number of rows and forms a part of shard.
With time as the DB grows the time taken to query it increases exponentially sharding helps in scaling the DB horizontally to achieve the performance benefits.