Tag Archives: Distrabuted Computing

Vector Clock

What is Vector Clock?
To provide reliability and availability data is stored at various data centers ,vector clock helps in assuring the consistency in data being read at any point of time

Say there are n nodes and for each node there is a cycle thus if a system has 3 nodes around which the data is replicated there will be array of 3 cycles with one cycle per node

Each node maintains its own tick and updates the same each time any modification is made along with passing the message and updated counter to subsequent replicas

Initial values of tick for all nodes N1,N2,N3 is T=0 Now say some modification is made at node N1 and it ticks it clocks by 1
N1 T1 passes it to N2,N3

(N1 T1)(N2 T0)(N3 T0)
(N1 T1)(N2 T0)(N3 T0)
(N1 T1)(N2 T0)(N3 T0)

Now say changes are made to data at node N2 thus  it ticks clock  by 1 from T0 to T1 and passes the same to N1,N3

 Say Now N3 does an update and communicates the same to N2 and N1 and same N2 does and communicates the same to N1 and N3 and N1 also does the same
But due to network breakdown the messages were not delivered and following were the status of vector with each node

To resolve the above conflict  a different algorithm is required which varies as per requirements



Intro to Dynamo DB

For services like session management,sales relational DB leads to limit  in scale and availability while Dynamo addresses these problems much more efficiently

Data is partitioned and replicated using consistent hashing while the consistency between replicas during updates is maintained by a quorum-like technique(Quorum protocol) and a decentralized replica synchronization protocol,

Dynamo is a completely decentralized system with minimal need for manual distribution

When to use Dynamo DB?(not restricted to!!)

 For the Application that only requires read and write operations to data item uniquely identified by a  primary key whose operations do not span across multiple data items ,Dynamo Db targets applications that need to store objects that are relatively small in size (less than 1MB)

We need to admit to that fact that the availability and consistency can not be achieved at the same time with  both contributing to 100% you have to choose one among the two

Since the Data is being replicated across many regions in various data centers when an update is performed it takes time to replicate across all the replicas .There are two choices

**)Either you make data unavailable until the complete updation across the various replicas of data is performed (Known as Strongly Consistent read)


**)Provide a copy of inconsistent data (with the immediate read in effect ) called as (Eventually Consistent read)

Dynamo is designed to be an eventually consistent data store with all updated reaching all replicas eventually

Dynamo Db pushes the complexity of conflicts to reads in order to ensure the writes are never rejected i.e even if the most recent update is not performed on all the replicas you can still write and keep on updating but  Dynamo will restrict you from reading the inconsistent data