Diplomats can code too! (Posts about distributed systems)https://wintermade.it/blog/enTue, 29 Oct 2019 21:36:42 GMTNikola (getnikola.com)http://blogs.law.harvard.edu/tech/rss"Time, Clocks and the Ordering of Events in a Distributed System"https://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.htmlAlessandro Balzano<p>In a distributed system, understanding whether an event happened before another event is a difficult task, but it's necessary to better understand how the system is behaving: what event caused another event?</p>
<p>One solution was proposed by Lamport in his paper <a href="https://www.microsoft.com/en-us/research/publication/time-clocks-ordering-events-distributed-system/">Time, Clocks and the Ordering of Events in a Distributed System</a>: a clock that is updated only via messages sent inside the system, without using external sources (eg physical time).</p>
<h2>What is a distributed system?</h2>
<p>A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages. The paper, while acknowledging that the remarks of this paper are more general, considers a system as <em>distributed</em> if the message transmission delay is <em>not negligible</em> compared to the time between events in a single process.</p>
<p>Events can be anything that can be considered important in the system, for example running a certain subroutine, or sending/receiving messages from other processes.</p>
<h2>Why do we need to know the order of events?</h2>
<p>The knowledge of causal precedence relation among the events of processes helps solve a variety of problems in distributed systems, such as distributed algorithms design, tracking of dependent events, knowledge about the progress of a computation, and concurrency measures. For example, CockroachDB is using a variant of Lamport's timestamps (the logical clocks explained in this article) to order database transactions. In this very article, Lamport shows how his timestamps let him solve a distributed variant of the mutual exclusion problem - we will not discuss it in this blog post, though.</p>
<h2>About time</h2>
<dl>
<dt>From the first paragraph of the article:</dt>
<dd><p>The concept of the temporal ordering of events pervades our thinking about systems. [...] However, we will see that this concept must be carefully reexamined when considering events in a distributed system.</p>
</dd>
</dl>
<p>The first observation is: in distributed systems, we cannot use our intuition about time to decide if something happened before something else, or if some action can be accepted by the system. <a href="https://queue.acm.org/detail.cfm?id=2745385">There is no "now"</a> in distributed systems, especially in the geographically distributed ones.</p>
<p>In a distributed system, it is sometimes impossible to say that one of two events occurred first. We cannot say that something happened in a specific moment, because what your "specific moment" and my "specific moment" may not be the same. We can only say that <strong>"happened before"</strong> is a partial ordering of the events in the system.</p>
<p>How can we fix, or at least work around, this uncertainty? Let's use something else in place of physical clocks: logical clocks. The rest of the article explains what is it, and the rules behind them.</p>
<h2>Defining "Happened before"</h2>
<p>It's now time to define the <strong>"happened before"</strong> relation.</p>
<p>The relation "happened before" <span class="math inline">→</span> on the set of events of a system is defined as the smallest relation satisfying the following conditions:</p>
<ol type="1">
<li>If <em>a</em> and <em>b</em> are events in the same process, and <em>a</em> comes before <em>b</em>, then <span class="math inline"><em>a</em> → <em>b</em></span></li>
<li>If <em>a</em> is the sending of a message by one process and <em>b</em> is the receipt of the same message by another process, then <span class="math inline"><em>a</em> → <em>b</em></span></li>
<li>If <span class="math inline"><em>a</em> → <em>b</em></span> and <span class="math inline"><em>b</em> → <em>c</em></span>, then <span class="math inline"><em>a</em> → <em>c</em></span>. Events are said to be <em>concurrent</em> if both <span class="math inline"><em>a</em> → <em>b</em></span> and <span class="math inline"><em>b</em> → <em>a</em></span> are false.</li>
</ol>
<p>Assuming that events cannot happen before themselves, the relation is an irreflexive<a class="footnote-ref" href="https://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.html#fn1" id="fnref1" role="doc-noteref"><sup>1</sup></a> partial ordering<a class="footnote-ref" href="https://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.html#fn2" id="fnref2" role="doc-noteref"><sup>2</sup></a> on the set of all events in the system.</p>
<h2>Giving a number to an event</h2>
<p>We now have a relation between events that defines if one of two events happened before the other one. In order to use it, we need to assign a number to an event - let's call this function <strong>Clock</strong>. Each process <span class="math inline"><em>P</em><sub><em>i</em></sub></span> has its own Clock <span class="math inline"><em>C</em><sub><em>i</em></sub></span>. The entire systems of Clocks is represented by the function C, which assigns to any event b the number <span class="math inline"><em>C</em>(<em>b</em>)</span>, where <span class="math inline"><em>C</em>(<em>b</em>) = <em>C</em><sub><em>j</em></sub>(<em>b</em>)</span> if <span class="math inline"><em>b</em></span> is an event in the process <span class="math inline"><em>P</em><sub><em>j</em></sub></span>.</p>
<p>The C function (clock of the system) has to respect the <em>Clock Condition</em>.</p>
<dl>
<dt><em>Clock Condition</em>:</dt>
<dd><p>for any event <em>a, b</em>, if <span class="math inline"><em>a</em> → <em>b</em></span> then <span class="math inline"><em>C</em>(<em>a</em>) < <em>C</em>(<em>b</em>)</span></p>
</dd>
</dl>
<p>We cannot expect the converse condition to hold as well, since that would imply that any two concurrent events must occur at the same time.</p>
<p>From our definition of <em>happened before</em>, the Clock Condition is satisfied if the following two conditions hold:</p>
<ul>
<li><strong>C1</strong>: If <em>a</em> and <em>b</em> are events in process <span class="math inline"><em>P</em><sub><em>i</em></sub></span>, and <em>a</em> comes before <em>b</em>, then <span class="math inline"><em>C</em><sub><em>i</em></sub>(<em>a</em>) < <em>C</em><sub><em>i</em></sub>(<em>b</em>)</span></li>
<li><strong>C2</strong>: If <em>a</em> is the sending of a message by process <span class="math inline"><em>P</em><sub><em>i</em></sub></span> and <em>b</em> is the receipt of that message by process <span class="math inline"><em>P</em><sub><em>j</em></sub></span>, then <span class="math inline"><em>C</em><sub><em>i</em></sub>(<em>a</em>) < <em>C</em><sub><em>j</em></sub>(<em>b</em>)</span></li>
</ul>
<p>The first condition, <strong>C1</strong>, helps us order two events in the same process: if one happened before the other, then the event that happened before must be assigned a lower number. The second condition, <strong>C2</strong>, tells us how we should handle the communication between processes. Sending or receiving a message is an event too: sending a message must have a lower number than the receipt of the same message -<em>can you receive a message before it is sent?</em></p>
<p>To guarantee that the system of clocks satisfies the Clock Condition, we need two implementation rules:</p>
<ul>
<li><strong>IR1</strong>: Each process <span class="math inline"><em>P</em><sub><em>i</em></sub></span> increments <span class="math inline"><em>C</em><sub><em>i</em></sub></span> between any two successive events (the increment itself is <strong>not</strong> an event!)</li>
<li><strong>IR2 (a)</strong>: If event <em>a</em> is the sending of a message <em>m</em> by process <span class="math inline"><em>P</em><sub><em>i</em></sub></span>, then then message <em>m</em> contains a timestamp <span class="math inline"><em>T</em><sub><em>m</em></sub> = <em>C</em><sub><em>i</em></sub>(<em>a</em>)</span></li>
<li><strong>IR2 (b)</strong>: Upon receiving a message <em>m</em>, process <span class="math inline"><em>P</em><sub><em>j</em></sub></span> sets <span class="math inline"><em>C</em><sub><em>j</em></sub></span> greater than or equal to its present value and greater than <span class="math inline"><em>T</em><sub><em>m</em></sub></span></li>
</ul>
<p>IR1 insures that the clock is updated by every process (and C1 is satisfied). IR2 insures that C2 is satisfied by describing the value that should be associated to each message and how receiving processes should handle the timestamp in received messages.</p>
<h2>From a partial to a total ordering</h2>
<p>The <strong>Clock</strong> function described in the previous paragraph lets us order the events in a partial order. Why is it a problem? It may be a problem because events in different processes may have the same number - which one happened before?</p>
<p>To break ties, we can use any arbitrary total ordering <span class="math inline">≺</span> of the processes. It let us define a new <em>happened before</em> relation, described by the symbol <span class="math inline">⇒</span>.</p>
<dl>
<dt><em>happened before</em> <span class="math inline">⇒</span>:</dt>
<dd><p>If <em>a</em> is an event in process <span class="math inline"><em>P</em><sub><em>i</em></sub></span> and <em>b</em> is an event, then <span class="math inline"><em>a</em> ⇒ <em>b</em></span> if and only if either</p>
<ol type="1">
<li><span class="math inline"><em>C</em><sub><em>i</em></sub>(<em>a</em>) < <em>C</em><sub><em>j</em></sub>(<em>b</em>)</span></li>
<li><span class="math inline"><em>C</em><sub><em>i</em></sub>(<em>a</em>) = <em>C</em><sub><em>j</em></sub>(<em>b</em>)</span> and <span class="math inline"><em>P</em><sub><em>i</em></sub> ≺ <em>P</em><sub><em>j</em></sub></span></li>
</ol>
</dd>
</dl>
<h2>Drawbacks of Lamport's timestamps</h2>
<p>So... we just solved all of our problems, right? Unfortunately, no. This kind of logical clocks have some problems. Let's look at the first one: <span class="math inline"><em>a</em> → <em>b</em> ⟹ <em>C</em>(<em>a</em>) < <em>C</em>(<em>b</em>)</span>, but the converse is not true! We cannot use the clock values to order events! We will discuss about a solution of this, named <a href="https://en.wikipedia.org/wiki/Vector_clock">Vector Clock</a>, in a different post.</p>
<h2>Anomalous Behavior, aka "out-of-band messages mess up with the ordering!"</h2>
<p>There is a different problem, too. The new <em>happened before</em> relation <span class="math inline">⇒</span> does not protect us from anomalous behavior, if the ordering obtained by this algorithm differs from that perceived by the user.</p>
<p>Lamport asks to imagine two friends, A and B, using the same distributed computer system. Let's suppose they perform the following steps:</p>
<ol type="1">
<li>Friend A issues a request <span class="math inline"><em>a</em></span> on their computer</li>
<li>Friend A telephones Friend B, telling them to issue a new request</li>
<li>Friend B issues a request <span class="math inline"><em>b</em></span> on their computer (different from Friend A's)</li>
</ol>
<p>Our system may order the request named <span class="math inline"><em>b</em></span> before request <span class="math inline"><em>a</em></span>, and it's not even wrong! The problem, in this thought experiment, is that the phone call is not an event we recorded in our system, so we did not assign it a number, so we cannot use it to order the two requests.</p>
<p>Formally, we have two sets of events. <span class="math inline"><em>L</em></span> is the set of all system events. Unfortunately, the phone call is not in <span class="math inline"><em>L</em></span>, but in <span class="math inline"><em>L</em>′</span>, the set of events which contains the events in <span class="math inline"><em>L</em></span> together with all other relevant external events. Let then ⮕ denote an "happened before" relation on <span class="math inline"><em>L</em>′</span>. We can see that <span class="math inline"><em>a</em> → <em>b</em></span> is false, but <span class="math inline"><em>a</em>⮕<em>b</em></span> is true!</p>
<p>It is impossible to avoid anomalous behavior when the event ordering system we set up only uses events in <span class="math inline"><em>L</em></span>, but doesn't use the ones in <span class="math inline"><em>L</em>′</span>. What can we do, then? We have two possibilities: either we <em>explicitely introduce into the system the necessary information about the ordering ⮕</em>, or we construct a system of clocks that satisfy a stronger condition.</p>
<p>In the first case, we give the user the responsibility for avoiding anomalous behavior - in this case, at step 2 Friend B should have asked the timestamp <span class="math inline"><em>T</em><sub><em>a</em></sub></span> of request <span class="math inline"><em>a</em></span>, and specify that the request <span class="math inline"><em>b</em></span> should be given a timestamp later than <span class="math inline"><em>T</em><sub><em>a</em></sub></span>.</p>
<p>We will explore the second possibility in a later post, exploring we can define a stronger clock condition and how we can use physical clocks to build a clock function that satisfies it.</p>
<hr>
<p><em>Did you like this article? Did you find an error? Don't hesitate and let me know! Contact me on</em> <a href="https://twitter.com/alfateam123">Twitter</a>, <em>or send me an</em> <a href="mailto:winter@wintermade.it">email</a>. <em>Subscribe to the</em> <a href="https://wintermade.it/blog/rss.xml">RSS feed</a> <em>to read the next article!</em> <em>If you want to read more articles like this one, consider offering me a</em> <a href="https://www.ko-fi.com/alessandrobalzano">Ko-Fi</a>.</p>
<section class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn1" role="doc-endnote"><p>A relation is "irreflexive" if it does not relate any element to itself - an example is "greater than" on the real numbers: a real number cannot be greater than itself<a class="footnote-back" href="https://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.html#fnref1" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>A "partial ordering" is a relation that defines an order between some, not all, of the elements in the set the relation is defined over.<a class="footnote-back" href="https://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.html#fnref2" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>distributed systemslogical clockshttps://wintermade.it/blog/posts/logical-clocks-lamport-timestamps.htmlMon, 23 Sep 2019 21:35:00 GMT