I joined a program run by the Ethereum Foundation. It was during the pandemic — a time when some of my plans fell through, but new opportunities arose. The Ethereum Foundation, in many ways, oversees the Ethereum protocol and supports its community. The program's aim was to gather a group of individuals, mainly students, and guide them through the basics of distributed systems, blockchains, the Ethereum protocol, and its ecosystem.
It was a captivating experience. I already had some knowledge about distributed systems, but it was only after this program that I truly grasped their potential in real-world applications. We even had the chance to interact with Vitalik Buterin during the presentations of our final projects.
Subsequently, I became deeply interested in distributed computing. I wrote my thesis on asset transfer in distributed systems, collaborating with Pyotr Kuznetsov, a professor at Telecom Paris, and a researcher in the domain of distributed systems.
We demonstrated that cryptocurrency could operate without consensus. Our study illustrated that public cryptocurrency can be implemented based on Poof-of-Stake without resorting to consensus. From then on, Pyotr Kuznetsov and I began partnering in this field.
Currently, I have two primary pursuits. First, I'm a developer at a major company. Secondly, I'm engaged in research in distributed computing, collaborating with a research group from Telecom Paris.
I stumbled into distributed computing almost by accident. I never imagined I would spend considerable time on science and research. While I had a fondness for mathematics and computer science, it was largely within the confines of university.
However, I'm primarily a developer, working for one of the FAANG companies. Our team's mission there is to maintain platform integrity. From a technical standpoint, it involves a mix of machine learning, distributed systems, and data analysis. Additionally, I delve into research in distributed computing and cryptography.
About distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system is when you have many computers (machines) working together to solve a particular problem. They accomplish this through local computations and by communicating with each other. That is, they can perform local tasks, like any typical program, but they also communicate amongst themselves.
Distributed systems are all about scalability and correctness. For instance, it's vital for the system to handle a large number of clients and to do so correctly (or consistently).
Today, all services are built on distributed systems, almost everything we interact with. A standard example of a task tackled by distributed systems is data replication.
Take YouTube for instance. Imagine if it ran on a single computer with just one server handling requests from all users. Clearly, this approach wouldn’t scale. The idea here is, instead of one computing machine, to use multiple. Several servers that would handle requests/store data. But issues arise: for example, different data might be stored on different servers, so a query to one server might theoretically yield a different response than an identical query to another server. Distributed computing theory offers ways to fundamentally address these challenges and determine if they can be solved at all.
A comparison can be drawn with Bitcoin. There’s no central financial system (bank) processing transactions. Instead, all active network participants (miners) check whether a transaction can be confirmed or not and reach a consensus on it.
Executing a transaction in a distributed system isn’t straightforward. One needs to determine, for instance, whether Alice has already spent the money she is currently transferring to Bob. Perhaps she has spent all her money, but the active network participants aren't aware of it.
Decentralized and distributed systems are synonyms. Almost. More precisely, the interpretation can vary. For example, "distributed system" might sometimes refer to a system consisting of multiple servers but all coordinated by a single central machine. On the other hand, when someone says "decentralized," it specifies that all devices in the system contribute to problem-solving/decision-making.
"Research" can mean anything: from marketing to UX. But in this context, we're discussing academic or scientific research. The main objective of scientific research is usually to "understand something new." In this way, we try to figure out how to solve a new problem, how to do it faster/more efficiently, or in a different manner, or, on the contrary, demonstrate that it's impossible.
Our focus is on scientific research in the realm of distributed computing. Currently, I am delving into topics associated with distributed asset transfer systems (cryptocurrencies). With the advent of Bitcoin, this field has garnered immense interest. Satoshi Nakamoto, whoever he/she/they might be, proposed a system where payments can be made and money can be exchanged without a central bank.
Satoshi tackled the issue of online payments within a distributed system. There’s no regulatory entity in this setting (no "bank" or centralized transaction processing hub) and it operates in an open environment (anyone can join the network to send and verify transactions). Before Bitcoin, nobody had attempted to solve such a problem within this framework. Although, all elements of Satoshi's proposed system had existed in some form: distributed systems theory in models where Byzantine (malicious) participants might exist had been around for decades (and ironically seemed somewhat useless to many); Merkle Trees emerged in 1979; proof-of-work was introduced in the 90s as an anti-spam measure.
Yet, nobody combined these pieces. No attempts were made to construct a decentralized payment system. Upon observing Bitcoin, researchers began suggesting improvements, devising new protocols that would enhance the original design.
Then altcoins emerged, leveraging advanced technologies. Yet, a common thread among them was the utilization of blockchain to realize a decentralized payment system.
However, this isn’t just about forks. A plethora of other cryptocurrencies, like Ethereum, surfaced. Each offering its unique enhancements and implementations. But all, in some manner, employed blockchain.
Blockchain is a widely recognized distributed data structure. If we link this with distributed computing, you can append information at the end of this system and retrieve data based on the block's index. The fundamental challenge is to make this distributed data structure consistent. Clients interact with such data structures. They should perceive it as if accessing a data structure implemented on a singular server/computer.
From the blockchain's definition, we can deduce that it's a data structure that establishes a linear order (in this case, over a set of written blocks). From distributed computing theory, we understand that to establish a linear order, one must be adept at solving the consensus problem.
About consensus algorithms
Consensus is a simple problem within the realm of distributed systems. It's formulated as follows: there's a set of participants in the system, each proposing certain values. Every correct participant in the system must agree on one of these values. This problem is, in a sense, "provably" complex. In a way, it divides the world of distributed computing tasks into those that require consensus and those that don't. They can be loosely termed as "complex" and "simple" tasks, respectively.
When I first delved into crypto, one thing caught me off guard. I already had some knowledge about consensus and distributed systems, yet I was bewildered when people claimed that "Proof-of-Work is consensus" or "Proof-of-Stake is a certain type of consensus algorithm," and so forth. That's not accurate.
Proof-of-Work isn't consensus. As we've clarified, consensus is simply a task in distributed computing. It can be tackled using various algorithms, but neither Proof-of-Work nor Proof-of-Stake are consensus algorithms. They're just mechanisms that help us ensure the correctness of an algorithm within a certain model.
There are open and closed distributed systems, or, as they're sometimes called, permissionless and permissioned. An open system is one where we assume anyone can join: in the case of cryptocurrencies, anyone can begin sending and validating transactions. In a closed system, we assume that we can control who gets to validate transactions.
There's a primary distinction between the two. In closed systems, we can assume that roughly no more than ⅓ of all network participants are Byzantine replicas. Such an assumption is impossible in open systems. In essence, malicious actors could create an unlimited number of virtual replicas and dominate the system numerically, known as a Sybil attack.
How does this affect practical applications? In closed systems, we can achieve consensus without resorting to any auxiliary mechanisms. Since we can make a reasoned assumption based on the system's properties about the relatively small number of Byzantine replicas (less than a third), the task is solved without any additional interventions. Examples of algorithms include Paxos and Raft. In open systems, however, we need to ensure that a vast number of malicious actors don't compromise the integrity of the consensus algorithm. For this, we might use mechanisms designed to weed out a majority of the malicious participants.
One such mechanism is Proof-of-Work. The premise is: if a participant wants to perform a task, like validating transactions, they must possess some computational power. By creating virtual participants, malicious actors don't increase their computational prowess — it remains unchanged.
One of these mechanisms is Proof-of-Work. The concept is straightforward: if a participant wants to perform a task, like validating transactions, they must possess some computational power. When creating virtual participants, malicious actors don't boost their computational capabilities; they remain the same.
Proof-of-Work can be distinguished from the consensus mechanism itself, just like other Sybil attack prevention mechanisms (such as Proof-of-Stake, Proof-of-Space, and Proof-of-Space-time). They typically evolve somewhat separately. One can select a mechanism to ensure the correctness of consensus in an open network and choose a consensus algorithm independently.
The Proof-of-Work mechanism is energy-intensive. Suppose you're a conscientious participant who just wants to serve the system (validate transactions) and earn a reward as prescribed by the protocol.
To "prove" to the system that you're not attempting to attack it, you expend significant computational resources, solving extensive cryptographic puzzles through trial and error. This mechanism serves to eliminate participants who lack computational power (the type typically controlled by Byzantine actors). Indeed, this mechanism safeguards the system, but miners don't produce anything of intrinsic value. They simply expend energy to demonstrate that they are genuine entities and, in a sense, can be trusted.
About the differences between consensus algorithms
Proof-of-Work impacts the throughput of the protocol. The number of operations completed in a given timeframe diminishes. Regardless of what is implemented using the Proof-of-Work mechanism in a distributed system — be it a typical cryptocurrency or a more general task (like state machine replication) — the issues remain the same: disproportionate energy consumption to the task and reduced protocol speed/throughput.
Proof-of-Stake is rooted in basic game theory principles. The core idea is straightforward: tie the number of votes or the weight of a participant in the system to the amount of their assets held within the system, referred to as the "stake". The rationale is that it's challenging for a Byzantine participant to acquire a significant amount of these assets.
At the same time, the more assets a participant has (the bigger their stake), the less likely they are to deviate from the protocol (becoming a Byzantine participant). Why does motivation to adhere to the protocol increase with more assets? In Proof-of-Stake systems, the integrity of the system can only be compromised if Byzantine participants accumulate a large portion of the assets. In such a scenario, they would be the first ones to suffer since the value of the "coins" in the compromised system would plummet to zero. Essentially, the more assets a network participant has, the more vested they are in ensuring everything operates as intended.
Proof-of-Stake is not energy-intensive. Additionally, it can offer protection against Sybil attacks (much like Proof-of-Work). It also allows for an increased number of operations to be executed within the system.
To me, it's a more intriguing mechanism. Not just due to its efficiency, but also its foundational concept. Furthermore, it introduces a robust economic foundation, making it easier to incorporate various real-world financial mechanisms.
About protocols without consensus
Personally, I favor solving problems without relying on consensus algorithms. Specifically, I'm intrigued by cryptocurrency protocols that either don't use consensus at all or offer hybrid algorithms where the use of consensus is minimal. They aren't frequently encountered in practice yet, but an increasing number of academic papers on these topics are emerging. Theoretically speaking, we're essentially trying to answer the question, "What else can we achieve without utilizing a powerful tool like consensus that hasn't been considered before?"
There are several theoretical "adversary" models. The first is called a static adversary. In this model, we assume that our adversary has taken control of specific replicas from the outset, but we're uncertain about which ones specifically.
For instance, out of 11 network participants, an adversary controls 3 (Byzantine) replicas right from the start. It's worth noting that for a while, Byzantine replicas might be indistinguishable from the correct ones. Thus, an external observer monitoring the system might perceive that the replicas are strictly adhering to the protocol. However, if a participant suddenly deviates from the given algorithm, it's assumed they've always been Byzantine. Typically, such a model is used in permissioned systems.
Another model is the dynamic adversary. In this framework, we believe that the adversary can decide which participants to capture/corrupt based on unfolding events. In a sense, the set of participants identified as Byzantine is continuously evolving. For simplicity, it's usually assumed that this set can only grow, meaning if a participant becomes Byzantine once, they remain so indefinitely. This model is generally assumed in permissionless systems.
For example, in a Proof-of-Stake system, if a participant holds a significant amount of assets within the system, they are considered correct. After a while, they might transfer all their assets to someone else (e.g., in payment for goods/services). Now, since they're no longer at risk, they might either attempt to harm the system or simply reduce the security of their device (again, because they have nothing to lose), making it vulnerable to breaches.
Distributed protocols in both systems are identical. They're designed to prevent or mitigate errors within the system. Every replica that receives a message can somehow verify it and determine its validity. One way to achieve this is through cryptographic primitives.
Cryptographic primitives help confirm that a message was genuinely sent by a replica. This is especially vital if the message doesn't come directly but is relayed through some broadcast algorithm via intermediaries. Each replica also knows the actions another replica must take to respond to the message since the algorithm is public. Thus, we can almost always validate that the message we receive makes sense.
There's another dimension in distributed systems, focused on ways to detect errors within the system. We acknowledge that the protocol might break, but we can determine if the protocol's outcome doesn't match its guarantees, identify the participant trying to compromise the system, and then remove them. Essentially, for every protocol breach, the correct replicas can find evidence that a violation occurred and then execute system self-repair. I haven't delved deeply into this, but I'm familiar with an insightful article on the topic: "Accountability and Reconfiguration: Self-Healing Lattice Agreement."
The consensus task in itself is "provably difficult." A foundational result in distributed computing theory, the Fisher-Lynch-Paterson (FLP) Theorem, states: "It is impossible to implement a deterministic consensus algorithm that is guaranteed to terminate if even a single replica might crash." Therefore, any practical consensus algorithm seeks to circumvent the FLP Theorem result in one of the following ways:
The algorithm assumes the system is synchronous: it relies on timings for sending/receiving messages; The algorithm employs randomization: thus, it becomes probabilistic, and guarantees are offered with a certain likelihood; The algorithm is implemented in such a way that, in the worst-case scenario, it might never conclude – participants of the system will never reach a result.
In essence, these are the primary foundational factors that can influence the correctness of algorithms. As for practicalities, with a proper algorithm implementation and a meticulous justification of its properties, it's typically possible to avoid these pitfalls (or, more accurately, minimize the likelihood of undesirable outcomes), and everyone seems to manage. Regarding a specific algorithm, it's crucial to carefully review the dedicated papers: examine the system assumptions made by the authors, the properties/guarantees the protocol offers, and the proofs of why everything functions as described.
I don't delve into the realm of AI in distributed computing, but, broadly speaking, I know that a particularly compelling scientific and engineering challenge is to adapt deep learning algorithms to distributed systems to achieve a performance boost.
If we were to speculate, it would be intriguing to see how deep learning algorithms might aid in ensuring system security and detecting potential Byzantine replicas.
Bootcamp: Solidity Developer
From the basics of Solidity to deployment on the mainnet.Learn more