TON: Telegram Open Network. Part 2: Blockchains, Sharding

    Ton


    This text is a continuation of a series of articles in which I consider the structure of the (allegedly) upcoming Telegram Open Network (TON) distributed network. In the previous part, I described its most basic level - the way nodes interact with each other.


    Just in case, I’ll remind you that I don’t have any relation to the development of this network and draw all the material from an open (albeit unverified) source - a document (there is also an attached brochure that outlines briefly the main points) that appeared at the end of last year. The amount of information in this document, in my opinion, testifies to its authenticity, although there is no official confirmation of this.


    Today, let's look at the main component of TON - the blockchain.


    Basic concepts


    Account ( account ). A certain data set identified by a 256-bit number account_id (most often this is the public key of the account holder). In the basic case (see below zero work chain ), this data means the user's balance. Anyone can “take” a specific account_id , but changing its value is possible only according to certain rules.


    Smart contract ( smart-contract ). In fact - a special case of the account, supplemented by a smart contract code and a repository of its variables. If in the case of a "wallet" you can credit and debit money from it according to relatively simple and predetermined rules, then in the case of a smart contract, these rules are written in the form of its code (in some Turing-complete programming language).


    Blokcheyna state ( state of blockchain ). The set of states of all accounts / smart contracts (in the abstract sense, a hash table, where the keys are account identifiers, and the values ​​are the data stored in the accounts).


    Message ( message ). Above, I used the expression “credit and debit money” - this is a special example of a message (“transfer N grams from account account_1 to account account_2 ”). Obviously, only a node that owns the private key of the account account_1 can send such a message - and is able to confirm this with a signature. The result of delivering such messages to a regular account is an increase in its balance, and for a smart contract - the execution of its code (which will process the receipt of the message). Of course, other messages are also possible (transferring not monetary amounts, but arbitrary data between smart contracts).


    Transaction ( transaction ). The fact of message delivery is called a transaction. Transactions change the state of the blockchain. It is from transactions (message delivery records) that the blocks in the blockchain consist. In this regard, one can imagine the state of the blockchain as an incremental database - all blocks are "diffs" that must be applied sequentially to get the current state of the database. The specifics of packing these “diffs” (and restoring their full state from them) will be discussed in the next article.


    TON blockchain: what is it and why?


    As mentioned in a previous article, blockchain is a data structure whose elements (blocks) are arranged in a “chain”, and each subsequent block of the chain contains a hash of the previous one . The comments asked the question: why do we need such a data structure when we already have a DHT - distributed hash table? Obviously, some data can be stored in DHT, but this is only suitable for not too "sensitive" information. Cryptocurrency balances cannot be stored in DHT - primarily due to the lack of integrity checks . Actually, the whole complexity of the blockchain structure is growing in order to prevent interference with the data stored in it.


    However, the blockchain in TON looks even more complicated than in most other distributed systems - and there are two reasons for this. The first is the desire to minimize the need for forks . In traditional cryptocurrencies, all parameters are set at the initial stage and any attempt to change them actually leads to the appearance of an “alternativethe universecryptocurrencies. ” The second reason is support for fragmentation ( sharding , sharding ) of the blockchain. Blockchain - a structure that is not able to become smaller over time; and usually, every node responsible for network health is forced to store it completely. In traditional (centralized) systems, sharding is used to solve such problems: some of the records in the database are located on one server, some on another, etc. In the case of cryptocurrencies, such functionality is still quite rare - in particular, due to the fact that it is difficult to add sharding to a system where it was not originally planned.


    How does TON plan to solve both of the above problems?


    Blockchain Content Workchains.


    Blockchain


    First of all, let's talk about what is planned to be stored on the blockchain. The status of accounts (“wallets” in the base case) and smart contracts (for simplicity, we assume that this is the same as the accounts) will be stored there. In essence, this will be a regular hash table - the keys in it will be the identifiers account_id , and the values will be data structures containing such things as:


    • balance;
    • smart contract code (smart contracts only);
    • smart contract data storage (smart contracts only);
    • statistics;
    • ( optional ) public key for transfers from an account, by default account_id;
    • queue of outgoing messages (here they are entered for forwarding to the recipient);
    • A list of the most recent messages delivered to this account.

    As mentioned above, the blocks themselves consist of transactions - messages delivered to various accounts account_id. However, in addition to account_id, messages also contain a 32-bit field workchain_id - the identifier of the so-called Workchain ( workchain , working blockchain ). This allows you to have several independent blockchains with different configurations. At the same time, workchain_id = 0 is considered a special case, a zero workchain - it is the balances that are in it that will correspond to the TON (Grams) cryptocurrency. Most likely, at the beginning of other workchains will not exist at all.


    Shardchayna. Infinite Sharding Paradigm.


    But this does not stop the growth in the number of blockchains. We will deal with sharding. Imagine that each account (account_id) is allocated its own blockchain - it contains all the messages it receives - and the status of all such blockchains is stored on separate nodes.


    Of course, this is very wasteful: most likely, each of these shardchains ( shardchain , shard blockchain ) will receive transactions very rarely, and there will be a lot of powerful nodes (looking ahead, I note that this is not just about clients on mobile phones - but about serious servers).


    Therefore, shardchains combine accounts by binary prefixes of their identifiers: if the shardchain has the prefix 0110, then transactions of all account_id that start with these numbers will be included in it. This shard_prefix can have a length of 0 to 60 bits - and most importantly, it can change dynamically.


    Shardchayna


    As soon as one of the shardchains receives too many transactions, the nodes working on it, according to predefined rules, “split” it into two children - their prefixes will be one bit longer (and for one of them this bit will be 0, and for the other - 1). For example, shard_prefix = 0110 b splits into 0110 0b and 0110 1b. In turn, if two “neighboring” shardchains start to feel quite freely (for some time), they will merge together again.


    Thus, sharding is done “from the bottom up” - we assume that each account has its own shard, but for the time being they are “glued” by prefixes. This implies the Infinite Sharding Paradigm ( paradigm of infinite sharding ).


    I would like to emphasize separately that workchains exist only virtually - in fact, workchain_id is part of the identifier of a particular shardchain. In formal language, each shardchain is defined by a pair of numbers ( workchain_id , shard_prefix ).


    Error correction. Vertical blockchains.


    It is traditionally believed that any transaction in the blockchain is “carved in stone”. However, in the case of TON, it is possible to “rewrite history” - in case someone (the so-called “fisherman” node ) proves that one of the blocks was signed incorrectly. In this case, a special correction block is added to the corresponding shardchain containing a hash of the block being corrected (and not the last block in the shardchain). Representing the shardchain as a chain of blocks laid out horizontally, we can say that the correcting block is hooked to the erroneous block not to the right, but from above - therefore it is believed that it becomes part of a small “vertical blockchain”. Thus, we can say that shardchains are two-dimensional blockchains .


    Vertical blockchain


    If, after an erroneous block, subsequent blocks referred to the changes made by it (that is, new transactions were made on the basis of invalid ones), corrective blocks are also added to these blocks. If the blocks did not affect the “affected” information, these “corrective waves” do not apply to them. For example, in the illustration above, the transaction of the first block was recognized as incorrect, increasing the balance of account C - therefore, the transaction reducing the balance of this account in the third block should also be canceled, and an adjustment block will be closed over the block itself.


    It should be noted - although the corrective blocks are shown located “above” the original ones, in fact they will be added to the end of the corresponding blockchain (to where they should be chronologically). The two-dimensional arrangement only shows to which point in the blockchain they will be “hooked” (by means of the hash of the original block located in them).


    You can separately philosophize about how good is the decision to "change the past." It would seem that if we admit the possibility of the appearance of an incorrect block in the shardchain, then we cannot but allow the possibility of the appearance of an erroneous correcting block. Here, as far as I can tell, the difference in the number of nodes that should reach a consensus on new blocks is that a relatively small “ working group ” of nodes (often changing their composition) will work on each shard chain , and the introduction of corrective blocks will require the consent of all nodes -validators . In more detail about validators, working groups and other roles of nodes I will tell in the following article.


    One blockchain to rule everyone


    The above lists a lot of information about various types of blockchains, which in itself should also be stored somewhere. In particular, we are talking about the following information:


    • about the number and configurations of workchains;
    • the number of shardchaynov and their prefixes;
    • which nodes are currently responsible for which shardchains;
    • hashes of the last blocks added to all shardchains.

    As you might have guessed, all these things are recorded in yet another blockchain storage - the masterchain ( masterchain , master blockchain ). Due to the presence of hashes from blocks of all shardchains in his blocks, he makes the system highly connected. This also means that the generation of a new block in the master chain will occur immediately after the generation of blocks in the shardchains - it is expected that the blocks in shardchains will appear almost simultaneously approximately every 5 seconds, and the next block in the masterchain - a second after that.


    But who will be responsible for the implementation of all this titanic work - for sending messages, executing smart contracts, forming blocks in shardchains and masterchains, and even checking the blocks for errors? Will all the phones of millions of users with the Telegram client installed on them secretly do all this? Or, perhaps, the Durovs team will abandon the ideas of decentralization and their servers will do it the old fashioned way?


    In fact, neither one nor the other answer is correct. But the fields of this article are rapidly ending, so the discussion of the various roles of the nodes (you might already have mentioned some of them), as well as the mechanics of their work, will be discussed in the next part.


    Also popular now: