Details on the update of Segregated Witness and the consequences of its adoption in Bitcoin

    In this article, we tried to consider in detail the changes in the Bitcoin protocol that occurred as a result of the Segregated Witness softfork update. We touched upon issues related to transaction malleability, maintaining backward compatibility, increasing bandwidth, new transaction serialization formats, new script variants, the Bech32 address format and its advantages, the concepts of weight, size and virtual size. Moreover, below is the most important update adaptation statistics and answers to frequently asked questions about this update.

    Before proceeding to a detailed description of all changes in this update, we suggest to get acquainted with the main idea of ​​Segregated Witness. Literally, Segregated Witness translates from English as "separated witness." In the Bitcoin contest, it is implied that evidence of ownership of coins will be stored separately from the transaction master data, as indicated in the diagram.
    image
    If we consider the whole protocol update, it includes many other improvements. SegWit allows you to increase network bandwidth, separate evidence of coin ownership from the rest of transaction data, correct transaction format deficiencies related to the possibility of modifying data in signed transactions (transaction malleability), and at the same time maintain backward compatibility with previous versions of the protocol. And the greatest value of this update is that it allows you to implement many very important off-chain solutions over the Bitcoin protocol.

    The problem of transaction malleability and its solution


    The bottom line is that when working in Bitcoin, it is possible to change a transaction in such a way that it remains correct during verification. These changes are very minor, they do not concern the addresses of the sender and the recipient, but they are enough to change the result of hashing. In other words, the transaction will transfer the coins to the previous addresses, but its modified hash value will not match the modified transaction with the original one.

    Obviously, the situation described above can only happen with transactions that have not yet received confirmation. Without its solution, it is impossible to achieve reliable operation of off-chain protocols, which provide for the construction of chains of unconfirmed transactions. Since not all data is signed during the creation of a transaction (for example, you cannot sign a scriptSig), it is possible to conduct several types of attacks:

    1. Change signature format . In the Bitcoin protocol, the signature format is not strictly approved and depends on the implementation of the OpenSSL library, in which the strict format is also not defined for the signature. The third party may change the intercepted transaction, which will entail a change in the hash value.
    2. Impact directly on scriptSig . As noted earlier, scriptSig contains a set of operations to verify evidence that a certain user owns the coins. But in addition to these operations, you can add others. A few innocuous, unaffected operations will result in a change in the hash value of the transaction.

    Thus, you can change the hash value of a transaction without having access to private keys. Why is changing the hash value so undesirable?

    First of all, it should be noted that it is possible to create a copy of the original transaction, add changes to it that will not affect its correctness during verification, and send it to the network. A modified copy with a different hash value spends the same coins as the original, so it can compete with the original transaction for confirmation.

    Above were mentioned off-chain protocols. To implement them, a solution to the problem of transaction malleability is necessary. The basis of their work is the construction of chains of unconfirmed transactions. Changing the hash value of the first transaction will entail the invalidity of the entire chain of unconfirmed.

    In the SegWit update, strict field filling rules were defined, therefore the problems associated with the transaction malleability for transactions of the new format can be considered solved. This allowed us to specify the data and serialize it uniquely, excluding duality.

    Backward compatibility when distributing a block over the network


    According to the old rules, the maximum block size is 1 MB and contains transactions with embedded evidence. While the new rules suggest that the maximum basic block size is 1 MB, but in addition there is a data structure with evidence. Accordingly, the total size of the new block exceeds 1 MB.

    For the purpose of backward compatibility, the rules of operation of the protocol allow old nodes to work with new blocks, but they will receive a block only in the basic configuration with a maximum size of 1 MB. They are not available structure witness. New nodes get a full-fledged block with transactions and evidence. The following figure will help to get acquainted with this question.

    image

    On the left is a diagram of the Bitcoin protocol before activating Segregated Witness. The block had a maximum size of 1 MB, and it was distributed between different network nodes in one form.

    Since the block size is limited, the number of transactions that can be placed in it is limited, and the system capacity depends on it. Of course, when there was a question about increasing throughput, first of all we began to look for an answer in ways to increase the maximum block size.

    Ways to increase bandwidth


    Let us consider two main ways to solve the problem of increasing system capacity. Any proposal is carefully checked and tested by the Bitcoin protocol team. If the consent of the community is reached and it is decided to implement the proposal in the protocol, an update is issued.

    Hardfork update. The most trivial of updates is to increase the size of the block. It is assumed that one block will accommodate more transactions, increasing throughput. However, such a block will not be accepted by nodes operating under the old protocol, the rules of which say that the maximum block size cannot exceed 1 MB. Such a change requires hardfork, which is more complex organizationally than softfork.

    SoftFork update. Segregated Witness allows us to solve this problem using softfork. How exactly? It allows us to divide the block into two parts, in the first of which transactions are stored, and in the second - evidence. At the same time, new nodes of the network receive both parts, and old - only a block of transactions of 1 MB. Older nodes cannot receive blocks with evidence and, accordingly, cannot validate transactions that they receive, but this allows them to participate in achieving consensus and not resort to hardfork, but gradually move on to new software.

    Innovations Segregated Witness


    Consider what is included in the Segregated Witness update. The first and most important innovation of Segregated Witness is the new transaction serialization format. In addition to the already known fields, there are three new transactions in the new transaction: the marker, flag, which are used for versioning and in this case they are strictly specified, but in the following protocols they can be changed, as well as the witness field. Witness (witness data) - this is actually a set of evidence of possession of coins, which are taken out of the main part of the transaction. Structurally, it looks like a set of inputs, with each element of witness data corresponds to an input with a specific number, which allows you to compare the evidence with the specific coins spent.

    Witness transaction id


    To get a transaction identifier (transaction id, txid), you need to bring the transaction itself into one data sequence according to a special serialization format, and then get a hash value from this data sequence. With the introduction of Segregated Witness, a new identifier appeared, witness transaction id (wtxid), and a new serialization format, respectively. For old transactions that spend money without using Segregated Witness, wtxid will be the same as txid, because they will not have new fields that have been added to Segregated Witness.

    image

    Wtxid is needed to build an alternate Merkle tree for proof. It is built in the same way as for ordinary transactions, but instead of the transaction hash, wtxid is used here. Accordingly, wtxid is hashed in pairs and results in the Merkle root.

    It is important to note that Merkle root is inserted into the coinbase transaction, and not into the block header. If root were in the block header, the block structure would change. Nodes that support the old protocol could not work with such blocks. And all efforts to maintain backward compatibility would be rested against this inconsistency. Therefore, root is inserted into one of the outputs of the coinbase transaction. When all nodes switch to Segregated Witness, this situation may change and new approaches will be considered.

    Witness programs to set the conditions for spending coins


    Let's take a look at how the Segregated Witness transaction script is built and how it allows the old hosts to understand that the Segregated Witness transactions are correct, despite the fact that they do not receive evidence of coin ownership.

    The script describing the rules of spending coins from a new format transaction consists of two parts. The first part is the witness version byte (the byte identifying the witness program version). It can take values ​​from 0 to 16 (OP0-OP16), now OP0 is used. In the future, new versions of the protocol with support for other versions of the witness program may appear. The second part is the witness program. This part can be from 2 to 40 bytes in size.

    Witness program is the result of witness hashing script. The witness script itself contains a complete description of the conditions of spending coins. Witness data contains evidence of possession of coins, which must satisfy the conditions of the witness script. Accordingly, witness data always consists of two parts: the witness script and proof of ownership of coins.

    It is worth noting that the witness program does not contain any operations (coincidence of hash values, verification of electronic signatures), and the script itself begins with the OP0 code, therefore, it is valid for all old nodes. Moreover, the nodes that are not updated to SegWit, do not check the evidence of possession of coins for the outputs of the new format, they consider such spending to be correct in any case. Strictly speaking, the old nodes will accept transactions of the new format even if its sender does not actually own the corresponding coins. That is why SegWit requires the majority of Bitcoin’s mining powers to accept this update. Another feature is that the scriptSig of a transaction that spends coins from the output of a new format will be empty.

    New options for setting the conditions of spending coins


    With the introduction of SegWit, two standard formats for scriptPubKey were proposed, which became an alternative to the two most common formats for setting coin spending rules before this update appeared. Consider them in order.

    Pay to witness public key hash (P2WPKH) is analogous to the standard pay to public key hash. What is its difference? As noted earlier, the scriptSig does not fill up and remains empty. All evidence of possession of coins is transferred to the structure of witness data.

    At the same time, a script that was reviewed earlier, a version and a public key hash, which is a witness program, is inserted into the scriptPubKey. A node on the network distinguishes such a waste script from others due to the fact that its version is equal to one, and the data size is 20 bytes. Different version and other size carry different waste rules.

    image

    In this case, the scriptPubKey contains two parts: the witness number of the version is zero and the hash value of the recipient's public key. ScriptSig will be empty, and witness data will contain an electronic signature and a public key to verify it.

    Pay to witness script hash(P2WSH) is an analogue of the standard pay to script hash. In this case, custom script can be used to set the rules for spending coins. How does a host distinguish such a script from the previous one? In this case, the version still has a value of one, and the witness program occupies 32 bytes and is a hash value from the witness script. If a transaction comes to the network node that contains some script that has the first version, but its size differs from the values ​​of 20 or 32 bytes, the node will reject this transaction because it will not know how to work with it.

    Witness data here is divided into two parts. The first contains a set of evidence of ownership of coins, that is, a set of signatures. The second part contains the witness script, the contents of which just sets the rules for spending coins, but in this case it is indicated at the moment of spending, and at the moment of sending the coins its hash value was indicated.
    image
    In this case, the scriptPubKey contains two parts: the witness number of the version is zero and the hash value of the witness script for the case of the 1-of-2 multi-signature. ScriptSig will be empty, and witness data will contain an electronic signature and source witness script in open form.

    P2SH wrapper


    The new format of the script is different from the old one. Accordingly, the old services and wallets will not know how to work with such a script format and how to compose it. For the purpose of backward compatibility in Segregated Witness transactions using P2SH, a special “wrapper” is used, which allows you to create a transaction that has the properties of the Segregated Witness of the transaction, but not different from the usual P2SH for the outside world.

    P2SH is used to simplify the work of the sender and not burden it with the details of the recipient's Redeem Script implementation. In this case, the recipient gives the sender only the Redeem Script hash value, and the script itself transmits to scriptSig along with the evidence.

    image

    In this case, the scriptPubKey contains the hash operation, the hash value of the redeem scrip and the comparison operation (as for the old version of P2SH). ScriptSig here will contain the hash value of the public key, and witness data will contain an electronic signature and a public key.

    This approach allows non-updated digital wallets to send transactions to Segregated Witness addresses, without actually knowing anything about new ways to set coin spending conditions.

    New Bech32 Address Format


    It is worth mentioning separately Bech32 addresses that are considered native SegWit addresses. Most of its history, Bitcoin used Base58 coding for addresses with the addition of a checksum, which is the truncated result of double hashing using the SHA-256 hash function. They were part of the original software and their scope was extended to BIP13 for P2SH. But both the character set and the checksum algorithm have limitations:
    • the address in Base58 takes up more memory in QR codes, since it cannot use the alphanumeric representation mode;
    • The base58 is very inconvenient for reliable writing on paper, typing on a mobile keyboard or reading out loud;
    • double checksum hashing is slow;
    • Base58 decoding is complex and relatively slow.

    image
    The SegWit update includes a new class of outputs (witness programs) and its two cases: P2WPKH and P2WSH. Their functionality is indirectly available for old nodes through the use of P2SH, but it will be more optimal and safer to use it directly, for this it is necessary to introduce a new address format.

    Specification Bech32 addresses


    The Bech32 address has a length that does not exceed 90 characters and contains:

    1. Part, convenient for reading by the person. This includes data that may need to be transmitted or that have any relation to the owner of the address, at least 1 character in length. For example, by default, “bc” characters are used for mainnet addresses, and “tb” characters are used for testnet.
    2. The separator, which is always equal to "1". If "1" is allowed inside the human-readable part, then the separator is the last of the characters "1".
    3. The data part is at least 6 characters long and consists only of alphanumeric characters, excluding “1”, “b”, “i”, and “o”. Here, the witness program version and data from the witness program itself (from 2 to 40 bytes) are used as the main data.


    image

    Why include a separator in the addresses? This allows you to uniquely separate the human-readable part from the data part, avoiding potential collisions with other human-readable parts that use the prefix. It also avoids the restrictions in the character set for the human readable part. The separator is 1, because the use of a non-alphanumeric character makes it difficult to copy addresses (without double-clicking in several applications). Therefore, an alphanumeric character was selected outside the main character set. Also, the use of the base number 32 system is accompanied by an increase in the length of the address by 15% compared with the base 58 base number system, but this does not matter when copying addresses.

    Checksum Bech32 addresses


    The last 6 characters of the address are a checksum. The checksum is based on the BCH code, which guarantees the detection of any error affecting no more than 4 characters, and the chance that the checksum will converge when more than 4 errors are made is one of 109.

    One of the advantages of using BCH codes is that they can be used to correct errors. If the address was made up to 4 errors, they will be corrected automatically. And if more errors are made, they will be detected with a high probability, but without the possibility of their automatic correction.

    Upper and lower case address


    Lower case is used when defining the character value for a checksum.

    The software always displays the entire string of Bech32 addresses in lower case. If an upper case version is required (for example, for presentation purposes or for use in a QR code), then this is available as an option. Moreover, the software will not accept addresses in which some characters are in upper case and some in lower case. Lower case is usually preferable for displaying, but for QR codes, upper case should be used, because it allows an alphanumeric representation that is encoded 45% more compact than a byte representation.

    The concepts of weight and block size


    Another important change that Segregated Witness has made is the introduction of such a thing as transaction weight and block weight. Before Segregated Witness, they usually talked only about the size of the transaction and the size of the block. The size of the block was limited to 1 MB. After activating the update, there are two transaction formats. Accordingly, you need to continue to support both.

    In order to solve this problem, the concept of transaction weight and the corresponding weight units (weight units) were introduced. The size of the main part of the transaction is now taken into account with a factor of 3, and the size of the witness data with a factor of 1. As you can guess, any data that was included in the witness data required 3 times less commission than the basic data of the transaction. Such an approach allows validators to determine a more profitable transaction in relation to the place occupied in a block and the remuneration received. Weight is calculated using a special formula.

    block weight = base size * 3 + total size

    block weight - block weight (measured in weight units)
    base size - base block size (measured in bytes)
    total size- total block size (measured in bytes)

    In this formula, the base transaction size means that the transaction size during serialization by the old rules is multiplied by three and the result is added to the transaction size serialized by the new rules. As a result, we get the weight of the transaction.

    Regardless of whether the old sample transaction is serialized according to the old or according to the old rules, it will always have the same size, respectively, the weight will be exactly 4 times larger. And for the Segregated Witness transaction, the weight will be slightly less, because the evidence of the ownership of coins will not be included in the size of the transaction.

    Together with the weight was introduced the concept of virtual size, which is calculated by dividing the weight by four. Virtual size is used to calculate commissions for transactions and so that validators can understand how beneficial it is for them to include a specific transaction in a block using the usual record price, which is measured in spb units (satoshi per byte).

    virtual size = weight units / 4

    Since the transaction weight for non-segregated witness will be 4 times larger than the size, the virtual size of the transaction will be equal to the normal size. Accordingly, for old transactions, the calculation of the commission will not change. For new transactions, it will be slightly less, because the signatures are placed in a separate structure. Thus, it is possible to pay for them smaller commissions, but to have the same priority with miners when included in the unit. At the same time, the maximum block size without witness data (base size) remained 1 MB, and the maximum block weight is 4 MB.

    Here a logical question may arise: “What will the actual block size be with witness data?”. Absolutely accurate answer is impossible. Obviously, this value will be in the range from 1 MB to 4 MB. But you can make a more accurate theoretical assessment. It turns out about 1.8 MB. Where does this value come from? A typical transaction block currently consists of about 60% of the open data. If we calculate the weight of a block of 1 MB in size consisting of 40% of the data of evidence of possession of coins, we obtain the following data.

    400,000 bytes * 4 = 1,600,000 standard units of weight
    600,000 bytes * 1 = 600,000 standard units of weight
    1600,000 + 600,000 = 2,200,000 standard units of weight
    4,000,000 / 2,200,000 = 1.81 MB

    That is, it can be assumed that the effective block size will be about 1.8 MB. But it is obvious that in practice this value will completely depend on the composition of transactions in this block.

    Update Adaptation Statistics


    As of July 2018, the number of SegWit transactions surpassed 35% of the total number in the Bitcoin network. At the same time, the main services for working with Bitcoin and digital wallets have implemented support for Segregated Witness quite recently (for example, Electrum and Bitxfy).

    image
    The diagram is taken from the materials of the BitMex Research study .

    In the dynamics of the final block size, after activating the update, you can also notice significant changes. At the moments of increasing the flow of new transactions, almost all blocks turn out to be significantly more than 1 MB, and sometimes even more than 2 MB. In addition, it is clear that after the activation of SegWit, the need for an urgent solution to the problem of low bandwidth does not seem so acute anymore.

    image
    According to BitMex Research

    If you look at the dependence of the average transaction fee on the number of transactions in a new format, you can also notice a very strong correlation in the changes in these values.

    And let's not forget that Segregated Witness enabled the development of off-chain solutions on top of the Bitcoin protocol. Of course, the adaptation of the lightning network is a much more difficult task compared to SegWit, however, the work in this regard is well under way and there are already significant achievements.

    Frequently asked Questions


    - Is it correct to say that RBF (replace-by-fee) will not work for the Segregated Witness transaction?

    No, replace by fee will work for the Segregated Witness of transactions, because it is not based on what your spending rules are, but on the fact that you use one coin and specify the sequence of the transaction entry number. If you increase the value of the input using the same coins, and indicate the correct evidence that you own these coins, you can also replace the previous transaction.

    - How can I change the hash of an unconfirmed transaction?

    The transaction hash is the result of calculating the hash function of all the data that is stored in the transaction. ScriptSig, which is contained in a transaction, participates in hash calculation, but cannot be signed. Minor changes in this field that do not entail changes to the signature verification rules will cause changes in the transaction hash value. This means that the signature remains valid, the transaction is valid, but its hash value changes.

    - How is witness data stored in a transaction?

    As noted, in Segregated Witness, transactions introduced a new serialization format. Besides the fact that we have a set of inputs and outputs, other bytes are added where evidence is stored. Accordingly, this data is stored there. As simply as possible, it can be represented as follows: there is just a data set where it is written that there are two transaction inputs (bytes of the first input and bytes of the second input), two outputs, and after them two more sets of Witness data, which are also recorded as bytes . In fact, evidence of possession of coins has been moved to another place during serialization.

    - Why not use an existing set of characters, such as RFC3548 or z-base-32 for Bech32 addresses?

    The set of characters is chosen so as to minimize the ambiguity associated with their visual similarity. The order is chosen to minimize the number of pairs of characters that differ in less than one data bit. The checksum is selected to maximize the probability of detecting a small number of errors, which improves its efficiency for typical errors.

    Also popular now: