Shamir's secret sharing scheme

Original author: Eric Rafaloff
  • Transfer
Consider the scenario when it is necessary to ensure the security of the bank vault. It is considered completely inaccessible without a key, which is given to you on the first day of work. Your goal is to securely save the key.

Suppose you decide to keep the key with you all the time, giving access to the repository as needed. But you will quickly understand that such a solution does not normally scale in practice, because every time you open your store requires your physical presence. What about the vacation you were promised? In addition, the question is even more frightening: what if you lost the only key?

With the thought of vacation, you decided to make a copy of the key and entrust it to another employee. However, you understand that this is also not ideal. By doubling the number of keys, you also double the possibility of a key theft.

In desperation, you destroy the duplicate and decide to split the source key in half. Now you think two trusted people with key fragments must be physically present in order to collect the key and open the vault. This means that a thief needs to steal two fragments, which is twice as difficult to steal one key. However, you soon realize that this scheme is not much better than just one key, because if someone loses half the key, the full key cannot be restored.

The problem can be solved with a series of additional keys and locks, but with this approach, many keys and locks are quickly required . You decide that in an ideal scheme you need to divide the key so that security does not rely entirely on one person. You also conclude that there must be a certain threshold for the number of fragments, so that if one fragment is lost (or if a person goes on vacation), the entire key remains functional.

How to share a secret


This type of key management scheme was thought by Adi Shamir in 1979 when he published his work “How to share a secret” . The article briefly explains the so-called$ (k, n) $ threshold scheme for effectively sharing a secret value (for example, a cryptographic key) into $ n $parts. Then, when and only when at least$ k $ of $ n $ parts are assembled, you can easily recover the secret $ S $.

From a security point of view, an important feature of this scheme is that an attacker should not learn absolutely nothing if he does not even have$ k $parts. Even having$ k - 1 $parts should not give any information. We call this property semantic security .

Polynomial interpolation


Shamir threshold scheme $ (k, n) $built around the concept of polynomial interpolation . If you are not familiar with this concept, it is actually quite simple. In general, if you have ever drawn points on a graph, and then connected them with lines or curves, you have already used it!


An unlimited number of polynomials of degree 2 can be drawn through two points. To choose one of them, a third point is needed. Illustration: Wikipedia

Consider a polynomial with degree one,$ f (x) = x + 2 $. If you want to plot this function on a graph, how many points do you need? Well, we know that this is a linear function that forms a line and therefore needs at least two points. Next, we consider a polynomial function with degree two,$ f (x) = x ^ 2 + 2x + 1 $. This is a quadratic function, so at least three points are required for plotting. What about a polynomial with degree three? At least four points. And so on and so forth.

The really cool thing about this property is that, given the degree of the polynomial function and, at least,$ degree + 1 $points, we can infer additional points for this polynomial function. Extrapolation of these additional points is called polynomial interpolation .

Making a secret


Perhaps you have already understood that Shamir’s smart scheme comes into play here. Suppose our secret$ S $ - this $ 42 $. We can turn$ S $ to point on the chart $ (0.42) $ and come up with a polynomial function with degree $ k-1 $which satisfies this point. Recall that$ k $will be our threshold for the required fragments, so if we set the threshold to three fragments, then we must choose a polynomial function with degree two.

Our polynomial will have the form$ f (x) = a_0 + a_1x + a_2x ^ 2 + a_3x ^ 3 + ... + a_ {k-1} x ^ {k-1} $where $ a_0 = S $ and $ a_1, ..., a_ {k-1} $- randomly selected positive integers. We just build a polynomial with degree$ k-1 $where free ratio $ a_0 $ - This is our secret $ S $and each of the following $ k-1 $members have a randomly chosen positive coefficient. If you go back to the original example and assume that$ S = 42, k = 3, a_1, ..., a_ {k-1} = [3,5] $, then we get the function $ f (x) = 42 + 3x + 5x ^ 2 $.

At this stage, we can generate fragments by connecting$ n $ unique integers in $ f (x) = 42 + 3x + 5x ^ 2 $where $ x \ neq 0 $(because it is our secret). In this example, we want to distribute four fragments with a threshold of three, therefore we randomly generate points$ (18, 1716), (27, 3768), (31, 4940), (35, 6272) $and send one point to each of the four trusted people, the key keepers. We also inform people that$ k = 3 $as it is considered public information and necessary for recovery $ S $.

Recovery secret


We have already discussed the concept of polynomial interpolation and the fact that it underlies the Shamir threshold scheme. $ (k, n) $. When any three of the four proxies want to restore$ S $they just need to interpolate $ f (0) $with its own unique points. For this they can define their points.$ (x_1, y_1), ..., (x_k, y_k) = (18, 1716), (27, 3768), (31, 4940) $and calculate the Lagrange interpolation polynomial using the following formula. If programming is clearer to you than mathematics, then pi is essentially an operator forthat multiplies all the results, and sigma is that forwhich adds up.

$ P (x) = \ sum_ {j = 1} ^ {k} p_j (x) $


$ P_j (x) = y_j \ prod _ {\ scriptstyle m = 1 \ atop \ scriptstyle m \ neq j} ^ {k} \ frac {x-x_m} {x_j-x_m} $


With $ k = 3 $ we can solve this as follows and return our original polynomial function:

$ \ begin {aligned} P (x) & = {y_1} \ left ({x-x_2 \ over x_1-x_2} \ cdot {x-x_3 \ over x_1-x_3} \ right) + {y_2} \ left ( {x-x_1 \ over x_2-x_1} \ cdot {x - \ _ 3 \ over x_2-x_3} \ right) + {y_3} \ left ({x-x_1 \ over x_3-x_1} \ cdot {x-x_2 \ over x_3-x_2} \ right) \\ P (x) & = {1716} \ left ({x-27 \ over 18-27} \ cdot {x-31 \ over 18-31} \ right) + {3768 } \ left ({x-18 \ over 27-18} \ cdot {x-31 \ over 27-31} \ right) + {4940} \ left ({x-18 \ over 31-18} \ cdot {x -27 \ over 31-27} \ right) \\ P (x) & = 42 + 3x + 5x ^ 2 \ end {aligned} $


As we know that $ S = p (0) $recovery $ S $ carried out simply:

$ \ begin {aligned} P (0) & = 42 + 3 (0) + 5 (0) ^ 2 \\ P (0) & = 42 \ end {aligned} $


Using unsafe integer arithmetic


Although we successfully applied the main idea of ​​Shamir $ (k, n) $, we still have a problem that we have ignored until now. Our polynomial function uses insecure integer arithmetic. Keep in mind that for each additional point that an attacker receives on the graph of our function, there are fewer opportunities for other points. You can see it with your own eyes when building a graph with an increase in the number of points for a polynomial function using integer arithmetic. This is counterproductive to our stated security goal, because the attacker should absolutely not know anything until they have at least$ k $fragments.

To demonstrate how weak the circuit with integer arithmetic is, consider a scenario in which an attacker received two points$ (18, 1716), (27.3768) $ and knows public information that $ k = 3 $. From this information he can deduce$ f (x) $equal to two, and connect to the formula the known values $ x $ and $ f (x) $.

$ \ begin {aligned} f (x) & = a_0 + a_1x + a_2x ^ 2 + a_3x ^ 3 + ... + a_ {k-1} x ^ {k-1} \\ f (x) & = S + a_1x + a_2x ^ 2 \\ f (18) \ equiv 1716 & = S + a_118 + a_218 ^ 2 \\ f (27) \ equiv 3768 & = S + a_127 + a_227 ^ 2 \ end {aligned} $


Then the attacker can find $ a_1 $by counting $ f (27) - f (18) $:

$ \ begin {aligned} 3768 - 1716 & = (S - S) + (27a_1 - 18a_1) + (729a_2 - 324a_2) \\ 2052 & = 9a_1 + 405a_2 \\ 9a_1 & = 2052 - 405a_2 \\ a_1 & = \ frac {2052 - 405a_2} {9} \\ a_1 & = 228 - 45a_2 \ end {aligned} $


Since we have defined $ a_1, ..., a_ {k-1} $ like randomly selected positive integers, there are a limited number of possible $ a_2 $. With this information, the attacker can display$ a_2 \ in [1,2,3,4,5] $because anything more than 5 will do $ a_1 $negative. This turns out to be true, since we have identified$ a_2 = 5 $

The attacker can then calculate the possible values. $ S $by replacing $ a_1 $ at $ f (18) $:

$ \ begin {aligned} 1716 & = S + a_118 + a_218 ^ 2 \\ 1716 & = S + 18 \ left (228 - 45a_2 \ right) + a_218 ^ 2 \\ 1716 - S & = 18 \ left (228 - 45a_2 \ right) + a_218 ^ 2 \\ -S & = 18 \ left (228 - 45a_2 \ right) + a_218 ^ 2 - 1716 \\ -S & = 4104 - 810a_2 + a_218 ^ 2 - 1716 \\ S & = -4104 + 810a_2 - a_218 ^ 2 + 1716 \\ S & = 810a_2 - 324a_2 -2388 \\ S & = 486a_2 - 2388 \ end {aligned} $


With a limited set of options for $ a_2 $ it becomes clear how easy it is to find and check the values $ S $. There are only five options.

Solving the problem of unsafe integer arithmetic


To eliminate this vulnerability, Shamir proposes using modular arithmetic, replacing $ f (x) $ on $ f (x) \ mod p $where $ p \ in \ mathbb {P}: p> S, p> n $ and $ \ mathbb {P} $- the set of all primes.

Quickly recall how modular arithmetic works. An arrow watch is a familiar concept. She uses watches that are$ \ mod 12 $. As soon as the hour hand passes twelve, it returns to one. An interesting feature of this system is that just by looking at the clock, we cannot deduce how many turns the hour hand made. However, if we know that the hour hand has passed 12 four times, we can completely determine the number of hours elapsed using a simple formula$ a = mq + r $where $ m $ - this is our divider (here $ m = 12 $), $ q $ - is the coefficient (how many times the divider without remainder goes into the original number, here $ q = 4 $), but $ r $ - this is the remainder, which usually returns the modular operator call (here $ r = 1.5 $). Knowing all of these values ​​allows us to solve the equation for$ a = 49.5 $, but if we miss the coefficient, we can never restore the original value.

You can demonstrate how this improves the security of our circuit by applying the circuit to our previous example and using$ p = 73 $. Our new polynomial function$ f '(x) = 42 + 3x + 5x ^ 2 \ mod 73 $, and new points $ (18, 37), (27, 45), (31, 49), (35, 67) $. Now the key keepers can once again use polynomial interpolation to restore our function, only this time the operations of addition and multiplication must be accompanied by a reduction modulo$ p $ (eg $ 48 + 93 \ mod 73 = 68 $).

Using this new example, assume that the attacker has learned two of these new points,$ (18, 37), (27, 45) $and public information $ k = 3, p = 73 $. This time, the attacker, based on all the information he has, displays the following functions, where$ \ mathbb {N} $ - a set of all positive integers, and $ q_x $ represents the modulus coefficient $ f '(x) $.

$ \ begin {aligned} f '(x) & = S + a_1x + a_2x ^ 2 \ mod 73 \\ f' (x) & = S + a_1x + a_2x ^ 2 - 73q_x: q_x \ in \ mathbb {N} \\ f '(18) \ equiv 37 & = S + a_118 + a_218 ^ 2 - 73q_ {18} \\ f' (27) \ equiv 45 & = S + a_127 + a_227 ^ 2 - 73q_ {27} \ end {aligned} $


Now our attacker finds again $ a_1 $by calculating $ f '(27) - f' (18) $:

$ \ begin {aligned} 45 - 37 & = (S - S) + (27a_1 - 18a_1) + (729a_2 - 324a_2) + (-73q_ {27} - (-73q_ {18})) \\ 8 & = 9a_1 + 405a_2 + 73 (q_ {18} - q_ {27}) \\ -9a_1 & = -8 + 405a_2 + 73 (q_ {18} - q_ {27}) \\ 9a_1 & = 8 - 405a_2 - 73 (q_ {18} - q_ {27}) \\ a_1 & = \ frac {8 - 405a_2 - 73 (q_ {18} - q_ {27})} {9} \\ a_1 & = \ frac {8} {9} - 45a_2 - \ frac {73} {9} (q_ {18} - q_ {27}) \ end {aligned} $


Then he tries again to withdraw $ S $by replacing $ a_1 $ at $ f '(18) $:

$ \ begin {aligned} 37 & = S + 18 \ left (\ frac {8} {9} - 45a_2 - \ frac {73} {9} (q_ {18} - q_ {27}) \ right) + a_218 ^ 2 - 73q_ {18} \\ -S & = 18 \ left (\ frac {8} {9} - 45a_2 - \ frac {73} {9} (q_ {18} - q_ {27}) \ right) + a_218 ^ 2 - 73q_ {18} - 37 \\ S & = -18 \ left (\ frac {8} {9} - 45a_2 - \ frac {73} {9} (q_ {18} - q_ {27} ) \ right) - 324a_2 + 73q_ {18} + 37 \\ S & = 486a_2 + 21 + 219q_ {18} - 146q_ {27} \ end {aligned} $


This time he has a serious problem. There are no values ​​in the formula.$ a_2 $, $ q_ {18} $ and $ q_ {27} $. Since there are an infinite number of combinations of these variables, he cannot get any additional information.

Safety considerations


Shamir’s secret sharing scheme offers security in terms of information theory . This means that mathematics is resistant even against an attacker with unlimited computing power. However, the scheme still contains several known problems.

For example, the Shamir scheme does not create verifiable fragments , that is, people can freely present fake fragments and interfere with the restoration of the correct secret. A hostile fragment keeper with enough information may even produce another fragment by changing$ S $at its discretion. This problem is solved with the help of verifiable secret sharing schemes , such as the Feldman scheme.

Another problem is that the length of any fragment is equal to the length of the corresponding secret, so the length of the secret is easy to determine. This problem is solved by the trivial packing of the secret with arbitrary numbers up to a fixed length.

Finally, it is important to note that our safety concerns may go beyond the scheme itself. For real-world cryptographic applications, there is often a threat of attacks via third-party channels, when an attacker tries to extract useful information from the application execution time, caching, failures, etc. If this is a concern, you should carefully consider the use of protective measures, such as functions and search with constant execution time, prevent the storage of disk space and consider a number of other things that are beyond the scope of this article.

Demo


On this page there is an interactive demonstration of the Shamir secret sharing scheme. The demonstration is based on the ssss-js library , which itself is a JavaScript port of the popular ssss program . Note that calculating large values$ k $, $ n $ and $ S $ may take some time.

Also popular now: