Codex32: A Shamir Secret Sharing Scheme
Blockstream Research

Codex32: A Shamir Secret Sharing Scheme

Andrew Poelstra

Since 2020, alongside Blockstream Research, I have been playing with the idea of cryptography without electronic computers. This is not a new idea: the entire history of cryptography until the 20th century was like this. But modern cryptography, with hundred-digit numbers and complex algorithms, has always used computers. And with good reason: computers can perform operations in a billionth of a second that might take a human minutes to perform.

A billion-fold slowdown is the least of the problems for a hand-computable cryptosystem. Humans are not just slow; there are only so many precise instructions they can keep in mind at once, they have limited patience for reading such instructions, and they struggle to do tedious computations for long. Humans also make frequent mistakes, even with things they have successfully done many times before.

As it turns out, despite the complexity of managing secret data, we can still perform many operations by hand with the help of worksheets and simple tools that can be printed and cut out.

Today, we are launching Codex32: A Shamir Secret Sharing Scheme, a new booklet available on the Blockstream store. Codex32 contains tear-out worksheets for checksumming and secret sharing, paper computers that can be cut out and assembled, a worksheet for removing bias from dice rolls, and beautiful artwork by Micaela Paez and M. Lufti' As'ad.

We will explore what the codex does, but first, let's address the question that has been nagging you since the first sentence of this post: Why?

The premise behind codex32 is that you will be generating, checksumming and splitting a BIP 32 master seed, from which you will derive Bitcoin addresses.

Why Hand Computation?

Electronic computers are pretty amazing. They can perform calculations far faster than humans, without mistakes, for years on end without rest or boredom. But superhuman speed means that their actions cannot really be checked by humans.

This is not just philosophy. This problem is why we have a multi-billion-dollar cybersecurity industry and the open-source software movement. This is why security researchers insist that voting machines produce paper records that can be counted by hand. Computers can be infected by malware, their code could be malicious, they may leak secret data through side channels or by failing to delete things, and they may be buggy. These problems are nearly intractable when it comes to software, but they also apply to hardware, where verification requires complex tools and expertise. And as detailed in The Age of Surveillance Capitalism, even computers that are "working correctly" are likely to be working against their users' interest.

Every time you replace hardware or update software, all these concerns are renewed. (If you do not upgrade your software, they simply grow without limit.) Updates also risk compatibility breaks leaving your data inaccessible.

In our day-to-day lives, we mostly accept these things as the cost of being part of a digital society, but when it comes to the secret key data that controls a bearer asset such as Bitcoin, this cost may be too great.

By computing with pen and paper, we can be assured that no secret data appears anywhere we did not write it. We can create our own random data in a transparent way. We can choose how long we want to take to do various operations, confounding timing attacks. We can be assured that as long as our instructions are written somewhere, perhaps printed and stored in a safe, that our processes will remain compatible. And these assurances are not only real, but they feel real, giving us a peace of mind that electronic computers never can.

Brain Wallets and Paper Wallets

These ideas might remind readers of some old ideas bouncing around the BitcoinTalk forums. For example, a "paper wallet," in which a user writes a seed (or private key) on a piece of paper and stores this offline. Metal devices such as Cryptosteels or ColdTi are natural extensions of this idea, which are much more likely to survive natural disasters or flooded basements.

Offline storage of seed data is a good idea for increasing the security of your bitcoin holdings. It’s recommended to use some sort of metal device, though acid-free paper could work in a dry and fireproof location. With such storage there is a tradeoff between creating more copies (increasing the risk of theft) or fewer (increasing the risk of loss). The codex gives users more freedom to make this tradeoff, by allowing them to split their data into multiple "shares" such that the original data can only be recovered when enough of them are brought together.

Another idea from the same era is that of a "brain wallet" (or brainwallet) in which the user simply memorizes their secret data in lieu of physical backups. We strongly discourage this. One problem with this is that it encourages users to choose weak seeds that are too short, too highly structured, or which exist in printed literature. Even if such seeds are tweaked and prodded in various ways, they would not have enough information to withstand an attacker with a lot of computational power. One of the earliest Bitcoin scams was a website designed to "help" users produce such weak seeds for use with brain wallets.

The problem is ultimately that it is hard to memorize good randomness. The structure that makes common phrases, bits of poetry, and short stories so memorable is also what makes it easier for attackers to guess your words.

The second problem with brain wallets is that human memory is fallible. It is easy to convince ourselves that we are the kind of highly intelligent people whose memories would not fail. But intelligence will not help if you hit your head, get a fever, experience trauma, or simply lose the motivation to do your memory-refreshing ritual after a few years.

The correct way to generate seed data is to produce at least 128 bits of uniformly random data, and the correct way to store it is outside of a brain. The codex provides a way to produce such data by rolling dice and applying a von Neumann extractor to eliminate bias. There are other ways to get good seeds from dice, often referred to as diceware.

How Much Can You Actually Do by Hand?

With this background in place, let's discuss the various things that users might want to do with their seed backups.

  • Verify the integrity of the backup.
  • Verify that the coins controlled by the backup have not moved.
  • Re-split the backup, if it is a secret shared backup.
  • Recover or initialize a new wallet from the backup.

Some users might want to skip the "new wallet" and simply do everything on paper. Unfortunately, there is currently no way to derive addresses or sign transactions without using electronics. For the above tasks, though, we can.

Verify backup integrity. Long-term storage media might become corrupted or worn down so that they become unreadable. There is a mechanism to detect and fix small numbers of independent errors, called an error correction code or checksum. The codex provides worksheets enabling users to create and verify checksums on their data. If the checksum passes, everything is good. If it does not you know you have a mistake. The process of creating a checksum takes 30-60 minutes, and needs to be done twice to catch mistakes made the first time. Verifying a checksum takes the same amount of time, and only needs to be done once.

Verify the coins are still there. There is not much an offline backup can do to check the state of the blockchain. Instead, use an online watch-only wallet to monitor the blockchain for any movement of your coins, without ever needing access to secret data.

Split and re-split the backup. This is the most interesting and powerful feature of codex32. The codex provides the tools to create a secret split across several "shares" such that the secret can be recovered by a threshold number of them. Typical threshold values are 2 or 3. If an attacker has fewer shares than this, they learn nothing about the secret.

With codex32, the process is that a user generates threshold-many random initial shares, then using these shares, derives additional shares as needed, up to 31 in total. Later, given enough shares, they can reconstruct the secret.

The derivation process ensures that if your initial shares have valid checksums, then the derived shares and final secret automatically will as well. This means that you can create your initial shares, compute their checksums, derive additional shares, and verify their checksums—and if you make any mistakes during this tedious process, the checksum worksheet will catch it.

Deriving shares takes about 5-10 minutes per input share, and the final checksum check takes 30-60 minutes.

Recover or initialize a new wallet. Wallets can take codex32 shares to initialize themselves. The process is much the same as existing workflows using seed words.

Right now, there is an open pull request to Bitcoin Core to support codex32, which is included in the version of Core used in Bails. Several other wallets have indicated their intention to eventually support codex32, including Blockstream Green, Anchorwatch, and Liana. The technical difficulty of such support is comparable to the difficulty of supporting Segwit addresses back in 2016, except easier because codex32 reuses much of the same logic.

What's Next?

The immediate next step for codex32 is to improve wallet support. This means improving and polishing the rust-codex32 library and introducing support into BDK.

We would also like to implement error correction logic and expand the codex32 website to support this as well as more interactive functionality. Error correction currently requires computers. But at least for single errors, we plan to implement by-hand correction using lookup tables.

The codex has instructions for splitting and checksumming secrets. But it does not provide a lot of guidance for what to do next: distributing shares to people you trust and making a schedule to verify their integrity.

We would like to recommend that people verify their shares every year, but the current process is a pain to set up: you need to re-read the instructions, maybe assemble new paper computers, and then spend 30-60 minutes on the verification. And if this fails, in the absence of error correcting tables, you need to do everything again. So this is not a realistic recommendation.

However, we have a solution in mind: quickchecks are partial checksum checks that still detect 99.9% of errors, and can be contained (instructions, lookup tables, and all) on a single sheet of paper. There are seven quickchecks which together amount to a full checksum verification. Users can store many copies of the full set with their shares; then every year they simply grab the next page, follow the instructions to fill it out (which will take 10-15 minutes rather than 30-60), and destroy it.

Right now, quickchecks exist only as mathematics. We need to do the work of creating worksheets, tweaking parameters, and laying out instructions.

The above plans require a bit of elbow grease, but we have some big ideas for future research. From a given seed, is it possible to compute hashes to derive additional seeds, as is done in BIP 32? Is it possible to perform an elliptic curve multiplication, which would allow the derivation of addresses and signing of transactions? Is it possible to do encryption or handshaking protocols? As of this writing, we do not really know.

If you have specific preferences, please, mark the topic(s) you would like to read: