Bitcoin transaction malleability: looking at the bytes

"Malleability" of Bitcoin transactions has recently become a major issue. This article looks at how transactions are modified, at the byte level.

I have a new article The malleability attack graphed hour-by-hour. Check it out too.

An attacker has been modifying Bitcoin transactions, causing them to have a different hash. Recently an attacker has been taking transactions on the Bitcoin peer-to-peer network, modifying them slightly, and rapidly sending them to a miner. The modified transaction often gets mined first, pre-empting the original transaction. The attacker can only make "trivial" changes to a transaction, so exactly the same Bitcoin transfer happens as was intended - the same amount is moved between the same addresses, so this attack seems entirely pointless. However, each transaction is identified by a cryptographic hash, and even a trivial change to the transaction causes the transaction hash to change. Changing the hash of a transaction can have unexpected effects on the Bitcoin system.

A very quick explanation of transactions

A Bitcoin transaction moves bitcoins from one address to another. A transaction must be signed with the private key corresponding to the address, so only the owner of the bitcoins can move them. (This signing process is surprisingly complex.) The signature is then put in the middle of the transaction. Finally, the entire transaction (including the signature) is cryptographically hashed, and this hash is used to identify the transaction in the Bitcoin system. The important data is protected by the signature and can't be modified by an attacker. But there are few ways the signature itself can be changed, but still remain valid.

(This is oversimplified. For more details, see Bitcoins the hard way.)

Looking at a modified transaction

To find a transaction suffering from malleability, I looked at the unconfirmed transactions page. If a transaction gets modified, only one version will get mined successfully (and actually transfer bitcoins), and the other will remain unconfirmed (and have no effect). Among the many conditions enforced in mined blocks, the same bitcoins can't be spent twice, so both transactions will never be mined. This is why having two versions of a transaction doesn't result in two payments.

I picked a random unconfirmed transaction from Feb 11 to examine. (Unfortunately this transaction has been discarded since I wrote this article, breaking my links. But you can look up a different one if you want.) helpfully includes a banner warning that something is wrong:

Warning! this transaction is a double spend of 112593804. You should be extremely careful when trusting any transactions to/from this sender.

Looking at the transactions, everything seems fine:

The confirmed transaction takes 0.01 BTC from 1JRQExbG6WAhPCWC5W5H7Rn1LannTx1Dix and transfers 0.0099 BTC to 1Hbum99G9Lp7PyQ2nYqDcN3jh5aw878bFt (the remainder is a mining fee of 0.001 BTC). This transaction has hash bba8c3d044828f099ae3bc5f3beaff2643e0202d6c121753b53536a49511c63f.

The unconfirmed transaction takes 0.01 BTC from 1JRQExbG6WAhPCWC5W5H7Rn1LannTx1Dix and transfers 0.0099 BTC to 1Hbum99G9Lp7PyQ2nYqDcN3jh5aw878bFt (the remainder is a mining fee of 0.001 BTC). This transaction has hash d36a0fcdf4b3ccfe114e882ef4159094d2012bc8b72dc6389862a7dc43dfa61c.

The scripts of both transactions appear identical:

Input Scripts
30450220539901ea7d6840eea8826c1f3d0d1fca7827e491deabcf17889e7a2e5a39f5a1022100fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d8701 046c4ea0005563c20336d170e35ae2f168e890da34e63da7fff1cc8f2a54f60dc402b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44 OK
Output Scripts
OP_DUP OP_HASH160 b61c32ac39c63f919c4ce3a5df77590c5903d975 OP_EQUALVERIFY OP_CHECKSIG 
Both transactions look identical: the bitcoins are moving between the same accounts in both cases, the amounts are equal, and the scripts look identical. So why do they have different hashes? A clue is the unconfirmed transaction is 224 bytes and the confirmed transaction is 228 bytes.

Looking at the raw transactions also fails to show what is happening:

      "scriptSig":"30450220539901ea7d6840eea8826c1f3d0d1fca7827e491deabcf17889e7a2e5a39f5a1022100fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d8701 046c4ea0005563c20336d170e35ae2f168e890da34e63da7fff1cc8f2a54f60dc402b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44"
      "scriptPubKey":"OP_DUP OP_HASH160 b61c32ac39c63f919c4ce3a5df77590c5903d975 OP_EQUALVERIFY OP_CHECKSIG"

Even though the scripts are mostly in hex in this raw display, they have been parsed slightly, which hides what is going on. We need to get the full scripts here and here.

The unconfirmed transaction has script:

The confirmed transaction has script:
There are a couple differences (highlighted in red). But what do they mean?

This script is the scriptSig, the signature of the transaction using the sender's private key. This signature proves the sender owns the bitcoins. However, the scriptSig isn't just a simple signature, but is actually a program written in Bitcoin's Script language. This program pushes the signature data onto the execution stack. The program from the unconfirmed script is interpreted as follows:

Y 00fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d87
public key type04
Y 02b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44

The program from the confirmed script is interpreted as follows:

OP_PUSHDATA2 00484d 48 00
Y 00fe745667e444978c51fdba6981505f0a68619f0289e5ff2352acbd31b3d23d87
OP_PUSHDATA2 00414d 41 00
public key type04
Y 02b47574d6ce5c6c5d66db0845c7dabcb5d90d0d6ca9b703dc4d02f4501b6e44

Note the highlighted differences. The original transaction has a byte 0x48, which says to push (hex) 48 bytes of data. The modified transaction has a OP_PUSHDATA2 (0x4d), which says the next two bytes (48 00) are the number of bytes to push. In other words, both transactions do exactly the same thing (push the signature), but the original indicates this with 48, while the modified transaction indicates this with 4d 48 00. (Pushing the public key has a similar modification.) Since both scripts do exactly the same thing, both transactions are equally valid. However, since the data has changed, the transactions have two different hashes.

Why does malleability matter?

Transaction Malleability has been discussed for years and treated as a minor inconvenience. Both transactions have exactly the same effect, moving bitcoins between the same addresses. Only one transaction will be confirmed by miners, and the other will be discarded, so nobody gets paid twice even though there are two transactions.

There are, however, three problems that have turned up recently due to malleability.

First, the major Mt.Gox exchange stated they would stop processing bitcoin withdrawals until the Bitcoin network approves and standardizes on a new non-malleable hash. Apparently they were using the hash to track transactions, and would re-send bitcoins if the transaction didn't appear to go through. This is obviously a problem if the transaction did go through, but with a different hash.

Second, some wallet software would use both transactions to compute the balance, which caused it to show the wrong value.

Finally, due to the way Bitcoin handles change, malleability could cause a second transaction to fail. This requires a bit more explanation.

Failures due to change and malleability

The Bitcoin protocol doesn't really move bitcoins from address to address. Instead, it takes bitcoins from a set of inputs, and sends them to a set of outputs. Each output is an address (actually a script, but let's ignore that for now). Each input is an output from a previous transaction, and each input must be entirely spent.

As a result, if you have 3 bitcoins, and you want to spend one of them, the other two bitcoins get returned to you as change, sent to an address you control. If you then want to spend some of the change, your second transaction references the previous transaction that generates the change, referencing it by the hash of the first transaction. This is where malleability becomes a problem - if the first transaction's hash changed, the second transaction is not valid and the transaction will fail. Note that the change will still go to your proper address, so you can spend it as long as you use the correct (modified) transaction hash, so you don't lose any bitcoins. You just have the inconvenience of having a transaction rejected, and you'll need to redo it with the right hash.

The change problem only happens because some wallet software takes a shortcut, letting you (attempt to) spend the change before the transaction has been confirmed. The reasoning is that since it's your change from your transaction, you should be able to trust yourself. But that breaks down with malleability.

Malleability has been known for a long time

Transaction malleability has been known since 2011. The exact OP_PUSHDATA2 malleability used above was described four months ago here. There are many other types of malleability, which are explained here. The script code can be modified in several ways while leaving its operation unchanged. The signature itself can be encoded slightly differently. And interestingly, due to the mathematics of elliptic curves the numeric value of the signature can be negated, yielding a second valid signature.


Hopefully this has helped to make malleability more understandable. If you want to know more details of the Bitcoin protocol, including signing and hashing, see my previous article Bitcoins the hard way.


James Poole said...

Nice write-up. I only vaguely understood what was going on before reading this, but it makes sense now. Thanks!

dooglus said...

> OP_PUSHDATA 0041 4d 41 00

I think you missed a '2' in the operator's name there.

dooglus said...

In case anyone's wondering what my comment was about, the issue I was pointing out in the post has now been fixed.

Anonymous said...

so what happened here ??

# Theft Withdrawal Transactions and historical withdrawals by Attacker 1




Anonymous said...

Note that the malleability is also with PUSHDATA and PUSHDATA4, not specifically PUSHDATA2. Also, a major application that malleability breaks is anything that relies on a precomputed nlocktime'd refund transaction, spending a transaction back to the sender before the original transaction is announced.

Unknown said...

great article, i have really enjoyed reading this. learning about mining is very interesting and mineco are a really good mining company!

Unknown said...

Really nice post!
Now I'm trying to conduct such a malleable TX on my own example, just to try how it works.
Can you tell what exactly and how I can change in my tx 350946f9c61598ff4d8c77cb99625f6ac106765dcbf2d2d855a122363b3f3c24?
Did you use createrawtransaction/signrawtransaction?