Understanding bitcoin transactions
If you’ve been onto Coinbase and bought some bitcoin, then you’ve initiated a bitcoin transaction. If you’ve moved funds from an exchange to your LedgerX, then you’ve initiated a bitcoin transaction. If you’ve sent a few satoshis to a friend or used a crypto debit card to pay for your morning coffee, you’ve also initiated a transaction. As such, many of us have used bitcoin for exactly what it was intended to be, a method of moving funds from one person/entity to another, without an intermediary such as a bank or clearing house. However, what’s going on underneath the hood?
In this post I’ll explore the components of a transaction and the importance of signatures, in order to explain just what a transaction is made up of and why it’s a little more complex than first glance.
What makes up a transaction?
Simply put, a transaction is made up of inputs and outputs. The inputs are outputs from previous transactions and identified as hashes (data of fixed length created via a hashing algorithm). You can think of this in fiat money terms. If you have some coins in your pocket and buy a coffee, these coins will have been given to you after a previous transaction e.g buying your morning paper with a £10 note. As such, the inputs for your coffee purchase are the output from your newspaper purchase.
There are 3 outputs of a transaction; the destination address, the change address and the miner fee.
The destination address
This is the address where you are looking to send the bitcoins and thus the recipient of the transaction. It is important to note that a single transaction may be to many recipients rather than just one.
The change address
This is where any surplus from the transaction will be sent and should be an address which the sender has control over. Many wallets autogenerate change addresses when the transaction is initiated, however if you are moving funds from a paper wallet or hardware wallet then it may be necessary to specify the change address directly. If a change address is not provided then the surplus from the transaction will be sent as the transaction fee.
Transaction Fee
This is the fee paid to the miner for including the transaction into a block. A minimum transaction fee was introduced in Bitcoin Coin 0.9 (currently 1,000 satoshis) but the daily transaction fee will depend on network traffic and demand. At present (https://bitinfocharts.com/comparison/bitcoin-transactionfees.html) this is c$1 but notably rose to over $55 back in December 2017 when demand on the network surged and the bitcoin price rose to almost $20,000.
With the fiat example above, the destination address is the person you are buying the coffee from and how much they are charging, the change address is back to yourself with the change for the transaction, and the miner fee would be a fee paid to the merchant for facilitating the transaction. Now let’s look at a bitcoin example…
In the diagram below, the address [1BafdH…] is looking to make a 2.13 BTC transaction to address [17Gr3Fan….] as such the required inputs are at least 2.13 BTC and must be collected from previous transaction outputs. In the example below there are 6 possible inputs — similar to having various denomination coins in your pocket to buy the coffee with. These are referred to as Unspent Transaction Outputs, or UTXOs, and an algorithm called CoinSelect will choose between these in order to determine the best possible combination of inputs to ensure at least the output required is selected.
In the example below there is one address and 6 UTXOs associated, however if funds are stored in a wallet it may be that there are many addresses each with a number of small balances. In some cases the UTXOs for an address are so small that the transaction costs are greater than the value of the UTXO — These are referred to as ‘dust’ and can be remedied by sweeping balances across addresses. Dust is created because a bitcoin has up to 8 decimal places of value and can therefore be partitioned into much smaller amounts than fiat currency which only has cents or pennies at the smallest degree. As such as transactions are created the change amounts can become smaller and smaller, eventually creating dyst across a number of different addresses.
In connection to this, a recent BIP (Bitcoin Improvement Proposal) 176 by developer Jimmy Song proposed the use of the term ‘bits’ to refer to 1000 satoshis of value in order to help readability past our traditional understanding of 2 decimal places and accommodate for dust value amounts.
The outputs for the example transaction are therefore the destination address [17Gr3Fan…] with output amount of 2.13 BTC, a change address with amount 0.043 BTC and the remaining amount of 0.02 BTC which is the transaction fee for the miner.
Notably the change address could be back to [1BafdH…] but it is good blockchain hygiene to use a different change address rather than sending back to the input address. This is because when a transaction is initiated the public key of the input address is revealed. Until this point the public key remains hashed and since hashes are one way encryptions, then no viewer can determine the public key associated. However when the UTXOs are spent, the public key is revealed. Therefore, in order to guard against potential future attacks in which private keys can be reconstructed from public keys, if a different change address is used each time then even if the private key could be reconstructed the balance to hack would be 0BTC.
Furthermore, by using one address as an ‘account’ and therefore sending and receiving many transactions into it, the person’s transaction history can be easily tracked through the blockchain and their current balance known. However, by using different change addresses each time it is impossible for a viewer to know whether these transactions are to the same person or another, as such providing additional transaction privacy.
In addition, and as we will explore further when we discuss signatures, using a distinct change address removes the possibility for an attacker to compare and potentially mimic signatures.
Building a Transaction
As outlined above, a transactions is a combination of inputs and outputs. However building a transaction which will be accepted by miners involves a number of additional information points.
The first is a 4 byte version number, which specifies to the miners what the validation rules are for the transaction. The validation rules for the network can change therefore by specifying a version number, future transactions can be validated with new rules without invalidating previous transactions.
The next two pieces of information identify the specific UTXOs to be used as inputs. These are the transaction identifier (txid) and an output index number (often referred to as a ‘vout’). It is worth noting that this information simply identifies the UTXOs, it is the coin selection algorithm which choosing them.
The final element is the signature script which provides conditions upon which the bitcoins can be spent by the destination address and a signature which is produced by the Elliptical Curve Digital Signature Algorithm (ECDSA) using the secp256k1 curve. Let’s unpack this a little….
Signatures
In order to move bitcoins associated with an address you must have control over the corresponding private key and, crucially, you must prove this to the network. However, you do not want to show the private key to the network as this would allow a nefarious actor to move any remaining bitcoins you had left at the address and compromise the security fo the address. As such, signatures are used to prove that you have the associated private key without revealing it. This is most clear if we consider an example:
Alice wishes to spend some bitcoins at an address which she claims to own.
She generates a signature which is the cryptographic hash of the transaction data and her private key. In order for Bob to verify this signature, he computes the transaction data (which is publicly available), the signature (which Alice has made publicly available) and a hash of Alice’s public key (which is also publicly available). The cryptographic output of this essentially provides a yes or no as to whether the signature is valid and thus whether Alice has used the corresponding private key for the public key.
It is important to note that should different transaction data be used then a different signature will be produced. This prevents attacks in which the message is changed post signing or signature replication is attempted. For a real world comparison, imagine the signature you put on the bottom of a contract automatically changes if someone attempts to alter a clause in the contract — thereby a) alerting you to this change and b) avoiding signature comparisons and thus possible replications.
Returning to the additional input components, the signature script can now be understood to be the signature as described above, plus a redeem script. There are a number of possible redeem scripts and we will consider 4 below:
- Pay-to-PubKey (P2PK)
Using this script will allow a transaction to be paid directly to the public key rather than the hashed version — the address. This is not a preferred transaction type as it broadcasts the public key as an unhashed version and could therefore be vulnerable to reverse computation of private keys from public keys, if this is achieved in the future.
- Pay-To-Script-Hash (P2SH)
This allows encumbrances (conditions) to be placed on the destination address which must be met before the bitcoins can be spent. A common example of this is MultiSig in which n of m signatures are required to spend the bitcoins. MultiSig addresses are denoted by a pre-fix of `3` rather than the usual bitcoin address pre-fix of `1` and can have up to 15 signatures.
- Pay-to-Witness-Script-Hash (P2WSH)
This is similar to P2SH however the data in the signature script is moved to the witness. This was introduced after BIP91 in July 2017 within the Segregated Witness soft fork.
- Pay-to-Pubkey-Hash (P2PKH)
This is the most common redeem script with the transaction sent to 1 or more bitcoin addresses — which are hashes of the associated public key.
In addition to the encumbrance provided to the destination address by a redeem script, it is also possible to place an encumbrance on the miner. This is referred to as the TimeLock and specifies an earliest time or block height the transaction can be added to the blockchain. For values <500m this is parsed as the block height and the transaction can be added to a block which has a height or greater than or equal to the value. For values greater than or equal to 500m, the value is parsed in unix epoch time format (number of seconds since 1stJanuary 1970).
By providing this delay to block addition, an unconfirmed transaction can be updated any time before the locktime expires. However, this feature is often disabled by including a sequence number for the input which is set to the maximum value of [0xffffffff].
And that’s how bitcoin transactions work under the hood!
For more #BitcoinExplained, follow me on LinkedIn (https://www.linkedin.com/in/tarannison)