crypto

Bitcoin scaling problems

When bitcoin was first proposed, I argued that the proposed algorithm failed to scale.

Well, when getting started, scaling does not matter.  Now, however, a bitcoin wallet is starting to cost substantial bandwidth and processing power.  There are plans to address this, but I am underwhelmed by those plans. The proposed plans will make bitcoin more centralized, and will still have scaling issues.

Seems to me that we need an algorithm where no one computer needs to keep a copy of all transactions, or even a complete listing of who owns what coins, so as to maintain scaling all the way to operating all of the world’s transactions, and full decentralization both.

Gold also has problems, in that transporting gold from one place to another is slow and risky. Easier to leave the gold in one place, and transfer ownership. This, however tempts the proprietor of that place to misconduct (fractional reserve and term transformation) and exposes the proprietor to attack. So the most trusted proprietor winds up being the most powerful state, which is to say, the USG. Banks around the world leave their gold with Fort Knox the New York Fed, and simply move ownership of it. And that state then steals the gold. Banks that are owed gold by the USG have been asking for their gold back, asking for physical gold, and not getting it.

I recommend a system in which the private key that is the rightful owner of a bitcoin possesses data showing that he rightly acquired bitcoins from someone that was previously acknowledged as the rightful owner of those bitcoins by a previous leading hash of the state of bitcoin ownership, and a hash chain showing that his rightful ownership is acknowledged in the current leading hash of the state of bitcoin ownership, but not very many other people possess that data, not everyone in the system possesses that data, so not everyone in the system has to download that data and check it for internal consistency. When he spends the bitcoins, he proves ownership by showing that his ownership is in the most recent hash – but the recipient’s wallet would not have that information until supplied. This sounds easy, but the tricky bit doing it without risk of the network splitting into inconsistent states.

Instead of this, the proposed scaling fix for bitcoin is a very large blocksize, which would mean that only a small number of big wealthy institutions are full participants in the system – which brings us back to the problems we have been having with gold and fiat money.

29 comments Bitcoin scaling problems

Matthew says:

Reminds me of Git. You don’t have to use the entire revision history to verify that the current commit SHA is valid.

Matthew says:

Which raises the possibility of forkable and mergeable finances?

VXXC says:

“that only a small number of big wealthy institutions are full participants in the system –” You don’t say. Perhaps the point?

The Worlds gold [putatively] isn’t at Ft Knox, it’s at 250 Maiden Lane, NYFED.

jim says:

Thanks for the correction.

Red says:

Guess there is nothing really new under the sun. New system, same problems.

Couldn’t this scaling issue be used to crashed the network?

I’ve been banging on the scaling issues a while, but it may be overstretching things to say “only wealthy institutions…”. If the cost does become significant, that’s not the same as becoming unaffordable.

Lets say we’re shifting 10MB per block – 100X the current rate. That’s 10% of a present-day broadband connection, and filling a $100 disk drive per year. The processing would require a dedicated current low-end PC, which could be commoditised into something costing $100-$200. Total cost to run it, maybe $250/yr. That’s affordable for most people if it actually mattered.

The issue is today it’s still mostly a hobby for most participants, and it’s starting to become a bit costly for something that people went into when it was free.

jim says:

There can only be one.

If bitcoins are worth anything, they are worth the possibility that they will replace dollars.

To replace dollars means a lot of transactions. To scale, to have a widely distributed peer to peer system, have to generate the hash that represents all transactions without a node that generated the hash does having to know all transactions.

For if to generate the hash, requires to know all transactions, to replace dollars, means to know a lot of transactions.

Mike says:

The main utility of bitcoins is in it substituting for dollars (or other national currency). True, but replace dollars for what? The main use cases I see for Bitcoin are international funds transfer (bypassing Western Union, SWIFT, banks etc, and their associated fees and delays) and near-anonymous payments (e.g. contraband).

Bitcoin doesn’t need to replace national currencies altogether to be useful. No-one needs to buy their groceries in Bitcoins. Just being an substitute for telegraphic transfers and buying contraband online is enough to carve out a useful niche.

Anyway, you only need to keep the hashes of unspent outputs (from what I understand, at least). The rest is just archival data. That can’t be too much of a hassle. That leaves the problem of bandwidth (ie. keeping up with the blockchain). I’m not sure how much bandwidth per transaction Bitcoin uses (is there any protocol-level compression between nodes, like with HTTP? There should be.)

A bitcoin “bank” (ie. supernode that does all the blockchain-handling for leaf nodes) might not be intrinsically trustworthy (e.g. the possibility of fractionating), but Bitcoin grows in trustworthy soil – the open-source and hacker community.

jim says:

Your ambitions are too modest to be achievable. There can only be one. if bitcoin is going to succeed, everyone who buys a hamburger will do so by waving their smartphone at the cash register, or by paying with bank issued physical money backed by bitcoins. If it occupies a niche, it will expand from that niche, or be destroyed. There is no stable point.

That’s the idea, but fiat money backed by bitcoins is workable – you don’t need everybody in the world to get a transaction note when you buy a hamburger (or a lottery ticket).

Mike – since the traffic mostly consists of hashes, it’s not significantly compressible.

The essence of bitcoin is that there’s no central point that anybody has to trust. That’s not the same as nobody ever choosing to trust anybody. The unavoidable processing load of bitcoin comes from the need to check that the transaction output you’re receiving hasn’t already been spent – that’s a service that can practically and safely be outsourced provided there’s a wide choice of providers to outsource it to. I would anticipate it becoming an ISP function.

Anomaly, I agree about fiat money. A worse solution, though a solution nonetheless, to the scaling problem is to keep the coins in banks. Intra-bank transactions don’t touch the network – and inter-bank transactions can be settled at the end of the day in a single transaction. I won’t say the banks would be kept honest by the knowledge that they couldn’t be bailed out, but at least they couldn’t be bailed out.

At the least, bitcoin will take us back to an unregulated banking market: anyone will be able to start a bank.

jim says:

If the bitcoin system winds up dominated by a hundred banks, and anyone of sufficient wealth and prestige can start their own bank, we have escaped from state dominance, but we still have the problem of not being able to hang misbehaving bankers.

Jim, I’m not convinced by Moldbug Monetary Theory’s “there can only be one”. The demonetization of silver is an example for it, but gold and our fiat money system coexisting for a while is an example against. (Perhaps Moldbug might argue that people save in gold but not fiat money.) One can always choose to save in something other than the dominant money, for diversification purposes. Or if there is no dominant money, you can choose to save in both to hedge your bets.

jim says:

I don’t think gold and fiat coexist. A short while ago fiat was absolutely dominant. Now, due to ever increasing abuse of fiat money, not so much, and we are back to an unstable situation, where fiat may collapse completely and gold dominates, or fiat recovers and gold collapses.

Handle says:

The blockchain-length problem has been thought about for a while, and now as a potential exploit (though I proposed an easy proof-of-work hashcash fix here).

Still, at viral levels of adoption, even without micropenny spoofing, the bandwidth constraints of distributing the complete ledger of all transactions will, eventually, become prohibitive or exploitable or otherwise break down in forking or something else as the coordination algorithm fails to update and keep pace with monetary velocity with multiple transactions per bitcoin per minute all over the globe.

One solution could have been to allow a maximum length of blockchain and do first-in-first-out deletion of the oldest records, but this creates other problems. Another solution could be to have nodes serve as temporary registries for a particular range of bitcoin numbers – advertising themselves as swappers and editors (instead of appenders) of the data concerning satoshis in that range.

It’s a hard problem to avoid if you don’t want there to be any long-term central registry. Something has to exist in a distributed, ephemeral form, and it has to be updatable without adding to its size.

jim says:

Does not seem that hard to solve:

Let everyone who owns or recently owned some bitcoin keep a record showing that a recent top level hash contained the records showing him to be the owner, and the records showing the previous owner to be the owner, and the signed transfer from the previous owner.

Rather than everyone keep a record of who owns and owned all bitcoins, and all transfers.

Map all bitcoins into a space from zero to less than one. Partition and group all the data according to its position in this range. Then everyone tracks only the subranges that correspond to coins they care about, and accepts on faith what other people report about the root hashes of subranges they do not care about. So everyone keeps full data for subranges corresponding to data they care about, and accepts on faith root hashes for subranges that they do not care about, partial data for subranges they do not care about.

Each person constructs the full root hash of the full range from zero to less than one by hashing together, tree fashion, all the root hashes of subranges from zero to less than one. But they don’t generate all the subrange hashes.

For each power of two subrange containing some data, there is a hash.

There is a root hash for the range zero to less than one, which is the hash of the hash of the range 0 to less than half and the hash of the range half to less than one.

And, similarly, the hash of the range half to less than one is the hash of the hash of the range half to less than three quarters, and the hash of the range three quarters to less than one.

If one of the subranges of a subrange is empty, then the hash of the subrange is the hash of the non empty subrange of the subrange. Thus if a subrange contains only one item, its hash is the hash of that item.

To prove that your item was previously accepted into the consensus root hash, you need to provide that item, and all its sister hashes leading to the consensus root hash. If there are ten to the fifteenth items, far overflowing any one person’s or group’s disk drive, then all you need is fifty hashes to validate your item.

sviga lae says:

I may have the wrong impression, but it seems you misunderstand the fundamental unit of account in Bitcoin.

There is no such things as ‘a Bitcoin’, there are only addresses (public keys), with associated balances as represented by the sum history of all valid transactions into and out of the address.

It’s a distributed ledger. There is no finite set of bitcoins or associated addresses to partition.

I’m siding with Mr Gordon Moore on this one. The miners will scale, meanwhile lightweight clients such as Electrum let us avoid dealing with the whole blockchain by placing our trust in a selection of servers.

Eventually, clearinghouses will provide escrow for timely transactions, but there will be no need for even the average user to eschew the blockchain or to trust more than walking-around money to deposit-taking institutions.

Purging old records has the benefit of eliminating lost (or abandoned, e.g. death) wallets to make the money supply more quantifiable. Owners can send new transactions to themselves to update their timestamps. Purging will not hold the block chain size constant, because the rate of transactions is growing (probably exponentially).

Although the block chain size is currently only ~8GB (up from ~2GB in a year) and thus can still easily fit in the 4TB harddisks available to and afforded by the consumer market, it will not only eventually outpace Moore’s Law applied to harddisk space, but it is currently too large for many consumer internet connections to download in any quick start scenario. If non-hosted ISP connections provide 0.1 – 1GB per 10 minutes, then (assuming a resumable download manager for dropped connections) 8GB is a 1 hour to 1 day download. At 4TB a 1 year to decades download. Note a mining peer could begin processing before downloading the entire blockchain, if it is download from newest to oldest, and all the transactions in a current block are from blocks already downloaded.

At Visa scale of 16 million transactions per 10 minute block, the blockchain would be growing at roughly 23 GB per day or 8 TB per year. However, some percent of this can be reduced by pruning the blockchain for private keys that have been entirely spent (and possibly also beyond a certain age).

I propose that although we need to broadcast the transactions, the blockchain should only need to store the balances of the private keys (perhaps after the currently 100 block maturity cycle to account for resolution of competing forks). There would be two Proofs-of-Work provided, i.e. two parallel blockchains, one containing the transaction data and the other only the private keys with updated balances, with the former provide first, then all peers competing to provide the latter. So the reward would be split in half and the difficulty for both blockchains would be set so they both average completion every 10 minutes. Or the latter blockchain could be a digest of say every 10 to 100 blocks, and so the difficulty could be adjusted to be every 100 to 1000 minutes.

If the number of private keys in existence could be limited (by an automated free market algorithm protocol that raised the price of new private keys while giving a simultaneous credit for spending all of and thus deleting a private key), then size of the blockchain could be limited. Four billion private keys with a 4-byte balance would require roughly 100 GB, thus 12 hours to 12 days download. With perhaps 100 million Bitcoin users at most over the next several years, that is 40 private keys each. By the time the entire human population needs to use Bitcoin, the bandwidth of the ISPs will probably have increased an order-of-magnitude, so the limit can be increased by up to an order-of-magnitude.

For many reasons, including that mining is the only way to obtain Bitcoins truly anonymously, we don’t want mining to be limited to only those with certain resources (especially we don’t want to eliminate normal ISP accounts!).

Every mining peer has to have the evidence that supports a transaction, else peers could disagree on consensus (see my conclusion that alternatives to Proof-of-Work must centralize to obtain consensus) about new blocks and forks could appear.

Assume the blockchain is partitioned in N sections, where each mining peer only has to hold a section determined from its private key by partitioning the private key space into N sections.

If the blockchain evidence for each transaction is not sent to every mining peer, then transactions require a factor of N more time to be added into the blockchain (must wait for a mining peer to win the Proof-of-Work which holds the section of record on the sender’s balance) and forks can appear because (N-1)/N mining peers won’t be able to verify (N-1)/N transactions in the current block before starting Proof-of-Work on the next block.

So if the blockchain is N partioned, the only viable design is that the evidence must be sent to all mining peers for each transaction. Thus increasing bandwidth required by Proof-of-Work while reducing bandwidth required for new peers to download the entire blockchain. The number of peers that will request the evidence is N-1 and the size of the blockchain that a new peer has to download is total/N.

I believe Jim is correct that the only evidence that needs to be sent are the branches of the Merkle tree within the block up to block hash. All mining peers would keep a complete history of mining hashes, since these are only 80 bytes * 6/hr * 24hr * 365 = 4MB per year.

The Merkle tree is a perfectly balanced binary tree, thus the depth of the tree is log2(T) where is T is the number of transactions in a block. Thus the number of (2 hashes evidence per) nodes from block hash tree root is log2(T)-1. Thus the Merkle branch evidence bandwidth required at the limit of N -> infinity is T_current x ((log2(T_old)-1) x 2 x hashsize + transactionsize/2). Note this is in addition to the data for the current block, which is T_current x (hashsize + transactionsize) – hashsize.

Visa scale is ~16 million transactions per 10 minutes. If hashsize is 20 bytes (instead of current 32 bytes) and transactionsize is 50 bytes, then for ~16million transactions per block, the 1.1GB data size increases to 15.8 GB per 10 minute block.

Non-hosted ISP connections are limited to order-of-magnitude of 100 MB to 1GB bandwidth per 10 minutes equating to 1.4 – 14.3 million transactions per 10 min block with Bitcoin’s non-partitioned blockchain or 118 to 1046 thousand transactions per 10 min block with the herein proposed, partitioned blockchain.

Thus I conclude that the only way to scale to Visa-scale and retain freedom of mining for all (and thus anonymity for all), is to limit the number of private keys as I proposed above. This also has the advantages of keeping required bandwidth thus unreliable connection hiccups lower and discarding the history of transaction graphs, which thus increasing anonymity w.r.t. the private sector attacks (although the NSA has the Zetabyte storage resources to retain the transaction graphs even at Visa scale).

Does anyone see a problem with that proposal?

Andrew E. says:

Jim,

You have things backwards on a key point. There is something USG values more, much, much more, than possession of gold and that is a currency it can print at will. But for this to work the dollar must have credibility and credibility is given (or earned) not taken. So for instance, the leverage with respect to Germany’s gold at the NY Fed lies with the ECB not USG. And that’s because the ECB can easily break the USG printing press by bidding for bullion, in size, on the cash market with printed euros exploding the price of gold much closer to its true value, sinking the dollar. The ECB chooses not to exercise this power almost certainly because they see the dollar system ending on its own and it’s potentially bad for business being the guy who overturned the global apple cart.

Vintage bandage dresses were donned by celebrities including Victoria Beckham and Keira Knightly.

Instead of late-night clubbing, our MICHAEL Michael Kors Selma Tote is packed up for some time in Italy at Milan Fashion Week.

winner says:

This is a great tip particularly to those new to the blogosphere.

Simple but very accurate information… Thank you for sharing this one.
A must read post!

[…] scaling problems started to bite in 2013.  They are now biting really […]

[…] scaling problems started to bite in 2013. They are now biting really […]

jim says:

Not publishing this paper, because it appears to be bullshit.

The core problem with shards is that it is in the interests of the members of a shard to cooperate together to collectively perform a Byzantine fault against the other shards. This paper fails to address that problem.

The total state of the network must always be represented by the root of a merkle tree that represents the total state of the network at any one block time. The proposed mechanism lacks that merkle root.

Consensus in a widely distributed system always means consensus about the past. The present is always in flux. In Bitcoin, this consensus is represented by a block being built on by all future blocks. The block contains a merkle root that represents consensus about who owned what at that block time. Where is this merkle root in this proposal?

In Bitcoin, a miner makes an assertion about who owned what. When other miners build on his block, this assertion becomes the universally accepted consensus about who owned what at that time.

Leave a Reply

Your email address will not be published. Required fields are marked *