BitClout Code Walkthrough

Introduction: The BitClout Repos

Today, bitclout.com is powered by the following repos. Together, these repos make up the entirety of what runs on bitclout.com while also supporting the ability for anyone to run their own BitClout node with all of the same data that bitclout.com has access to:

  • github.com/bitclout/core

    • This is a Golang repo that contains all of the "consensus" code behind BitClout. It's meant to be kernel that's embedded as a library into projects that want to build on the BitClout firehose.

  • github.com/bitclout/backend

    • The backend repo embeds core as a library and exposes a rich API on top of it to support transaction construction, submitting transactions to the blockchain, storing user data, and more. In some sense, it's the first "reference" app built on the core BitClout blockchain.

  • github.com/bitclout/frontend

    • This is an Angular app that is the frontend for bitclout.com. It uses the API exposed by the backend repo to support all of its queries.

  • github.com/bitclout/identity

    • This is a lightweight embeddable app that gets loaded as an iFrame in the frontend Angular app to handle all signing functions.

Below is a simple diagram that shows visually how these repositories fit together:

Overview of the architecture

We think the easiest way to understand the architecture is to describe how a node syncs with other nodes, and then to walk through key codepaths with pointers to functions and line numbers. We do this below. We use the following commit hashes to refer to the code:

The node’s main loop

  • The entrypoint to everything the node does is main.go. It's better to start tracing from the backend repo's main rather than the core repo's main, since the core repo is mainly intended to be used as a library. Moreover, since backend uses the core repo as a library, we will hit all of the core functionality by starting here anyway.

    • There is a lot of indirection in main introduced by the fact that we are using Viper to manage our command-line flags. When the backend binary is run, a command is passed, such as "run," which triggers a Run() function defined in the cmd package.

    • All available commandline flags can be viewed in the init() function. Some of these flags are initialized in LoadConfig() at the beginning of Run().

      • Note the core repo's flags are effectively imported into backend. This allows for maximum composability, whereby someone can include the core repo and get all of its functionality embedded into their binary for free.

    • Once you get into the Run() function, everything the node does can be traced explicitly. We will be walking through some of the key codepaths below

  • When a node starts up, it looks for peers that it can download blocks and transactions from. There are two main ways a node finds peers:

    • DNS bootstrapping. All peers scan domains of the form bitclout-seed-*.io to see if any valid peers are available. The function that does that is addSeedAddrsFromPrefixes() and the list of "prefixes" that are scanned is defined in constants.go.

      • Because it would cost O($1M) to buy all of the seeds, and because a node only needs one valid sync peer in order to thwart an "eclipse" attack, and because a node can iterate over tens of thousand of DNS records per second, and because DNS seeds can be changed by node operators if a particular prefix is monopolized, we think this is a safe way to find initial peers.

    • Commandline flags. --connect-ips means a peer will connect to the specified peer and nothing else. --add-ips means these peers will be added to the list of things that the peer is going to try and connect to. When we spin up new nodes, we often use --connect-ips with a trusted node because it's easier than bootstrapping from the sea of nodes that are running in the wider internet.

  • The ConnectionManager is responsible for managing all connections with peers. It's initialized using a Start() function that is kicked off in main.go. Tracing the code starting from this function is a great way to understand how connections with peers are established and maintained.

  • When the ConnectionManager connects to a peer, it does a "version negotiation" similar to Bitcoin. This happens in ConnectPeer(). If the peer passes this version negotiation, then the peer is passed off to server.go via a "newPeerChan." server.go is then responsible for doing higher-level interactions with the peer.

    • server.go is started using a Start(), which is a good place to start tracing through it. server.go can be thought of as the "main loop" for the node. It is basically a single for{} loop that all peers and services are adding messages to. See messageHandler() to see this "main loop" in action.

    • server.go processes two types of messages conceptually. Control messages and peer messages, both via messageHandler.

      • Peer messages just contain messages that came from one of the peers that the node was connected to. You can see there aren't very many of them, and they're fairly straightforward.

      • Control messages are basically notifications about things that happened internally to the node. For example, a new peer connected or a new peer disconnected.

  • When a peer is connected, server.go gets a NewPeer or MsgBitCloutNewPeer control message from the ConnectionManager and handles it in _handleNewPeer(). This is typically the "starting point" for server.go

    • If the peer is a valid one, then server.go will accept this peer as a "sync peer" in _startSync(), and it will send it a GetHeaders or MsgBitCloutGetHeaders message to start syncing headers and blocks from it.

    • The initial sync for a node is currently completely single-threaded. A sync peer is found and other peer messages are largely ignored until the node has downloaded up to the last 24 hours worth of blocks.

  • Below are the steps to syncing with a peer, which can be traced by following the functions in server.go:

    • ConnectionManager passes a MsgBitCloutNewPeer message to server.go, which is processed in messageHandler().

    • Choose a remote peer as a syncPeer in _startSync(). Call this the "remote peer."

    • Remote peer replies to the MsgBitCloutGetHeaders with a MsgBitCloutHeaderBundle in _handleGetHeaders().

      • Note that a "header locator" similar to Bitcoin is used to determine which headers are needed.

    • Node processes the MsgBitCloutHeaderBundle at _handleHeaderBundle() and responds with different messages depending on how synced the peer is.

      • If more headers are required, it sends another MsgBitCloutGetHeaders. Note that headers are requested until the number of headers in the latest HeaderBundle is < MaxHeadersPerMsg. This is how the node knows that it's downloaded all the headers the remote peer has for it.

      • If the node has exhausted the peer's headers then it downloads blocks until it has a block for every corresponding header that the peer sent it. This is exactly the same as the "headers-first" synchronization that Bitcoin does. The MsgBitCloutGetBlocks message is sent in GetBlocks().

      • Processing a block happens in ProcessBlock(), which is a great function to trace through. It calls ConnectBlock(), which calls ConnectTransaction on each transaction, which we'll discuss later.

      • Once the node has all the headers it needs from the peer, and if the node has downloaded and validated all the blocks from this peer, then the node is fully synced.

        • Once we get to this state, the node listens to INV messages from all of its peers. If it sees an INV message for a new block that it doesn't have yet, then it will send the peer a GetHeaders request, which will kick off this headers-first process for the single missing header/block.

  • Once the node has gotten through this loop, it is fully synced and in a "steady-state." At this point, the node listens for INV messages from its peer to update its state. INV messages or MsgBitCloutInv are processed via messageHandler() just like everything else. INV messages can be for a block, as mentioned previously OR for a transaction. Below is the case for a transaction INV:

    • Note that some "handle" functions are defined in peer.go rather than server.go. When this is the case, the server.go _handlePeerMessages() function will just enqueue the message for the peer's thread to process it. This is done in order to move processing into another thread for efficiency reasons (not doing this would cause server.go to be *too* single-threaded). Here you can see the _handleInv() in server.go delegate the call to peer.go, and here you see peer.go dequeuing it to process it. Note that there are several messages that are delegated in this way, all defined in the StartBitCloutMessageProcessor() function.

    • If the node is missing a transaction that it received an INV for, it sends a GetTransactions or MsgBitCloutGetTransactions message to the peer.

    • This triggers the node's _handleGetTransactions() function in server.go, which results in a TransactionBundle or MsgBitCloutTransactionBundle being sent back.

    • The node receives the transaction bundle here and processes each transaction in _processTransactions() in server.go.

      • When a transaction is processed in server.go, it is basically just calling processTransaction() in mempool.go. If the transaction is valid then it will be added to the mempool, and if not then it will be rejected. In order to validate a transaction, mempool uses the previously mentioned ConnectTransaction() function defined in block_view.go.

  • Now we understand how a node syncs initial blocks, and how it accepts new blocks and transactions in the steady-state. The next step is to understand how blocks are created and mined:

    • block_producer.go runs in a continuous loop kicked off via a Start() function called in main.go. Start() calls UpdateLatestBlockTemplate() at regular intervals to create new blocks for miners to mine. This is a great function to trace.

    • Function _getBlockTemplate() contains the logic for constructing a new block. It basically does the following:

      • Add txns from the mempool to the block until the block is full.

      • Compute the fee, merkle root, etc.

    • Newly-created "block template" is added to recentBlockTemplatesProduced in AddBlockTemplate().

    • block_producer.go just produces block templates, but it's up to miners to compute winning hashes. That happens via a remote process as follows:

      • Every node exposes two functions via JSON API: GetBlockTemplate() and SubmitBlock(). The URL paths for these and all other API functions can be seen here and here (the latter powers the block explorer).

      • Miners run remote_miner_main.go and connect to any node they want via a flag. This can be their own local node or a remote node like api.bitclout.com. remote_miner_main.go will continuously call GetBlockTemplate() on the chosen node and hash it until it's found a block. Once it has found a winning hash, it calls SubmitBlock(), which then causes the node to process it and broadcast it to the rest of the network.

        • Because all nodes expose get-block-template, all nodes can be used to mine blocks in this way. Miners generally don't need to do anything other than point to a valid BitClout node somewhere on the network.

      • Note that we are currently working on increasing the nonce size to 64 bits up from 32 bits. This will result in ExtraNonce being basically deprecated, and will make GetBlockTemplate() much faster because it won't have to copy a block.

    • Once a block has been submitted via SubmitBlock, it is then relayed to other peers via the INV mechanism described previously. This happens as follows:

  • There is one more important thread that a node runs at startup, which is the BitcoinManager thread defined in bitcoin_manager.go. Like everything else, it has a Start() function that is kicked off in main.go via server.go (called here. It works as follows:

    • It looks for a Bitcoin peer and connects to it through _getBitcoinPeer().

    • It sends the Bitcoin peer a GetHeaders and kicks off a single-threaded main loop with its peer here.

    • It downloads headers until it is fully synced with the Bitcoin peer.

      • All we really need from a Bitcoin node is its header chain.

      • The headers are used to validate BitcoinExchange transactions when calling ConnectTransaction() in either ProcessBlock() or processTransaction(). A BitcoinExchange transaction is only valid if it has a merkle proof attached to it that has a valid Bitcoin header hash as its root. More on this later.

    • In addition to the header chain, new blocks are downloaded from the Bitcoin node in order to extract valid BitcoinExchange transactions from them. Basically, any transaction that sends Bitcoin to the sink address, defined here, is recognized as being able to print BitClout on the Bitcoin chain.

      • Blocks are downloaded from the Bitcoin peer whenever a new header is received from the peer here and here.

      • You can see how the extraction of a BitcoinExchange transaction works here.

    • Like other services, whenever the BitcoinManager gets some new transactions or headers, it notifies server.go by adding a message that will be processed by messageHandler. This happens here.

    • The BitcoinManager does some other things, like for example it is used to broadcast BitcoinExchange transactions to many peers at once here. But its main purpose is to download the Bitcoin header chain and, to a lesser extent, to download new blocks and extract valid BitcoinExchange transactions from them.

    • Note also that using a single Bitcoin peer may seem insecure, but because the node checks the minimum work is above a certain threshold, it's generally not an issue. Additionally, nodes that run bitclout.com are pointed at specific trustworthy Bitcoin peers using --bitcoin_connect_peer

Seed creation and transaction construction

Below we trace how seeds and transactions are created while giving detail on their format and how validation works.

  • First, a user lands on bitclout.com, which is the Angular frontend.

    • All the API endpoints for the frontend are defined in a single file called backend_api_service.ts

      • All the routes are defined here.

    • They all hit corresponding API endpoints defined on the node's JSON API, which is fully defined in frontend_server.go.

      • All the routes are the same as the ones defined in backend_api_service.go and are defined here and configured here.

      • When a node starts up it opens up three ports: A "web" port that serves the Angular app, a "protocol" port that is used to connect with peers and process all blockchain-related messages, and an "API" port that is used to handle requests from the Angular app.

        • By default these ports are: 4002=Angular app, 17001=JSON API, 17000=protocol port

        • Note that the “web” port is deprecated in favor of running the frontend Angular app as a stand-alone service. So very soon a node will only have a JSON API port and a protocol port.

      • Anytime the angular app needs to do something like construct a transaction or download the data for a user, it uses the API port. The JSON API is like "glue" between the blockchain and the frontend.

  • Creating and storing the seed

    • When a user hits “Sign Up,” they are taken to identity.bitclout.com.

      • On identity.bitclout.com, the user generates a seed phrase and then hits next.

        • The seed is stored in the localStorage of identity.bitclout.com using a call to addUser.

      • All of the seed phrases stored in localStorage are encrypted using a call to getEncryptedUsers().

        • The access level of the host is determined. For example, bitclout.com has “FULL” access. Other nodes will have different access depending on what users have explicitly allowed.

        • If a host has “FULL” access, then an encryption key is computed for that host, to be used in a subsequent step. This encryption key is stored in localStorage where possible, but for some browsers like Safari it must be stored in a Cookie, which is less ideal but it works.

        • Once an encryption key is generated for the host, it is used to compute an encryptedSeedHex. Again, this only happens if the node has the FULL access level.

      • Then, if the host has the FULL access level, the encrypted users are sent back to the host (in our case it’s bitclout.com) by a call to login(), which then does a window.postMessage back to the host.

        • Note: This is tab-to-tab communication. bitclout.com opens identity.bitclout.com, identity generates the encryptedSeedHex, and then sends it back to bitclout.com. This same process works if you replace bitclout.com with the host of your own third-party node. The difference is that your third-party node will need to ask the user for permission in order to get encryptedSeedHex sent back to it.

    • Once bitclout.com has the encryptedSeedHex, it uses it to sign things. It does this by calling various operations on an iframe of identity.bitclout.com embedded within it.

    • Why is this so complicated? Why send encryptedSeedHex back to the host? Wouldn’t it be better to just keep everything in identity.bitclout.com?

      • The reason for this setup is that iOS devices does not allow identity.bitclout.com to access persistent localStorage when it’s embedded as an iframe in bitclout.com. This is due to Apple’s crusade against third-party cookies. However, Apple does allow identity.bitclout.com to access its cookies when its embedded as an iframe on bitclout.com if those cookies are set as first-party cookies.

      • So, what do we do? We push the user to create their seed on identity.bitclout.com, where we can set an encryption key as a first-party cookie. Then, back on bitclout.com we store the encryptedSeedHex. When signing is needed, the encryptedSeedHex is passed to the identity.bitclout.com iframe, which has access to the encryption key in the cookie, which it then uses to decrypt the encryptedSeedHex and sign the transaction.

      • One draw-back of this approach is that cookies are sent to the identity.bitclout.com automatically when the page or iframe loads. This is not ideal, but that information is useless without the actual seed. Moreover, and critically, cookies are only used on iOS devices. On non-iOS devices, the encryption key is stored in localStorage. This means that only iOS devices are subject to this drawback.

      • One other draw-back is that an XSS attack on bitclout.com or a third-party node could technically give the attacker access to the encryptedSeedHex. However, this information is useless without the encryption key stored exclusively in identity.bitclout.com.

  • When a user does any kind of "write" operation in the app, such as submitting a post, liking, or updating their profile, a corresponding endpoint in frontend_server.go is called to construct a transaction. That transaction is then returned unsigned, signed by the identity iframe, and then submitted back to core via SubmitTransaction().

  • As an example, consider /send-bitclout, which is relatively straightforward:

    • First, a universal view is fetched. More on this later, but it basically gives the endpoint a "union" of the "state" between what's in the mempool and what's in the blocks. For example, if someone sent you BitClout in a txn that's in the mempool, you can use the view to find that UTXO. And if they sent it to you in a txn that's been mined into a block, you can also find it in that view.

    • In order to create the spend transaction, the endpoint needs to find UTXO's for the user. This generally always happens in AddInputsAndChangeToTransaction(), which is a good function to trace through. I'm not aware of any transaction assembly that does not utilize this function for UTXO fetching.

      • The key function is GetSpendableUtxosForPublicKey(), which generates a universal view that includes txns from the mempool and then returns all UTXO's that are associated with the particular public key. These UTXO's can then be assembled into a transaction.

      • Again, basically all transaction assembly runs through this codepath.

    • Then the transaction is sent back to the frontend and signed.

    • The transaction is then validated and broadcasted in VerifyAndBroadcastTransaction().

      • It does some pre-validation of the transaction by calling ConnectTransaction() on it.

      • If the validation passes then it calls BroadcastTransaction(), which calls _addNewTxn() in server.go, which adds the transaction to the mempool calling ProcessTransaction().

      • Once the transaction is in the mempool, the node will eventually relay the transaction to its peers via a separate thread running in server.go that's kicked off in Start() through _startTransactionRelayer().

        • This thread is basically looking at the mempool at regular intervals and sending transactions to peers that they don't already have. This is how a transaction that's generated in the UI makes it to the rest of the network.

    • Once a transaction has gone into the mempool then we're done. It will eventually be mined into a block.

  • A note on the /burn-bitcoin endpoint:

    • This endpoint is called when a user buys BitClout using Bitcoin in the "Buy BitClout" tab. It does the following:

      • Constructs a Bitcoin transaction sending the user's Bitcoin to the "sink" address

      • Broadcasts it to the Bitcoin blockchain

      • Waits some amount of time for the transaction to propagate

      • Checks to see if a double-spend occurred during this interval.

      • If no double-spend was detected, the transaction is added to the BitClout mempool with the expectation that it will eventually mine into a Bitcoin block (and subsequently a BitClout block).

        • The fee is generally set to 2x the "fastest" fee to ensure very high probability that the txn is processed. This is currently set in the frontend, but there is no reason why it can't be re-enforced in either the frontend_server.go code or in the mempool itself prior to accepting the Bitcoin txn.

      • Once this transaction has been accepted into the mempool, the user can immediately spend it.

        • This means there will be some risk of reversion of the user's transactions if the transaction isn't ultimately confirmed by the Bitcoin blockchain. But we have yet to have someone successfully double-spend against the latest iteration of the double-spend checking logic.

      • BitcoinExchange transactions can also be added to the mempool via relay from other peers. In this case, the node can be set to ignore unmined Bitcoin transactions from peers so there is minimal risk of a double-spend or reversion.

    • Importantly, no matter what the mempool does, the BitClout blockchain will not allow a BitcoinExchange transaction into it without at least one block of work on it. In practice, three blocks of work are required because miners wait for three blocks in order to be safe. This happens via a param called MinerBitcoinMinBurnWorkBlocks that is utilized by the block producer.

Transaction format

  • Generally, all important "messages" that need to get sent between peers, most notably MsgBitCloutTxn and MsgBitCloutBlock, are defined in network.go. They all implement the very simple BitCloutMessage interface.

  • All of these messages have serialization functions called ToBytes() that are defined by us in order to guarantee that all nodes serialize to the exact same bytes. If we were to rely on protobufs of JSON, nodes could get different serialized byte strings for the same messages because these formats do not guarantee consistent serialization across machines.

  • Transactions are based on UTXO's. They contain the following:

    • TxInputs, which is effectively a list of <PreviousTxID, index> pairs called UtxoKey where the index refers to the output being spent.

    • TxOuptuts, which just specify what amounts are going to which public keys.

    • TxnMeta. More on this later

    • PublicKey. In BitClout transactions are very simple and only have one public key that can be deemed to be the "executor" of the transaction. The transaction is generally always signed by this public key.

    • ExtraData. This is a flexible map that arbitrary data can be added to. It is currently used to support Reclouts via RecloutedPostHash and IsQuoteReclouted params. It can be used to augment a transaction without causing a hard fork, which significantly increases the extensibility of BitClout by the community. For example, one can trivially add a "pinned posts" feature using ExtraData without consulting the core BitClout devs about it.

  • Note that the map keys of ExtraData are always sorted when serialized so that consistent serialization across machines is preserved even though we're using a map.

  • Transaction metadata is used to determine what type of transaction we're dealing with. For each type of transaction in the system, a metadata type is defined that implements the BitCloutTxnMetadata interface. The full list of transaction types can be viewed here. To see descriptions of each one, simply find where that transaction type implements the interface.

    • For example, here is the BitcoinExchangeMetadata. You can see it contains a full Bitcoin transaction plus a merkle proof into the Bitcoin blockchain. This is how a node verifies that a particular Bitcoin transaction has a sufficient amount of work on it.

    • TODO: The comments on these transaction types could use some work.

Transaction validation

  • Virtually all transaction validation happens in _connectTransaction in block_view.go.

  • Validation works by applying the transaction to a "view," which is basically a "simulation" of what would happen if the transaction were written to the database, but that doesn’t actually modify the database. This is useful because a view can allow you to "simulate" what would happen if you applied a bunch of transactions to the database in sequence in order to validate whole blocks before ever actually writing anything to the database. And this is exactly what ConnectBlock does.

    • A view is basically a "copy on write" system. When a transaction requires something to be written to the database, an in-memory entry is created representing that entry. This generally happens in calls to _set.*mappings and _get.*, such as _setProfileEntryMappings and _getProfileEntryForUsername.

  • If all of the transactions that have been applied to a view appear to be valid, the view can be "flushed" to the database, which writes all of the updates those transactions produced to the database. The flush code for the view is here, and it delegates to individual flush functions here.

  • We can walk through connecting an UpdateProfile transaction to see how it works.

    • _connectTransaction delegates to _connectUpdateProfile``

    • UTXO's are generally always checked by a call to _connectBasicTransfer, which returns the total input and output of the transaction.

    • An existing profile entry is looked up if one exists. If it exists, it is updated. Otherwise, a new one is created from scratch.

      • Updating an existing profile happens here while creating a new one happens here.

    • In both cases, mappings for the profile are first deleted from the view and then set on the view.

      • Note that deleting something from the view never actually deletes a mapping, it only marks it as isDeleted=true. This is because the flush needs to propagate this change to the db, and it can only do that if it knows the entry is scheduled to be deleted by leaving it in the view.

    • Finally, some information is saved that allows us to roll back or "disconnect" the transaction in the future if needed.

  • Every transaction has both a _connect and a _disconnect. The _disconnect restores the view to the state it was in before the transaction was connected. _disconnect code is rarely used, but it supports reorgs of blocks, which happen from time to time.

    • During a reorg, we need to disconnect some transactions from some blocks and connect transactions from some other blocks in order to validate the fork, *before* writing anything to the db. This happens in ProcessBlock here.

Last updated