Categories
All Posts Posts

The Blockchain Oracle Problem

Definition: The blockchain oracle problem describes the contradiction between a highly decentralized blockchain and the more or less centralized oracles.

It leads to a paradox in where the blockchain becomes more useful with off-chain data but at the same time loses its benefit of decentralization through it.

Categories
All Posts Posts Uncategorised

Data Aggregation Methods of Blockchain Oracles

If a blockchain oracle retrieves its data from multiple data sources, it needs to aggregate them. In this post, you will learn how blockchain oracles aggregate data.

The aggregation consists of two steps:

  1. Data cleansing: Outlier detection and type checks
  2. Data aggregation

Data Type Checks

Incoming data need to be checked if their data type is correct. If you expect a number, you will reject any letter. Besides a simple data type check, it is possible to verify whether the pattern is correct or not. A typical example is the zip code or bank account number that has a fixed number of characters. If the submitted value is too short or too long, it will be rejected.

Pattern and data type checks help to protect against typos or bugs in the code where erroneous messages are sent.

Outlier Elimination/Data Cleansing

The first step is to detect and eliminate outliers. Outlier detection is an important part to keep the data reliable. But since oracles don’t know what data is right or wrong, it can only apply very rough measures.

The most straightforward approach is to cut off the n lowest and n highest values from the data set. You find this often in jury decisions in sports where the lowest and the highest scores are removed to eliminate biased jury members.

Let us consider the following example. There are ten different data sources. Each of them reports one value. We want to set n = 1.

data source value
1 6
2 4
3 8
4 4
5 5
6 6
7 7
8 5
9 1
10 3

First, we sort the values.

data source value
9 1
4 3
2 4
10 4
5 5
7 5
1 6
6 6
8 7
3 8

Then, we delete the highest and lowest value. These are 1 (reported by data source 9) and 8 (reported by data source 3).

The resulting table looks like that:

data source value
4 3
2 4
10 4
5 5
7 5
1 6
6 6
8 7

The threshold n can be given absolutely or depending on the number of sources.

Another similar approach is to calculate quantiles and eliminate them. A common threshold is the first quartile and third quartile. Everything below or above is ignored in the following aggregation step.

Some oracles use absolute values as threshold. They provide an upper and lower bound in which the reported values must lie. This is useful to eliminate nonsense data like negative sizes or prices.  Chainlink is an oracle that uses this approach.

Besides absolute values, relative values are possible too. Data that stray more than x percent from the average, are not considered.

Here, we explain in short how to calculate the quartile. When calculating the quartiles, we need to sort the values first and calculate the cumulative percentage.

data source Value cumulative percentage
9 1 0.1
4 3 0.2
2 4 0.3
10 4 0.4
5 5 0.5
7 5 0.6
1 6 0.7
6 6 0.8
8 7 0.9
3 8 1.0

There are different methods (See https://en.wikipedia.org/wiki/Quartile) how to determine the exact value of a quartile that yield different results. We use the cumulative percentage to determine the quartiles.

The first quartile consists of all values that receive a cumulated percentage of at least 25 % (0.25). To achieve this, we need to include the values 1, 3, and 3.

The third quartile consists of all values that receive a cumulated percentage of at least 75 % (0.75). This comprises the values 1, 3, 3, 4, 5, 5, 5, and 5.

Since we want to delete all values below the first quartile and above the third quartile, we get the following table:

data source value
4 4
1 5
5 5
8 5

Besides those rather simple methods of outlier detection, there are more sophisticated mechanisms like support vector machines, k-nearest neighbor, hidden Markov models, etc. However, they are difficult to perform on a smart contract due to its limited computation capacity.

Value Aggregation

After eliminating outliers, the oracle has to condense all data into a single value. First, we look at methods that can be used to aggregate values. Later, we learn how to aggregate values on-chain and off-chain.

Aggregation Methods

Again, we have different methods here:

  • Mean: average
  • Median: the number in the middle of the sorted values
  • Mode: the most often repeated number in our data set

The following example shows the calculation of mean, median, and mode with the data from a sorted and cleaned table.

data source value
4 3
2 4
10 4
5 4
7 5
1 6
6 6
8 7

 

Average = (3+4+4+4+5+6+6+7)/8 = 4.875

Median = (4+5)/2 = 4,5

Mode = 4

The mode can be considered as a majority voting. [1] However, it can be difficult to determine, if there is no single most value.

On-Chain Aggregation

In an on-chain aggregation scheme, all data sources send their data to a smart contract. To mitigate the last actor problem, oracles could introduce a commit-reveal scheme with a deposit that is burned if the data source doesn’t report in time.

Such an on-chain aggregation scheme is independent of a centralized authority. But it is expensive since every reporter has to create a transaction and pay for it. This becomes more severe if a commit-reveal scheme is used, and each reporter has to make at least two transactions.

On-chain aggregation

Off-Chain Aggregation

Oracles can decide to collect and aggregate data off-chain and only report the result to their smart contracts. The first way to do this is that the data sources report to the oracle. The data oracle cleans and aggregates the data. After it is finished, it sends the data to the smart contract.

Off-chain aggregation with central party.

The problem with this is that neither the reporter nor the data consumer can be sure whether its data got aggregated correctly unless the raw data is provided to the smart contract.

To avoid this uncertainty, reporters can coordinate themselves avoiding the oracle. To prove that the values are submitted correctly, they could use threshold signatures like Schnorr signatures. To create a threshold signature, you need at least k out of n participants. This makes the signature resilient against not responding data sources.

In this setup, reporters exchange their values and aggregate them to a value A. Then, every data source signs the value A. If a majority (k out of n) of data sources signed the same value A, it is considered as correct. In the next step, this value A is sent to the smart contract.

Off-chain aggregation without central party.

However, the tricky task is to provide an infrastructure where all oracles can exchange their values and signatures. And finally, one participant has to create a transaction and send the signature to the blockchain. This requires a centralized authority again.

Besides the issue regarding the communication infrastructure, freeloading is a problem too. Here, reporters can simply copy the data from other reporters and thus save data retrieval costs.

Categories
Posts

Central Bank Digital Currency (CBDC)

Definition and meaning: Central Bank Digital Currency (CBDC), also called digital fiat currency or digital base money, is a digital version of a fiat currency like Euro, Dollar or Yen issued by a central bank.

CBDC/digital base money is being issued by a central authority like a central bank or government. It is by law, a means of payment. A holder of a CBDC coin would have a claim on the central bank.

Categories
Posts Uncategorised

Blockchain Forks – Hardfork & Softfork explained

Definition: A fork is a split of the blockchain into two (or more) branches. This happens, if there are two blocks with the same height (or block number).

In this post we explain what forks in blockchain context are, how they can happen and what the result is.

Categories
All Posts Posts

Long Range Attack in Proof of Stake (PoS) Blockchains

Definition: A long range attack is an attack on the consensus model of a proof of stake blockchain. A block producer (or validator) tries to create an alternative chain starting from a long ago created common block. The long range into the past is necessary to circumvent possible penalties for forking. Long range attacks are explained mostly in conjunction with proof of stake, since this problem is prevalent here and closely related to the nothing at stake problem.

Categories
All Posts Posts

Nothing at stake in Proof of Stake (PoS)

Definition: Nothing at stake means that in a proof of stake (PoS) consensus algorithm rational validators (block producer) risk nothing (have nothing at stake) if they create blocks on different chains. As a consequence, the whole blockchain network won’t reach consensus on the longest chain.

Categories
All Posts Posts

List of blockchains by consensus, smart contracts and privacy

List of blockchains with information about consensus algorithm, privacy and smart contracts.

Blockchain Consensus Turing complete Smart Contracts privacy
Bitcoin PoW no (scripts) no
Bitcoin Gold PoW no (scripts) no
Bitcoin SV PoW no (scripts) no
BitShares DPoS yes no
Cardano PoS yes no
Dash PoW no yes
Dogecoin PoW no no
EOS DPoS yes no
Ethereum PoW yes no
Ethereum 2.0 PoS/PoW yes no
Ethereum Classic PoW yes no
Hyperledger Fabric yes no
IOTA PoW/TaPoW no no
Lisk DPoS no no
Litecoin PoW no (scripts) yes
Monero PoW no yes
NEO DPoS yes no
Peercoin PoS no (scripts) no
Qtum PoS yes no
Tezos PoS yes no
TRON DPoS yes no
Zcash PoW no yes

Links

Delegated Proof of Stake (DPoS)

Proof of Work (PoW)

Proof of Stake (PoS)

Categories
All Posts Posts

What is Delegated Proof of Stake (DPoS) – Explanation

Definition: Delegated Proof of Stake (DPoS) is a form of consensus algorithm, where voters vote for block producers (sometimes called witnesses) who then perform the block creation and enforce the consensus. In order to vote a voter needs funds. And the voting power is proportionally to the amount of tokens or coins a voter has (or stakes).

Categories
All Posts Posts

What is Prof of Stake (PoS)?

Explanation: Proof of Stake is a consensus model in cryptocurrencies. Block producers (miners) get selected according to the amount of coins they hold or stake. The more coins a miner holds the higher is his probability to get selected in order to validate a block.

Categories
All Posts Posts

How to Calculate a Bitcoin Block Hash Manually

In this post we describe how to calculate an verify a Bitcoin block hash manually. We need to follow five easy steps to calculate the block hash.

If you want to try it out, you can use our online bitcoin block hash calculator.

Input Values of a Bitcoin Block Hash

But before we start with the calculation we take a look at the input values.

The SHA-256 hash function takes six inputs in order to calculate the block hash.

Input Explanation
Version Tells us the block version number. Is also used to participate in polls about protocol changes
Previous Block Hash (hashPrevBlock) 256-bit hash of the previous block header
Merkle Root hash (hashMerkleRoot) 256-bit hash of the Merkle root. The Merkle root is based on all transactions in a block.
Time Block timestamp as seconds since 1970-01-01T00:00 UTC
Bits (difficulty) Target to be met by the block hash
Nonce 32-bit number used to find the correct block hash in PoW

Where to get the data from?

Blockchain data are public. If you install a node you will be able to request the data from your node.

But there are also public services called block explorers which provide the data. Here is a list of block explorers you can use:

  • com
  • https://www.blockchain.com/explorer

However, there is on caveat. The format in which the data are provided can be pretty messy. Some block explorers display the data in hex values others in decimal values. This requires a little cleanup before the actual calculation can begin. Some block explorers even give imprecise data like a truncated date without the seconds.

We take example data from the block number 600,000.

Input Value (original) Value in hexadecimal
Version 0x20000000 0x20000000
Previous Block Hash (hashPrevBlock) 00000000000000000003ecd827f336c6971f6f77a0b9fba362398dd867975645 00000000000000000003ecd827f336c6971f6f77a0b9fba362398dd867975645
Merkle Root hash (hashMerkleRoot) 66b7c4a1926b41ceb2e617ddae0067e7bfea42db502017fde5b695a50384ed26 66b7c4a1926b41ceb2e617ddae0067e7bfea42db502017fde5b695a50384ed26
Time 2019-10-19 02:04:21 as timestamp: 1571443461 0x5DAA5305
Bits (difficulty) 0x1715a35c 0x1715a35c
Nonce 0x3f93ada7 0x3f93ada7

Step 1: Converting into little-endian

All values need to be converted into little-endian hex format. Usually block explorers or hex calculators provide the data in big-endian. Little-endian means that the least significant byte is on the left side. (In big-endian it is on the right side)

big-endian to little-endian

Input Value in little-endian hexadecimal
Version 00000020
Previous Block Hash (hashPrevBlock) 45569767D88D3962A3FBB9A0776F1F97C636F327D8EC03000000000000000000
Merkle Root hash (hashMerkleRoot) 26ED8403A595B6E5FD172050DB42EABFE76700AEDD17E6B2CE416B92A1C4B766
Time 0x0553AA5D
Bits (difficulty) 0x5CA31517
Nonce 0xA7AD933F

Input data from Bitcoin block 600000.

Step 2: Concatenation

The next step is to concatenate all little-endian formatted hash values in the right order. The symbol || means concatenation. As result we receive our rawHex.

rawHex = Version || Previous Block Hash || Merkle Root hash || Time || Bits || Nonce

rawHex = 0000002045569767d88d3962a3fbb9a0776f1f97c636f327d8ec0300000000000000000026ed8403a595b6e5fd172050db42eabfe76700aedd17e6b2ce416b92a1c4b7660553aa5d5ca31517a7ad933f

Step 3: Convert into bits

The concatenated hash value now needs to be converted into binary format. Most hash calculators and libraries distinguish between string inputs and hexadecimal inputs.

Let us consider an example.

The string “2c” would have a binary representation of 0011001001100011. This is derived from the ASCII table.

Character 2 c
Binary ASCII representation 00110010 01100011

If we interpret “2c” as hexadecimal value its binary representation would be: 101100. And this is exactly what we need.

This step is usually done internal with many hash libraries. The resulting binary number of our rawHex would be too large to display here.

Step 4: Apply SHA-256 twice

On our bit representation of the rawHex we apply the SHA-256 now twice. This means, we first calculate the SHA-256 of the binary representation of our rawHex and then take the result (again as binary) and input it into our SHA-256 function again.

2 x SHA-256: 915fcd96d1c84298a8fbfb9c13a9f7b4760e9056683107000000000000000000

Step 5: Convert to little-endian

The last step is to convert the output of the last step into little endian hexadecimal format. Once this is done, we have your block hash.

00000000000000000007316856900e76b4f7a9139cfbfba89842c8d196cd5f91

Pitfalls when calculating a Bitcoin block hash

If you want to calculate or verify a Bitcoin block hash manually, you need to take care of the format of your input values.

Version: The version can be given as 1, 2, 0x00000002, 0x20000000, or in different values. If the version is given as decimal values like 1 or 2 it needs to be converted into a padded hexadecimal value first (0x00000002). If it is given as hex-value like this 0x20000000, it can be used as input value right away (it still needs to be converted to little-endian thought).

Date: make sure you use the correct date. Some browsers might give you wrong date values, because they apply the wrong time zone. And some block explorers cut off the seconds from their dates.

Further links

Another explanation can be found here and here.