Categories
All Posts

Blockchain Oracles Explained

Definition Blockchain Oracle: A Blockchain Oracle (also crypto oracle) is a service that provides off-chain data to smart contracts running on a blockchain.

Why are Blockchain Oracles Necessary?

A smart contract running on a blockchain cannot access real-world data. This limits the potential of blockchain-bases decentralized services. For example, insurance smart contracts need information about damage events, finance smart contracts need price information from exchanges, and supply chain smart contracts need to be informed about the condition and location of a container.

The problem is that block producers cannot access arbitrary data from outside the blockchain. It would be impossible for them to provide the temperature in Berlin on Friday at 2 pm or tell when a shipping container crosses the border. In order to come to a consensus, all block producers would have to verify the correctness of the data, which is just an impossible task.

And here oracles come into play. Oracles have the task of providing off-chain data to on-chain smart contracts.

In this article, we explain what blockchain oracles are and how they work.

Types of Oracles – Oracle Architecture

Blockchain oracles can be categorized in different ways. Here, you will find a list of possible design patterns for oracles. Those design patterns need to be into account when planning the oracle architecture.

Data source

Human oracles: Here, humans enter the data directly into the oracle. They are good at answering specific questions which are not publicly available.

Software oracles: Here, the data comes from a software. This could be a calculation result, a website, or a random number.

Sensor/hardware oracles: Here, the data comes from a hardware device. These could be temperature sensors, tachometers, rain sensors, RFID chips and so on. Then we call them hardware oracles.

Availability

Another way to distinguish the data source is, whether it is public or private. However, there are many nuances in the publicity of a data source.

Data from a website are usually available to everyone who has internet access. Stock market data, however, are more private. If you want to access them, you need to register with the stock exchange and probably pay a fee, particularly if you want to receive them in real-time.

If you own a fitness tracker linked to a smart contract, your fitness data is only accessible to you and the fitness tracker producer. Third parties have no access to these data (unless someone sells them).

Automatization

Oracles can collect data manually or automatically. Human oracles can collect unstructured data or data which occur only rarely. These could be the outcome of a government election or the result of a football match. But if it comes to repeatedly created data that are well structured, automated Oracles are more efficient. They collect data continuously and send them to the blockchain either on-demand or constantly

Data access

Another architecture decision to be made is when to request the data from the data source. There are basically three ways to do it:

    1. Immediate Read: The data is stored in the oracle contract, and users can read these data at any time by requesting them from the contract. The oracle might update the data occasionally.
    2. Publish-Subscribe: An oracle fetches data regularly rom the data source. The user actively polls the oracle or the oracle notifies the user as soon as new data is available.
    3. Request-Response: If a user needs some data, it requests the oracle. The oracle then tries to find a data source, retrieves the data from there and forwards it to the user.

Data verification

Oracles can provide sensitive data. That’s why they should provide reliable data. As they cannot trust a single reporter or data source, they must deal in some way to verify the data. The verification influences massively the architecture of the whole oracle service.

There are basically four design patterns:

  • No data verification
  • Trusted sources and Authentication proofs
  • Reputation/Dispute

Grade of centralization

Another category is the grade of centralization. Centralized oracles consist of one node that collects and aggregates all data. With decentralized oracles, the data collection and aggregation is done on-chain in a smart contract or in another peer-to-peer network.

Off-chain data aggregation
On-chain data aggregation

How Oracles Work

The basic work principle of an blockchain oracle is as follows::

  1. Event occurs
  2. Oracle collects data about this event
  3. Oracle processes data (cleansing, aggregating, etc.)
  4. Oracle submits data to smart contract
  5. Smart contract acts upon the data
Oracle workflow

Typically, there are different actors in an oracle. Here, we have a look at the actors and their roles. In some real-world cases, the distinction between single roles is not always clear.

Data source: Here the data comes from. It could be, a website, a sensor, an API, a crypto exchange, a weather station, etc.

Reporter: Reporters are nodes that retrieve raw data from data sources and extract the necessary information from it. After the treatment they provide the necessary information to the oracle service.

Oracle service: Oracle services are node that receive data from one or more reporters. It can conduct sanity checks (optional) before aggregating the data. In the last step they submit the data to the smart contract. The oracle service can be a smart contract.

Customer: Typically, this is a smart contract that needs the data to make a decision.

Data source and reporter may be the same entity.  The same holds for oracle services and reporters, and in some cases, even data sources.

Schematic representation of the oracle process.

Data Sources

In theory, there is no limit to where data can come from. Possible data sources are:

  • websites
  • APIs
  • humans
  • IPFS
  • sensors
  • algorithms
  • hardware (secured hardware like Intel’s SGX)

Possible data are:

  • weather data
  • stock exchange data (token prices)
  • access data of smart locks
  • fitness data
  • mathematical calculations (like WolframAlpha)
  • traffic data
  • public government data
  • random numbers

You see, data sources and types of data are vast.

Source of Errors

In this topic, we explain sources of errors that can occur in blockchain oracles. Errors can occur in every stage of the reporting phase. Sources of errors can lead to attacks. In General, possible errors lead to the so-called oracle problem.

Faulty Data Sources

If data sources provide wrong data, reporters pick them up and submit them to the oracle. This is no problem if there are many independent data sources since the wrong data could be filtered out in the cleaning process. But if there are not enough independent data sources, the error can make its way into the smart contract. The following illustration shows how one data source can influence the result.

Correlations between erroneous data are possible.

In this setup, data sources 2, 3, and 4 retrieve their data from data source 1. For the reporters, however, they appear independent. Reporter A, B, and C rely on those data and submit them to the oracle. The data coming from reporter D would be considered as wrong.

Faulty reporters

Reporters can also contribute to wrong data. This can happen intentionally or unintentionally. Unintentional errors are, for example, rounding errors or parsing errors.

But reporters can also manipulate data intentionally. To increase the impact, they could launch a Sybil attack. Here, the malicious reporter sets up many fake reporter nodes that appear to be independent. But in fact, they are controlled by the same entity. With a Sybil attack, it is possible to influence the result. This works if the reporters are anonymous. Avoiding Sybil attacks is only possible if reporters have to reveal their real identity or have to lodge a deposit. Deposits, however, have the disadvantage that richer players have better chances to register more reporters than poorer players.

Freeloading

Freeloading means that a reporter can observe the data provided by other reporters and copy them. This would create two problems:

  1. Correlation of data: This weakens the security because the diversity of data sources is reduced.
  2. Saving costs on the back of honest reporters: The freeloading reporter would save expenses for querying the data source.

Freeloading can be solved with a commit reveal scheme. In the first step, reporters create a hash from their values concatenated with a random string (salt), and submit that hash to the oracle contract.

After each reporter has committed to the value, the values and salts are revealed.

The hashes are then compared. If a reporter reveals a different value-salt combination than what he committed to in step one, he loses all of its deposit. The same holds if the reporter doesn’t reveal at all. This ensures that nobody can copy and use the data for free in the actual reporting round.

But once the revealed data is written to the blockchain, it is public and can be seen by everyone at no cost. This makes it difficult for oracles to get paid for their work.

Misaligned Incentives

Oracles, data sources, and reporters can take bribes from users and submit wrong data. Oracles could also become users themselves. In this case, they could, for example, influence the outcome of a bet in their favor.

If we assume rational oracles, they try to maximize their reward. As long as an oracle is honest, it will be chosen for future tasks and get paid. As soon as the oracle lies, it loses its entire reputation and is out of business. It cannot expect any further payments from the oracle business in the future.

The reward for being honest must be higher than the reward for cheating. This can be calculated with the net present value.

Let us assume that the data source receives a payment of 1 for each report and the interest rate is 10 %, and we assume an infinite time frame, we can calculate the net present value:

NPV = R/i

NPV: Net present value

R: periodical Reward

i: interest rate

NPV = 1/0.1 = 10

In our calculation example, the oracle would have no incentive to cheat as long as the reward for cheating is lower than 10.

Use Cases and Applications for Blockchain Oracles [Topic]

There are many use cases for off-chain data in smart contracts. Here, we give a brief overview of some oracle applications:

  • Data markets: A data market simply provides (sells) data. It doesn’t care what it is used for or who buys it. Data can be customer data, weather data, economic data, machine operating parameters, etc. An oracle that acts as a data market could help to increase the trustworthiness of the provided data by verifying it.
  • Supply chain: Oracles could improve the data quality in a supply chain and thus help its breakthrough in mass-market adoption. Applications could be customs declarations, location services that automatically release the payment as soon as the shipment reaches its destination or proof of origin, transport insurances, or proof of the origin of raw materials and goods.
  • Insurance: A reliable data source is crucial for insurances. If the data is reliable and unambiguous, a smart contract could calculate insurance premiums and release the amount insured automatically in the occurrence of a loss.
  • Decentralized finance: Here, mostly buy and sell rates from other markets are necessary for the smart contract in order to make profound pricing decisions. But data is also crucial for the assessment of credit risks of a customer.
  • Prediction markets: When resolving a prediction market, off-chain data about the outcome of the predicted event is necessary. Oracles can provide the event outcome.
  • Decentralized Energy: With solar panels, biogas plants, wind power stations, and other modern decentralized electricity producers, new settlement schemes become vital. This gets even more important when we recognize that former consumers transform into so-called prosumers (producer + consumer). In times of much wind or shining sun, they feed electricity into the network and in case of doldrums or night, they consume electricity. If the settlement is done on a blockchain the amount of consumed and produced electricity needs to be recorded and made available to the underlying smart contract. This is a use case for an oracle.
  • Gaming: Buying virtual goods inside a game and trading it on a blockchain or using location data in order to unlock a new level independently from the game publishers requires a connection to off-chain events.
  • Cross-chain communication: Another application of oracles can be cross chain communication. Since blockchains cannot easily verify transactions on other blockchains, they need a service that provides relevant information. Oracles can collect data from one blockchain and submit it to other blockchains. The correctness of the data can cryptographically be verified.
  • Others: There are plenty more use-cases and applications of oracles like car rental, housing, accommodation booking, etc.

Sources and Links