The Oracle Problem

Why decentralizing everything is more difficult than it sounds

More and more companies, financial institutions, start-ups and news outlets have been making big claims about blockchain technology and its future applications. Large institutions want to be “with it” and tick their innovation checkboxes by setting up internal blockchain teams, or by partnering with crypto companies like Ripple, etc. Business savvy entrepreneurs are vying for the attention of investors and ICO money by claiming to solve every problem imaginable with blockchains.

Many of these efforts fail to grasp two fundamental principles:

1) Blockchains are very inefficient databases, that only lend themselves to specific industries like financial products, insurance markets, gambling, collectables, etc.

2) Blockchains exist in virtual space, and are only tenuously connected to the physical world

In this paper, we will focus on the second principle, since the first has been covered many times and hardly needs repeating.

The Chasm

A fact that few people appreciate is the vast divide between the virtual and physical world. That is, for information about the physical world to enter the digital space, someone or something has to supply it first. No algorithm can come up with the temperature in Paris today, or the weight of a blue whale, without that information being entered into the digital space by a person or machine. Conceptually, one can think of the digital and physical as two distinct entities that are only loosely connected by a system of links, bridging the gap between worlds.

Oracles are essentially algorithms that attempt to learn some fact, or access some piece of information from elsewhere. In the context of this paper, oracles refer to programs or smart contracts that obtain information from outside the blockchain, and feed it into the blockchain. In the case of decentralized flight insurance, for instance, the smart contract needs to know whether a given flight was actually on time or not. However, since the blockchain platform is decentralized, with no ultimate authority, who is to say which information is true, or where to get information from.

Especially in times of universal uncertainty and mistrust in the media, we need to be extremely careful how to obtain information, and how to verify that information, without resorting ultimately to trust in any one person or source. This problem grows with the stakes, which, in the insurance industry, or prediction markets can be huge. The question then becomes how to create an oracle that is reliable, and resistant to tampering or attempts to game it in order to extract some advantage or spread misinformation.

Simple Oracles

The easiest way to find out the current temperature in Paris, the outcome of an election, the price of an asset, or the status of a flight, is to consult a trusted source. For instance, a smart contract could make an API call to a meteorological website, or scour the New York Times for election result headlines, etc.

The problem with this approach is that a central source is required. While this may be relatively uncontentious for some topics, such as election results, it can be much more difficult for others, where the truth of the matter is not easily verifiable, or where a lot of money is at stake. Prediction markets are notoriously hampered by corruption and would therefore not benefit from centralized oracles. Concerns are not limited to the trustworthiness of the source, but also to the security of the transfer of information. Since a single source presents a clear target, it can be attacked more easily. A fake source might pose as the real deal, or a malicious party may bribe or corrupt the information, or launch a denial of service attack on the source.

There are several examples of bad news sources causing havoc in the stock and crypto markets, and increasingly the resulting chaos is due to algorithmic trading bots that cannot distinguish real from fake news based on their use of simple oracles. In 2013 a hacker infiltrated the Associated Press Twitter account and tweeted that President Obama was injured in an explosion at the White House. Even though it was quickly deleted, this single tweet caused a $130 billion loss in the stock market. Similarly, in 2017, Ethereum lost $4 billion because of a fake news story about Vitalik Buterin’s death in a car accident, showing that faulty sources can be very expensive.

A way of avoiding these problems is to consult several sources and trust that the median result is correct. If, for example, 50 news sites are monitored for election headlines, it is unlikely that the majority of them will run with a false story, be misinformed, or hacked. Another solution is to enlist services such as Oraclize and RealityKeys to provide DApps with API data, and other real-world facts about flights times, sports results and exchange rates, using cryptographic authenticity proofs.

Schelling Points

In the 1960s Thomas Schelling proposed a game-theoretic concept he called a focal point. Focal points are relevant in coordination games where the participants are unable to communicate. A famous example is a story of four university students who skipped out on a Monday class to spend an additional day skiing. Unfortunately, they missed an exam during this time, and subsequently concocted a story about how they attempted to attend the exam, when one of the tires blew, rendering them unable to make it in time. Since they were good students the professor gave them a chance, and let them retake the exam. On the day of the test, they were seated in separate rooms and the exam consisted of only one question: Which tire blew?

Since all tires are roughly equally likely to fail, all four answers are equilibrium solutions, which puts the students in a very difficult spot, because in truth, the tire failure was a made-up story. Criminal investigators have been using these and other techniques for decades, separating the suspects and asking detailed questions about their accounts. The idea is that there is simply no way to create a lie consistent enough to survive scrutiny if no communication between subjects is allowed.

What Schelling noticed is that there seem to be certain focal points dependant on context which people will gravitate towards. In the case of tire failure, the front right tire will be chosen more often than the others and is thus the “correct” choice. Another example is that of two strangers trying to meet up in a given city, without knowing exactly when and where, and without the ability to coordinate. Most people familiar with New York tend to suggest 12 noon, in front of Grand Central Terminal, while most Tokyo denizens will likely choose the Hachiko statue in front of Shibuya Station. While technically, all times and locations are equilibrium solutions regarding meeting places, these special places stand out as focal points.

Beauty Contests

Keynes proposed an experiment in the 1930s that let newspaper readers choose the six most attractive faces in a lineup of 100 pictures. The winners of the contest would be those that chose the faces that most other readers also believed to be the most attractive. While a simple strategy is for a reader to pick the faces she finds most appealing, a better strategy might be to disregard her own feelings, and focus on what she thinks everyone else will think. Thus, perhaps the faces with the most stand-out features could be focal points, or maybe readers will argue that blonde hair is typically considered most attractive. The observation is that there can be significant differences in outcome when participants chose based on what they think other participants will chose.

Several experiments have been conducted to show such differences. For instance, when asked to choose between a puppy, a kitten and several other baby animals based on cuteness, participants mostly decided on the kitten and puppy. When asked to chose the animal they thought most other participants would choose, the kitten overwhelmingly won, because most people are convinced that other people prefer kittens over other baby animals.

The German newspaper Spectrum der Wissenschaft gave out a $1000 prize in 1997 to the person who could guess a number between 0 and 100 that was closest to 2/3 of the average of all numbers submitted. From a game theory point of view, the equilibrium solution to this game is 0 for the following reasons.

If every participant chooses more or less random numbers, we should expect the average of all numbers to be 50. Thus, two thirds of 50 is 33, and we should choose 33 if we want to win. This is a choice of the first degree.

However, once we consider that everyone else is thinking along these lines, the actual average will be around 33, which means that the winning number will be 22. This is a choice of the second degree.

If everyone takes a second-degree choice into account, the third-degree choice will be 14. This repeats until everyone should rationally choose 0. In reality, the average person is not game theory savvy enough to fully appreciate the recursive structure of this scenario, and therefore will likely choose numbers at one of these focal points. If we assume the fallibility of people’s reasoning, we can predict the resulting distribution to be an amalgamation of nth degree choices. There are sophisticated ways of doing this, and behavioural game theory has come far in determining how many recursive steps people tend to take. One can model the likelihood of the nth step by a poisson distribution dependant on experimental data, for instance. However, we can achieve a rough picture by simply overlaying the frequencies of nth degree choices, as follows.

In fact, when the German publication disclosed both its own findings, as well as those of an identical Financial Times experiment, we see that our merged frequency prediction is not too bad. Of course, several participants chose more or less random numbers, or considered 99 and 100 to be of particular interest, but a large number of choices center around the aforementioned focal points of various degrees of choice.

The Truth as a Focal Point

As discussed above, we can see that people tend towards various focal points, depending on the context of the situation. Instead of posing a mathematical question, we can turn our attention to practical matters, and incentivize those who are closest to the median of overall choices.

Let us assume that someone has placed a large bet on the USD/Bitcoin exchange rate exceeding 1 million by the end of 2019, and placed their funds in a smart contract, which will pay him out 100 to 1, if the price of Bitcoin actually reaches or exceeds that amount. An oracle is necessary to determine if and when this happens, such that the smart contract can act. Since the price of Bitcoin is a large prediction market, let us assume that the distribution of votes is roughly normal (this is not a necessary assumption, but makes illustration easier). Every prediction within the interquartile range (IQR), that is, between the 25th and 75th percentile of sorted predictions is rewarded, creating an incentive to be as close to the median as possible. The most reliable way to achieve this is to vote in accordance to the true exchange rate, because that is the most obvious focal point around which everyone will likely gather.

Weaknesses of Focal Point Oracles

Simple Attack: Median Manipulation

As long as the majority of participants are being honest, we can see how the oracle will produce an accurate account of reality. Conversely, if the majority of participants are colluding and attempt to game the process, the oracle is thrown off. A way in which dishonest participants could game the system is by manipulating the median of results. If a smart contract needs to know the price of a certain good, or cryptocurrency, it can use a decentralized oracle which relies on votes in the above way. If we imagine a price of $100 being voted on for several votes in a row, and we reasonably expect the price to remain the same in the next election, then the variance of choices will likely be small. This is ideal, since the median will only have to be moved a little.

If a group of voters were colluding to push the median down it can help achieve their nefarious purpose. This could be either a buying opportunity on a decentralized exchange, or related to the terms of the relevant smart contract they are trying to trick.

While the median is quite resilient to change, something interesting happens to the IQR in this case. Now, all values between 0 and 100.19, are within 50% of the median, which means that this kind of manipulation can create downward vote incentives on a subsequent vote, especially if it is combined with an announcement attack described below.

Public Announcement Attack

Even in the absence of collusion, there are ways in which the oracle can be gamed. For instance, if a group of participants holding 5% of votes, publicly announces their vote ahead of time, others face a difficult choice, especially if it is not known how much voting power the announcers have. In fact, it could be in the interest of the announcers to spread misinformation about the size of their vote, or even lie about their announcement, making everything very messy. These considerations are salient when the subject matter is the price of a token, or contract.

Let us assume that there is an announcement claiming that 10% of votes will chose a price X in the next election. Since focal points are not about what the participant believes, but rather about what the participant believes other participants to believe, a conservative bet might be to vote for X as well, or a middle ground between X and previous prices, just to be safe. Even if the announcers were lying about their voting power, and in fact only control 1% of votes, price X could be within the new range of accepted results based on everyone else hedging their bets. In this particular example, this kind of strategy could exert upward or downward pressure on price, regardless of the forces of supply and demand.

Trying to censor these kinds of announcements could be a solution, but might prove to be very difficult. Censorship on the internet is not generally a proven policy, because it is almost impossible to enforce. Of course, any kind of censorship enforcement will bring back worries of centralizing the process, and the concern is that the censors themselves are corrupt. On the other hand, if voting announcements are heard all the time, and are all over the place, we can reasonably expect other voters to build up a tolerance, and ignore these dubious claims.

Incentivized Announcement Attack

A subset of announcement attacks are bribed announcements. If a villainous party decides to incentivize others by rewarding them for voting according to the announcement, we may end up with two focal points, in which case the median of results is affected. As Vitalik Buterin points out in his SchellingCoin article the incentives could be construed such that the announced focal point is in fact superior and will be chosen over the unaffected one. This kind of bribery would undoubtedly be very expensive, but could, in some cases pay off. It is conceivable that a malicious party need only affect the outcome of a single vote in order to trigger a smart contract to act in a desired way, such as prematurely, or incorrectly releasing billions of dollars to the cheaters. This is a very serious problem.

Strategy

If the previous price was voted at $100, and an announcement attack claims that 25% of the vote will choose a price of zero on the next round, then a relatively safe bet is to choose a middle value, between 50 and 100. 25% of the vote is enough to move the first quartile by half the distance between median and announced value. If, however, we do not know how much of the vote is controlled by the announcers, it becomes more complicated.

Since no one knows how anyone else will vote, and everyone votes the way they think everyone else will vote, we may end up in a beauty contest described above, with n degrees of choice, each pushing the IQR towards the announced vote. This depends on the incentive structure of the system and risk taking behaviour of voters. In order to make these choices we must assign probabilities to whether the announcement attackers have at least 25% of the vote, whether the rest of voters will choose to vote for a middle ground, or whether the rest of voters will choose to join the announcement attackers.

Below, we can see IQR ranges that consider every scenario in 5% increments, from no attackers, to 95% attackers vote. Assuming that attackers and other voters only have two choices, namely to vote with the attackers, or vote for the previous value, we can see that in 25% of cases, voting for the previous value is safe, in 20% of cases, voting for the attackers value is safe, and in 55% of cases, voting for a middle ground is safe. In this simplified model we do not know the probability associated with each choice, and thus must assume a uniform distribution of probabilities. Given these assumptions, voting for the middle value is likely the safest option. This is a choice of the first degree.

The problem does not stop here. Since everyone else is thinking along the same lines, a considerable number of voters will choose the middle value, creating choices of the second degree, in which a new middle value between the middle value and the attackers’ value is chosen. This, of course, leads to a recursion, with the equilibrium being the attackers value.

Whether or not this equilibrium is realistic depends mainly on two factors:

  1. How many steps voters can think ahead
  2. How likely voters are to believe announcers
  3. The incentive structure of the system

When considering incentives, punishing bad votes and rewarding good ones could drastically change the optimal strategy of this voting game. More importantly, we have only considered a scenario in which everyone within the IQR is rewarded equally. If we change the rules such that reward decreases as a function of distance from the mean, behaviour would likely be more conservative. In this case, the middle value is much less attractive.

Honesty

Worries regarding announcement attacks fail to factor in an important aspect of human behaviour: Most people are relatively honest, and cannot tolerate cheating and injustice. Some of the above assumptions are probably not accurate if we factor in human tendencies. For instance, it might not occur to many people to report on anything other than the truth, even if they would gain a slight advantage by not doing so. In that case, the probabilities of the above distribution might not be uniform.

Augur, the Ethereum based prediction market is a large game theory experiment and seems to confirm exactly this. On the Augur platform, participants stake their Reputation tokens (REP) on the outcome of certain events, betting on anything from sports results to elections to weather patterns. All participants in a given market are incentivized to report on the relevant outcomes, for doing so accurately will entitle them to a share in network fees. Falsely reporting on outcomes risks losing up to 20% of staked tokens. Even if there is a false consensus, there are some conflict resolution backstops that the injured parties can utilize in order to get justice.

Instead of re-centralizing the problem, Augur attempts to resolve its disputes in a decentralized way by allowing disputing users to stake REP against the markets tentative outcome. There are two possibilities: If enough REP is staked against a tentative outcome to meet the required dispute bond size (calculated based on stake in that particular market), the new result becomes the new tentative outcome. If the market is large enough such that 2.5% of all REP is staked on a dispute bond, the entire network goes into a fork stage in which it might split into two distinct universes; one in which the outcome is considered true, and another in which it is considered false.

The fact that Augur needs to resort to different universes shows how difficult the oracle problem can be. We can imagine one universe, according to the platform, in which Hillary won the election, and one in which Trump did, even though there is a clear fact of the matter.

The Double Spend Problem

While the price of gold or the outcomes of prediction markets can be considered real-world facts that pass into the blockchain space, there is set of items which will prove incredibly difficult to resolve in a decentralized way. DigixDAO, for instance, is putting gold on the blockchain. What this means is that they secure a certain amount of gold, go through audits and verifications and a host of other regulatory processes, and then create gold tokens on Ethereum for every ounce they actually hold in their vaults, similar to what Tether did with the USD.

The double spend problem arises because there is no permanent connection between an ERC-20 gold token and the actual gold in the DigixDAO vaults. That gold could be stolen, or sold secretly, in which case DigixDAO would have sold twice as much gold; a 100% profit. Clearly, there is a large incentive to do so, and this is precisely why they are motivated to centralize the oracle process, in order for regulators and auditors to anchor ERC-20 gold to the authority of such institutions.

There have been many efforts to bring real estate markets into the blockchain space, but they face similar problems. If a house is tokenized, and sold on the blockchain, the actual building is still in the real world and can be sold again, burned down for the insurance money, rented out, lived in, or be abandoned without this information passing onto the token, unless there is some authority that keeps tabs on real estate and updates the state of affairs regularly. How this problem can be solved in a decentralized way is largely unexplored territory.

Decentralized Autonomous Organizations (DAOs) could be a solution. We have already seen how a DAO, like Dash, can have real effects in the world by hiring contractors, sponsoring MMA fighters and investing in advertisements. It is foreseeable that an Oracle DAO could hire contractors to check on buildings, or even run its own real estate office, in exchange for network fees or mining rewards of the relevant blockchain real estate market.

This is a somewhat centralizing solution, because those contractors are points of weakness and can be bribed or corrupted to report falsely. If millions of dollars of real estate are at stake, this is not unlikely. Neither robotic contractors, nor smart houses are a good solution either, because both can be hacked or tricked, leaving us with an uneasy sense that this problem is insurmountable.

Conclusion

This paper has gone over some aspects of the oracle problem, and hopefully has illustrated why “decentralizing everything” may be more difficult and more dependant on compromises than many people expect. This is especially true for tokenized real-world assets, which suffer from the double-spend vulnerability. For decentralized oracles that are used in prediction markets we have discussed how announcement attacks may cause game-theoretically undesirable outcomes, in which the dominant strategy is not the optimal one people are hoping for. It will depend on the specifics of the incentive systems, and the ingenuity of the oracle’s design.

Tech Entrepreneur, here to share thoughts on technology, politics and other philosophical musings.