Crypto and the Latency Arms Race: Crypto Exchanges and the HFT Crowd

News by Coindesk: Max Boonen
Carrying on from an earlier post about the evolution of high frequency trading (HFT), how it can harm markets and how crypto exchanges are responding, here we focus on the potential longer-term impact on the crypto ecosystem.
First, though, we need to focus on the state of HFT in a broader context.

Conventional markets are adopting anti-latency arbitrage mechanisms

In conventional markets, latency arbitrage has increased toxicity on lit venues and pushed trading volumes over-the-counter or into dark pools. In Europe, dark liquidity has increased in spite of efforts by regulators to clamp down on it. In some markets, regulation has actually contributed to this. Per the SEC:
“Using the Nasdaq market as a proxy, [Regulation] NMS did not seem to succeed in its mission to increase the display of limit orders in the marketplace. We have seen an increase in dark liquidity, smaller trade sizes, similar trading volumes, and a larger number of “small” venues.”
Why is non-lit execution remaining or becoming more successful in spite of its lower transparency? In its 2014 paper, BlackRock came out in favour of dark pools in the context of best execution requirements. It also lamented message congestion and cautioned against increasing tick sizes, features that advantage latency arbitrageurs. (This echoes the comment to CoinDesk of David Weisberger, CEO of Coinroutes, who explained that the tick sizes typical of the crypto market are small and therefore do not put slower traders at much of a disadvantage.)
Major venues now recognize that the speed race threatens their business model in some markets, as it pushes those “slow” market makers with risk-absorbing capacity to provide liquidity to the likes of BlackRock off-exchange. Eurex has responded by implementing anti-latency arbitrage (ALA) mechanisms in options:
“Right now, a lot of liquidity providers need to invest more into technology in order to protect themselves against other, very fast liquidity providers, than they can invest in their pricing for the end client. The end result of this is a certain imbalance, where we have a few very sophisticated liquidity providers that are very active in the order book and then a lot of liquidity providers that have the ability to provide prices to end clients, but are tending to do so more away from the order book”, commented Jonas Ullmann, Eurex’s head of market functionality. Such views are increasingly supported by academic research.
XTX identifies two categories of ALA mechanisms: policy-based and technology-based. Policy-based ALA refers to a venue simply deciding that latency arbitrageurs are not allowed to trade on it. Alternative venues to exchanges (going under various acronyms such as ECN, ATS or MTF) can allow traders to either take or make, but not engage in both activities. Others can purposefully select — and advertise — their mix of market participants, or allow users to trade in separate “rooms” where undesired firms are excluded. The rise of “alternative microstructures” is mostly evidenced in crypto by the surge in electronic OTC trading, where traders can receive better prices than on exchange.
Technology-based ALA encompasses delays, random or deterministic, added to an exchange’s matching engine to reduce the viability of latency arbitrage strategies. The classic example is a speed bump where new orders are delayed by a few milliseconds, but the cancellation of existing orders is not. This lets market makers place fresh quotes at the new prevailing market price without being run over by latency arbitrageurs.
As a practical example, the London Metal Exchange recently announced an eight-millisecond speed bump on some contracts that are prime candidates for latency arbitrageurs due to their similarity to products trading on the much bigger CME in Chicago.
Why 8 milliseconds? First, microwave transmission between Chicago and the US East Coast is 3 milliseconds faster than fibre optic lines. From there, the $250,000 a month Hibernia Express transatlantic cable helps you get to London another 4 milliseconds faster than cheaper alternatives. Add a millisecond for internal latencies such as not using FPGAs and 8 milliseconds is the difference for a liquidity provider between investing tens of millions in speed technology or being priced out of the market by latency arbitrage.
With this in mind, let’s consider what the future holds for crypto.

Crypto exchanges must not forget their retail roots

We learn from conventional markets that liquidity benefits from a diverse base of market makers with risk-absorption capacity.
Some have claimed that the spread compression witnessed in the bitcoin market since 2017 is due to electronification. Instead, I posit that it is greater risk-absorbing capacity and capital allocation that has improved the liquidity of the bitcoin market, not an increase in speed, as in fact being a fast exchange with colocation such as Gemini has not supported higher volumes. Old-timers will remember Coinsetter, a company that, per the Bitcoin Wiki , “was created in 2012, and operates a bitcoin exchange and ECN. Coinsetter’s CSX trading technology enables millisecond trade execution times and offers one of the fastest API data streams in the industry.” The Wiki page should use the past tense as Coinsetter failed to gain traction, was acquired in 2016 and subsequently closed.
Exchanges that invest in scalability and user experience will thrive (BitMEX comes to mind). Crypto exchanges that favour the fastest traders (by reducing jitter, etc.) will find that winner-takes-all latency strategies do not improve liquidity. Furthermore, they risk antagonising the majority of their users, who are naturally suspicious of platforms that sell preferential treatment.
It is baffling that the head of Russia for Huobi vaunted to CoinDesk that: “The option [of co-location] allows [selected clients] to make trades 70 to 100 times faster than other users”. The article notes that Huobi doesn’t charge — but of course, not everyone can sign up.
Contrast this with one of the most successful exchanges today: Binance. It actively discourages some HFT strategies by tracking metrics such as order-to-trade ratios and temporarily blocking users that breach certain limits. Market experts know that Binance remains extremely relevant to price discovery, irrespective of its focus on a less professional user base.
Other exchanges, take heed.
Coinbase closed its entire Chicago office where 30 engineers had worked on a faster matching engine, an exercise that is rumoured to have cost $50mm. After much internal debate, I bet that the company finally realised that it wouldn’t recoup its investment and that its value derived from having onboarded 20 million users, not from upgrading systems that are already fast and reliable by the standards of crypto.
It is also unsurprising that Kraken’s Steve Hunt, a veteran of low-latency torchbearer Jump Trading, commented to CoinDesk that: “We want all customers regardless of size or scale to have equal access to our marketplace”. Experience speaks.
In a recent article on CoinDesk , Matt Trudeau of ErisX points to the lower reliability of cloud-based services compared to dedicated, co-located and cross-connected gateways. That much is true. Web-based technology puts the emphasis on serving the greatest number of users concurrently, not on serving a subset of users deterministically and at the lowest latency possible. That is the point. Crypto might be the only asset class that is accessible directly to end users with a low number of intermediaries, precisely because of the crypto ethos and how the industry evolved. It is cheaper to buy $500 of bitcoin than it is to buy $500 of Microsoft shares.
Trudeau further remarks that official, paid-for co-location is better than what he pejoratively calls “unsanctioned colocation,” the fact that crypto traders can place their servers in the same cloud providers as the exchanges. The fairness argument is dubious: anyone with $50 can set up an Amazon AWS account and run next to the major crypto exchanges, whereas cheap co-location starts at $1,000 a month in the real world. No wonder “speed technology revenues” are estimated at $1 billion for the major U.S. equity exchanges.
For a crypto exchange, to reside in a financial, non-cloud data centre with state-of-the-art network latencies might ironically impair the likelihood of success. The risk is that such an exchange becomes dominated on the taker side by the handful of players that already own or pay for the fastest communication routes between major financial data centres such as Equinix and the CME in Chicago, where bitcoin futures are traded. This might reduce liquidity on the exchange because a significant proportion of the crypto market’s risk-absorption capacity is coming from crypto-centric funds that do not have the scale to operate low-latency strategies, but might make up the bulk of the liquidity on, say, Binance. Such mom-and-pop liquidity providers might therefore shun an exchange that caters to larger players as a priority.

Exchanges risk losing market share to OTC liquidity providers

While voice trading in crypto has run its course, a major contribution to the market’s increase in liquidity circa 2017–2018 was the risk appetite of the original OTC voice desks such as Cumberland Mining and Circle.
Automation really shines in bringing together risk-absorbing capacity tailored to each client (which is impossible on anonymous exchanges) with seamless electronic execution. In contrast, latency-sensitive venues can see liquidity evaporate in periods of stress, as happened to a well-known and otherwise successful exchange on 26 June which saw its bitcoin order book become $1,000 wide for an extended period of time as liquidity providers turned their systems off. The problem is compounded by the general unavailability of credit on cash exchanges, an issue that the OTC market’s settlement model avoids.
As the crypto market matures, the business model of today’s major cash exchanges will come under pressure. In the past decade, the FX market has shown that retail traders benefit from better liquidity when they trade through different channels than institutional speculators. Systematic internalizers demonstrate the same in equities. This fact of life will apply to crypto. Exchanges have to pick a side: either cater to retail (or retail-driven intermediaries) or court HFTs.
Now that an aggregator like Tagomi runs transaction cost analysis for their clients, it will become plainly obvious to investors with medium-term and long-term horizons (i.e. anyone not looking at the next 2 seconds) that their price impact on exchange is worse than against electronic OTC liquidity providers.
Today, exchange fee structures are awkward because they must charge small users a lot to make up for crypto’s exceptionally high compliance and onboarding costs. Onboarding a single, small value user simply does not make sense unless fees are quite elevated. Exchanges end up over-charging large volume traders such as B2C2’s clients, another incentive to switch to OTC execution.
In the alternative, what if crypto exchanges focus on HFT traders? In my opinion, the CME is a much better venue for institutional takers as fees are much lower and conventional trading firms will already be connected to it. My hypothesis is that most exchanges will not be able to compete with the CME for fast traders (after all, the CBOE itself gave up), and must cater to their retail user base instead.
In a future post, we will explore other microstructures beyond all-to-all exchanges and bilateral OTC trading.
Fiber threads image via Shutterstock
