Only drawback is the limitation to 1 minute resolution. Still good enough if you are not HFT.
If you are HFT they had packages like "all you can eat" feeds. It was like $500/month for the BTCUSD pair on any 5 exchanges out of 75 supported, with different pricing for different latency offers (going all the way to colo!) using custom client and software integration (for SLA on latency targets)
I'd have to ask if they now provide historical data.
If you have precise symbols, date ranges and exchanges you're interested in, get in touch I'll work something out so it's not that taxing on your budget.
Hey this looks really well put together, nicely done! Maintaining a connection to all these exchanges, and managing that data, is no easy task.
A few random questions:
1. Can you speak more to the "synchronized clock" part? You augment data feeds timestamps with your own?
2. What kind of database did you choose for this? How much data are you managing?
Also, (promise you'll give me a discount if tell you this?) this seems underpriced. I haven't seen anyone sell truly high resolution depth data going months back yet. For BitMEX nonetheless. But maybe i'm out of the loop. Anyways, congrats on the launch!
1. Yes every message has also local timestamp (100 ns precision).
2. Currently it's around 4-5TB of compressed data (so 25-35TB uncompressed - need to check to be sure).
Indeed pricing is supposed to be very affordable as it's targeted at independent algo traders so without spending huge amounts of $$$ they can have good data to backtest on professional level (arguably if you can call crypto trading professional as some argue). Happy to provide you with the discount, please get in touch with me via email if interested.
Thanks for the details. Flat files in the cloud, and heavy caching on the client sounds like the most effective way to go about it to me too. I work at a company that collects data like this for the traditional derivatives markets and fwiw about half of our data is in S3 too. The other half, Dynamo.
I appreciate you making this accessible to indie algo traders, it's def what we need. Will keep an eye on this.
This is a cool service. I worked as a researcher for a trading firm and we had similar internal tools.
Some hopefully not too harsh feedback:
You're capturing data in London. I know nothing about crypto markets, but they probably aren't all colocated in London, and your users won't be either. You should try to collect data at each source, synchronize it well, and let users adjust timings to suit their needs.
Data integrity is critical. You have incidentReports in your API, but I didn't see what goes in there. Ideally, make this machine readable (begin/end timestamps for each incident interval) or call the user saying data is good/bad as they stream it.
To make this more useful as a product, consider building a normalization layer on top of what you have here. It's great that you provide the actual exchange messages for those who need them, but researchers often want to answer questions like "which market has the tightest average bid-ask spread over the past month?" without learning details of a dozen APIs and writing boilerplate code for each.
I'd suggest providing the user with a standardized object representing the limit order book for a market and ticker. Clients would subscribe to it and receive generic events like snapshot, order/price level added/deleted, trade, etc. As the data is being streamed, they could also access the current state of the book at each point in time through this object to get information like the best prices, size and number of orders at each price, spread, etc.
Thanks, really appreciate constructive feedback! Some of the points you've mentioned were already on the roadmap and I'll definitely consider the rest, although crypto markets are quite specific and can't be 1 to 1 compared to traditional wise hence my initial choices.
The main thing you offer is order book replay, am I understanding that correctly? I have been interested in something similar, but I am not sure how to justify the extra data actually.
Could you give a scenario where order book data at this granularity might come in handy, as opposed to say a single measure of liquidity (however that would be defined)? Thanks
Yes, you are correct, but it's not only order book but also trades, liquidations etc- full market data replay. If you trade on higher time frame, it's not that useful, you can use daily OHLC data,but for intraday and more HFT algo strategies it may be handy. General common knowledge is that order book is noise and fake data mostly but I disagree - check out https://www.reddit.com/r/highfreqtrading/comments/av5c4m/mar... for some ideas why such data is useful.
Indeed this data is available as real-time stream via public exchanges APIs, but you can't 'go back in time' and subscribe to data from for example two months ago replay it again and recreate exact market state at that time, using this API you can, does it make sense?
Yes I think I understand. Historical data is freely and publicly available but this lets me replay the market rather than just do static analysis. Not an active trader, just trying to clarify my thoughts about why I would need this/pay for it. Thanks
Looks like a fine service, but as a matter of policy I (and many of my peers) do not partner with crypto services that do not list basic company details on their website. For new services, I'm looking for specific names of founders, location, mission statement/values etc.
Not in general. client_oid is meant to be different for different orders. It is a "cookie" that the clients can use to later identify orders placed through the rest API. "later" here can be via the websocket stream, or after a crash/restart.
https://www.kaiko.com/ seems to the same data but with far longer historical coverage (Tardis starts from April this year). The drawback of Kaiko is the higher price tag.
Yes, kaiko provides similar service and there are others in this space as well, but it's normalized data only and only snapshot of 10% of the top of the order book taken every minute - not streaming order book data (initial snapshot + incremental updates). It works for some use cases, but not all, hence my API which I'd hope fills that niche.