Gryphon: An open-source framework for algorithmic trading in cryptocurrency

reso · on June 20, 2019

Hey folks, I'm the primary maintainer of Gryphon. The backstory here is: I was one of the founders of Tinker, the trading company that built Gryphon as our in-house trading infrastructure. We operated 2014-18, starting with just a simple arbitrage bot, and slowly grew the operation until our trades were peaking above 20% of daily trading volume on the big exchanges.

The company has since wound down (founders moved on to other projects), but I always thought the code deserved to be shared, so we've open sourced it. Someone trying to do what we did will probably save 1.5 years of engineering effort if they build on Gryphon vs. make their own. As far as I know there isn't anything out there like this, in any market (not just cryptocurrencies).

Hope you guys like it!

westurner · on June 21, 2019

> As far as I know there isn't anything out there like this, in any market (not just cryptocurrencies).

How does Gryphon compare to Catalyst (Zipline)? https://github.com/enigmampc/catalyst

They list a few example algorithms: https://enigma.co/catalyst/example-algos.html

"Ask HN: Why would anyone share trading algorithms and compare by performance?" https://news.ycombinator.com/item?id=15802834 (pyfolio, popular [Zipline] algos shared through Quantopian)

"Superalgos and the Trading Singularity" https://news.ycombinator.com/item?id=19109333 (awesome-quant,)

monkeydust · on June 20, 2019

Well documented. Looks like a nice way to learn about market making in a real life situations with small fractions of bitcoin. Curious to know if someone takes it for a real drive in a production environment.

reso · on June 20, 2019

Thanks! I hope they do.

I had considered adding a 'reading list' to the docs for people who are new to trading, so this could be used as a teaching tool (or just to make using it easier). I'll bump that up the priority list.

mojolozzo · on June 20, 2019

I would love that, too.

_fx6v · on June 20, 2019

Would love that. Please and thank you.

monkeydust · on June 21, 2019

Ditto. Heck you could combine all of it and make a nice udemy course for some extra $ if you wanted to.

thomasfromcdnjs · on June 20, 2019

Looks great!

Though looking at the code -> https://github.com/garethdmm/gryphon/tree/master/gryphon/lib...

Seems quite hard to extend it with new exchanges.

reso · on June 21, 2019

This is partially documented here, I'll add more soon: https://gryphon.readthedocs.io/en/latest/exchange_integratio...

reso · on June 20, 2019

Shouldn't be, you just need to write a wrapper for the exchange API with the interface defined in 'gryphon.lib.exchange.exchange_api_wrapper'. I'll add an article to the docs with more about that soon.

westurner · on June 21, 2019

Would CCXT be useful here? https://github.com/ccxt/ccxt

> The ccxt library currently supports the following 135 cryptocurrency exchange markets and trading APIs:

reso · on June 21, 2019

It's possible CCXT could be used to easily wrap other exchanges into gryphon. I'm not familiar with the library so hard to guess if it would be a net win or not.

thomasfromcdnjs · on June 21, 2019

Thanks!

Which exchange is a good example?

I find the directory structure just a confusing, why does each exchange-market get it's own file, wouldn't it be better for the exchange just to define some markets as hashes?

reso · on June 21, 2019

Gemini is probably the best one, code-quality wise.

Agreed, the way the trading-pair-to-exchange relationship works right now is imperfect. It's ripe for a good refactor.

codesushi42 · on June 20, 2019

Thanks for taking the time to open source this.

So is its use case limited to arb? Or are other HFT strategies supported?

reso · on June 20, 2019

It's perfectly general: arb, market making, signal trading, ml, etc. Whatever strategy class you're thinking of, you can probably implement it on Gryphon.

short_sells_poo · on June 20, 2019

Can you please explain to me how a tool written in python can be used for HFT or market making?

I’m asking because we generally used ASICS and c++ in the past, or more recently rust. Even GPUs are often difficult because they introduce milliseconds of latency.

reso · on June 21, 2019

If you want to restrict the definition of HFT to only sub-millisecond strategies you're correct. But then, all HFT is impossible in crypto, since with web request latency and rate limits, it would be very difficult to get tick speeds even in the 10s of milliseconds. It's fine if you want to call this "algo trading" instead of HFT, but I think a common understanding of the term would include Gryphon's capabilities.

In any case Gryphon uses Cython to compile itself down to C, which isn't quite as good as writing in native C but is a good chunk of the way there.

westurner · on June 21, 2019

> In any case Gryphon uses Cython to compile itself down to C, which isn't quite as good as writing in native C but is a good chunk of the way there.

Would there be any advantage to asyncio with uvloop (also written in Cython (on libuv like Node) like Pandas)? https://github.com/MagicStack/uvloop

IDK how many e.g. signals routines benefit from asyncio yet.

100qs · on June 20, 2019

I don't believe true crypto HFT strategies exist (i.e. sub-millisecond tick to trade). It's just not possible with websockets and http requests being the standard for data feeds and order placement on crypto exchanges.

xvilka · on June 21, 2019

Is there something like that, but with more predictable languages like Rust[1], or OCaml? I wouldn't trust my money for dynamic typing. I recently asked the same question[2] to the Futu broker[3]. Sadly no answer. I wish more quant companies would pay attention to the predictability of their own code and external APIs. For example using Imandra[4] to reason about their own code. Using their own protocol language specification[5] they created verified FIX library[6].

P.S. Also you might want to add that framework in Awesome Quant list[7].

[1] https://internals.rust-lang.org/t/proposal-business-applicat...

[2] https://github.com/FutunnOpen/futu-api-doc/issues/2

[3] https://www.futu5.com/en/

[4] https://imandra.ai/

[5] https://docs.imandra.ai/ipl/

[6] https://docs.imandra.ai/fix-engine/

[7] https://github.com/wilsonfreitas/awesome-quant

AlexCoventry · on June 21, 2019

You'd probably derive more benefit from typing this with mypy, instead of rewriting in a statically-typed language

whalesalad · on June 20, 2019

"trade billions" + `pip install` -- all the rope to hang yourself with!

In all seriousness this looks like a pretty comprehensive offering for open source, looks cool.

kapilvt · on June 20, 2019

worth noting its common clause.. so doesn't qualify under osi (https://opensource.org) definitions as open source.

pmarreck · on June 20, 2019

heh I was going to suggest something similar but then I saw it was extensively unit-tested... impressive

reso · on June 20, 2019

Thanks!

pedrovhb · on June 20, 2019

A couple of years ago I made a bit of cash by arbitraging within an exchange. It had both BTC and fiat pairs for all other cryptos, and in some cases buying one, converting to the other and then selling back could net a profit even with fees.

They'd occasionally ban my IP (but not API key, for some reason) I assume due to lots of order book info requests, even though my bot traded enough volume to certainly more than make up for them in fees.

Eventually it stopped earning anything. I assume the exchange started doing it internally themselves and the profit was had before the orders hit the books.

nostrademons · on June 20, 2019

Could also just be competition. The thing about arbitrage opportunities is that they're usually arbitraged out of existence once enough people become aware of them. All it takes is one faster arbitrage bot front-running you and you won't see any more arbitrage opportunities.

rolltiide · on June 21, 2019

And the exchanges.

This aint regulated securities or commodities markets and there is serious front running happening

tastyfreeze · on June 20, 2019

I had to stop running my arbitrage bot because it was getting beat by other bots. All it takes is somebody closer to the exchange to get an edge in the arbitrage game. It was fun but there wasn't enough revenue to justify collocated hosting.

keyle · on June 21, 2019

I guess the original author 'moved on to other projects' because they couldn't make a stable income.

We've all tried it, I personally couldn't make money after thousands of nodejs lines against Binance's API and many 'clever' ideas.

Anyone out there making a living from crypto-algo-trading?

hmd_imputer · on June 20, 2019

Why did you choose python 2.7?

reso · on June 20, 2019

That's holdover for when the project was started years ago. It's a focus for this year to migrate.

Havoc · on June 20, 2019

Wow. That’s pretty cool.

Been toying with the idea of algo trading on stock market. Nice to have a reference work

markbnj · on June 20, 2019

I've been playing around with using lstm recurrent nets to find patterns in forex trades, with no real expectation of anything other than learning about recurrent nets (and td convolutional nets). I was able to access 15 years of historical tick data. I would imagine lack of historical pricing data would be an issue for any machine learning approach to crypto trading. Even with 15 years of daily prices I only have ~5500 samples per major currency pair. I've toyed with learning off hourly prices rather than daily, and I've also thought about creating more samples by shifting prices up or down, since perhaps the general patterns would be the same.

nurettin · on June 20, 2019

You could make any decently clever algorithm work in backtests as long as you don't account for spread, order failures and data delays. Years ago I used to test algorithms in a "harsh environment" where all your trades are essentially 10% worse.

ska · on June 20, 2019

   I've also thought about creating more samples by shifting prices up or down, since perhaps the general patterns would be the same.

This, like all synthetic data approaches, is fraught with ways to trip yourself up on generalization.

fwip · on June 20, 2019

Please don't use code formatting for quotes - it's very difficult to read a long, non-wrapping line on a number of devices, especially mobile.

ska · on June 20, 2019

ah, too late to edit but I'll try and remember that.

fwip · on June 21, 2019

Thanks :)

markbnj · on June 20, 2019

That's true, of course. If the technique worked here then it would be because there really are location independent features that can learned. I view it as similar to translating and transforming images in the MNIST digit set to account for the various ways the feature you're searching for can be spatially oriented. Of course I have no idea if this holds true and will work for pricing data.

short_sells_poo · on June 20, 2019

Generating data this way is extremely difficult. The true population characteristics of prices are unobservable, we can only with with samples of the population. This means that any attempt at generating data is highly likely to add errors that are difficult to quantify. This is a fundamental difference to disciplines where ML is most successful: in finance you can’t meaningfully generate new data, you can only work with what has been historically recorded, and that is often not relevant to how the market will behave in the future. You can always generate more cat images for a ConvNet to learn, you can feed it photos from multiple angles or even 3D imagery. None of this is available for market data unfortunately.

ska · on June 20, 2019

Right, the thing about applications like the digit set and similar OCR problems etc., is that we can independently generate a model of "acceptable" translations/rotations and validate it reasonably easily because we understand the domain well (not that you can't cause trouble this way). This certainly isn't true across data sets.

short_sells_poo · on June 20, 2019

Ouch. Out of all the possible subjects to learn NNs on, you have picked by far the most difficult possible. Seriously. If you think of an analogue to rocketry, with the easiest being launching fireworks from a bottle and the other being a mission to Mars, you have picked a Moon landing.

I don’t even know where to begin. Financial data has an extremely low signal to noise ratio and it is fraught with pitfalls. It is highly non-normal, heteroscedastic, non-stationary and frequently changes behavioural regimes. It is irregular, the information content is itself irregular and the prices sold by vendors often have difficult to detect issues that will taint your results until you actually start trading and realise that a fundamental assumption was wrong. You may train a model on one period, and find that the market behaviour has changed and your model is rubbish. Cross validation and backtesting on black box algorithms with heavy parameter tuning is a field of study on it’s own with so many issues that endless papers have been written on each specific nuance.

Successfully building ML models for trading is an extremely difficult discipline that requires a deep understanding of the markets, the idiosyncrasies of market data, statistics and programming. Most quant shops who run successful ML Algos (they are quite rare) have dedicated data teams whose entire remit is to source and clean data. The saying of rubbish in, rubbish out is very true. Even data providers like Reuter’s or Bloomberg frequently have crap data. We pay nearly 500k a year to Reuters, and find errors in their tick data every week. Data like spot forex is a special beast because the market is decentralized. There is no exchange which could provide an authoritative price feed. Trades have been rolled back in the past and if your data feed does not reflect this, you are effectively analysing junk data.

I don’t even want to get started about the fact that trying to train an RNN on 5500 observations is folly. Did you treat the data in any way? The common way to regularise market data for ML is to resample it to information bars. This is not going to work on a daily basis, so you should start off with actual tick data.

Nearly every starry eyed junior quant goes in with the notion that you can just run some fancy ML models on some market data and you’ll get a star trading algo. That a small handful of statistical tests will tell you whether your results are meaningful, whether your data has autocorrelation or mean reverting properties. In reality, ML models are very difficult to train on financial data. Most statistical forecasting tools fail to find relationships and blindly training models on past data very rarely results in more than spurious performance in a back test.

I don’t want to discourage you by any means, but I’d start off with something easier than what you are proposing. Finance firms have entire teams dedicated to what you are trying to do and even they often fail to find anything.

luxpir · on June 20, 2019

Seconding this. I'm in touch with a bunch of smart coder/traders trying this and nobody (to my knowledge) is making backtest match forward test. To me, ML isn't optimised for this kind of problem. It might be possible to kludge it, but you won't know what it's doing.

My bot that tracks and trades momentum isn't as sexy, but it works.

markbnj · on June 21, 2019

Thanks for the great feedback. I have no expectations for this other than the learning, and it's already been successful on that front. Just seemed like a fun thing to poke at when most other hobbyists seem to be doing image analysis and language modeling. I've crawled a couple of forums and I get that there are a lot of people out there who think they can readily use these techniques to make money. I doubt very much that this will be the outcome in my case :).

Where I am now I am just trying to figure out how to treat the data, whether to normalize or stationarize and how to encode inputs, etc. The reason that I am working with daily prices is that the fantasy output of this would be a model that can inform a one day grid trading strategy. It may very well be that daily prices won't work for this.

westurner · on June 21, 2019

Whether there's anything like an equilibrium in cryptoasset markets where there are no underlying fundamentals is debatable. While there's no book price, PoW coin prices might be rationally describable in terms of (average_estimated cost of energy + cost per GH/s + 'speculative value')

A proxy for energy costs, chip costs, and speculative information

Are there standard symbols for this?

Can cryptoasset market returns be predicted with quantum harmonic oscillators as well? What NN topology can learn a quantum harmonic model? https://news.ycombinator.com/item?id=19214650

westurner · on June 21, 2019

"The Carbon Footprint of Bitcoin" (2019) defines a number of symbols that could be standard in [crypto]economics texts. Figure 2 shows the "profitable efficiency" (which says nothing of investor confidence and speculative information and how we maybe overvalue teh security (in 2007-2009)). Figure 5 lists upper and lower estimates for the BTC network's electricity use. https://www.cell.com/joule/fulltext/S2542-4351(19)30255-7

Here's a cautionary dialogue about correlative and causal models that may also be relevant to a cryptoasset price NN learning experiment: https://news.ycombinator.com/item?id=20163734

short_sells_poo · on June 21, 2019

Cool stuff, and I didn't mean to discourage you at all. Some of the most interesting challenges in datascience arise in finance.

Forex perhaps is just a pathologically tricky beast to trade well, even though it is the easiest to access. I think perhaps cryptos would be an easier start in terms of there being more inefficiencies and autocorrelation in the market.

In terms of data treatment, I recommend starting with Marco de Prado's Advances in Financial ML. I don't agree with some of his methods, but it is a practical book that highlights a lot of the issues you'll face. You can then draw your own conclusion how to treat them.

markbnj · on June 21, 2019

Thanks for the recommendation, it's a great book and I've already gotten some value, as well as some perspective, out of it.

nostrademons · on June 21, 2019

All of this is true, but I'll point out that for crypto, your competition is "I heard it was hot on Reddit/Telegram", and if it's not that it's often "I buy $X worth of Bitcoin on the first of every month". I suspect that a random algorithm (like, literally buys & sells random amounts of cryptos at random times) would do better than the average crypto trader, simply because the "I heard it was hot" strategy inherently leads towards buy-high-sell-low behavior.

dinhnv · on June 21, 2019

This is the thing I've searched for many days. Thank guy!

atemerev · on June 20, 2019

Nice to see fellow market makers! New code releases are always a good thing; perhaps I’ll do something too in the near future.

gcb0 · on June 20, 2019

1. implement a real algo that depends on known behaviour of some players.

2. release public algo that implements that known behaviour your main algo expects

3. ???

4. Profit!

reso · on June 20, 2019

Totally fair to wonder that, all I can say is we make it clear the built-in algos are just for demo/inspiration purposes, shouldn't be run live with any expectation of profit. The point of Gryphon is to use the framework to build/run your own strats much faster than if you had to build everything yourself.

rolltiide · on June 21, 2019

> all I can say is we make it clear the built-in algos are just for demo/inspiration purposes, shouldn't be run live with any expectation of profit

Boilerplate CYA that counterintuitive undermines confidence in the code while you expect some people to see through it and use it anyway

Thats all the experience I need, cloned