I used CFR to solve another card game called Setback (aka Auction Pitch), which is a trick-taking game that’s similar to, but simpler than, Bridge.
CFR is very effective, but slow and requires a lot of RAM. I had to create a smaller, abstract version of the game, solve that, and then map the result back to the actual game, so I didn’t end up with a perfect Nash equilibrium, but the solution does still play at a super-human level.
One of the interesting things about my approach is that it actually uses CFR at two separate levels: First it solves a single-deal version of the game, then it uses that solution to run CFR again on a repeated version of the game where each player accumulates points across multiple deals. (Bidding in Setback is highly score-dependent.)
I think a similar approach might be possible for Hearts, but I haven’t tried it yet. Solving Bridge with CFR may be beyond our current capability, but could also be possible in the future.
The commercially available solvers may be using CFR, but they are not anywhere near as strong as Pluribus because Pluribus pre-computed solutions for a reduced state space, then mapped the hand and actions into that reduced space and solved from there. That meant Pluribus could come up with a much better solution in much less time than the commercial solvers. This is also why most of the solvers only solve heads up.
I'm curious if one can make any money playing poker online while following some computer-optimized strategy. I assume many (most?) players are already doing this. Insights are appreciated :)
Author here. Yes, you can still make money in online Poker. Its not as juicy as it was and rake is very high these days making micro stakes unbeatable. Moreover, highest stakes usually run live and access is gated. The old 2000s dream of grinding from micros to become a high stake player is no longer possible.
Regd. RTA and bots, they have always existed and constantly keep getting better. Its a cat and mouse game between websites and cheaters which will continue
That just looks to me like a list of rules with vague promises of “sophisticated proprietary software, trust us bro” enforcing the rules. Who knows for sure how much of that is actually implemented? We have to take the company’s word for it.
I don’t doubt some of that cheat detection exists, but some of it also seems pretty fantastic.
A approximately optimal bot would probably need 10s of thousand of hands to be identified as a bot. I mean, the node chance weights are not the same between different bots for the state nodes. The state is probably even compressed in different ways. The variance is way too high. It is not like chess.
The “sophisticated proprietary software, trust us bro” is probably mainly used as a way to not do payouts. Online casinos are really shady.
Winning in poker is about identifying "fish". Being 0.001% better than the other bots at the table wont beat the house.
GG is doing not even basic modeling of theoretically possible vs actual win rates to identify outliers (cheaters..)
I'd say that the assertion that these "named pro-players" are doing anything to ensure a game seems like utter PR fluff nonsense, esp given they have 0 credentials / skills in data science / analysis etc.
Yes I have seen the thread. Catching a player using just the winrate is hard. 9k hands is peanuts and some of the assumptions in that thread (like a 53% VPIP player must have 0bb/100 winrate) are slightly far fetched.
I know for a fact that GG has a GTO detection algorithm. If you play too close to GTO/optimal strategy they investigate. Lot of RTA folks got caught this way.
Jason Koon and Fedor Holz are part of the team which manually reviews statistical anomalies. Its stupid to think they dont know about the data side of Poker. Moreover, a ton of Poker players end up becoming traders and data scientists. There is a lot of skill overlap.
I agree that some of the stuff is PR nonsense but to dismiss their entire anti fraud operation is just stupid. They are literally the best in the world at this.
That's evidential decision theory, right? You minimize the expected regret. If that's not risk adverse enough for you, you can weight together multiple perturbation groups for your world model and utility function.
Something that is surprising to me is that there are seemingly no strong open-source poker AIs available. Maybe it's because implementing CFR for poker is genuinely difficult?
Author here. There are CFR implementations and even open source implementations of Pluribus. A google search shows me many such implementations.
The tough part is not implementing the bot or replicating the research(Noam Brown mentions in the linked AMA that it just cost couple hundred dollars for its training). The hard part is the infra for cheating. Setting up multiple accounts on the website (bypassing KYC checks) and getting reliable cashouts (poker sites do a ton of AML checks). Its easier to do it in lower stakes but the cost of running the bot will eat up the win rate. On the flip side, the chances of you getting caught are very high when you play high stakes.
I'm not looking to cheat, but to simply play against strong AIs to train.
However I haven't found any complete ready-to-play implementations. There are a few simple implementations of CFR on github, but that's a long way from a complete poker bot.
Ah then, just go buy any of the training tools like GTO+, GTOBase, Vision(by run it once), GTOWiz, PLOMastermind tool by Jnandez etc. They are essentially coaching software to learn the game(powered by a solver) and a ton of them have a practice section where you can play against an AI. One of the earliest Poker solver devs Oleg Ostroumov made this back in 2013 or so: https://holdem.olegsolvers.com/ . You can play with that.
On a side note, he also wrote about his story of building the first productized solver here:
Oleg's solver is fantastic, thank you! It's only for heads up, though.
I'm still surprised by the difference with the chess world where all the strong engines are free and open-source. Maybe poker is less attractive to developers, or maybe poker devs are simply more inclined to make money (which I completely understand!)
> I'm still surprised by the difference with the chess world where all the strong engines are free and open-source.
FWIW, this is a fairly
recent thing in chess (within the last decade). Before 2015 or so closed source commercial engines were better than open source options.
You may be interested in https://actionflop.com/. It lets you play heads up no limit holdem against a game theory optimal bot (solved by CFR). It keeps track of your profit as well as your loss from making mistakes. It's a work in progress, and a coming feature will show you precisely what mistakes you made in each hand.
How do it know that an action given a hand and state is a mistake? It's quite rare for a play to be 0 probability in a strategy, so any action might be considered optimal. The tool need to know your "probability distribution" for the state and action?
Right now, it's only keeping track of the mistakes where you pick a play which you should pick 0% of the time (for example, you should fold or raise but never call). These kinds of mistakes may not be quite as rare as you would think. For example, today, over about 1400 hands played against the bot, players have lost about 1350 in expected value by making these kinds of pure mistakes. This amounts to about 48 big blinds per 100 hands. (Granted, people might be playing haphazardly and not trying to play perfectly.)
I am currently thinking about how to also measure mistakes which involve the player's distribution of choices when the best play is a mixed strategy. One possibility is to keep track of the player's distribution over time. This would probably require too large of a sample size, so one possibility would be to merge similar situations in the game tree when assessing this kind of mistake. Another possibility is to have the player somehow actually choose a mixed strategy when making a decision.
The papers released at the time of CFR poker bot being a thing in the academia niche (mainly around a yearly bot tournament) miss some key implementation points and are hard to reproduce.
I made a CRM bot for Texas hold-em limited using the uni supercomputer in 2012 something. It was quite good, beat the reference bots, but it could not play online. I think there was house bots or already mainly good bots in the poker rooms at that time. Internet poker has been broken for a long long time.
The issue of 2012 was more that the bad players all dropped off after the DOJ crackdown on online poker in 2011. It was still possible to play, but it became a pain in the ass to move money on and off the sites. That meant all the very casual players just didn't bother, and those were the ones you were making money from.
I played very successfully from 2005-2009, and I tried a couple of times post-2011 - it was just devoid of the total idiots that would feed you money.
CFR is very effective, but slow and requires a lot of RAM. I had to create a smaller, abstract version of the game, solve that, and then map the result back to the actual game, so I didn’t end up with a perfect Nash equilibrium, but the solution does still play at a super-human level.
One of the interesting things about my approach is that it actually uses CFR at two separate levels: First it solves a single-deal version of the game, then it uses that solution to run CFR again on a repeated version of the game where each player accumulates points across multiple deals. (Bidding in Setback is highly score-dependent.)
I think a similar approach might be possible for Hearts, but I haven’t tried it yet. Solving Bridge with CFR may be beyond our current capability, but could also be possible in the future.
[0]: https://www.bernsrite.com/Setback
[1]: https://github.com/brianberns/Setback
[2]: https://github.com/brianberns/Cfrm