Hacker News new | past | comments | ask | show | jobs | submit login
Pac-Man recreated with a GAN trained on 50k game episodes (nvidia.com)
292 points by doener on May 22, 2020 | hide | past | favorite | 149 comments



I've observed an increasing trend with modern AI research that

A) It's no longer practical to incorporate into real products given the hardware costs for training and inference. In NLP, large transformer based models can easily occupy a 5 thousand dollar GPU at a paltry 50-60 words per second processing pace.

B) The Research often demonstrates things that existing non-ML toolchains were already reasonable at doing with a better result, and a faster iteration time.

This demo looks like it's replicated the core mechanics of pac-man with worse graphics, 3-4 orders of magnitude higher resource consumption, and undoubtably many odd bugs hidden away from the pretty demo. It even needed the game to exist in the first place! While this is fascinating from the technical and academic perspective, it's hard to be convinced that any gains we see in the field won't be offset by increased hardware/latency costs from a product standpoint.


> It's no longer practical to incorporate into real products given the hardware costs for training and inference

I don't see that trend. Bleeding edge research has always been beyond what's practical, by its very nature. The focus is on pushing the boundaries, not to build something efficient/commercial.

It's the followup work that makes it practical. See for example the evolution of WaveNet. It was breakthrough research, but yeah it wasn't practical. Now a couple years later and we're generating speech on mobile devices of similar quality (see the recent Facebook research), and able to teach these machines using orders of magnitude less data (see 15.ai).

Not to mention the plethora of research that is explicitly about making neural networks more efficient and easier to train (e.g. EfficientNet; 1cycle; etc).

Bleeding edge stuff like AlphaStar, BigGAN, GPT-2, etc are just that: bleeding edge. They'll get more efficient and practical every year, just like everything else in machine learning.


Exactly. It is sort of like saying the first transistor was too expensive to be practical.

Development always starts with some novel technology that has major flaws, but some good nuggets. Then it is further developed to be practical.


One minor nit: The recent FB blogpost wasn't really "research" - it was mostly a reimplementation of this paper: https://arxiv.org/abs/1802.08435


It reminds me of older electric cars. They worked. They could get you around. But they did so at higher price, slower speeds, or other factors that a consumer wasn't willing to pay for. Thus they were regulated to novelty items that some work campuses would have and concept cars that would be shown from time to time.

But eventually we reached the point where they are not only competitive, they may be positioned to fully overtake combustion cars.

I wonder how many discoveries on electric cars ended up being fully dead ends along the path that took us to the point we are now?


Electric cars of the era you describe could at least go. The approach described in this paper doesn't "go". Its goal is to "learn a simulator [to train a robotic agent] simply by watching an agent interacting with the environment".

The learning part is done by training on the output of an existing simulator. Obviously, if you already have the output of a simulator, you don't need to learn a simulator so the only real use of this approach is to train a simulator by watching a human agent interacting with the environment. So the idea is to eventually apply this thing to training robots to interact in the real world by watching how humans do it (this is not clearly stated in the paper; it's my charitable interpretation of the very poor motivation of learning a simulator from the output of an existing simulator).

The only problem is that creating an accurate simulation of a human's interaction with the real world (let alone an accurate simulation of the real world) is prohibitively expensive. Thus, for the time being, this kind of idea works only in very, very simple environments like PacMan.

So we don't yet have an expensive prototype electic car that nevertheless can do everything an er other kind of car can do. We have... not sure what. The description of an idea with an example of how such a thing could be achieved in the far future, when we can train robots to act like humans by watching us.


It seems like there is not a lot of novel technology in an EV; what made them viable was steady improvements in battery technology so that the intersection between range, price, and weight is in a place that is financially feasible. Combined with continued improvements in battery, the already huge gain in fuel efficiency (for a typical car at least) will be what causes them to overtake ICEVs.


The word you are looking for is "relegate", not regulate.


Thanks. It seems obvious in hindsight. I wonder if I had the correct word in my head but my fingers typed the wrong word. That's happened a few times when dealing with words I rarely type. Sadly too late to edit.


The myth of progress has got to be the marketer’s best tool for getting at rich folk.


Your comment reminds me of snide dismissals of personal computing in the 1980s - looks worse than [existing thing], very expensive, requires an army of technicians, probably explodes unpredictably, overall just a terrible investment.


Some people seem to have construed this as an endorsement of the grandparent post, and I'd like to be clear that it definitely isn't.


Followed by decades long monopoly over expensive new technology, likely crushing waves of innovation and usage, in exchange for slow, buggy, not-even-automated spreadsheets; plus the hemming of governments and organizations into endless software rental contracts and platforms, that maybe don't add value as fast as Moore's Law.


that's a fair criticism, however it's not a given that any product category will end up with the same growth curve of PCs, the web, cloud, and mobile.

thinking back on the PC, Web, cloud, and mobile growth periods, each came with a transformative change in the capabilities, cost, and availability of computing power. We're only seeing two of these aspects in AI today - with the cost side heading in the wrong direction.


AI supercharged the importance of computing power to arms-race levels. We're still just brute forcing it with Deep Learning, but the capabilities are too awesome to ignore. Therefore, compute power will be bought-up and hoarded expensively, and used on the highest bidder. Will the market save us, and simply over-clock production? Not happening yet. Corporate appetite should be quite large. I do A.I. R&D and am just watching the mountain of power ascend before me and aged gaming laptop. My next level is yet the foothills. Yeah, I'd say growth or not, A.I. is going where PC, web, cloud, everything, goes, to the industrial monopolists and the money spenders.


I work in industrial AI research as well, but I think it's worth considering whether AI is providing an incremental benefit on existing products through the use of imense computing power - or unlocking new demand through new applications. If the latter category is limited by available compute, then we'll reach the limits of economically viable use cases rather quickly without the aid of moore's law.


I don’t think that’s the point of bleeding edge research.

Pushing boundries != commercialization potential


Is there research in this field that is not commercially funded? That seems like a silly distinction in America.


This is obviously not meant to be a real game engine. It shows that GANs can learn not only images but transitions between images given some action, which means they can be used as simulators. See world-models which used VAEs https://worldmodels.github.io/


Lots of new stuff sucks.

The english longbow was vastly superior to the first guns that came out, which were slow to load, unreliable to shoot, didn't work in the rain, and were extremely inaccurate.

3d printed stuff sucks. It's not very strong, it's not very pretty and takes a long time to create.

Meanwhile, take a look at this ML photo processing technique:

https://www.youtube.com/watch?v=bcZFQ3f26pA


Do you really think nVidia is seriously advocating for using neural networks as the best way to write a PacMan clone? It's obviously just a proof of concept to demonstrate that GANs are sophisticated enough to learn certain categories of models.


Doing these cute demos is a double-edged sword. A fun and charming demo attracts attention by non-experts who can't see the wood for the trees.

In my field we make robots do a thing, say folding laundry, as an example of dextrous manipulation[1], and the popular conversation is about whether your laundry robot makes commercial sense and not about your novel grasping strategy and nifty motion planner that makes this sorta-possible for the first time. But everyone sees your video. Pros and cons.

[1] laundry folding at Berkeley (not me) https://youtu.be/gy5g33S0Gzo


A) is just a sign, that further engineering, both model and framework, is required. Also, 50-60 words per second does not sound too bad. It amounts to 4M words per day. I imagine most people only type a thousand words per day. So with a single GPU you can handle 4000 users (assuming batch-style processing).

B) This specific example actually shows ML toolchains are orders of magnitude better, than existing ones. Imagine you wanted to make a football game from scratch, and this technique would let you do that by just feeding a lot of football game recordings to a program, as opposed to hiring software devs and graphics designers, and iterating for a year.


On A: Using a p3.2xl for that task leaves you with a minimum ec2 cost of goods sold of 50 cents per user per month for the target task. To make a reasonable SaaS business out of this you would need to be charging a minimum of $5.00 per user month for the service that this GPU is providing, assuming all other tasks have a negligible impact on Cost of Goods Sold, and that you are able to spread the load optimally throughout the day - and thus don't have to purchase higher capacity during peak.

This isn't too bad of a situation if this GPU provides 100% of your value proposition. but, if you needed your product to perform 5 such tasks - or competition/customer demand forces you to move to a bigger model with a 10 words per second inference rate then you may find it hard to balance customer value vs. price vs. margin.


> On A: Using a p3.2xl for that task leaves you with a minimum ec2 cost of goods sold of 50 cents per user per month for the target task

https://twitter.com/br_/status/979442438254166016

> "selling AWS at a loss" is crisp shorthand for a lot of startups' business models!


Researchers are not entrepreneurs. It's for the entrepreneurs and business to find a way to make this random game into something profitable.


> 3-4 orders of magnitude higher resource consumption

Just 10000x? More like 6 orders of magnitude


"This demo looks like it's replicated the core mechanics of pac-man with worse graphics, 3-4 orders of magnitude higher resource consumption, and undoubtably many odd bugs hidden away from the pretty demo. It even needed the game to exist in the first place!"

True, though my first thought was that this would be interesting from a reverse engineering point of view. Then again Pacman is pretty simple concept. Something worth reverse engineering is probably a lot more complex.


Or, to paraphrase: great paper, no application.


Like every so-called basic science. 1st batteries, radio, lasers etc also looked like that.

I see a strong parallel. It's a poor performance but important proof of concept.

Also, it's general in computing trend since assembly. Why compile when you can write machine code, waste of resources. Why have abstract compiler, when you can write assembly. Why have higher level language when it cant generate code optimal for target when it doesnt know it. Then web came, it's not faster than 20 years ago because with increased computing power bloat came. Somehow it always turns that seemingly wasteful apporach outweights its cons. Empirically lowered entry cost and increased applicability has very very high value. At least higher that I used to assess.


There is a reason why "Research" and "Development" are listed as separate concepts when people talk about R&D. The point of most research papers is not to directly invent a working product, it's to produce fundamental insights that may be useful in future applications.


See my previous comment: not in machine learning research.

Also, the paper itself tries (hard) to motivate its approach with a real-world application, in particular, learning a simulator to train robotic agents to interact with the real world.


It really is a great paper, the problem is when your entire field has few real applications but everyone is making too much money to admit that. Cool tech demos like this is awesome as a blog post, but not sure why it's academic...


Well, the paper has an experiment where they train an agent based on their system to play two of the three games (PacMan, PacManMaze and VizDoom) and compare its result to two other systems (an LSTM-based one and a model-based RL one, WorldModel). Their system gets the best score in Pacman but unfortunately the version that does better than WorldModel is the one that doesn't have the memory module whose usefulness is a central claim of their paper. WorldModel outperforms their system in VizDoom by ~300 points, though their system without the memory module "solves" VizDoom (i.e. scores just a bit over than 750; that is 765) and they claim that it's the first Gan-based system that achieves this.

A bit thin of the ground (especially when it comes to supporting the claim about the usefulness of the memory module) but the way neural nets research and especially RL research is going (from what I've seen, not my area of expertise) that's the kind of result that you can expect will get your paper published these days.


I don't know where you got the idea that the purpose of an academic paper is to produce the schematics for a working commercial product. The fact that it's novel and interesting is sufficient grounds for producing a paper about it. Why do you think projects like these are solely the domain of blog posts?


That is not true for academic papers in computer science and particularly in machine learning were contributions are expected to have some sort of practical application that is at least possible to foresee.

Indeed, a common criticism against papers in machine learning is "what is a real-world application of this approach?". If you can't answer that convincingly then it's unlikely your paper gets published, particularly if it has rather mediocre results otherwise, as is the case with this paper (see my earlier comment about the comparison to the two other systems which is a bit of an anticlimax).

In fact, the paper (well, the preprint version) makes a clear attempt to motivate the work with a practical application, saying that it aims to "learn a simulator [for robotic agents] by simply watching an agent interact with an environment".


> That is not true for academic papers in computer science and particularly in machine learning were contributions are expected to have some sort of practical application that is at least possible to foresee.

This is 100% not true for computer science and I don't know why you think it's true. There's are entire branches of computer science (e.g. complexity theory) that are highly theoretical and are certainly not geared towards immediate practical application.

Even in more applied branches of computer science like ML there are many papers that are geared more towards the fundamental research side of things.

> saying that it aims to "learn a simulator [for robotic agents] by simply watching an agent interact with an environment".

Saying that a paper could lead to a useful application down the line is not the same that saying that the paper is intended to describe the schematics for a specific useful application.


Well of course, it's not 100% true- who said anything about 100%? There are theoretical branches of computer science; but most of computer science is not purely theoretical. Certainly machine learning (and more broadly speaking, much of AI) is like I say.

I'm not sure who brought up a schematic? The bit I quoted is taken from the paper and it's a clear attempt to motivate the paper's approach as having a practical application. To clarify, do you disagree with that?


A lot of the ML techniques we're putting to use today came out of papers from the 80s. These techniques weren't practical to do much beyond toy examples, but they were proven to work. Now, we're able to use those very techniques to teach a machine to identify cat videos with 75% accuracy using unsupervised learning! [0]

---

[0]: https://www.wired.com/2012/06/google-x-neural-network/


A great sales pitch for data-center GPU's


The ai has created a functional internal model of an external system based on limited sensory input. That is essentially a kind of consciousness.


> The ai has created a functional internal model of an external system based on limited sensory input.

Sure.

> That is essentially a kind of consciousness.

That is a flying leap.


Not at all.

It just means some people poked a pile of a bazillion if-else statements long enough, until the desired result came out. I believe everyone here knows the xkcd I am referring to.

Still, I think it's an interesting result. But, as someone on Fridman's podcast said the other day: recent advances in AI have all been thanks to advances in computing power, not any new ideas. The future if AI seems to depend on Moore's Law, more than anything.


What makes you think the pile of if-else statements isn't conscious?


Because it's a pile of if-else statements shuffled until it was able to perform a single specialized thing. Any housefly would by a thousand times as "conscious".


What makes you so certain a house fly doesn’t have a form of rudimentary consciousness? It might be a spectrum.


I assume that a housefly does have a basic form of consciousness. It's not going to write Shakespeare but it probably wants to mate. (I know people who's life goals don't appear to extend much further).


How about a computer? It has if else statements too.


This research reminds me of one of their April Fool’s videos [1] with a gaming USB stick that you plug into your computer to offer various gaming enhancements. One of which is a “Ghostplay” feature where an AI program can mimic you after training on your gameplay.

I remember that video because it always struck me as not being far off from reality. It’ll make the current set of online game cheats/hacks seem like child’s play if it can be done effectively (read: human-like and undetectable). Maybe then a server-running AI will be needed to detect AI-like gameplay mimicry.

[1] https://youtu.be/smM-Wdk2RLQ


Something similar to what you describe is already being done. Game video is streamed to a Raspberry Pi Zero that sits between the real mouse and the Gaming PC and adjusts the mouse movements. Google for "Valorant RPi0" for videos to see this in action.

Here is an excerpt from a now deleted Reddit comment where the setup is described:

> "My pi is connected to my pc as a HID device with all the descriptors matching a legitimate mouse. My actual mouse is not connected to my pc and the movements are relayed through to the pi over wifi (UDP). My pc records my screen and streams info to my pi over wifi which my pi happily adjusts as seen in the video."


There are cheats that work solely off of game recordings?


There are cheats that work just with input modification: FPS games often have a recoil simulation which requires you to adjust your aim downwards as you're firing (and it's predictable). You can make a hardware cheat device which just replays this motion when you click a button to fire, improving your aim substantially.


In many non-fps games of course - even in FPS games you can make an aimbot (admittedly a crappy one) with nothing but autohotkey. The reason is because in most FPS games the enemy is highlighted a specific color, so you just move the mouse until that color is centered and shoot.


And neural network upsampling, this can't be true, right?

https://www.youtube.com/watch?v=0X1RtXCvPFQ


Servers are already running AI to try and counter cheating in quite a few games.


2018 video on Valve's use of deep learning for anti-cheat in CSGO https://www.youtube.com/watch?v=ObhK8lUfIlc


Neat, I wasn’t aware of this application of DL. Very cool!


Aside, but, I'm said this is an acceptable use of the term AI, to mean "big statistics".


I find it amusing you seem sad because one hyped term was used, only to go on to prefix the word statistics with "big". Statistics was comfortable with large quantities since inception.


I agree with you, ‘big data’ is silly too.


AI these days just means principal component analysis


But if the AI is functionally indistinguishable from a human player (human-like and undetectable), does it even matter really? https://xkcd.com/810/

(Conversely, if they're _not_ functionally indistinguishable from a human player, then that means it's possible to write an AI or other algorithm to detect them.)


AI-based detection systems have to be fairly conservative about banning accounts. so there will probably be a transition period where the clientside AI mimics a human well enough not to get banned, but makes enough "dumb computer" type mistakes to get noticed by other human players. this would be very annoying for the humans.

if the AI is truly indistinguishable from a human player, then it sort of depends on the human player we are comparing to. if the AI plays like the best human players in the world, this would be a big problem for games without a good ELO system.


You don't let chess engines compete in the chess world championship for humans


>> does it even matter really?

Because, even if acting like a human, AI can play/fight 24/7. Many online games still have leader boards that can be climbed through steady play rather than short-term bursts of skill.


One more reason to fix those games, I guess.


I made a video explaining the details of this paper if you are interested! https://youtu.be/H8F6J7mYyz0


Very interesting video! How does the GameGAN image discriminator work? I don't know so much about GANs but it seem to me that classifying PacMan images as real or fake would be a very hard problem because of the low resolution.

Suppose PacMan is in a corridor and there is a dot to the left of him and three to the right: ".P..." That is plausible because PacMan hasn't eaten any dots yet. But if the second dot to the right is missing: ".P._." something is wrong because the first dot to the right is still there and PacMan can't jump over dots. It must be very hard to get the discriminator to understand that the second image is fake.


Thanks for watching bjourne! There are three different discriminators, the first does realism for the game, the second does action-conditioned discrimination (making sure the generator takes in the action for its next frame prediction), and the third is a temporal realism with a 3D CNN.

Interesting scenario, it would be a cool experiment to walk through the latent space and try to find that kind of frame in the GAN latent space. I'm sure you could find it, but the question might be how far off the generated data's manifold is it.


Thanks for doing this. I stumbled upon one of your previous videos a while ago and really liked it. I'm super indebted to YouTubers like yourself that take the time to distill and explain complex topics, papers, and new research.


Thank you so much!


Thought experiment: Is my brain also generating reality without a physics engine and is filling in details that it thinks should be there? If true, what are the advantages/disadvantages of it?


Absolutely.

A few times a year I might wake up in the middle of the night and — at first glance — perceive a goblin sitting on a nearby chair. As more of me wakes up, I realise it’s a pile of my clothes.

Now I’m learning German, I have the experience of listening to a conversation in German, thinking I’ve heard a particular word, asking the locals what it means, and finding neither of them noticed that word being used — but I genuinely hear the unspoken word, so I know a pre-conscious part of my mind is filtering all the German I hear through a vocabulary which is too small.

My mother has Alzheimer’s, and one of the bigger surprises was realising quite how many reality-models we have by way of her losing them at different times. When I was caring for her, star jumps terrified her, she forgot what windows look like at night and how to count past 5, and she lost both object permanence and the concept of left: https://kitsunesoftware.wordpress.com/2018/01/31/alzheimers/

One of the advantages of having a model of the world is it allows you to predict the future.

The only disadvantage I can think of is that it is comparable to branch prediction failure in a pipelined CPU — for example it can be very tempting to apply that to fellow minds, and when those minds don’t think like you, you can annoy them no end by telling them (implicitly or explicitly) that you know better than they do what they do and don’t want. Or you might become convinced that wearing lucky pants causes your sports team to win/your lottery numbers to come up. Or etc.


Yes, absolutely. You can catch a baseball because your brain predicts where it will go before your eyes see its actual movement. This may be a good place to start learning more: https://en.wikipedia.org/wiki/Direct_and_indirect_realism


Another fun example of how your brain predicts physics stuff is when you misjudge what something weighs. For example, if you subconsciously judge a glass is full of water but it's actually empty, as you try to lift it your arm will fly up awkwardly. This all happens without conscious thought.

Don't tell me it never happened to you!


I knocked myself over once by lifting a solid double-corrugated cardboard box that I thought had a mini-lathe in it- I got ready to lift with both legs and promptly threw the box into my face and fell over.

That was one of the weirdest experiences I think I've had.


Also when you think there's a step, and there isn't one.


How is that "not a physics engine"?


It's more general than a physics engine in that it's predicting an expected outcome...it just so happens that the outcome here is largely dictated by physics. (At some level they all are but some more obviously than others)


It's ridiculously bad at modelling quickly rotating objects.


The other answers to your comment are great but I also want to point out that you can feel your brain doing that if you play some videogames for a long time. You can feel that you develop an intuition for the way the game will behave even if the in-game physics are very different from our real word, showing the plasticity of our brain. If we had a fully hardcoded physics simulator in our head we couldn't "think with portals".


This reminds me of a story when VR was just becoming big. Someone had been playing the game "Job Simulator" in the restaurant level where you play as a cook in the kitchen of a restaurant. The player was constantly looking in the fridge for things and realized they didn't have to open the door since the game didn't stop their head from clipping through. So they would just stick their head through the door when they needed to check.

Cut to a few hours later when they took a break and wanted a snack. So they went to their kitchen and tried to look in their real world fridge, and immediately slammed their head into the door as they tried to stick their head through it without thinking...


If I play a racing game for a while, I find it pretty difficult to drive safely right afterward.

I have not actually had any incidents, but I have to consciously not crash into things, or drive wildly.


I've had the same experience after playing FPS games for extended periods of time. Everything feels off.


I felt exactly this when playing Portal for the first time when it came out. It felt like nothing before or after. Highly recommended.


Here's a quick fun demonstration of this. Close your left eye, then stare at some spot. If you move a finger around the right side of that spot (arm half extended, maybe ~20-30 degrees to the right).

If you manage to not move your eye while paying attention to the finger, you'll find that it doesn't look quite right when you put it over the blind spot.

With your finger you know it should be there, so something finger-like fills the blind spot. If instead of a finger you try with something like a black dot, you'll see it disappear completely.


This is a compression artifact. You cannot process everything that is going on, so your brain fills in the missing or uncertain parts with a similar (abstract) computation that it already has experienced in a similar manner. It is abstract in the sense that it only replicates some aspects of the real thing. Perceiving such a compression/abstraction is better for increasing reproductive success than receiving just noise when data is missing or has low resolution.


It means the original programmers of PAC-MAN didn't "understand" the game either :-). There was never an "engine"


Yes. Many optical illusions will prove that. Or try some Psychedelics. I assume its beneficial for survival.


It would be super interesting to see if this knowledge generalizes with transfer learning. For example, after seeing 50,000 episodes of PacMan, would the GAN be able to recreate Space Invaders with just 5,000 extra episodes?


Interesting to explore how this can be used. You still need to develop a game with an actual engine first, to generate all the samples for training the AI how to generate frames without an engine.

But maybe if training sample size can be considerably reduced, game making can open up to far less technical people. Instead of developing with an engine, game makers could show the AI "when I press this button, change the scene like this", and the AI could try to figure out the rules from given samples.


Seems like it's basically simulating an environment which it has agency to affect. To the extent that games are simulating some universe, it seems like this could also be applicable for simulating at least one universe that has a game engine we can't quite reproduce yet.


Programming then becomes teaching.


The point is this brings us closer to demonstrating AI which is in some sense grounded in reality, as opposed to some kind of “Clever Hans” or Searle’s “Chinese Box.” The network has learned how to “imagine playing out a scenario” much like a human player might use his own mental model of the game to consider various strategies to attempt, perhaps with the real game.

That the network apparently has a mental model which is rich enough to simulate actual game play implies that it hasn’t just fit a static caricurature of the game which immediately breaks on an unseen move. That is, it has a non-trivial “conceptualization” of its environment which goes beyond simple pattern matching or nearest neighbor matching.


> Trained on 50,000 episodes of the game

Meanwhile an actual human will probably need only about 5 games to fully understand how the game works and reacts.


I think a few more.

Early games will establish things like "can ghosts go through each other", "if I hit the wall do I die" (like in Snake; though on Acorn Electron you could eat the wall sometimes!), "can ghosts go through the shortcut", "how long do pills last", do all pills last the same amount of time".

Then later efforts would be needed to know things like "what algorithm do the ghosts follow", "does the same colour ghost always follow the same algo", "do ghost algos change when you eat a fruit", "what order do the fruit go in" ...

Of course you could guess a lot of these, but that's not what "fully understand how the game works" means.


Something being glossed over is the difference between being able to play the game and being able to implement the game.

Also, Nvidia's thing doesn't do the fruits and possibly not even eating ghosts or points and probably not the ghost algos.


Fundamentally, this is because we have already been trained on years of living in and reacting to the real world, which, to some abstract degree, the game mimicks. The Nvidia model starts with nothing.


So you are saying that if you first trained it on a different game with similar real-world mechanics, then training it on this game would require far fewer iterations?


Pac-Man isn't a purely abstract game, it's loaded with human cultural references. Pac-Man himself is a creature who loves eating food: mostly white pellets but also various kinds of fruit and of course the iconic power pellets. Fruit of course is something humans immediately identify as something good to eat and in the game they're rewarded with points which are denoted by numbers on the screen that grow larger.

It almost needn't be said that human culture puts a great deal of emphasis on making numbers grow larger, particularly those associated with personal wealth and economic growth. To a computer without this context, why should there be any reason to prefer larger numbers over smaller ones? Heck, if you're going all the way and denying the AI direct access to the game's memory, then it's got to learn to recognize the digits first before it can even begin to decide whether it's good to increase them.

Even more abstractly, the game relies on the human concepts of good and bad. It's good to eat the fruit and pellets to increase your score counter. It's bad to let a ghost catch pac-man, causing him to die and lose a life. Without being taught, human children can identify all of these elements but the computer would seem to have a more difficult task.


This called transfer learning and is an active area of research. It doesn't work nearly as well as it "should" if "AI" was really Artificial "Human Intelligence".


It might. My understanding is, it is common to take trained models from one problem and use parts of them on other problems in a similar domain to speed up training.


No, in humans the cycles involved in walking and 3d orienting carry over, so several thousand iterations of pre-game and a few of game.


This is a tired argument, experiments on humans with random textures show only a 10X slowdown, not 10,000x


Indeed. Some work has actually been done to try to quantify how important various types of human prior knowledge are for learning video games: https://arxiv.org/pdf/1802.10217.pdf


Humans intelligence basically evolved (as far as we know) by retarding childhood development so we had neural plasticity longer to learn more. Crazy that we're giving our already-dumb AIs the extra challenge of learning things from nothing and doing so quickly.


I wouldn't say "fully" https://www.gamasutra.com/view/feature/3938/the_pacman_dossi...

& also that humans come with a huge amount of training beforehand. We've experienced eating & ghost stories & moving around in 3d spatial dimensions before playing pacman usually


OTOH if you told a current AI a ghost story it wouldn't be able to do anything useful with that info a pacman context.


Not a very convincing argument. Even when Pac man was new and video games barely existed it took no time to get how it works.


Because it resembles a world you are very familiar with. Let’s say you have something we don’t have experience with, for example docking two satellites with orbital mechanics, the humans would be struggling because their intuition is wrong. AI may do better with this.


and how many games does a human need to transform into a pacman game playable by another human? ...lol


The same. Once you understand how it works its just about coding. Coding Tetris never required anyone to play millions of times either.


For those confused, this is just a recreation of an animated image of pacman, not a playable recreation of the actual game.

Impressive still, but not exactly what's lead to be believed by the title.


It's not quite as playable as the full pacman game, but it is more than an animation since it's conditioned on the action taken


Is it? says the model keep track of user input to generate appropriate frames.



So how large is the generated model compared to the Pac-Man binary? (edit: which was 16KB)


I wonder what the copyright implications of this are. Obviously it's Pacman, but none of the source is the same.


Game rules aren't copyrightable. The visuals are a copy but obviously fair use, and it's obviously trademark use, and Nvidia probably has a marketing partnership with NAMCO for this


"“We were blown away when we saw the results, in disbelief that AI could recreate the iconic PAC-MAN experience without a game engine,” said Koichiro Tsutsumi from BANDAI NAMCO Research Inc., the research development company of the game’s publisher BANDAI NAMCO Entertainment Inc., which provided the PAC-MAN data to train GameGAN. “This research presents exciting possibilities to help game developers accelerate the creative process of developing new level layouts, characters and even games.”"

Emphasis mine.

Also, while I disagree that this would be real "fair use" (rather than Internet Pundit fair use, which is virtually indistinguishable from "I want this, so it must be legal") without permission, it wouldn't matter anyhow. If nVidia did this without permission and tried to distribute exactly what we saw, it would also be a trademark violation. Pacman is almost certainly in the top 1% of defended trademarks, if not .1% or .01%... not a great thing to try to use without permission.


I'm not convinced the visuals would be fair use. However it was certainly done in partnership with Namco. It's celebrating Pac-Man's 40th anniversary and says at the bottom of the article: PAC-MAN™ & ©BANDAI NAMCO Entertainment Inc.

Plus there is a massive give away that Namco approved this within the article:

We were blown away when we saw the results, in disbelief that AI could recreate the iconic PAC-MAN experience without a game engine,” said Koichiro Tsutsumi from BANDAI NAMCO Research Inc., the research development company of the game’s publisher BANDAI NAMCO Entertainment Inc., which provided the PAC-MAN data to train GameGAN. “This research presents exciting possibilities to help game developers accelerate the creative process of developing new level layouts, characters and even games.”


The design of their GAN looks super complex, and Pacman is pretty simple... I kinda wonder if they meta-overfit the game (ie the Pacman game engine which they say is missing is sorta hidden in the structure of the GAN architecture that they start off with)


How long until this method evolves to solve business requirements? (Argument could be made that the game is a set of business requirements)

Imagine visual programming but instead of connecting systems and dealing with high level schemas, the stake holder just defines the data flow, and the model creates a (hopefully) more extendable system than current visual programming systems.

But the complexities/black box nature of some neural nets (and the CPU designed by AI that was enormously complex) may transcend to this use case. Meaning we can create systems that are more efficient, more easily, but we ourselves can’t interpret the entire system, let alone each piece sufficiently.


In most real world business situations we can't even define the requirements and constraints in a manner amenable to AI solutions. It only works for the most trivial issues.


I've never had a harder time explaining something this impressive to my gf.


So AI can learn to play the games, but can it learn to create them? That is, can it create compelling cultural novelty?

The most successful examples I've heard are computer generated classical music, but generating abstract art like classical music is simpler than generating art that relies on coherent language and narrative, like opera, or the blues, or a good novel.


There has been research on it. I forgot the source and can’t find it easily, but if you email Julian Togelius, he might know. He doesn’t focus on game AI research and is the only name I can remember at the moment.

Edit: oh, Togelius also seemed to have researched it.

https://www.google.com/amp/s/www.technologyreview.com/2017/1...


Cool. Thanks for that link!


This seems less impressive than the Mario recreation from a few years ago[0] -- I must be missing something because this is a simpler game that was trained on significantly more data, yet the result is comparatively worse.

[0] https://www.theverge.com/2017/9/10/16276528/ai-video-games-g...


What're the total number of possible states in Pacman?


The number of possible pellet configurations on the grid is astronomical on its own, then you have pacman positions and ghost positions at least. I don't know the exact value but my estimate is "a very big number".


It's not so much the total number, but the dimensionality.

Eg if you have a blob that can move in x and y coordinates, it doesn't get more complicated just because you move from 32-bit floating point numbers to 64-bit floating point numbers. Even though the number of states squares.


Does anyone know if we can combine knowledge from different domains when training to play games? Example: have a language model read pieces of text about the game, then feed the output of that into a model that's doing normal reinforcement learning. Maybe train them both at the same time?

The hope is that the model can combine inputs from different domains and generalize better.


I predict methods like this will be used to create a simulation environment that approximates the real world. Reinforcement learning folks are running out of game engines win inside of. I believe the best way to transcend this coming limitation will be to take a learning based approach similar to this.


I mean, it's pretty clear that we are already living in such simulation, right?


What evidence do you have? How would the world look different, if we weren't in a simulation?


I wonder, does it also reproduce the four unique behaviours of the ghosts? (their movement patterns are all programmed differently, one is random, one goes away from you, and I'm not sure about the rest)


For anyone interested on how the ghosts AI works: https://youtu.be/ataGotQ7ir8 I doubt the AI actually recreated this.


Please help me understand the point. Is it to show that nVidia chips can be used to create game AI using GANs? Or is it to aid in game design? Wouldn't a human be in a better position to do both?


If I'm understanding correctly, the machine learning model was able to create a game executable from information about a game.


From the article, it sounds to me more like the AI itself is rendering individual frames of the game. It takes the last frame and the current state of the gamepad as input, and outputs the next frame. They've effectively replaced the entire game code with a neural network.


How do they know it's fully functional without trying out how it reacts in every possible situation (which isn't really feasible)?


I guess they should train an AI to play it too


So wait, is this playable, or is it just capable of reproducing video of gameplay?


If the old patterns don't work its not really pacman.


hmm, train a bunch of UI elements and interactions and feed it underlying codebases and see what comes back... lol.


That’s actually been done at least a couple years back.

https://towardsdatascience.com/code2pix-deep-learning-compil...


dumb question but could ML be made to copy an OS? since this has a learning/input ability with rules.


I'd love to see more gameplay. In fact, I'd love to play it myself. I just can't believe it replicates the original gameplay exactly.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: