C-based Virtual Machine for Notch's DCPU-16

coderdude · on April 4, 2012

There's also a VM written in Go by kballard: https://github.com/kballard/dcpu16.go/

And a disassembler in JavaScript by evilpie: https://gist.github.com/2300590

Edit:

CoffeeScript, too (via steverb): https://github.com/Janiczek/cfs-dcpu16

@sp332: I feexed it

schwap · on April 4, 2012

I started one in JavaScript (As an exercise to learn JS beyond DOM manipulation): https://github.com/tscho/dcpu16-js

fbu · on April 5, 2012

Awesome man! You solved quite a few problems I had working on mine. (Not sure I understand everything though)

I checked your code, fixed a few things. It now works for: 7c01 0030 7de1 1000 0020 7803 1000 https://github.com/fbulens/dcpu16-js

I'll continue tomorrow (Europe here).

schwap · on April 5, 2012

Glad you found it helpful! I merged in your changes and added an html interface.

M4v3R · on April 5, 2012

If anyone's interested, I've repackaged kballard's Go VM, so you can compile it and it will run sample Hello World program, outputting it to stdout: http://github.com/M4v3R/dcpu16.go/

alnis · on April 5, 2012

Not quite as fancy as an emulator, but I've put together a JavaScript assembler: https://github.com/AlexNisnevich/dcpu16-assembler

jtauber · on April 5, 2012

and a Python one: https://github.com/jtauber/dcpu16py

itsbth · on April 5, 2012

Another one in coffeescript: https://github.com/itsbth/DCPU.coffee

Seems to work, but I have no idea if it's correct, as I don't have much to test it on.

fla · on April 5, 2012

Yet another one in CoffeeScript https://gist.github.com/2308725

sph · on April 5, 2012

And disassembler in Go (by me): https://bitbucket.org/1player/godcpu16

revelation · on April 4, 2012

An enum with the opcodes would have been a lot nicer. I guess theres no time to lose in a competition with the rest of the world to come up with stuff on a one page of paper CPU.

cpeterso · on April 4, 2012

I'm sure pull requests would be graciously considered. :)

revelation · on April 4, 2012

Sorry, this came off a bit rude. I just saw the Go version having the same magic numbers all over instead of using a straightforward enum for no cost, so I felt like this was worth mentioning.

On the other hand, I feel that building all kinds of stuff around notchs MMO-CPU is just an outlet for all of the excitement that has built up. I guess its upon Mojang to deliver and stop people from diverting off course.

lilyball · on April 4, 2012

I think you're a bit prejudiced against "magic numbers". If a number is used only once in the entire program, why bother giving it an enum? Just write a comment explaining what it is. That's what I did with the Go implementation.

revelation · on April 4, 2012

I am not prejudiced, I just know the problem. Enums provide type safety, they integrate with the tools we use, the debugger knows about them, many languages allow you to get a string representation of them for pretty debug output. And best of all, they are completely free.

wvenable · on April 4, 2012

Looking at this C code; it could be made more self documenting with a few constants (in enum if you want) for the values rather than just putting them in. Comments are fine but it seems like most of the values are used twice.

cpeterso · on April 4, 2012

An enum is a type of comment that can be enforced by the compiler and is likely less typing than adding /* regular comments */.

lilyball · on April 5, 2012

It can only be enforced if it's used more than once.

mochapixel64 · on April 4, 2012

> If a number is used only once in the entire program, why bother giving it an enum?

Software changes, and is often changed by someone other than the original author.

dizidoro · on April 4, 2012

I hate magic numbers, i believe comments should be made as last resort. The code should be self-documenting. It's about reading and understanding the code easier. One thing that i learned wit CS is that it's all about making higher abstractions, making the logic closer to the domain you are modeling. If there is a word that resumes CS, i would say abstraction. Thats why we have high level languages, OOP, Operating Sytems, folders and files and etc.

chj · on April 5, 2012

Normally enum is better.

Here enum is not a good idea if you have a switch case like:

switch (val) { /* val is uint8_t / case OP_XXX: break; case OP_YYY: break; case OP_ZZZ: break; }

Looks nice, but in debugging, you have no idea where we are going when you set a breakpoint at the switch. because the val is not an enum type, so debugger can't make the job easier for you.

Now it is much better:

switch (val) { / val is uint8_t / case 0: / XXX / break; case 1: / YYY / break; case 2: / ZZZ */ break; }

Ralith · on April 5, 2012

> Looks nice, but in debugging, you have no idea where we are going when you set a breakpoint at the switch. because the val is not an enum type, so debugger can't make the job easier for you.

So .. move forward a step?

chj · on April 5, 2012

Things could be more complicated than that.

The val may be a corrupted value which are not covered by any of the listed case. However, you will be able to identify what's going wrong right away by comparing the value with listed cases.

Yes, it is rare. but sometimes I do wish I haven't used enum.

DanBC · on April 5, 2012

    putting three spaces in front of a line formats it as code and allows use of *

not putting those spaces means that text in asterisks become italics

swetland · on April 5, 2012

Well, I think of it more as "fan art", if you will, than a serious project. I was trying to aim for a "one page" implementation for a one page spec (given how generally clean and tidy the design is) because it was fun.

a1k0n · on April 4, 2012

And I'm pretty sure the O register isn't supposed to be set in all cases. And what was lit[] supposed to be for?

swetland · on April 5, 2012

Yeah, think you're right. I'm out of spec there. That's what you get for trying to be cute. Fixed now.

a1k0n · on April 5, 2012

I think your lit table also implies you can write to literals. So I can 'SET 2, 0x1f; SET A, 2' and now A is 0x1f.

swetland · on April 5, 2012

There's a check in the writeback phase that prevents that.

a1k0n · on April 5, 2012

Ah-ha, yes there is. Sorry.

jpd · on April 4, 2012

He needs 'lit' so dcpu_opr() can uniformly return a pointer for all results, including literal values.

zellyn · on April 4, 2012

Suggestion: Sorry, I don't have the time right now, but someone should start and curate a github project containing specs, test cases, expected results, etc.

A nice addition, until Notch's actual design is known, would be to specify a simple community-created "hardware" spec: say memory areas for 40x25 text, 80x25 text, 320x200 monochrome, etc. and a low page (say the first 0xff bytes) for switching between those modes. That way, code could be easily shared and tested.

lilyball · on April 4, 2012

Notch tweeted a screenshot (sorry, having trouble finding it) that seemed to show a 32x16 text grid. I'm assuming that's what 0x8000.. maps to

revelation · on April 4, 2012

Like a framebuffer? I guess it would be easier to specify an inevitable syscall mechanism.

colomon · on April 4, 2012

I'd think just a simple standard for 80x25 text would be a great start...

a1k0n · on April 4, 2012

I think Notch is thinking the same thing: https://twitter.com/#!/notch/statuses/184933588754628609

I'm guessing it's 80x25, 2000 words starting at 0x8000, high byte is the attribute and low byte is the character. Just like a CGA/EGA/VGA display (except that started at 0xA0000). But it's sort of silly to speculate at such an early stage; for instance, is this emulated on the server? client? both, kept in sync somehow? If on the server, do you have to constantly send framebuffer updates?

simcop2387 · on April 4, 2012

Since it's text I would have started it at 0xB000 I think it was. That's where I believe the text mode on old XT/AT systems used to pull it from. It's been way too long since I've dealt with that for me to remember properly

a1k0n · on April 4, 2012

Oh yeah. 0xB8000. (You'd use a segmented address of 0xB800:0000, but the "segment" part that would be in 16-byte "paragraphs", so the logical "flat" address was equivalent to 0xB8000).

simcop2387 · on April 4, 2012

That was it! Boy does that bring back memories. Some bad ones, some good ones.

I can only imagine how fast putting text on the screen can be if you did it with some of the vector instructions in a modern processor....

VikingCoder · on April 4, 2012

I remember using a Hercules graphics card exactly like this...

MarkSweep · on April 5, 2012

A calling convention would also be useful.

dljsjr · on April 4, 2012

Related to all of the people who've already sunk a bunch of work in to this; the official lore/backstory says that the whole premise was instigated by a Big/Little Endian mixup, and that the DCPU is supposed to be Little Endian...

But from Notch's spec document:

"Instructions are 1-3 words long and are fully defined by the first word. In a basic instruction, the lower four bits of the first word of the instruction are the opcode, and the remaining twelve bits are split into two six bit values, called a and b. a is always handled by the processor before b, and is the lower six bits. In bits (with the least significant being last), a basic instruction has the format: bbbbbbaaaaaaoooo"

In other words, the current DCPU spec is Big Endian. So I wonder if Notch will be rewriting the spec, even though from all accounts it seems that he's already implemented the emulator.

idupree · on April 4, 2012

The spec example

   For example, to do a 32 bit add of 0x12345678 and 0xaabbccdd, do this:
      SET [0x1000], 0x5678    ; low word
      SET [0x1001], 0x1234    ; high word

suggests he's thinking in little-endian.

I don't think it's relevant to this spec though: there's no way for a DCPU program to detect the CPU's endianness. Imagine it's C with 16-bit "char"s and no larger int sizes.

hetman · on April 5, 2012

Endianness doesn't really mean anything in the context of a single word (except maybe on serial lines). What made you conclude the above suggests big endian?

etcet · on April 4, 2012

It'd be funny to see the first programs running in the game to make the same mistake. Maybe the inconsistency in notch's spec is purposeful.

TazeTSchnitzel · on April 4, 2012

little endian words, big endian instructions?

pierrebouchet · on April 4, 2012

I'm sure right now someone somewhere is working on an FPGA implementation of the DCPU-16.

joezydeco · on April 4, 2012

That would be a hell of a hack if Notch ran the backend on a handful of high-speed gate arrays.

kvikramg · on April 4, 2012

already done. I was asked todo it for my ungrad thesis.

pierrebouchet · on April 5, 2012

tell us more, do you have a link?

patrickod · on April 4, 2012

I have a feeling this game will be the driving force in making me learn up on this area of CS.

VikingCoder · on April 4, 2012

From my read of this, we do not yet have code that accurately emulates the number of cycles per instruction, which will probably turn out to be critical in the final game.

sp332 · on April 4, 2012

Most instructions take one cycle. The ones that don't, disconcertingly, have the results available immediately and then deduct cycles. I don't know if this will be fixed later or not. https://twitter.com/#!/notch/status/187449161216573441

swetland · on April 6, 2012

As far as I can tell, I think the cycle count is intended to model the cost of cycles against your 100KHz budget. I'm not sure I think it's worth the complexity.

Given that Notch seems to want to continuously operate one or more virtual CPUs for every user, if it were me, I'd favor making the execution model as simple and cheap as possible.

Anderkent · on April 5, 2012

Why do you care when the result is available if the next instruction will run after the wait is complete?

I guess it matters if another CPU can read your memory and depends on the cycle count to synchronize...

VikingCoder · on April 5, 2012

It also matters if writing to the memory causes an actuator to... act.

Picture if memory location 7 is mapped to how much energy to pump into the laser cannon.

Anderkent · on April 5, 2012

In which case interaction happening when you send the command and not at a command-specific during the execution should simplify things?

VikingCoder · on April 5, 2012

If it's supposed to take 3 clock cycles to execute an instruction, then it's "unfair/unrealistic" to really activate the laser cannon instantly, and then make the processor wait 3 clock cycles before it begins to execute the next instruction.

It'd be "more fair/realistic" if it took 3 clock cycles, and THEN the memory was overwritten.

Note: this is debating the timing of a 100 kHz simulated processor, within a game... so, the discussion is kind of inherently pedantic. =)

Triave · on April 4, 2012

Note that this VM can depend on behaviour that is explicitly left undefined by the C11 standard. For example, this DCPU-16 program can cause a shift by an amount greater than the width of uint16_t:

SET A, 0x1 SHL A, 0x100

lifthrasiir · on April 5, 2012

The Java Language Specification indeed states that the shift amount should be ANDed with 31 or 63, depending on a type of the left-hand operand. I guess Notch did not intend this behavior, however.

http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html...

lilyball · on April 4, 2012

This will, of course, set both A and O to 0.

Triave · on April 4, 2012

Depends how left shift is implemented: on some processors (including x86), only the lowest order bits of the number of positions to shift by are used. In Java:

char A = 1; A = ((char) ((A << 32) & 0xFFFF)); System.out.println(Integer.toHexString(A));

actually outputs 1.

topbanana · on April 4, 2012

I have a terrible feeling I'm missing something here

VikingCoder · on April 4, 2012

notch, creator of Minecraft, has announced a new space-themed MMO, with people being able to write code against a 16-bit machine architecture, which will be executed to fly your ship around, even while you're not connected to the game.

This is a c implementation of that machine architecture.

I don't think we have any info on sensors or actuators for the ships though, yet...

dbh937 · on April 5, 2012

Someone should make something analogous to the GertBoard for the Raspberrypi[0], except virtually, to solve that problem [0]http://www.raspberrypi.org/archives/411

indy · on April 4, 2012

The next game from Notch (creator of Minecraft) is called 0x10c and features a 16bit cpu. More details can be found at http://0x10c.com/

to3m · on April 4, 2012

No, sorry, you are missing nothing. This is actually happening.

thelucky41 · on April 4, 2012

The first thing I'm doing with this is making it play minecraft.

pavel_lishin · on April 4, 2012

The first world you should test it on is one in which someone's built a control panel/interface for the CPU.

bebop · on April 4, 2012

and it was turtles all the way down...

isamgray · on April 5, 2012

I made one in C++. https://github.com/isamgray/DCPU-16-CPP I'm new to writing emulators and such, so I used the already made C one, as a sort of "guide". No copy/paste of code though. I plan on adding in a disassembler soon.(already have most of the code written)

I hope someone finds it useful!

swalsh · on April 4, 2012

It would seem like some kind of compiler is going to be written before the first beta is even released. Its like a space race!

sp332 · on April 4, 2012

I know the architecture was built to be very easy to emulate, so I'd like to see how it compares to a PyPy JIT version :)

tophercyll · on April 4, 2012

It would probably be worth it to temporarily define one of the reserved opcodes as putchar.

pjscott · on April 5, 2012

Yes, "temporarily".

It would definitely be a fitting nod to the history of the 6502 (which inspired this processor) to have a variety of incompatible undocumented instructions. Maybe not a good thing, but... fitting.

lilyball · on April 4, 2012

Video RAM starts at 0x8000.

Zaak · on April 4, 2012

True, but there's no information on the display modes or the framebuffer format.

jpd · on April 4, 2012

Instead of using dcpu.skip, why not just (d->pc += !res)? This way you don't have to call dcpu_step() again before getting to the next instruction and more accurately follow the way the hardware would likely work.

swetland · on April 6, 2012

Instructions can be 1-3 words depending on opcode and operand types. I was using the skip register to avoid having to decode instructions just to skip 'em. Turns out that my operand decoder is side-effecty (adjusts SP for PUSH/POP operands), so I had to go this route anyway. Code updated.

jpd · on April 6, 2012

Ah, thanks. I forgot about that.

dbh937 · on April 4, 2012

Someone should implement C in the DCPU-16 before launch, so the assembly language isn't necessary. Mid- to high-level languages FTW!

athst · on April 4, 2012

Can anyone help me understand how this will work with the game? I don't see anything about flying virtual ships around...

VMG · on April 4, 2012

As far as I can understand the virtual ships have programmable hardware

steverb · on April 4, 2012

Coffescript based interpreter:

https://github.com/Janiczek/cfs-dcpu16

fla · on April 5, 2012

I made one too :) https://gist.github.com/2308725

cpdean · on April 4, 2012

How long till someone installs a linux distro to the new game's simulated CPU?

unwind · on April 5, 2012

Hopefully/probably, very long since the CPU in question is quite small. 64 KB of total memory (code and data) is rather cramped to run Linux. Also, it has no I/O and no interrupts. It'd have to be a pretty radically small OS kernel.

jimktrains2 · on April 5, 2012

Is there an assembler?

jimktrains2 · on April 5, 2012

http://onlinehut.org/2012/04/assembler-for-dcpu16/ Yup, on the front page now