Hacker News new | past | comments | ask | show | jobs | submit login
C-based Virtual Machine for Notch's DCPU-16 (github.com/swetland)
128 points by coderdude on April 4, 2012 | hide | past | favorite | 89 comments



There's also a VM written in Go by kballard: https://github.com/kballard/dcpu16.go/

And a disassembler in JavaScript by evilpie: https://gist.github.com/2300590

Edit:

CoffeeScript, too (via steverb): https://github.com/Janiczek/cfs-dcpu16

@sp332: I feexed it


I started one in JavaScript (As an exercise to learn JS beyond DOM manipulation): https://github.com/tscho/dcpu16-js


Awesome man! You solved quite a few problems I had working on mine. (Not sure I understand everything though)

I checked your code, fixed a few things. It now works for: 7c01 0030 7de1 1000 0020 7803 1000 https://github.com/fbulens/dcpu16-js

I'll continue tomorrow (Europe here).


Glad you found it helpful! I merged in your changes and added an html interface.


If anyone's interested, I've repackaged kballard's Go VM, so you can compile it and it will run sample Hello World program, outputting it to stdout: http://github.com/M4v3R/dcpu16.go/


Not quite as fancy as an emulator, but I've put together a JavaScript assembler: https://github.com/AlexNisnevich/dcpu16-assembler



Another one in coffeescript: https://github.com/itsbth/DCPU.coffee

Seems to work, but I have no idea if it's correct, as I don't have much to test it on.


Yet another one in CoffeeScript https://gist.github.com/2308725


And disassembler in Go (by me): https://bitbucket.org/1player/godcpu16


An enum with the opcodes would have been a lot nicer. I guess theres no time to lose in a competition with the rest of the world to come up with stuff on a one page of paper CPU.


I'm sure pull requests would be graciously considered. :)


Sorry, this came off a bit rude. I just saw the Go version having the same magic numbers all over instead of using a straightforward enum for no cost, so I felt like this was worth mentioning.

On the other hand, I feel that building all kinds of stuff around notchs MMO-CPU is just an outlet for all of the excitement that has built up. I guess its upon Mojang to deliver and stop people from diverting off course.


I think you're a bit prejudiced against "magic numbers". If a number is used only once in the entire program, why bother giving it an enum? Just write a comment explaining what it is. That's what I did with the Go implementation.


I am not prejudiced, I just know the problem. Enums provide type safety, they integrate with the tools we use, the debugger knows about them, many languages allow you to get a string representation of them for pretty debug output. And best of all, they are completely free.


Looking at this C code; it could be made more self documenting with a few constants (in enum if you want) for the values rather than just putting them in. Comments are fine but it seems like most of the values are used twice.


An enum is a type of comment that can be enforced by the compiler and is likely less typing than adding /* regular comments */.


It can only be enforced if it's used more than once.


> If a number is used only once in the entire program, why bother giving it an enum?

Software changes, and is often changed by someone other than the original author.


I hate magic numbers, i believe comments should be made as last resort. The code should be self-documenting. It's about reading and understanding the code easier. One thing that i learned wit CS is that it's all about making higher abstractions, making the logic closer to the domain you are modeling. If there is a word that resumes CS, i would say abstraction. Thats why we have high level languages, OOP, Operating Sytems, folders and files and etc.


Normally enum is better.

Here enum is not a good idea if you have a switch case like:

switch (val) { /* val is uint8_t / case OP_XXX: break; case OP_YYY: break; case OP_ZZZ: break; }

Looks nice, but in debugging, you have no idea where we are going when you set a breakpoint at the switch. because the val is not an enum type, so debugger can't make the job easier for you.

Now it is much better:

switch (val) { / val is uint8_t / case 0: / XXX / break; case 1: / YYY / break; case 2: / ZZZ */ break; }


> Looks nice, but in debugging, you have no idea where we are going when you set a breakpoint at the switch. because the val is not an enum type, so debugger can't make the job easier for you.

So .. move forward a step?


Things could be more complicated than that.

The val may be a corrupted value which are not covered by any of the listed case. However, you will be able to identify what's going wrong right away by comparing the value with listed cases.

Yes, it is rare. but sometimes I do wish I haven't used enum.


    putting three spaces in front of a line formats it as code and allows use of *
not putting those spaces means that text in asterisks become italics


Well, I think of it more as "fan art", if you will, than a serious project. I was trying to aim for a "one page" implementation for a one page spec (given how generally clean and tidy the design is) because it was fun.


And I'm pretty sure the O register isn't supposed to be set in all cases. And what was lit[] supposed to be for?


Yeah, think you're right. I'm out of spec there. That's what you get for trying to be cute. Fixed now.


I think your lit table also implies you can write to literals. So I can 'SET 2, 0x1f; SET A, 2' and now A is 0x1f.


There's a check in the writeback phase that prevents that.


Ah-ha, yes there is. Sorry.


He needs 'lit' so dcpu_opr() can uniformly return a pointer for all results, including literal values.


Suggestion: Sorry, I don't have the time right now, but someone should start and curate a github project containing specs, test cases, expected results, etc.

A nice addition, until Notch's actual design is known, would be to specify a simple community-created "hardware" spec: say memory areas for 40x25 text, 80x25 text, 320x200 monochrome, etc. and a low page (say the first 0xff bytes) for switching between those modes. That way, code could be easily shared and tested.


Notch tweeted a screenshot (sorry, having trouble finding it) that seemed to show a 32x16 text grid. I'm assuming that's what 0x8000.. maps to


Like a framebuffer? I guess it would be easier to specify an inevitable syscall mechanism.


I'd think just a simple standard for 80x25 text would be a great start...


I think Notch is thinking the same thing: https://twitter.com/#!/notch/statuses/184933588754628609

I'm guessing it's 80x25, 2000 words starting at 0x8000, high byte is the attribute and low byte is the character. Just like a CGA/EGA/VGA display (except that started at 0xA0000). But it's sort of silly to speculate at such an early stage; for instance, is this emulated on the server? client? both, kept in sync somehow? If on the server, do you have to constantly send framebuffer updates?


Since it's text I would have started it at 0xB000 I think it was. That's where I believe the text mode on old XT/AT systems used to pull it from. It's been way too long since I've dealt with that for me to remember properly


Oh yeah. 0xB8000. (You'd use a segmented address of 0xB800:0000, but the "segment" part that would be in 16-byte "paragraphs", so the logical "flat" address was equivalent to 0xB8000).


That was it! Boy does that bring back memories. Some bad ones, some good ones.

I can only imagine how fast putting text on the screen can be if you did it with some of the vector instructions in a modern processor....


I remember using a Hercules graphics card exactly like this...


A calling convention would also be useful.


Related to all of the people who've already sunk a bunch of work in to this; the official lore/backstory says that the whole premise was instigated by a Big/Little Endian mixup, and that the DCPU is supposed to be Little Endian...

But from Notch's spec document:

"Instructions are 1-3 words long and are fully defined by the first word. In a basic instruction, the lower four bits of the first word of the instruction are the opcode, and the remaining twelve bits are split into two six bit values, called a and b. a is always handled by the processor before b, and is the lower six bits. In bits (with the least significant being last), a basic instruction has the format: bbbbbbaaaaaaoooo"

In other words, the current DCPU spec is Big Endian. So I wonder if Notch will be rewriting the spec, even though from all accounts it seems that he's already implemented the emulator.


The spec example

   For example, to do a 32 bit add of 0x12345678 and 0xaabbccdd, do this:
      SET [0x1000], 0x5678    ; low word
      SET [0x1001], 0x1234    ; high word
suggests he's thinking in little-endian.

I don't think it's relevant to this spec though: there's no way for a DCPU program to detect the CPU's endianness. Imagine it's C with 16-bit "char"s and no larger int sizes.


Endianness doesn't really mean anything in the context of a single word (except maybe on serial lines). What made you conclude the above suggests big endian?


It'd be funny to see the first programs running in the game to make the same mistake. Maybe the inconsistency in notch's spec is purposeful.


little endian words, big endian instructions?


I'm sure right now someone somewhere is working on an FPGA implementation of the DCPU-16.


That would be a hell of a hack if Notch ran the backend on a handful of high-speed gate arrays.


already done. I was asked todo it for my ungrad thesis.


tell us more, do you have a link?


I have a feeling this game will be the driving force in making me learn up on this area of CS.


From my read of this, we do not yet have code that accurately emulates the number of cycles per instruction, which will probably turn out to be critical in the final game.


Most instructions take one cycle. The ones that don't, disconcertingly, have the results available immediately and then deduct cycles. I don't know if this will be fixed later or not. https://twitter.com/#!/notch/status/187449161216573441


As far as I can tell, I think the cycle count is intended to model the cost of cycles against your 100KHz budget. I'm not sure I think it's worth the complexity.

Given that Notch seems to want to continuously operate one or more virtual CPUs for every user, if it were me, I'd favor making the execution model as simple and cheap as possible.


Why do you care when the result is available if the next instruction will run after the wait is complete?

I guess it matters if another CPU can read your memory and depends on the cycle count to synchronize...


It also matters if writing to the memory causes an actuator to... act.

Picture if memory location 7 is mapped to how much energy to pump into the laser cannon.


In which case interaction happening when you send the command and not at a command-specific during the execution should simplify things?


If it's supposed to take 3 clock cycles to execute an instruction, then it's "unfair/unrealistic" to really activate the laser cannon instantly, and then make the processor wait 3 clock cycles before it begins to execute the next instruction.

It'd be "more fair/realistic" if it took 3 clock cycles, and THEN the memory was overwritten.

Note: this is debating the timing of a 100 kHz simulated processor, within a game... so, the discussion is kind of inherently pedantic. =)


Note that this VM can depend on behaviour that is explicitly left undefined by the C11 standard. For example, this DCPU-16 program can cause a shift by an amount greater than the width of uint16_t:

SET A, 0x1 SHL A, 0x100


The Java Language Specification indeed states that the shift amount should be ANDed with 31 or 63, depending on a type of the left-hand operand. I guess Notch did not intend this behavior, however.

http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html...


This will, of course, set both A and O to 0.


Depends how left shift is implemented: on some processors (including x86), only the lowest order bits of the number of positions to shift by are used. In Java:

char A = 1; A = ((char) ((A << 32) & 0xFFFF)); System.out.println(Integer.toHexString(A));

actually outputs 1.


I have a terrible feeling I'm missing something here


notch, creator of Minecraft, has announced a new space-themed MMO, with people being able to write code against a 16-bit machine architecture, which will be executed to fly your ship around, even while you're not connected to the game.

This is a c implementation of that machine architecture.

I don't think we have any info on sensors or actuators for the ships though, yet...


Someone should make something analogous to the GertBoard for the Raspberrypi[0], except virtually, to solve that problem [0]http://www.raspberrypi.org/archives/411


The next game from Notch (creator of Minecraft) is called 0x10c and features a 16bit cpu. More details can be found at http://0x10c.com/


No, sorry, you are missing nothing. This is actually happening.


The first thing I'm doing with this is making it play minecraft.


The first world you should test it on is one in which someone's built a control panel/interface for the CPU.


and it was turtles all the way down...


I made one in C++. https://github.com/isamgray/DCPU-16-CPP I'm new to writing emulators and such, so I used the already made C one, as a sort of "guide". No copy/paste of code though. I plan on adding in a disassembler soon.(already have most of the code written)

I hope someone finds it useful!


It would seem like some kind of compiler is going to be written before the first beta is even released. Its like a space race!


I know the architecture was built to be very easy to emulate, so I'd like to see how it compares to a PyPy JIT version :)


It would probably be worth it to temporarily define one of the reserved opcodes as putchar.


Yes, "temporarily".

It would definitely be a fitting nod to the history of the 6502 (which inspired this processor) to have a variety of incompatible undocumented instructions. Maybe not a good thing, but... fitting.


Video RAM starts at 0x8000.


True, but there's no information on the display modes or the framebuffer format.


Instead of using dcpu.skip, why not just (d->pc += !res)? This way you don't have to call dcpu_step() again before getting to the next instruction and more accurately follow the way the hardware would likely work.


Instructions can be 1-3 words depending on opcode and operand types. I was using the skip register to avoid having to decode instructions just to skip 'em. Turns out that my operand decoder is side-effecty (adjusts SP for PUSH/POP operands), so I had to go this route anyway. Code updated.


Ah, thanks. I forgot about that.


Someone should implement C in the DCPU-16 before launch, so the assembly language isn't necessary. Mid- to high-level languages FTW!


Can anyone help me understand how this will work with the game? I don't see anything about flying virtual ships around...


As far as I can understand the virtual ships have programmable hardware


Coffescript based interpreter:

https://github.com/Janiczek/cfs-dcpu16



How long till someone installs a linux distro to the new game's simulated CPU?


Hopefully/probably, very long since the CPU in question is quite small. 64 KB of total memory (code and data) is rather cramped to run Linux. Also, it has no I/O and no interrupts. It'd have to be a pretty radically small OS kernel.


Is there an assembler?





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: