My goal wasn't to make a ton of money. It was to build good computers.-- Steve Wozniak
I have a big interest in various CPU architectures. At the end of the day, they are all equivalent Turing machines. Ignoring time and space constraints, what can be calculated by the biggest supercomputer can also be calculated by the smallest processor one can create. If you have the ability to execute instructions in order and also to test and branch to a new instruction, you have a processor that is "Turing complete." Everything else is just gravy. I like gravy. It adds flavor to the blandest of foods. With all the variety of CPU architectures created over the last 70 years, we have lots of gravy to choose from. Sampling as many flavors as we can is a worthwhile exercise.
Around 1976 or so, the Great Woz (Steven Wozniak of Apple) was writing the BASIC interpreter for the Apple II in machine language. The 6502 processor used in the Apple II is the epitome of 8 bit processors. If you are writing a program, say, a BASIC interpreter, that needs to access more than 256 bytes (addressable by 8 bits) you are going to need to handle 16 bit data. The 6502 makes this, uh, challenging. Woz took a bold step, especially for the time; he created a virtual machine: Sweet 16.
Virtual machines are quite common now. Java depends on a virtual machine, javavm, and Microsoft's .NET architecture is a virtual machine, too. There are many more examples that I could list. But, in 1976, running a VM (virtual machine) on a 1 MHz 6502 was somewhat unusual.
I have known of Sweet16 for years. But, at the time the Apple II was being sold I couldn't afford one so I never looked much into it. Recently I came across some documentation for Sweet16 at 6502.org and started looking into it. I was truly struck by the simplicity and elegance of this machine. It is a simple machine and the articles by Woz give some indications of the thought process that went into it. He needed to solve a problem and he solved it in typical Woz style: simple and elegant.
I'm interpreting what I have read here. These are my thoughts on how it came to be. Woz needed to do quite a bit of 16 bit operations moving things around or finding stuff in a 64K (16 bit) address space. Doing so with straight 6502 code was cumbersome. He wanted a 16 bit "Dream Machine" and so he created one perfectly suited to the purpose. Of course, being a VM, it would be much slower than direct 6502 code so it wasn't intended as a general purpose CPU. When I started looking at it I realized that it would be very useful for a lot more than handling memory. It's a very nice architecture, simple yet powerful. It's missing a few things that would be very useful for a CPU, but not many. The instruction set is clean and powerful and uses one to three bytes for all instructions. It saves a lot of space that way. I don't know how much Woz used it in writing his Integer BASIC, but I imagine it made the code considerably cleaner and smaller. The source code for his implementation is online and it is a marvel in tight, efficient coding. I decided I wanted to play with this architecture and Sweet2016 was born.
I have two computers in my collection that use a 6502: an Apple ][e and a Commodore VIC-20. I could have fired up one of those to play around with Sweet16. That would be fun... for a while. But it would have become challenging in a short time. Like the Dark Side of the Force, the power of a modern PC is hard to refuse. Largely due to the clean design of Sweet16 I had an emulator/VM cranked out in about a day.
So, let's describe the architecture a bit. Although the term RISC (reduces instruction set computer) had not been invented yet, the design is rather RISC-like. It deviates from the generally accepted definition quite a bit, but does have several similarities. First it definitely has a reduced instruction set! As implemented in the Apple II it has 28 instructions. It also has lots of registers:16. That wouldn't really be considered a lot now, but at the time was a bit unusual. The registers are mostly symmetric, with the same operations being applicable to them all in most cases. There are some exceptions, like register 0 (R0) being an accumulator. That is highly non-RISC-like. But, the program counter and stack pointer, as well as condition codes and status, are held in general registers as is rather common in RISC machines. It's byte addressed and uses variable-length instructions. However, that makes it very code efficient, with a large number of the instructions using a single byte, one using 3 bytes, and the rest using 2. The architecture reminds me of a cross between a DEC PDP-11 and a Data General Nova. If you squint just right, you can see a hint of 6502 in the background, which should probably be expected.
As far as I know, the machine has never been implemented in hardware. That's a shame, as I think it would be a relatively simple one to build with the TTL MSI ( Transistor Transistor Logic, Medium Scale Integration) that most mini-computers of the day were built with. I suspect it would be a good performer, outrunning much of it's competition. I'm actually strongly considering trying to implement it myself either in an FPGA or with actual TTL chips, once I get some experience with it.
Here is the instruction set as implemented by Woz: // TODO: put ISA here Non-Register Instructions Register Instructions (x indicates register code, 0 to F)
All branches are relative, going forward up to 127 bytes or backward up to 128 bytes from the incremented PC after the branch instruction. Results of a comparison are held in R13. R14 is used as status register, holding the carry bit and the index of the last result register. R15 is the program counter and R12 is the stack pointer for Branch to Subroutine. If you are familiar with the ARM architecture, much of this probably looks very familiar! It is interesting that the ARM architecture was also designed by people who were looking to extend / replace the 6502 architecture for a coprocessor to go into the BBC micro. I do wonder if they were influenced by Sweet16!
As I mentioned, the design isn't intended or even very well suited as-is to being a general purpose CPU. It wasn't intended to be. It was designed for a specific purpose and works very well for that. However, to make it a nice CPU would not take much modification. Right now, before really playing with it, I see three things that need to be added. First, we need a logic instruction or two. We "might" could get by without adding any, using arithmetic instructions (how?), but it would be painful. Second, a shift-right would be really nice. We don't need a shift-left (why not?) And third, a subroutine call to any place in memory would really be a good idea. We don't need a jump to any place in memory (why not?) With the design as it stands, all fifteen of the designated register opcodes are used and there are three non- register opcodes unused. Most of the extensions I mentioned would be best implemented as register instructions, but we could work around that.
Some of the used opcodes could be done away with easily enough and replaced with ones that are more generally useful. The RTN instruction isn't really needed at all. It was a way to escape Sweet16 mode and go back to native 6502 code. The Branch on Minus 1 and Branch on Not Minus 1 probably wouldn't be missed too much. I think we could add the instructions we really need without changing the flavor much. We shall see.
One glaring omission for a general purpose CPU is the lack of interrupts. Interrupts wouldn't have made much sense for the original purpose, but we should look into adding them. Again, I don't think it would be very difficult. The implementation might not be optimum, but in a VM it won't make a lot of difference, and even in hardware I think we would be OK. This will require further study.
Like I said, I have hacked together a quick and dirty VM for this thing. So far I have implemented the basic, unextended, instruction set, minus the RTN and BRK instructions. I am posting the program here for you to play with, and will post updates as they become available. I'm currently working on an assembler and a way to load assembled hex files. Right now, you have to put a program in by typing the hex code into the source. It does, however, have a trace facility to allow you to single step through your program. Eventually I plan to have an online assembler and source-code tracing so you can tell it to load an assembly language file, assemble it, and run and debug it in source code, complete with single-step and breakpoints.
coming soon...