Home | Services | Robots | Electronics | Programming | Math and Science | Projects |
Tutorials | Scroungineering | Commentary | Connections | Contact Us | About Us |
June 2, 2013
For a very long time I have wanted to write a complete language compiler. At first I had no
idea how to go about it. So I studied. And studied some more. I finally felt like I knew
enough to get started and I wrote a couple of simple toy compilers for very simple languages.
They didn't do much, but served as a proof of concept. I moved on to other projects and didn't
complete what I started.
A couple of years ago I decided to pick it up again. Now, there are plenty of compilers in world and I wanted to make something useful. The competition is tough. Since Arduino with it's C/C++ based compiler was getting popular I decided that what the world really needed was a good, simple BASIC compiler for embedded systems. As much as I like C and C++, I just don't think they are good for beginning programmers to start with, especially in embedded systems. And from many of the questions and problems I have seen on Let's Make Robots that deal with the gotchas of C and C++ I am encouraged that there is a need for something simpler. There is at least one very good BASIC compiler for the AVR, but it isn't free. So I started the emBASIC project to provide a good quality, easy to use, open-source and free BASIC compiler.
Once I made that decision, I began to design the language. I decided what all I wanted it to have, and what I didn't want it to have. I wrote up a grammar, and began coding. I had recently become a convert to object oriented progamming and C++ seemed like the best way to go. But I soon hit a stumbling block. I didn't know the right way to write an object oriented parser. So I went back to studying. In the meantime I decided I wasn't really happy with the language as I had designed it, so I threw it out and decided to start over. This is that restart.
Here is the plan. The compiler should be useful and simple, easy for beginners and powerful enough for professionals. It should produce good code and be easily portable to different development systems and target processors. The code should be well structured, well written, and object oriented. It should be a good model for someone wanting to learn about compilers without being overwhelming like gcc. The language should be based loosely on early Microsoft BASIC interpreters and compilers, but updated to be a "modern" language: no GOTO, GOSUB, or line numbers! The language and compiler should take advantage of modern hardware and development tools to ease development and produce better code faster, but be written entirely from the ground up instead of using compiler-writing tools.
At this point the project is very much a work in progress. The basic structure is laid out and some parts are working, like the scanner (lexical analyzer) and good chunks of the parser. There is a lot of work left to do, and what is done needs a lot of cleaning up. But the basics of what is here are sound. I have not made any attempt to make the code small. I choose clarity over size since it won't be a large application but will run on large machines (PCs). Besides, I believe in the adage "get it working, then optimize where needed." In any case, here is what I so far.