Syndicate the Cosmos Blog Feed.

Compiler Rebuild

We cannot put it off any longer. We need to rebuild our compiler.

Work in Progress

This document will be continually updated.

Status

The compiler we have works, but it is much slower than required. Its outgrown its initial purpose of bootstrapping the Cosmos project. We tried to provide an upgrade to speed it up, but the consensus is that we need to start over.

We can reuse much of the general design and classes, but the core parts need to be rebuilt from scratch. In fact most of the "busy work" can be reused. Our overall structure is good, it is the "glue" theat needs rebuilt.

Design Goals

The compiler needs to be fast. Every bit of time saved is recovered quickly because builds are frequent. The code however must also remain maintanable.

The compiler will be built using independent modules. This will allow concurrent development and profiling of individual parts of the system.

Concurrent Development

Build, profile, refine, repeat.

Build for threading.

Module - IL Scanner

The IL Scanner is the entry point for the compiler. It eventually will support multiple scanning modes, such as one scan per assembly and walk and reduction as monolithic.

Initially it must implement the walk and scan monolithic, and even after other modes are supported this must always remain as an option. There will be times that dynamic loading will not be available and a monolithic output must be used. One example is a boot loader that does the dynamic loading. Using the monolithic option, syslinux can boot a monolithic Cosmos bootloader which can then dynamic load the intended Cosmos build.

Walk and scan is how the current compiler operates. It is given a starting point in an assembly and the IL is scanned starting at that point. All dependent methods and referenced assemblies are found and compiled into one monolithic output.

The IL scanner should only do the scanning of input assemblies and no output processing. The IL scanner only determines what parts of a source assembly will be compiled.

IL Token Classes

We must avoid creating monolithic source code in the compiler full of if or case statements. The current compiler has a base IL class, and a descendant class for each IL operation. This design works very well, however there is significant overhead in creating instances of these classes for every individual IL operation in the input. We knew this from the beginning, and it has always been part of the plan to keep this encapselation yet make it more efficient.

Instead of creating instances for each IL operation, cached instances should be used that do not store instance data. For example, a base class of ILOperation and then each IL operation will have a descendant class. Each class should also have an overriden mmethod which returns the its IL byte code. On startup, a list indexed by byte code will be created with an instance of the class. When the IL operation is encountered, the instance of the handler class can be quickly looked up, and a method called for processing of the IL operation including retrieval of extra data in the input stream. These classes then communicate with the assembler.

Assembler

Our curent assembler (at last glance a few months back) was in pretty good shape. The only major changes should be to check for optmizations around memory consumption, loops, and to change the output classes to be cached instances like the IL classes rather than creating a new instance each and every time.

The assembler should use x# as much as possible and eventually completely x# as x# evolves, with the assembler classes on the back side of x#. The assembler should not be specific to x86, although x86 is our first target.

Implementation Phases

In order to enable other Cosmos users we must implement this in a phased approach.

Phase 1 - Monolithic Output

Phase 1 is very similar to how our existing compiler works now, but with rebuilt code for compiling.

Goals

Phase 2 - Internal Assembler


Completion of our internal assembler to allow bypass of nasm.

Phase 3 - Dynamic Loading

Dyanamic loader will be written using phase 1 approach. syslinux to bootloader to cosmos build.

Phase 4 - Optimizations of input and output

Evaluate MOSA at this stage and perform a "cook off" test.

Comments:

-- No Comments --

Post a comment