Main menu
IT Visions
Microprocessor
Architecture
Task scheduler
Instruction set

The new Innovatic microprocessor architecture combines the best from the CISC and RISC world. The basic idea is to make an extremely efficient processor architecture with an extremely high code density. A processor, which is useable from the smallest 8-bit smartcard and embedded application to even very powerful 64-bit PC's. This enables reuse of software and hardware and with that very big savings in development costs, but no microprocessor family today has such a big span.

The new architecture has the following properties:

  • Simple and cheap to produce.

  • A simple and easy to learn 8-bit instruction set.

  • Stack based architecture. This makes the processor very easy to utilize 100 % for a compiler so that very efficient code is generated, and because there is nothing to save and restore during subroutine/procedure calls, the lack of performance with interrupts and object oriented programming is not as big as on traditional register based computers.

  • A code density even better than the best CISC processors and typical 2-6 times better than most other RISC processors! This is the keyword for the new, light weight computer world.

    The high code density makes it economical possible to base a part of the memory system in a PC on fast, but rather expensive static RAMs (SRAM) as a supplement to the usual cheap, but slow dynamic RAMs (DRAM). In this way, the most critical parts of the programs and data may be loaded into SRAM. This increases the speed considerably compared to today's PC's, because such a human controlled split-up is much more efficient than the automatic cache mechanisms, which are used on today's PC's to hold a limited (typical 128-512 kbyte) amount of the last fetched instructions and data. Besides, it is much simpler and there is no problems in keeping the memory and the data cache synchronized. A SRAM uses 6 (or 8) transistors per bit, but a DRAM only one transistor and one capacitor. Therefore, the die size for an SRAM is approximately 3-4 times bigger than a DRAM and therefore more expencive, but if the code density is increased correspondingly so that less memory is needed, it compensates for this! Besides, SRAM do not need any refresh. This makes it possible to let the computer "go to sleep" with a very low power consumption and still maintain some data in the SRAMs. The data in the DRAMs are lost a few milliseconds after the refresh stops so they must be saved on e.g. a harddisk before power down.

    The real bottle-neck in a modern computer system is the memory. A 1.4 GHz RISC processor needs e.g. all data and instructions within 0.7 nS to maintain full speed. However, with DRAMs it takes approximately 40-60 nS to get a word if it is not available in any memory pipeline or in a cache memory (a cache miss). This is the case regardless of the type of memory and the bus speed! The advantage of the modern RAM types like bursted RAM, synchroneous RAM (SDRAM), dual-data-rate (DDR DRAM), RAMBUS (RDRAM) etc. is just, that they contains a pipeline system, which makes it is possible to fetch the following words extremely fast e.g. 8 16-bit words in 10 nS for RDRAM, but this is of course only important if these words are actually needed! For data access, which is usually quite random, this is often not the case. However, to try to utilize the pipeline, a PC always fetches 4-8 words at a time, which courses a 25-75 % overhead, so with random access over big data areas the real speed may be as low as 10-15 MHz! The fast speed of todays DRAMs like 800 MHz for RDRAM is just a pseudo speed, which can only be utilized under very special circumstances. On the other hand, switching to SRAMs gives real power as the access time is at least 3 times less. The only type of data access, which can really benefit from a fast DRAM memory architectures, is transfer of big data areas e.g. by means of DMA (Direct Memory Access).

    Regarding the program, it is of course much more likely that the following words are needed - if the program is not written in such a way that there are too many jumps and subroutine calls! However, it is far more important with a high code density than with a high bus speed and exotic memory types. With e.g. a 64-bit bus and 8-bit instructions, 8 instructions are fetched at a time, and if in average each 1 1/2 instruction uses or store data in the memory, as with the Innovatic architecture, the overhead for fetching instructions is only 19 %. This is so little that it may be ignored. Even with infinite fast instruction fetch the user may not notice the difference. However, many of the modern RISC processors has a fixed instruction length of 32 bits and because of the load/store architecture, which makes it necessary to load all data into the internal registers before they may be used in calculations, they typical uses 2 instructions for each data access. This means that with a 64 bit bus the overhead for fetching instructions will be approximately 100 %, so unless the computer has an extremely big - and expencive - cache, which contains the whole program, the lack of performance coursed by the low code density is almost a factor 2!

  • An efficiency and speed as good as traditional 32 bit RISC processors.

  • A digital filter performance close to many dedicated digital signal processors (DSP). This is especially important for many of todays embedded applications.

  • High efficiency and speed even at low clock frequencies and without a cache. This makes the architecture perfect for low cost and power saving applications and for processing of very big data areas like 3D graphics, high resolution image processing, computer simulations etc., where the data cannot be contained in a cache.

  • Relative addressing over the full address range so that it is no longer necessary to relocate a program before it can be executed. This saves the space needed for the relocation table and it saves the relocation time, so that programs may be activated as fast as they can be fetched from e.g. a flashdisk or a harddisk!

  • Usable with both Von Neumann architecture (common data and program memory), which is used on a PC, and Harvard architecture (separate data and program memory), which is common in embedded applications.

Some RISC processors uses a very long instruction length of 64 or 128 bits - the so called VLIW (Very Long Instruction Word) processors. The purpose is to be able to do more operations simultaneously by means of more parallel execution units where each unit uses its own part of the instruction word. The disadvantage is that the instructions need to be grouped together in a precise order, which fits the processor architecture. This requires an extremely advanced compiler and makes assembler programming impossible. Besides, it puts extremely high demands on the memory because a new, long instruction word should be available on each clock-cycle or else nothing is gained. In fact, the VLIW architecture is only suited for systems with a very little amount of code and data, as e.g. digital signal processors (DSP), where both instructions and data may be contained in an on-chip level 1 cache.

In practice, the extremely complicated architectures, which are used in many of today's microprocessors, often only enhances the performance by a few percent - at the expense of the power consumption, reliability and price! If the complexity of a processor is e.g. increased 10 times and the chip is heated to 40° C over the ambient temperature, which is very common for today's PC processors due to the high complexity and clock speed, the total reliability of the processor is reduced 160 times!

This page is updated March 13th 2006