Technology Documents

Our Core Technology

Codeplay build advanced optimizing compiler technology for a wide range of novel multi-core and general purpose processors.

Our technology is the result of more than 10 years of research, development and customer feedback in commercial C/C++, shader and OpenCL compilers. Over the years we adapted our optimizing compiler technology, tools and their integrations for a variety of very different and often specialized single and multi-core processor systems. This has made our technology very flexible, widely applicable and highly configurable. We offer customizing this technology to support new processors, programming languages and integration into 3rd party development toolchains. Our experience also allows us to advise on compiler- friendly instruction sets and processor designs.

Customizable core compiler technology

Our technology consists of a set of core components that can be easily configured individually for custom processor systems. This allows our compilers to be integrated with very simple or complex established tool chains.

The C/C++ front-end provides compatibility with non-standard language extensions from Microsoft and GNU. It has optional language plugins to support HLSL/OpenCL/AltiVec™ and other extensions. Combining these extensions with C/C++ enables programmers to establish powerful programming models to suit different processors. The front-end also ships with configurable implementations of the MS and/or [[http://www.codesourcery.com/public/cxx-abi/" target="_blank|Itanium® C++ ABI] (for example with options to generate less code than GCC). Custom calling conventions can be defined in a program to support interfacing with existing assembly code or to configure register usage across function calls. The built-in inline assembler enables the seamless integration (and even optimization) of existing inline assembly code. In order to prevent time-consuming debugging we have incorporated various warning and advice messages that we found very important to help avoid programming errors.

The main optimizer implements vectorization and all standard optimizations, and a large number of special sophisticated optimizations, for example for vector processors. The register allocator can handle a large number of specialized (vector) registers as well as small registers sets (such as on x86). The instruction scheduler generates multiple instructions per cycle (multiple issue) if required. Various output file and debugging formats are supported.

Using these core components we built optimizing C/C++ compilers for x86 processors (various AMD and Intel targets, support for SSE/3DNow!™), for PlayStation®2 (Emotion Engine (EE) (MIPS) and Vector Unit (VU)) and for other client-confidential architectures. The requirements for optimization, code (and data) generation are significantly different between these processors. For example specialized processors such as the PlayStation®2 Vector Unit were not designed (e.g. no byte addressing, limited memory size etc) to execute programs written in standard C/C++. We have found solutions to work around such and other restrictions. Our tools integrate with Visual Studio and and GCC.

Offload™ C++ - taking C++ to Multicore

Offload™ C++ enhances C++ to support an incremental non-disruptive and type-safe migration of C++ code to homogeneous and in particular heterogeneous multicore systems. A simple programming model using a small set of language extensions enables the programmer to quickly offload code to accelerator processors and instantly verify the result. Offloading code with Offload™ C++ is simple and hence suitable for non-specialist programmers. All that usually needs to be done is to mark out a block of code that should be offloaded. The Offload™ C++ compiler then does all the hard work (automatic separate compilation (aka function or call-graph duplication which is required to compile standard C++ across different instruction sets and memory spaces on heterogeneous processors, linking accelerator code with host processor code etc.). Offload™ C++ is scalable: write your code once and run it on more cores later. Offload™ C++ can easily be disabled: the language extensions can be hidden inside macros that define nothing on other compilers.

Offload™ C++ has been successfully applied to a number of commercial PlayStation®3 games. We ship implementations of Offload™ C++ as part of our Offload™ Multicore programming systems for Cell Linux and PlayStation®3. There are detailed walk-through examples showing the process from applying offload blocks to performance tuning the code on the Offload™ product website. We can also customize our Offload™ C++ tools for other systems, for example fit specific native backends to our OffloadC++ compiler. If you already have a C compiler for your processor Offload C++ can be used almost immediately through our C++ to C translation technology.

Need an optimizing compiler for your processor - fast?

We have built C/C++ and shader compilers for many unusual and specialized (GPU- like) architectures. We are usually able to provide customers with compiler prototypes for their hardware very quickly, mostly within a few weeks. Just send us the specification of your processor and tell us your requirements and we'll do the rest.

From our experience with previous projects we are able to provide processor designers with valuable feedback on their processor. For example the instruction schedule (numbers of registers, instructions per cycle etc) generated for certain applications greatly depends on the capabilities of the processor. Once we have put together a compiler prototype for a new processor we are able to give further and more detailed recommendations to help optimize the processor design.