Tuesday, October 18, 2022

C Minus Minus

C-- (pronounced C minus minus) is a C-like programming language. Its creators, functional programming researchers Simon Peyton Jones and Norman Ramsey, designed it to be generated mainly by compilers for very high-level languages rather than written by human programmers. Unlike many other intermediate languages, its representation is plain ASCII text, not bytecode or another binary format.[1][2]

There are two main branches:

C-- is a "portable assembly language", designed to ease the implementation of compilers that produce high-quality machine code. This is done by delegating low-level code-generation and program optimization to a C-- compiler. The language's syntax borrows heavily from C while omitting or changing standard C features such as variadic functions, pointer syntax, and aspects of C's type system, because they hamper essential features of C-- and ease of code-generation.

The name of the language is an in-joke, indicating that C-- is a reduced form of C, in the same way that C++ is basically an expanded form of C. (in C, -- and ++ mean "decrement" and "increment," respectively.)

Work on C-- began in the late 1990s. Since writing a custom code generator is a challenge in itself, and the compiler backends available to researchers at that time were complex and poorly documented, several projects had written compilers which generated C code (for instance, the original Modula-3 compiler). However, C is a poor choice for functional languages: it does not guarantee tail-call optimization, or support accurate garbage collection or efficient exception handling. C-- is a tightly-defined simpler alternative to C which supports all of these. Its most innovative feature is a run-time interface which allows writing of portable garbage collectors, exception handling systems and other run-time features which work with any C-- compiler.

The first version of C-- was released in April 1998 as a MSRA paper,[1] accompanied by a January 1999 paper on garbage collection.[2] A revised manual was posted in HTML form in May 1999.[6] Two sets of major changes proposed in 2000 by Norman Ramsey ("Proposed Changes") and Christian Lindig ("A New Grammar") led to C-- version 2, which was finalized around 2004 and officially released in 2005.[3]

Type system[edit]

The C-- type system is designed to reflect constraints imposed by hardware rather than conventions imposed by higher-level languages. A value stored in a register or memory may have only one type: bit-vector. However, bit-vector is a polymorphic type which comes in several widths, e.g. bits8, bits32, or bits64. A separate 32-or-64 bit family of floating-point types is supported. In addition to the bit-vector type, C-- provides a boolean type bool, which can be computed by expressions and used for control flow but cannot be stored in a register or memory. As in an assembly language, any higher type discipline, such as distinctions between signed, unsigned, float, and pointer, is imposed by the C-- operators or other syntactic constructs. C-- is not type-checked, nor does it enforce or check the calling convention.[3]: 28 

C-- version 2 removes the distinction between bit-vector and floating-point types. These types can be annotated with a string "kind" tag to distinguish, among other things, a variable's integer vs float typing and its storage behavior (global or local). The former is useful on targets that have separate registers for integer and floating-point values. Special types for pointers and the native word were introduced, although they are mapped to a bit-vector with a target-dependent length.[3]: 10 

Implementations[edit]

The specification page of C-- lists a few implementations of C--. The "most actively developed" compiler, Quick C--, was abandoned in 2013.[7]

Haskell[edit]

Some developers of C--, including Simon Peyton Jones, João Dias, and Norman Ramsey, work or have worked on GHC, whose development has led to extensions in the C-- language, forming the Cmm dialect which uses the C preprocessor for ergonomics.[4]

GHC backends are responsible for further transforming C-- into executable code, via LLVM IR, slow C, or directly through the built-in native backend.[8] Despite the original intention, GHC does perform many of its generic optimizations on C--. As with other compiler IRs, the C-- representation can be dumped for debugging.[9] Target-specific optimizations are, of course, performed later by the backend.

See also[edit]

References[edit]

  1. ^ a b Nordin, Thomas; Jones, Simon Peyton; Iglesias, Pablo Nogueira; Oliva, Dino (1998-04-23). "The C– Language Reference Manual".
  2. ^ a b Reig, Fermin; Ramsey, Norman; Jones, Simon Peyton (1999-01-01). "C–: a portable assembly language that supports garbage collection": 1–28.
  3. ^ a b c d Ramsey, Norman; Jones, Simon Peyton. "The C-- Language Specification, Version 2.0" (PDF). Retrieved 11 December 2019.
  4. ^ a b GHC Commentary: What the hell is a .cmm file?
  5. ^ "An improved LLVM backend".
  6. ^ Nordin, Thomas; Jones, Simon Peyton; Iglesias, Pablo Nogueira; Oliva, Dino (1999-05-23). "The C– Language Reference Manual".
  7. ^ "C-- Downloads". http://www.cs.tufts.edu. Retrieved 11 December 2019.
  8. ^ GHC Backends
  9. ^ Debugging compilers with optimization fuel

External links[edit]



from Hacker News https://ift.tt/yrsiY2B

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.