Thursday, June 30, 2022

Clang IR (CIR): A New IR for Clang

Clang IR (CIR)

Clang IR (CIR) is a new IR for Clang. The LLVM’s discourse RFC goes in depth about the project motivation, status and design choices.

The source of truth for CIR is found at https://github.com/facebookincubator/clangir.

The main branch contains a stack of commits, occasionally rebased on top of LLVM upstream, tracked in latest-upstream-llvm branch.


Getting started

Git repo

$ git clone https://github.com/facebookincubator/clangir.git llvm-project

Remote

Alternatively, one can just add remotes:

$ cd llvm-project
$ git remote add fbi git@github.com:facebookincubator/clangir.git
$ git checkout -b clangir fbi/main

Building

In order to enable CIR related functionality, just add mlir and clang to the CMake list of enabled projects and do a regular LLVM build.

... -DLLVM_ENABLE_PROJECTS="clang;mlir;..." ...

See the steps here for general instruction on how to build LLVM.

For example, building and installing CIR enabled clang on macOS could look like:

CLANG=`xcrun -f clang`
INSTALLDIR=/tmp/install-llvm

$ cd llvm-project/llvm
$ mkdir build-release; cd build-release
$ /Applications/CMake.app/Contents/bin/cmake -GNinja \
 -DCMAKE_BUILD_TYPE=Release \
 -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} \
 -DLLVM_ENABLE_ASSERTIONS=ON \
 -DLLVM_TARGETS_TO_BUILD="X86" \
 -DLLVM_ENABLE_PROJECTS="clang;mlir" \
 -DCMAKE_CXX_COMPILER=${CLANG}++ \
 -DCMAKE_C_COMPILER=${CLANG} ../
$ ninja install

Check for cir-tool to confirm all is fine:

$ /tmp/install-llvm/bin/cir-tool --help

Running tests

Test are an important part on preventing regressions and covering new feature functionality. There are multiple ways to run CIR tests.

The more aggresive (slower) one:

CIR specific test targets using ninja:

$ ninja check-clang-cir
$ ninja check-clang-cir-codegen

Using lit from build directory:

$ cd build
$ ./bin/llvm-lit -a ../clang/test/CIR

How to contribute

Any change to the project should be done over github pull requests, anyone is welcome to contribute!


Documentation

Operations

cir.alloca (::mlir::cir::AllocaOp)

Defines a scope-local variable

Syntax:

operation ::= `cir.alloca` $type `,` `cir.ptr` type($addr) `,` `[` $name `,` $init `]` attr-dict

The cir.alloca operation defines a scope-local variable.

Initialization style must be one of:

  • uninitialized
  • paraminit: alloca to hold a function argument
  • callinit: Call-style initialization (C++98)
  • cinit: C-style initialization with assignment
  • listinit: Direct list-initialization (C++11)

The result type is a pointer to the input’s type.

Example:

// int count = 3;
%0 = cir.alloca i32, !cir.ptr<i32>, ["count", cinit] {alignment = 4 : i64}

// int *ptr;
%1 = cir.alloca !cir.ptr<i32>, cir.ptr <!cir.ptr<i32>>, ["ptr", uninitialized] {alignment = 8 : i64}
...

Attributes:

Attribute MLIR Type Description
type ::mlir::TypeAttr any type attribute
name ::mlir::StringAttr string attribute
init ::mlir::cir::InitStyleAttr initialization style
alignment ::mlir::IntegerAttr 64-bit signless integer attribute whose minimum value is 0

Results:

Result Description
addr CIR pointer type

cir.binop (::mlir::cir::BinOp)

Binary operations (arith and logic)

Syntax:

operation ::= `cir.binop` `(` $kind `,` $lhs `,` $rhs  `)` `:` type($lhs) attr-dict

cir.binop performs the binary operation according to the specified opcode kind: [mul, div, rem, add, sub, shl, shr, and, xor, or].

It requires two input operands and has one result, all types should be the same.

%7 = binop(add, %1, %2) : i32
%7 = binop(mul, %1, %2) : i8

Traits: SameOperandsAndResultType, SameTypeOperands

Interfaces: NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute MLIR Type Description
kind ::mlir::cir::BinOpKindAttr binary operation (arith and logic) kind

Operands:

Operand Description
lhs any type
rhs any type

Results:

Result Description
result any type

cir.brcond (::mlir::cir::BrCondOp)

Conditional branch

Syntax:

operation ::= `cir.brcond` $cond
              $destTrue (`(` $destOperandsTrue^ `:` type($destOperandsTrue) `)`)?
              `,`
              $destFalse (`(` $destOperandsFalse^ `:` type($destOperandsFalse) `)`)?
              attr-dict

The cir.brcond %cond, ^bb0, ^bb1 branches to ‘bb0’ block in case %cond (which must be a !cir.bool type) evaluates to true, otherwise it branches to ‘bb1’.

Example:

  ...
    cir.brcond %a, ^bb3, ^bb4
  ^bb3:
    cir.return
  ^bb4:
    cir.yield

Traits: SameVariadicOperandSize, Terminator

Interfaces: BranchOpInterface, NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand Description
cond CIR bool type
destOperandsTrue any type
destOperandsFalse any type

Successors:

Successor Description
destTrue any successor
destFalse any successor

cir.br (::mlir::cir::BrOp)

Unconditional branch

Syntax:

operation ::= `cir.br` $dest (`(` $destOperands^ `:` type($destOperands) `)`)? attr-dict

The cir.br branches unconditionally to a block. Used to represent C/C++ goto’s and general block branching.

Example:

  ...
    cir.br ^bb3
  ^bb3:
    cir.return

Traits: Terminator

Interfaces: BranchOpInterface, NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand Description
destOperands any type

Successors:

Successor Description
dest any successor

cir.cast (::mlir::cir::CastOp)

Conversion between values of different types

Syntax:

operation ::= `cir.cast` `(` $kind `,` $src `:` type($src) `)`
              `,` type($result) attr-dict

Apply C/C++ usual conversions rules between values. Currently supported kinds:

  • int_to_bool
  • array_to_ptrdecay
  • integral

This is effectively a subset of the rules from llvm-project/clang/include/clang/AST/OperationKinds.def; but note that some of the conversions aren’t implemented in terms of cir.cast, lvalue-to-rvalue for instance is modeled as a regular cir.load.

%4 = cir.cast (int_to_bool, %3 : i32), !cir.bool
...
%x = cir.cast(array_to_ptrdecay, %0 : !cir.ptr<!cir.array<i32 x 10>>), !cir.ptr<i32>

Interfaces: NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute MLIR Type Description
kind ::mlir::cir::CastKindAttr cast kind

Operands:

Operand Description
src any type

Results:

Result Description
result any type

cir.cmp (::mlir::cir::CmpOp)

Compare values two values and produce a boolean result

Syntax:

operation ::= `cir.cmp` `(` $kind `,` $lhs `,` $rhs  `)` `:` type($lhs) `,` type($result) attr-dict

cir.cmp compares two input operands of the same type and produces a cir.bool result. The kinds of comparison available are: [lt,gt,ge,eq,ne]

%7 = cir.cmp(gt, %1, %2) : i32, !cir.bool

Traits: SameTypeOperands

Interfaces: NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute MLIR Type Description
kind ::mlir::cir::CmpOpKindAttr compare operation kind

Operands:

Operand Description
lhs any type
rhs any type

Results:

Result Description
result any type

cir.cst (::mlir::cir::ConstantOp)

Defines a CIR constant

Syntax:

operation ::= `cir.cst` `(` custom<ConstantValue>($value) `)` attr-dict `:` type($res)

The cir.cst operation turns a literal into an SSA value. The data is attached to the operation as an attribute.

  %0 = cir.cst(42 : i32) : i32
  %1 = cir.cst(4.2 : f32) : f32
  %2 = cir.cst(nullptr : !cir.ptr<i32>) : !cir.ptr<i32>

Traits: ConstantLike

Interfaces: NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute MLIR Type Description
value ::mlir::Attribute any attribute

Results:

Result Description
res any type

cir.get_global (::mlir::cir::GetGlobalOp)

Get the address of a global variable

Syntax:

operation ::= `cir.get_global` $name `:` `cir.ptr` type($addr) attr-dict

The cir.get_global operation retrieves the address pointing to a named global variable. If the global variable is marked constant, writing to the resulting address (such as through a cir.store operation) is undefined. Resulting type must always be a !cir.ptr<...> type.

Example:

%x = cir.get_global @foo : !cir.ptr<i32>

Interfaces: NoSideEffect (MemoryEffectOpInterface), SymbolUserOpInterface

Effects: MemoryEffects::Effect{}

Attributes:

Attribute MLIR Type Description
name ::mlir::FlatSymbolRefAttr flat symbol reference attribute

Results:

Result Description
addr CIR pointer type

cir.global (::mlir::cir::GlobalOp)

Declares or defines a global variable

Syntax:

operation ::= `cir.global` ($sym_visibility^)?
              (`constant` $constant^)?
              $sym_name
              custom<GlobalOpTypeAndInitialValue>($sym_type, $initial_value)
              attr-dict

The cir.global operation declares or defines a named global variable.

The backing memory for the variable is allocated statically and is described by the type of the variable.

The operation is a declaration if no inital_value is specified, else it is a definition.

The global variable can also be marked constant using the constant unit attribute. Writing to such constant global variables is undefined.

Symbol visibility is defined in terms of MLIR’s visibility, and C/C++ linkage types are still TBD.

Example:

// Public and constant variable with initial value.
cir.global public constant @c : i32 = 4;

Interfaces: Symbol

Attributes:

Attribute MLIR Type Description
sym_name ::mlir::StringAttr string attribute
sym_visibility ::mlir::StringAttr string attribute
sym_type ::mlir::TypeAttr any type attribute
initial_value ::mlir::Attribute any attribute
constant ::mlir::UnitAttr unit attribute
alignment ::mlir::IntegerAttr 64-bit signless integer attribute

cir.if (::mlir::cir::IfOp)

The if-then-else operation

The cir.if operation represents an if-then-else construct for conditionally executing two regions of code. The operand is a cir.bool type.

Examples:

cir.if %b  {
  ...
} else {
  ...
}

cir.if %c  {
  ...
}

cir.if %c  {
  ...
  cir.br ^a
^a:
  cir.yield
}

cir.if defines no values and the ‘else’ can be omitted. cir.yield must explicitly terminate the region if it has more than one block.

Traits: AutomaticAllocationScope, NoRegionArguments, RecursiveSideEffects

Interfaces: RegionBranchOpInterface

Operands:

Operand Description
condition CIR bool type

cir.load (::mlir::cir::LoadOp)

Load value from memory adddress

Syntax:

operation ::= `cir.load` (`deref` $isDeref^)? $addr `:` `cir.ptr` type($addr) `,`
              type($result) attr-dict

cir.load reads a value (lvalue to rvalue conversion) given an address backed up by a cir.ptr type. A unit attribute deref can be used to mark the resulting value as used by another operation to dereference a pointer.

Example:


// Read from local variable, address in %0.
%1 = cir.load %0 : !cir.ptr<i32>, i32

// Load address from memory at address %0. %3 is used by at least one
// operation that dereferences a pointer.
%3 = cir.load deref %0 : cir.ptr <!cir.ptr<i32>>

Attributes:

Attribute MLIR Type Description
isDeref ::mlir::UnitAttr unit attribute

Operands:

Operand Description
addr CIR pointer type

Results:

Result Description
result any type

cir.loop (::mlir::cir::LoopOp)

Loop

Syntax:

operation ::= `cir.loop` $kind
              `(`
              `cond` `:` $cond `,`
              `step` `:` $step
              `)`
              $body
              attr-dict

cir.loop represents C/C++ loop forms. It defines 3 blocks:

  • cond: region can contain multiple blocks, terminated by regular cir.yield when control should yield back to the parent, and cir.yield continue when execution continues to another region. The region destination depends on the loop form specified.
  • step: region with one block, containing code to compute the loop step, must be terminated with cir.yield.
  • body: region for the loop’s body, can contain an arbitrary number of blocks.

The loop form: for, while and dowhile must also be specified and each implies the loop regions execution order.

  // while (true) {
  //  i = i + 1;
  // }
  cir.loop while(cond :  {
    cir.yield continue
  }, step :  {
    cir.yield
  })  {
    %3 = cir.load %1 : cir.ptr <i32>, i32
    %4 = cir.cst(1 : i32) : i32
    %5 = cir.binop(add, %3, %4) : i32
    cir.store %5, %1 : i32, cir.ptr <i32>
    cir.yield
  }

Traits: NoRegionArguments, RecursiveSideEffects

Interfaces: LoopLikeOpInterface, RegionBranchOpInterface

Attributes:

Attribute MLIR Type Description
kind ::mlir::cir::LoopOpKindAttr Loop kind

cir.ptr_stride (::mlir::cir::PtrStrideOp)

Pointer access with stride

Syntax:

operation ::= `cir.ptr_stride` `(` $base `:` type($base) `,` $stride `:` type($stride) `)`
              `,` type($result) attr-dict

Given a base pointer as operand, provides a new pointer after applying a stride. Currently only used for array subscripts.

%3 = cir.cst(0 : i32) : i32
%4 = cir.ptr_stride(%2 : !cir.ptr<i32>, %3 : i32), !cir.ptr<i32>

Traits: SameFirstOperandAndResultType

Interfaces: NoSideEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand Description
base any type
stride integer

Results:

Result Description
result any type

cir.return (::mlir::cir::ReturnOp)

Return from function

Syntax:

operation ::= `cir.return` ($input^ `:` type($input))? attr-dict

The “return” operation represents a return operation within a function. The operation takes an optional operand and produces no results. The operand type must match the signature of the function that contains the operation.

  func @foo() -> i32 {
    ...
    cir.return %0 : i32
  }

Traits: HasParent<FuncOp, ScopeOp, IfOp, SwitchOp, LoopOp>, Terminator

Operands:

Operand Description
input any type

cir.scope (::mlir::cir::ScopeOp)

Represents a C/C++ scope

cir.scope contains one region and defines a strict “scope” for all new values produced within its blocks.

Its region can contain an arbitrary number of blocks but usually defaults to one. The cir.yield is a required terminator and can be optionally omitted.

A resulting value can also be specificed, though not currently used - together with cir.yield should be helpful to represent lifetime extension out of short lived scopes in the future.

Traits: AutomaticAllocationScope, NoRegionArguments, RecursiveSideEffects

Interfaces: RegionBranchOpInterface

Results:

Result Description
results any type

cir.store (::mlir::cir::StoreOp)

Store value to memory address

Syntax:

operation ::= `cir.store` $value `,` $addr attr-dict `:` type($value) `,` `cir.ptr` type($addr)

cir.store stores a value (first operand) to the memory address specified in the second operand.

Example:

// Store a function argument to local storage, address in %0.
cir.store %arg0, %0 : i32, !cir.ptr<i32>

Operands:

Operand Description
value any type
addr CIR pointer type

cir.struct_element_addr (::mlir::cir::StructElementAddr)

Get the address of a member of a struct

The cir.struct_element_addr operaration gets the address of a particular named member from the input struct.

Example:

!22struct2EBar22 = type !cir.struct<"struct.Bar", i32, i8>
...
%0 = cir.alloca !22struct2EBar22, cir.ptr <!22struct2EBar22>
...
%1 = cir.struct_element_addr %0, "Bar.a"
%2 = cir.load %1 : cir.ptr <int>, int
...

Attributes:

Attribute MLIR Type Description
member_name ::mlir::StringAttr string attribute

Operands:

Operand Description
struct_addr CIR pointer type

Results:

Result Description
result CIR pointer type

cir.switch (::mlir::cir::SwitchOp)

Switch operation

Syntax:

operation ::= `cir.switch` custom<SwitchOp>(
              $regions, $cases, $condition, type($condition)
              )
              attr-dict

The cir.switch operation represents C/C++ switch functionality for conditionally executing multiple regions of code. The operand to an switch is an integral condition value.

A variadic list of “case” attribute operands and regions track the possible control flow within cir.switch. A case must be in one of the following forms:

  • equal, <constant>: equality of the second case operand against the condition.
  • anyof, [constant-list]: equals to any of the values in a subsequent following list.
  • default: any other value.

Each case region must be explicitly terminated.

Examples:

cir.switch (%b : i32) [
  case (equal, 20) {
    ...
    cir.yield break
  },
  case (anyof, [1, 2, 3] : i32) {
    ...
    cir.return ...
  }
  case (default) {
    ...
    cir.yield fallthrough
  }
]

Traits: AutomaticAllocationScope, NoRegionArguments, RecursiveSideEffects, SameVariadicOperandSize

Interfaces: RegionBranchOpInterface

Attributes:

Attribute MLIR Type Description
cases ::mlir::ArrayAttr cir.switch case array attribute

Operands:

Operand Description
condition integer

cir.yield (::mlir::cir::YieldOp)

Terminate CIR regions

Syntax:

operation ::= `cir.yield` ($kind^)? ($args^ `:` type($args))? attr-dict

The cir.yield operation terminates regions on different CIR operations: cir.if, cir.scope, cir.switch and cir.loop.

Might yield an SSA value and the semantics of how the values are yielded is defined by the parent operation. Note: there are currently no uses of cir.yield with operands - should be helpful to represent lifetime extension out of short lived scopes in the future.

Optionally, cir.yield can be annotated with extra kind specifiers:

  • break: breaking out of the innermost cir.switch / cir.loop semantics, cannot be used if not dominated by these parent operations.
  • fallthrough: execution falls to the next region in cir.switch case list. Only available inside cir.switch regions.
  • continue: only allowed under cir.loop, continue execution to the next loop step.

As a general rule, cir.yield must be explicitly used whenever a region has more than one block and no terminator, or within cir.switch regions not cir.return terminated.

Example:

cir.if %4 {
  ...
  cir.yield
}

cir.switch (%5) [
  case (equal, 3) {
    ...
    cir.yield fallthrough
  }, ...
]

cir.loop (cond : {...}, step : {...}) {
  ...
  cir.yield continue
}

Traits: HasParent<IfOp, ScopeOp, SwitchOp, LoopOp>, ReturnLike, Terminator

Attributes:

Attribute MLIR Type Description
kind ::mlir::cir::YieldOpKindAttr yield kind

Operands:

Operand Description
args any type

Passes

-cir-lifetime-check: Check lifetime safety and generate diagnostics

This pass relies on a lifetime analysis pass and uses the diagnostics mechanism to report to the user. It does not change any code.

Options

-history : List of history styles to emit as part of diagnostics. Supported styles: {all|null|invalid}
-remarks : List of remark styles to enable as part of diagnostics. Supported styles: {all|pset}

-cir-merge-cleanups: Remove unnecessary branches to cleanup blocks

Canonicalize pass is too aggressive for CIR when the pipeline is used for C/C++ analysis. This pass runs some rewrites for scopes, merging some blocks and eliminating unnecessary control-flow.



from Hacker News https://ift.tt/EIX8YLv

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.