Friday, April 23, 2021

Can you tell an assembly language when you see one?

Can you tell an assembly language when you see one?

Assembly programming is nowadays seen as niche at best. And more often than not as needlessly meticulous, demanding, and wasteful even for its niches.

Assembly is hard. It is unfriendly. Programming in assembly language is slow and error-prone. This is conventional wisdom.

Unfortunately, these days this wisdom is mostly nurtured by people who have little or no idea of what modern assembly languages look like. Assembly programming didn't stay in the 50s, it evolved along with high-level languages incorporating structural, functional, and objective-oriented programming elements. It plays well with modern APIs and DOMs. It is, of course, conceptually low-level but you can build rather high-level abstractions on top of it as well.

In fact, I'm not even sure that anyone can easily tell assembly code from some high-level code without googling. Can you?

1. GUI

Here is a piece of code. It creates a window with WinAPI and starts a message processing loop for it.

Please read it and conclude if it's written in some variant of assembly or in some high-level language.

nMain proc hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
    LOCAL wc:WNDCLASSEX       ; create local variables on stack 
    LOCAL msg:MSG
    LOCAL hwnd:HWND

    mov   wc.cbSize,SIZEOF WNDCLASSEX      ; fill values in members of wc 
    mov   wc.style, CS_HREDRAW or CS_VREDRAW
    mov   wc.lpfnWndProc, OFFSET WndProc
    mov   wc.cbClsExtra,NULL
    mov   wc.cbWndExtra,NULL
    push  hInstance
    pop   wc.hInstance
    mov   wc.hbrBackground,COLOR_WINDOW+1
    mov   wc.lpszMenuName,NULL
    mov   wc.lpszClassName,OFFSET ClassName
    invoke LoadIcon,NULL,IDI_APPLICATION
    mov   wc.hIcon,eax
    mov   wc.hIconSm,eax
    invoke LoadCursor,NULL,IDC_ARROW
    mov   wc.hCursor,eax
    invoke RegisterClassEx, addr wc        ; register our window class 
    invoke CreateWindowEx,NULL,
        ADDR ClassName, ADDR AppName,\
        WS_OVERLAPPEDWINDOW,\
        CW_USEDEFAULT, CW_USEDEFAULT,\
        CW_USEDEFAULT, CW_USEDEFAULT,\
        NULL, NULL, hInst, NULL
    mov   hwnd,eax
    invoke ShowWindow, hwnd,CmdShow        ; display our window on desktop 
    invoke UpdateWindow, hwnd              ; refresh the client area

    .WHILE TRUE                            ; Enter message loop 
                invoke GetMessage, ADDR msg,NULL,0,0
                .BREAK .IF (!eax)
                invoke TranslateMessage, ADDR msg
                invoke DispatchMessage, ADDR msg
   .ENDW
    mov     eax,msg.wParam                 ; return exit code in eax 
    ret
WinMain endp

This is written in MASM32 which is conceptually a set of macros and libraries over Microsoft Assembler. It works wonders with WinAPI and it's very easy to start with. And while creating and maintaining huge applications with it remains an issue, making simple, clean, and fast things is a pleasure, not a burden.

Source: Iczelion's Win32 Assembly Homepage, Tutorial 3: A Simple Window. http://win32assembly.programminghorizon.com/tut3.html

2. Libraries

This is an example of one function library. The function “add” simply adds two integer numbers and returns their sum.

(module
  (func $add (param $lhs i32) (param $rhs i32) (result i32)
    get_local $lhs
    get_local $rhs
    i32.add)
  (export "add" (func $add))
)

This is WebAssembly. Although the main idea behind it is providing binary code for the web regardless of the source language, it is also a legitimate assembly language itself. You can write programs for the web in it directly. It could be the source language by itself.

Source: WebAssembly examples referenced from the official site: https://developer.mozilla.org/en-US/docs/WebAssembly.

3. Algorithms

This is a TPK algorithm implementation. It contains a function, a few loops, an array, and some console output.

1   c@VA t@IC x@½C y@RC z@NC
2   INTEGERS +5 →c
3           →t
4       +t      TESTA Z
5       -t
6               ENTRY Z
7   SUBROUTINE 6→z
8       +tt→y→z
9       +tx→y→x
10  +z+cx   CLOSE WRITE 1

11  a@/½ b@MA c@GA d@OA e@PA f#HA i@VE x@ME
12  INTEGERS +20 →b +10 →c +400 →d +999 →e +1 →f
13  LOOP 10n
14      n→x
15  +b-x→x
16      x→q
17  SUBROUTINE 5 →aq
18  REPEAT n
16      +c →i
20  LOOP 10n
21      +an SUBROUTINE 1 →y
22      +d-y TESTA Z
23      +i SUBROUTINE 3
24      +e SUBROUTINE 4
25              CONTROL X
26              ENTRY Z
27      +i SUBROUTINE 3
28      +y SUBROUTINE 4
29              ENTRY X
30      +i→f→i
31  REPEAT n
32  ENTRY A CONTROL A WRITE 2 START 2

This is not assembly. It is Glennie's AUTOCODE — one of the first high-level languages ever.

Source: The Early Development Of Programming Languages by Donald E. Knuth, Luis Trabb Pardo, 1976.

By the way, TPK stands for “Typical Pardo Knuth”. It's not a real algorithm, it was made up to showcase a bunch of languages on a single example.

4. Structural programming

Here is an example of super-scalar sum calculation.

v0 = my_vector              // we want the horizontal sum of this
int64 r0 = get_len ( v0 )
int64 r0 = round_u2 ( r0 )
float v0 = set_len ( r0 , v0 )
while ( uint64 r0 > 4) {
        uint64 r0 > >= 1
        float v1 = shift_reduce ( r0 , v0 )
        float v0 = v1 + v0
}
// The sum is now a scalar in v0  

This is ForwardCom assembly language. Agner Fog, who is, among all other things, the author of popular optimization manuals and a living inspiration for all of us, proposes this syntax as more friendly to programmers. The syntax is not essential for computers, but since C is more convenient for most people than just opcodes, it is probably a good idea to make assembly programming more C-like.

Source: code examples from ForwardCom: An open-standard instruction set for high-performance microprocessors by Agner Fog.

5. More structural programming

This is a queen's problem solution with console output. It has very little platform dependency, but it doesn't have a lot of high-level features like classes, templates, or built-in containers either.

GET "LIBHDR"

GLOBAL $(
        COUNT: 200
        ALL: 201
$)

LET TRY(LD, ROW, RD) BE
        TEST ROW = ALL THEN
                COUNT := COUNT + 1
        ELSE $(
                LET POSS = ALL & ~(LD | ROW | RD)
                UNTIL POSS = 0 DO $(
                        LET P = POSS & -POSS
                        POSS := POSS - P
                        TRY(LD + P << 1, ROW + P, RD + P >> 1)
                $)
        $)

LET START() = VALOF $(
        ALL := 1
        FOR I = 1 TO 12 DO $(
                COUNT := 0
                TRY(0, 0, 0)
                WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", I, COUNT)
                ALL := 2 * ALL + 1
        $)
        RESULTIS 0
$)

This is not assembly. This is BCPL — the language that inspired B, and C; and C, as you might know, inspired C++, Java, C#, and even some of JavaScript in its turn. It is a high-level language, and it's not even all that old. It appeared quite after Fortran, Algol, Cobol, Lisp, APL. Its strong point at the time was simplicity and not innovation. Still, the very first hello world program was written in BCPL. So was the very first MMORPG.

Source: BCPL From Wikipedia, the free encyclopedia.

6. OOP (the one with classes and methods)

Here is a .NET assembly (not to be confused with assembly as in “assembly language”). It consists of one module with one class having one method that prints “Hello World” to the console output.

// Metadata version: v2.0.50215
.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )
  .ver 2:0:0:0
}
.assembly sample
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices
    .CompilationRelaxationsAttribute::.ctor(int32) =
      ( 01 00 08 00 00 00 00 00 )
  .hash algorithm 0x00008004
  .ver 0:0:0:0
}
.module sample.exe
// MVID: {A224F460-A049-4A03-9E71-80A36DBBBCD3}
.imagebase 0x00400000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003       // WINDOWS_CUI
.corflags 0x00000001    //  ILONLY
// Image base: 0x02F20000

// =============== CLASS MEMBERS DECLARATION ===================

.class public auto ansi beforefieldinit Hello
       extends [mscorlib]System.Object
{
  .method public hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       13 (0xd)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Hello World!"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ret
  } // end of method Hello::Main

  .method public hidebysig specialname rtspecialname
          instance void  .ctor() cil managed
  {
    // Code size       7 (0x7)
    .maxstack  8
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } // end of method Hello::.ctor

} // end of class Hello

Yes, this is an assembly written in assembly. This is in ILAsm (Intermediate Language Assembler) that assembles into .NET intermediate language.

You can use everything .NET has to provide with it: GUI, database access, network, — all while having very low-level control over details. Yes, it might seem overly verbose and needlessly specific, but it is still a rather mighty variant of assembly. Apart from classes and methods, it has exceptions and strings as native types.

Source: https://docs.microsoft.com/en-us/dotnet/framework/tools/ilasm-exe-il-assembler.

A shameless plug: a few years ago I did my take on introducing macros to ILAsm making MILasm. It's a working proof of concept. Fun to play with, not quite ready for production due to some inherent performance issues.

7. OOP (the one with objects and messages)

This is a TCP server example. It has objects, methods and it works in its own environment.

Namespace current addSubspace: #SimpleTCP!
Namespace current: SimpleTCP!

"A simple TCP server"
Object subclass: #Server
  instanceVariableNames: 'serverSocket socketHandler'
  classVariableNames: ''
  poolDictionaries: ''
  category: ''!

!Server class methodsFor: 'instance creation'!

new: aServerSocket handler: aHandler
  | simpleServer |
  simpleServer := super new.
  simpleServer socket: aServerSocket.
  simpleServer handler: aHandler.
  simpleServer init.
  ^simpleServer
!!

!Server methodsFor: 'initialization'!

init
  ^self
!!

!Server methodsFor: 'accessing'!

socket
  ^serverSocket
!

socket: aServerSocket
  serverSocket := aServerSocket.
  ^self
!

handler
  ^socketHandler
!

handler: aHandler
  socketHandler := aHandler.
  ^self
!!

!Server methodsFor: 'running'!

run
  | s |
  [
    serverSocket waitForConnection.
    s := (serverSocket accept).
    self handle: s
  ] repeat
!

!Server methodsFor: 'handling'!

handle: aSocket
  socketHandler handle: aSocket
!!
    

Conclusion

Assembly programming is not necessary all about processor instructions and registers anymore. Yes, it always starts ground-low, but it can be leveraged with functions, classes, and macros to be as high-level as you want.

Programming in assembly is not always that hard, and it is not always that meticulous. You just have to pick the right level to be at.



from Hacker News https://ift.tt/2QslBWa

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.