Sunday, August 30, 2020

Sum Types in Julia and Rust

As I have mentioned in other posts on this blog (and surely many would agree here), I strongly believe that a great start for designing a program is figuring out good representations for the real world problem one tries to solve, i.e. defining appropriate types. One helpful technique for defining types are so called sum types. Interestingly, they are used completely differently in different programming languages which I would like to briefly demonstrate in this article using Rust and Julia as examples.

Remark: Julia and Rust are my favourite programming languages and they are particularly suited to demonstrate my point. That should not stop you from reading if you use neither of these languages, though! First, both have a rather intuitive syntax and the code shown here is really basic, so you should be able to follow along. Second, you can probably substitute your inheritance-using language for Julia (e.g. Java) or your tagged-union-using language for Rust (e.g. Haskell). Even if not, you might learn something anyway!

Scenario

Suppose, you are creating an intergalactic game where players can inhabit three different star systems: Our sun (the solarians), the polar star (the polarians) and Alpha Centauri (the centaurians). An obvious first step would be to define types for each of these three classes:

In Julia:

struct Solarian end
struct Polarian end
struct Centaurian end

and in Rust (I will stick to this order if not stated otherwise):

struct Solarian;
struct Polarian;
struct Centaurian;

(You would add some fields to those structs, of course, but that’s not the point here.)

Now, let’s say you need to store the player value in another object like a game state:

struct GameState
    daytime::UInt
    player::???
end
struct GameState {
    daytime: u64,
    player: ???
}

What type would you use for player in each case? And what about

function homestar(player::???)::String
    ...
end
fn homestar(player: ???) -> String {
    ...
}

These are the questions we’re here for.

Strategy 1: Inheritance / subtyping

Julia

This is very straightforward in Julia. Define

and make the structs from above subtypes of that:

struct Solarian <: Player end
struct Polarian <: Player end
struct Centaurian <: Player end

Dynamic dispatch then allows you to write:

homestar(player::Solarian) = "sun"
homestar(player::Polarian) = "polar star"
homestar(player::Centaurian) = "Alpha Centauri"
homestar(player::Player) = "unknown" # or `error("Unknown player class")`

Rust

Rust does not have subtyping but you can define relations between types via traits (similar to interfaces in other languages). So, we can try

trait Player {
    fn homestar(&self) -> String;
}

and then

impl Player for Solarian {
    fn homestar(&self) -> String {
        String::from("sun")
    }
}

But what about the GameState struct? We can make use of trait objects, about which we only know that they implement a certain trait. We especially don’t know the object’s size so we need to reference to it. If you’re familiar with Rust, you know that this either means using a “normal” reference (& or &mut) and dealing with life time annotations (not my favourite, to be honest) or using a Boxed value. If you’re not familiar with that, it doesn’t really matter here anyway. GameState could look like this:

struct GameState {
    daytime: u64,
    player: Box<dyn Player>
}

This works, but does restrict us in what we can do with the Player trait, as it needs to be object safe. For example, greeting another player of an arbitrary class

trait Player {
    fn homestar(&self) -> String;
    fn greet<T: Player>(&self, other: &T);
}

would not work anymore.

This would be easy in Julia, though:

greet(self::Polarian, other::Player) = println("Hi!")
greet(self::Polarian, other::Solarian) = println("Your star doesn't even consist of three separate stars? Pathetic!")

Strategy 1b: Constrained generics

Both Julia and Rust offer defining generic types with type parameters. Additionally, one can put constraints on those parameters. This allows us to do a smiliar thing as in strategy 1 but without hiding the concrete player type in the GameState struct:

abstract type Player end
struct Solarian <: Player end

struct GameState{P <: Player}
    daytime::UInt
    player::P
end

or

trait Player { ... }
struct Solarian;
impl Player for Solarian { ... }

struct GameState<P: Player> {
    daytime: u64,
    player: P
}

This approach can be considered more performant in general, as the compiler can emit code for concrete types (monomorphisation) instead of carrying around a clumsy super type that keeps track of subtypes and virtual method tables. (Instances of type Player in Julia and of type Box<dyn Player> in Rust work quite similiarly, as far as I understand.)

However, there is a major concern with this approach: abstraction leaking. Users of the GameState struct now have to decide what concrete type to choose for the parameter P, very strictly so in Rust, not so much in Julia. This can be okay, but often enough GameState is the right place to deal with that question and really should hide that detail.

So as a rule of thumb: Consider generics for this problem “deep down” in your code but don’t miss the abstraction level where it needs to be kept inside.

Strategy 2: Enumerations

So far, the approaches were not really satisfying for Rust. But enumerations come to the rescue! Enumerations (or enums) are types that consist of a concrete and explicit list of possible values (so called variants).

Rust

In Rust, those values can even have their own data which is perfect for our situation:

enum PlayerClass {
    Sol(Solarian),
    Pol(Polarian),
    Cent(Centaurian)
}

struct GameState {
    daytime: u64,
    player: PlayerClass
}

impl PlayerClass {
    fn homestar(&self) -> String {
        use PlayerClass::*;
        match self {
            Sol(_) => String::from("sun"),
            ...
        }
    }

    fn greet(&self, other: &PlayerClass) {
        use PlayerClass::*;
        match (self, other) {
            (Cent(_), Pol(_)) => println!("Hello, fellow three-star-systemer!"),
            ...
        };
    }
}

This indeed solves all our previous problems and is hence the go-to pattern for this situation.

Julia

Julia does technically have enums, but not really.

@enum PlayerClass Sol Pol Cent

exists and it will give you a type PlayerClass with exactly these three possible values but enums are not a first class citizen. (It is quite interesting to look at @macroexpand @enum Foo A B C though, to see how enums “work” in Julia.) But that’s not even relevant for us, more importantly, enum variants in Julia are not able to carry additional data with them! So all the information in the fields of the Solarian, Polarian, and Centaurian struct can not be build into the PlayerClass enum. There is no way to sensibely implement

function takedamage!(player::PlayerClass)
    ...
end

whereas it is trivial in Rust:

impl PlayerClass {
    fn takedamage(&mut self) {
        match &mut self {
            PlayerClass::Sol(solarian) => solarian.health -= 42,
            ...
        };
    }
}

assuming the struct Solarian has a field health: u64.

Conclusion

You might have guessed it by now, but the general advice is: Use subtyping in Julia and enums in Rust. One caveat though: Enums can not be extended from the “outside”. If you plan to let users of your library add new type variants, you will have to use some kind of trait approach.

If you are somewhat familiar with Rust or Julia then a large part of this article is probably very obvious to you. “Of course I use subtyping, that’s what everybody loves Julia for!” or “Of course I use enums, they’re one of the key features of Rust!” you might say.

However, I find it quite surprising that these two – seemingly rather different – patterns can be used in the same situation. If you are (without loss of generality) a Julia developer, being aware of the corresponding pattern in Rust might help you design appropriate types faster, should you ever get into that situation; and vice versa.



from Hacker News https://ift.tt/32HUmcj

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.