Monday, August 30, 2021

In Go, pointers (mostly) don't go with slices in practice

When I wrote about why it matters that map values are unaddressable in Go, there were a set of Twitter replies from Sean Barrett:

Knowing none of the details & not being a go programmer, I would have guessed that map values aren't addressable because they're in a dynamically-sized hash table so they need to get relocated behind the user's back; getting the address of a value slot would break that.

But I'd also have assumed Go has dynamically-extensible arrays, and the same argument would apply in that case, so maybe not?

This sparked an article about how Go maps store their values and keys, so today I'm writing about the second part of Barrett's reply, about "dynamically-extensible arrays", because the situation here in Go is peculiar (especially from the perspective of a C or C++ programmer trying to extend their intuitions to Go). Put simply, Go has pointers and it has something like dynamically extensible arrays, but in practice you can't use pointers to slices or slice elements. Trying to combine the two is a recipe for pain, confusion, and weird problems.

On the surface, things look straightforward. The Go version of dynamically extensible arrays are slices. Slices and elements of slices are among Go's addressable values, so both of the following pointer creations are legal:

var s []int
s = append(s, 10, 20, 30)
// pointer to a slice element
//  and the slice
pe := &s[0]
ps := &s

At this moment you can dereference pe and ps and get the results you expect, including if you modify s[0] with eg 's[0] = 100'. Where things go off the rails is if you do anything else with the slice s, such as:

s = append(s, 50)
// return the slice from a function
return s

There are two problems. The first problem, possibly exposed by the append(), is that slice elements actually live in an anonymous backing array. Modifying the size of a slice (such as by appending another element to it) may create a new version of this anonymous backing array, and when the array is reallocated, any pointers to the old one aren't updated to point to the new one and so won't see any changes to it. So if you have the following code:

pe = &s[0]
s = append(s, 50)
s[0] = 100

The value of '*pe' may or may not now be 100, depending on whether the append() created a new version of the backing array.

The second problem is that slices themselves are passed, returned, and copied by value, which doesn't quite do what you might think because slices are lightweight things. A slice is a length, a reference to the anonymous backing array, and a capacity. Copying the slice copies these three, but doesn't copy the anonymous backing array itself, which means that many slices can refer to the same anonymous backing array (and yes this can get confusing and create fun problems).

When you take a pointer to a slice, you get a pointer to the current version of this tuple of information for the slice. This pointer may or may not refer to a slice that anyone else is using; for instance:

ps := &s
s = append(s, 50)

At this point, '*ps' may or may not be the same thing as 's', and so it might or might not have the new '50' element at the end. The more time that passes between taking a pointer to a slice and the slice being further manipulated, the less likely it is that 'ps' points to anything useful. If the slice 's' is returned from a function the return is copy by value, and so ps definitely no longer points to the live slice that the caller is using, although ps might have the same length and refer to the same anonymous backing array.

This leads to the situation mentioned on Twitter by Tom Cheng:

> regular GC keeps pointers to the old version alive if necessary.

Wait... what? so if i get a pointer into an array, then resize the array, then get a pointer to the same index, i'll have 2 valid pointers to 2 completely different objects??

(For 'array', read 'dynamically extensible array', so a slice. The answer is yes.)

It's possible to use pointers to slices or to slice elements in limited situations, if you're very careful with what you're doing with them (or know exactly what you're doing and why). But in general, pointers to slices and slice elements don't do what you want.

Honestly, this is a strange and peculiar situation, although Go programmers have acclimatized to it. To programmers from other languages, such as C or C++, the concept of pointers to dynamically extensible arrays seems like a perfectly decent idea that surely should exist and work in Go. Well, it exists, and it "works" in the sense that it yields results and doesn't crash your program, but it doesn't "work" in the sense of doing what you'd actually want.



from Hacker News https://ift.tt/3mEqE3u

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.