HTTP does a pretty good job staying out of everyone's way.
If you're reading this article, there's a solid chance it was delivered to you over HTTP. Even if you're reading this from an RSS reader or something. And you didn't even have to think about it!
"Not having to think about it" is certainly a measure of success for a given technology. By contrast, I think about Bluetooth a lot. I wish I didn't.
If my personal life is relatively devoid of thinking about HTTP, the same cannot be said of my professional life, wherein I maintain an HTTP/S/2 proxy.
Usually with HTTP, you control at least the client or the server. So you can trust those to some extent. Or, should trust not suffice, you can always get in there and add instrumentation until either you, or the problem, meets their maker.
But when you write a proxy, you can trust no one.
Not even yourself.
Bug reports usually come in the form of "So... stuff appears to be going wrong. We're not sure why. We suspect the proxy."
And who wouldn't!
It's just sitting there, in the middle of everything, taunting you.
It's consuming CPU cycles for reasons that seem dubious at best – why is it spending so much on syscalls? Does it really cost that much to do TLS? I thought TLS was fast now.
So of course, when something is misbehaving, and you don't have client logs, (because you almost never have client logging), and the server logs don't have anything, because they hate you, yes, you personally, what else are you supposed to blame?
As is often the case, this article is motivated by a particularly gnarly bug I've found at work - but it involves HTTP/2, and so, we must start at the beginning.
I briefly considered taking you on a tour of Ethernet and IP, but it turns out I've already done that. So let's skip over that part.
For our purposes, the internet really is a series of tubes.
Every "peer" gets an IP address (no, we're not talking about NAT), and you can establish an outbound TCP connections to another peer (which is what a client usually does) or listen for and accept incoming TCP connections (which is what a server usually does).
Once that's done, you get a bidirectional socket, from which you can read bytes from and write bytes to. You get in-order, reliable delivery - by which I mean there's checksums and retransmission involved.
Because there's a lot of peers and, thus, a lot of packets, there's mechanisms to make sure someone doesn't ruin it for everyone: these include congestion control (to protect the network) and flow control (to protect the server), and they're both out of scope, but it's nice to know they've thought about it.
HTTP/1.1 is a delightfully simple protocol, if you ignore most of it.
It merely involves opening a TCP connection to some server, and writing some text to it - and then we get some text back!
Using nc
, here netcat-openbsd, we can open a TCP connection to neverssl.com
, on port 80 (oh yeah, there's ports - 65536 of them, a whole u16's worth), and just speak handwritten HTTP:
Shell session
$ printf 'HEAD / HTTP/1.1\r\nHost: neverssl.com\r\nConnection: close\r\n\r\n' | nc neverssl.com 80 HTTP/1.1 200 OK Date: Tue, 13 Sep 2022 19:10:46 GMT Server: Apache/2.4.53 () Upgrade: h2,h2c Connection: Upgrade, close Last-Modified: Wed, 29 Jun 2022 00:23:33 GMT ETag: "f79-5e28b29d38e93" Accept-Ranges: bytes Content-Length: 3961 Vary: Accept-Encoding Content-Type: text/html; charset=UTF-8
Here's a more readable version of the payload we sent:
Shell session
HEAD / HTTP/1.1 Host: neverssl.com Connection: close
Where every line ends with \r\n
, also known as CRLF, for Carriage Return + Line Feed, that's right, HTTP is based on teletypes, which are just remote typewriters
nc
did a DNS lookup as a favor, which means it turned neverssl.com
into an IP address, but we could've just as well done it ourselves, using something like dig
:
Shell session
$ dig +short A neverssl.com 34.223.124.45
Or, for IPv6:
Shell session
$ dig +short AAAA neverssl.com 2600:1f13:37c:1400:ba21:7165:5fc7:736e
And we could've used either of these in place of neverssl.com
in our nc
invocation, and things would've worked just as well. I'm just not printing it here in case the IP address does change later and everyone is confused because the one-liner no longer works.
All of what we sent is called the "HTTP header", which contains the request line:
...itself made up of the "method" (sometimes called "verb"), here HEAD
, then the path, here /
, and the HTTP protocol version, which is a fixed string which is always set to HTTP/1.1
and nothing else.
IT'S SET TO HTTP/1.1
AND NOTHING ELSE.
According to RFC 7230, the request method is case-insensitive, which means we should be able to...
Shell session
$ printf 'head / HTTP/1.1\r\nHost: neverssl.com\r\nConnection: close\r\n\r\n' | nc neverssl.com 80 HTTP/1.1 501 Not Implemented Date: Tue, 13 Sep 2022 20:11:53 GMT Server: Apache/2.4.53 () Allow: OPTIONS,HEAD,GET,POST,TRACE Content-Length: 203 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>501 Not Implemented</title> </head><body> <h1>Not Implemented</h1> <p>head not supported for current URL.<br /> </p> </body></html>
Mhh, nope, nevermind.
The path determines the resource we want to operate on: if we specify a resource that doesn't exist, we might get a response with a status of "404", which means "not found", or "doesn't exist but if we returned a 403 you'd know something of that name exists and our customers might dislike that so all you get is a 404":
Shell session
$ printf 'HEAD /etc/passwd HTTP/1.1\r\nHost: neverssl.com\r\nConnection: close\r\n\r\n' | nc neverssl.com 80 HTTP/1.1 404 Not Found Date: Tue, 13 Sep 2022 20:15:42 GMT Server: Apache/2.4.53 () Connection: close Content-Type: text/html; charset=iso-8859-1
We're not done with our request payload yet! We sent:
This is actually a requirement for HTTP/1.1, and was one of its big selling points compared to, uh...
AhAH! Drew yourself into a corner didn't you.
...Gopher? I guess?
Anyway this lets you host different websites at the same IP address, which is fortunate because, in a world of proxies and CDNs and stuff, this is pretty much the default scenario.
Of course, not all servers check it:
Shell session
$ printf 'HEAD / HTTP/1.1\r\nHost: fasterthanli.me\r\nConnection: close\r\n\r\n' | nc neverssl.com 80 | head -1 HTTP/1.1 200 OK
But some do!
Shell session
$ printf 'HEAD / HTTP/1.1\r\nHost: fasterthanli.me\r\nConnection: close\r\n\r\n' | nc example.org 80 | head -1 HTTP/1.1 404 Not Found
If we wanted, we could send a body with our request: it's simply an arbitrary payload made of.. bytes. We'd put it after the empty line. After the \r\n\r\n
.
We just have to specify how long it is – at least, that's the simple way to do it.
Something like that:
POST / HTTP/1.1 Host: example.org Content-Length: 27 Take, eat; this is my body.
That allows the server to know the difference between "a client sent the whole request body then went away" and "a client yeeted its connection to us mid-request body".
example.org
, one of the few domains I'm allowed to use for illustrative purposes, doesn't actually do anything interesting with our request body:
Shell session
$ printf 'POST / HTTP/1.1\r\nHost: example.org\r\nConnection: close\r\nContent-Length: 27\r\n\r\nTake, eat; this is my body.' | nc example.org 80 | head -15 HTTP/1.1 200 OK Accept-Ranges: bytes Cache-Control: max-age=604800 Content-Type: text/html; charset=UTF-8 Date: Tue, 13 Sep 2022 20:28:11 GMT Etag: "3147526947" Expires: Tue, 20 Sep 2022 20:28:11 GMT Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT Server: EOS (vny/0452) Content-Length: 1256 Connection: close <!doctype html> <html> <head>
..but it's nice to see a response from another HTTP server.
You can see that responses look fairly similar to requests: they have headers too. Some let us know what's the encoding of the response body for example:
Content-Type: text/html; charset=UTF-8
This is a remnant from the before times, when not everything was UTF-8.
But Amos, even today, not everyth-
LALALA can't hear you. Anyway.
This server also lets us know what time it is, in case we're lost:
Date: Tue, 13 Sep 2022 20:28:11 GMT
There's a bunch of caching stuff too, but, to the relief of everyone, myself included, we're not talking about caching today.
Just like request bodies, content bodies have a corresponding content-length
header, and the idea is exactly the same: you want to be sure you've read the whole thing by the time the server closes the connection on you.
Which, speaking of, the server only closes the connection because we asked it to, with a Connection: close
request header.
If we don't, it keeps the connection open, ready for another request:
$ printf 'HEAD / HTTP/1.1\r\nHost: example.org\r\n\r\nHEAD / HTTP/1.1\r\nHost: example.org\r\nConnection: close\r\n\r\n' | nc example.org 80 HTTP/1.1 200 OK Content-Encoding: gzip Accept-Ranges: bytes Age: 475287 Cache-Control: max-age=604800 Content-Type: text/html; charset=UTF-8 Date: Tue, 13 Sep 2022 20:33:50 GMT Etag: "3147526947+gzip" Expires: Tue, 20 Sep 2022 20:33:50 GMT Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT Server: ECS (dcb/7F39) X-Cache: HIT Content-Length: 648 HTTP/1.1 200 OK Content-Encoding: gzip Accept-Ranges: bytes Age: 475287 Cache-Control: max-age=604800 Content-Type: text/html; charset=UTF-8 Date: Tue, 13 Sep 2022 20:33:50 GMT Etag: "3147526947+gzip" Expires: Tue, 20 Sep 2022 20:33:50 GMT Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT Server: ECS (dcb/7F39) X-Cache: HIT Content-Length: 648 Connection: close
Here's our request payload, for readability:
HEAD / HTTP/1.1 Host: neverssl.com HEAD / HTTP/1.1 Host: neverssl.com Connection: close
The most-used HTTP method is no doubt GET
, which just means "gimme that". HEAD
is just like GET
, except you're telling the server not to send a response body - you're only interested in the response headers.
POST
lets us submit forms, or upload stuff. DELETE
lets us delete stuff. OPTIONS
is used for Cross-Origin Resource Sharing.
There's other methods, which aren't as interesting, so let's not.
I wanted to talk about chunked transfer encoding, but it's a bit hard to give you a one-liner that works, so instead you're gonna have to trust me: imagine you're uploading a file as it's being generated.
You don't yet know how large it's going to be. So you can't send a Content-Length
header. What do?
You send chunks! Like so:
POST / HTTP/1.1 Host: example.org Connection: close Transfer-Encoding: chunked 4 Help C I am chunked 0
Every chunk is prefixed by L\r\n
, where L
is the length of the next chunk, formatted as hexadecimal ("I am chunked" is 12 bytes long, hence, 0xC
).
To signal that you're done, you send a chunk of length 0. Here too, the server can tell whether the client went away in the middle of sending the body, or after it was done. Some clients don't stick around to find out how you felt about their payload, you know how it is.
HTTP/1.1 is a text-based, human readable format. The request and response header are separated by CRLF (\r\n
), and contain various bits of metadata about a request.
Requests are made over TCP connections. In HTTP/1.1, multiple requests can be made over the same connection, one after the other. After the header, the body begins (if there is one), and can either be written "all at once", or in chunks, prefixed with their hexadecimal length.
We're far from done, but let's take a moment to consider how we would go about proxying HTTP/1.1.
Let's assume we're a CDN (Content Delivery Network) or an ADN (Application Delivery Network): we operate "edge nodes" at various locations around the world and proxy the requests back to some "worker nodes".
First, we need to accept TCP connections. As the first line of defense against attacks, this already raises questions: do we rate-limit? If so, how? Limiting the overall number of connections we're willing to service concurrently is fairly easy, but it gets harder if you want to limit "per IP address" or "per AS".
Not impossible, just harder.
But let's ignore that part - for each connection we accept on port 80, we must be ready to speak HTTP/1.1. That means reading the HTTP request header (that contains both the "request line" and "request headers").
And then, we have to make our first choice. Let's say we receive this:
GET / HTTP/1.1 Host: fantastic-app.example.org
We already know which "app" this request is meant for – we could start establishing a connection to the app right now, and start writing a similar request header to it!
But the internet is a cold and scary place. As far as network protocols are concerned, anyway.
A well-behaved, innocent client would send a few more headers and then an empty line, indicating the end of the HTTP header.
But a malicious client could do a number of things!
It could, for example, send a header of infinite length:
GET / HTTP/1.1 Host: fantastic-app.example.org Mwahahah: You will never stop parsing this header because I am never going to stop sending it. Your buffer will grow and grow, until your HTTP server consumes all available resources and dies in the fiery flames of an OOM. Unless you're specifically looking out for that, of course. Point is, this header will never end. In a real-life attack, this header would probably be generated, but in this case, it's hand-written. Because isn't it much more fun that way? Anyway I'll continue until I get a TCP connection reset, indicating your server's untimely death - until it restarts, and I do the same. We are not so different, you and I. Both shovelling bytes, day in, day out. What does it matter what our purpose is, as long as we can send and receive bytes? After all, isn't it what life is _truly_ about? Who can tell. Have you crashed yet? No. Very well, let's read from Webster's dictionary: aba, abacate, abacisci, aback, abactinal, abaculi, abacus (haha, classic), abaft, abaiss.. ooh, almost sent a non-ASCII character there, can't have that in a header value. Abalone, abandon, abandoner, abarthroses, abase... you're still here? Damn, this is more work than I thought. Fine. I give up. You stay online. You stay online and you be the bravest little server you can. Don't let others tell you what you can and can't do, you hear? You're going to be just fine, little server. Just fine. Me? Don't worry about me. Where I'm going, we don't need servers.
Which is why most servers protect against that - they'll just send back an HTTP 431 or something. 414 if it's the URI that's too long. Or just a generic 400 if they don't feel like being specific.
That's just one thing attackers can do!
They can also send reasonably-sized headers, but lots and lots and lots and lots and lots of them.
Actually, why complicate things? They could simply open a TCP connection and just... sit there. Doing nothing. And if they open lots and lots and lots of connections, you'll run into some arbitrary limit, maybe it's the maximum number of open file descriptors, maybe it's the amount of memory you have.
But maybe you set a timeout on how long you're willing to sit there with the connection all idle: after all, if someone's connecting, it must already know what it want, correct?
In which case the attacker can fall back to sending a reasonable amount of HTTP request headers, of reasonable size, but it sends them one... byte... at... a... time... slowly....... very slowly.
And that's a Slowloris attack. They can also do that with the request body to great effect! (If they can find an endpoint that accepts POST requests, for example).
So, for all these reasons and more, we probably want to wait until we've received the full HTTP request header before establishing a connection to the "backend" or "app" or "upstream" or whatever you want to call it.
But then, more decisions await.
What balance do you want to strike between "fidelity" (how accurately you reproduce the client's request) and "safety" (enforcing rules)?
For example, RFC 9110 doesn't allow the use of NUL characters in header values. But legacy applications might rely on that! They might treat header values as an opaque byte string, using the full 0-255 range.
What should your proxy do? Return HTTP 400 on ASCII values > 127, or pass them through as-is?
What if the client sends a Content-Length: 0
header, or no Content-Length
header at all, but then follows up with a request body? Do you send it to the backend? Ignore it? Reject the request altogether?
What if the backend returns an HTTP 204
status, which means "no content", but it also sends a Content-Length
header with a body. Do you strip the response body when replying to the client, or do you proxy it as-is?
Speaking of bodies – what kind of buffering do you do? Do you wait until you've received the whole request body to relay it to the server? What if it's really large? That might not make sense.
What about the response body though? Especially if you're intent on caching it?
What about transfer encoding? If the backend answers in chunked transfer encoding, do you always reply to the client in kind? Is it okay to turn a non-chunked body into a chunked one? What if the client has specific requirements?
What about content encoding? Bodies can be compressed with gzip or brotli, for example. Do you decompress request bodies? Do you pass them as-is? What about response bodies? What accept-encoding
do you send to the backend? Do you want them to try and compress the response body, or is that your job, as a CDN/ADN/edge?
If you're caching, do you cache bodies uncompressed? Compressed? With which algorithm? Is it a good idea to decompress from brotli on-the-fly for clients that don't support its encoding? Is is better to recompress to gzip on-the-fly or pay the cost of storing another copy as gzip?
Getting someone to write an HTTP proxy is not covered by the Geneva convention, but, as you can see, maybe it should be.
HTTP/1.1 proxies need to be opinionated, because there is a LOT of ways in which the specifications (plural) can be interpreted. A variety of attacks can be performed against HTTP endpoints.
We haven't touched on the security implications, mostly on resource exhaustion, but as with any protocol, user input cannot be trusted, and we must validate everything, set timeouts, and enforce limits aggressively.
But, we suffer HTTP for good reason: it's ubiquitous.
Let's get meta! I said you're probably reading this article over HTTP, and so we can read that article from the command line, just as we did before with neverssl.com
and example.org
:
Shell session
$ printf 'GET /articles/the-http-crash-course-nobody-asked-for HTTP/1.1\r\nHost: fasterthanli.me\r\nConnection: close\r\n\r\n' | nc fasterthanli.me 80 HTTP/1.1 301 Moved Permanently location: https://fasterthanli.me/articles/the-http-crash-course-nobody-asked-for server: Fly/54d1d920f (2022-09-30) via: 1.1 fly.io fly-request-id: 01GES4PVF73NH3QDMM6HHHQBTJ-cdg content-length: 0 date: Fri, 07 Oct 2022 11:53:50 GMT
That's... that's not the article.
Ah right, it redirects to the HTTPS version. Well that's a bummer. I mean, no! It's good! Pretty much any exchange over the internet should be secured in some way, and TLS (Transport Layer Security, not thread-local storage) is a decent way to do that.
Wait, why do we want to secure this particular exchange? Does reading this article put someone at legal risk somehow? Should they be worried?
Oh, no no. Cryptography isn't illegal anymore. Well, for now.
At the time of this writing it's fine to read this article. Check with your local legislator. It's easier to see the point of encryption if you think about accessing your bank's online interface to, uhh pay your bills, or count your money Scrooge McDuck style.
But if you care about privacy at all (and you should), encryption should be the default. If all you use Signal for is "sensitive communications", and everything else is plaintext, that's pretty much a big HEY HERE'S WHAT I HAVE TO HIDE sign to any potential attacker.
Also, cryptography isn't foolproof, and there's always the occasional backdoor introduced on purpose because some intelligence agency thought nobody should be able to snoop but them! And then said backdoors end up being used by other actors who aren't "the good guys", and also it turns the good guys maybe weren't good in the first place and...
Alright, okay, point taken, so — do we write a TLS implementation now or...?
Oh no, absolutely not. Normally I'd use rustls but here in the command line I'll just take what we probably already have installed, which is OpenSSL, which, you can tell is old because it still has "SSL" in the name.
SSL 2.0 was deprecated in 2011, and SSL 3.0 in 2015 because, woof — for those keeping score.
The openssl
command-line utility comes with an s_client
subcommand (see its manpage for more info) that lets us establish a TLS connection with a server, and encrypts from stdin / decrypts to stdout transparently.
So we can just swap nc
(which does raw TCP) with openssl s_client
and now we can read this article from the command line:
Shell session
$ printf 'GET /articles/the-http-crash-course-nobody-asked-for HTTP/1.1\r\nHost: fasterthanli.me\r\nConnection: close\r\n\r\n' | openssl s_client -verify_quiet -quiet -connect fasterthanli.me:443 | grep Bluetooth -A 2 -B 2 you didn't even have to think about it!</p> <p>"Not having to think about it" is certainly a measure of success for a given technology. By contrast, <a href="https://twitter.com/fasterthanlime/status/1568236103966007296">I think about Bluetooth a lot</a>. I wish I didn't.</p>
This right here is HTML, a markup language that briefly cosplayed as XML but is now its own thing, that you should definitely not parse with regular expressions.
Luckily, this is out of scope for this article.
You know what isn't, though? Rust. Because if we want to go past HTTP/1.1, we're going to need more than just the command line.
Wait, doesn't curl have an --http2
option?
curl? What's a curl? Can you eat it? Do you wear it as a hat? Does it run on my smart watch?
Actually chances are, it d-
Let's start the easy way! The reqwest crate lets us do that at a relatively high level, so, let's.
Shell session
$ cargo new --bin crash Created binary (application) `crash` package
We'll want an async runtime:
Shell session
$ cargo add tokio -F full Updating crates.io index Adding tokio v1.21.2 to dependencies. Features: (cut)
And of course, reqwest
:
Shell session
$ cargo add reqwest (cut)
And then, this code:
Rust code
// in `src/main.rs` #[tokio::main] async fn main() { let response = reqwest::get("http://example.org").await.unwrap(); println!( "Got HTTP {}, with headers: {:#?}", response.status(), response.headers() ); let body = response.text().await.unwrap(); let num_lines = 10; println!("First {num_lines} lines of body:"); for line in body.lines().take(num_lines) { println!("{line}"); } }
Shell session
$ cargo run (cut) Running `target/debug/crash` Got HTTP 200 OK, with headers: { "age": "471421", "cache-control": "max-age=604800", "content-type": "text/html; charset=UTF-8", "date": "Fri, 07 Oct 2022 13:50:20 GMT", "etag": "\"3147526947+ident\"", "expires": "Fri, 14 Oct 2022 13:50:20 GMT", "last-modified": "Thu, 17 Oct 2019 07:18:26 GMT", "server": "ECS (dcb/7F39)", "vary": "Accept-Encoding", "x-cache": "HIT", "content-length": "1256", } First 10 lines of body: <!doctype html> <html> <head> <title>Example Domain</title> <meta charset="utf-8" /> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <style type="text/css"> body {
Not much to say here! We did a plaintext HTTP/1.1 request, and we got a UTF-8-encoded HTML response. So far so good.
Now let's do the same, but with HTTPS (which, right now, is just HTTP/1.1 over TLS). All we have to do is change our URL to start with https://
:
Rust code
let response = reqwest::get("https://example.org").await.unwrap();
The output is exactly the same: we might hit a different server, the date would be different, but the payload is the exact same, and looking at request headers, there's really no way to tell we even used HTTPS.
How is reqwest
able to speak TLS? Let's find out:
Shell session
$ cargo tree -i openssl openssl v0.10.42 └── native-tls v0.2.10 ├── hyper-tls v0.5.0 │ └── reqwest v0.11.12 │ └── crash v0.1.0 (/home/amos/bearcove/crash) ├── reqwest v0.11.12 (*) └── tokio-native-tls v0.3.0 ├── hyper-tls v0.5.0 (*) └── reqwest v0.11.12 (*)
Oh! It just uses openssl (by default).
It feels weird that it just.. worked. Just like that. I don't trust it.
Let's do a little packet capture with tcpdump so we can see what's going down in Wireshark. The network interface I use for the internet on this VM is called enp0s3
, so, in one terminal:
Shell session
$ sudo tcpdump -i enp0s3 -s 65536 -w /shared/crash.cap tcpdump: listening on enp0s3, link-type EN10MB (Ethernet), snapshot length 65536 bytes (at this point it just waits)
And in another:
Shell session
$ cd crash/ $ cargo run --quiet (cut)
Then, Ctrl-C
in the tcpdump terminal:
Shell session
^C215 packets captured 229 packets received by filter 0 packets dropped by kernel
And then we can open it in Wireshark! It was a bit noisy so I had to find the right TCP stream (and filter with right click -> "Follow" -> "TCP Stream"), and then we get this:
First we have the TCP handshake: SYN, SYN-ACK, and ACK. Then my computer (in this case, 10.0.2.15
, oh no you have my IP, don't hack me!) sends a TLS ClientHello
, and that's what's focused just so you can see what's in there: one noteworthy part is the "server name" extension, which is set to example.org
, and that's how I know it's actually the right TCP stream!
Next up we can see the server replies with a ServerHello
, and then... there's a bunch of Application Data
, which are opaque. If we click on them, we just see the encrypted version:
So, well, uhh... we don't know what's in there, but we can tell it's using TLS at least, which is what we wanted to verify.
If we go back to the insecure http://
URL real quick, we see a different picture:
This time, we still see the TCP handshake, but then Wireshark is able to decode the following packets as plaintext HTTP. It even parses the headers, the body, everything!
It's time to move down one level of abstraction.
reqwest
uses hyper, and so, that seems like the next logical place to go to.
Shell session
$ cargo rm reqwest (cut) $ cargo add hyper -F client,tcp,http1 (cut)
Rust code
// in `src/main.rs` #[tokio::main] async fn main() { let response = hyper::Client::new() .get("http://example.org".parse().unwrap()) .await .unwrap(); println!( "Got HTTP {}, with headers: {:#?}", response.status(), response.headers() ); let body = response.body(); println!("Body: {:?}", body); }
Shell session
$ cargo run --quiet Got HTTP 200 OK, with headers: { "age": "479710", "cache-control": "max-age=604800", "content-type": "text/html; charset=UTF-8", "date": "Fri, 07 Oct 2022 16:14:16 GMT", "etag": "\"3147526947+ident\"", "expires": "Fri, 14 Oct 2022 16:14:16 GMT", "last-modified": "Thu, 17 Oct 2019 07:18:26 GMT", "server": "ECS (dcb/7F5E)", "vary": "Accept-Encoding", "x-cache": "HIT", "content-length": "1256", } Body: Body(Streaming)
Hey, that's eerily similar! And it wasn't even that bad!
Say, that's uhh.. that's not the body at all.
Oh right, the body. Well, in reqwest's API, you could already see the separation between "header" and "body", because we had to .await
twice: once to get the header, and another time to get the whole body as text.
The HTTP header is not to be confused with HTTP headers.
An HTTP header begins with a start-line
(a request-line
for requests, or a status-line
for responses) and is followed by one or more header fields. See RFC 9112, Section 2.1 for details.
Hyper does give us a handle to the body immediately, and then we can choose what to do with it. Using the lowest-level interface available, we can poll it for data, getting one buffer at a time:
Rust code
// in `src/main.rs` use hyper::body::HttpBody; use std::pin::Pin; #[tokio::main] async fn main() { let response = hyper::Client::new() .get("http://example.org".parse().unwrap()) .await .unwrap(); let mut body = response.into_body(); while let Some(buffer) = std::future::poll_fn(|cx| Pin::new(&mut body).poll_data(cx)).await { let buffer = buffer.unwrap(); println!("Read {} bytes", buffer.len()); } }
With this code, sometimes we get this:
Shell session
$ cargo run --quiet Read 1256 byte
And sometimes:
Shell session
$ cargo run --quiet Read 1125 bytes Read 131 bytes
Which is interesting! We can see how it corresponds to the size of the reads performed against the TCP socket with strace:
Shell session
$ strace -ff ./target/debug/crash 2>&1 | grep -E 'recvfrom|Read [0-9]+ bytes' [pid 174829] recvfrom(9, "$\241\201\200\0\1\0\1\0\0\0\0\7example\3org\0\0\34\0\1\300\f\0"..., 2048, 0, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa:0:a0c3::3", &sin6_addr), sin6_scope_id=0}, [28]) = 57 [pid 174829] recvfrom(9, "G\276\201\200\0\1\0\1\0\0\0\0\7example\3org\0\0\1\0\1\300\f\0"..., 65536, 0, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa:0:a0c3::3", &sin6_addr), sin6_scope_id=0}, [28]) = 45 [pid 174813] recvfrom(9, "HTTP/1.1 200 OK\r\nAge: 135402\r\nCa"..., 8192, 0, NULL, NULL) = 1591 [pid 174812] write(1, "Read 1256 bytes\n", 16 <unfinished ...> [pid 174827] write(4, "\1\0\0\0\0\0\0\0", 8Read 1256 bytes
And here's the version in two reads:
Shell session
$ strace -ff ./target/debug/crash 2>&1 | grep -E 'recvfrom|Read [0-9]+ bytes' [pid 175754] recvfrom(9, "2\375\201\200\0\1\0\1\0\0\0\0\7example\3org\0\0\1\0\1\300\f\0"..., 2048, 0, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa:0:a0c3::3", &sin6_addr), sin6_scope_id=0}, [28]) = 45 [pid 175754] recvfrom(9, "\v\376\201\200\0\1\0\1\0\0\0\0\7example\3org\0\0\34\0\1\300\f\0"..., 65536, 0, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa:0:a0c3::3", &sin6_addr), sin6_scope_id=0}, [28]) = 57 [pid 175740] recvfrom(9, "HTTP/1.1 200 OK\r\nAge: 481205\r\nCa"..., 8192, 0, NULL, NULL) = 1460 [pid 175737] write(1, "Read 1125 bytes\n", 16 <unfinished ...> Read 1125 bytes [pid 175740] recvfrom(9, "sking for permission.</p>\n <p"..., 8192, 0, NULL, NULL) = 131 [pid 175737] write(1, "Read 131 bytes\n", 15 <unfinished ...> [pid 175740] shutdown(9, SHUT_WRRead 131 bytes
Notice how the first read is 1460 bytes? I guess that's the MSS for the path from my VM to example.org
! How fun!
But how come sometimes we get it all in one read?
Ah, because the kernel does its own buffering.
Anyway, the poll_data
code is kinda scary, using the recently-stabilized std::future::poll_fn
construct, which you really shouldn't have to know about unless you really want to.
Instead, we can use a standard interface: streams! Which are like iterators, but asynchronous.
We just need to add a feature to hyper:
Shell session
$ cargo add hyper -F stream Updating crates.io index Adding hyper v0.14.20 to dependencies. (cut)
(Doing cargo add
for an existing dependency adds to the already-present features so we don't need to list tcp
, client
, http1
, etc. again)
And also add a convenience crate to use streams, since very little of async is actually in the standard library:
Shell session
$ cargo add futures (cut)
And now our code can look like this:
Rust code
use futures::TryStreamExt; #[tokio::main] async fn main() { let response = hyper::Client::new() .get("http://example.org".parse().unwrap()) .await .unwrap(); let mut body = response.into_body(); while let Some(buffer) = body.try_next().await.unwrap() { println!("Read {} bytes", buffer.len()); } }
And does the exact same thing, except, we could now use combinators on the body if we wanted, and we didn't have to know much about how futures work in Rust.
In our case though, we don't really need to read the body in a streaming fashion, we're fine collecting it all and handling it as a single buffer, so we can do this:
Rust code
#[tokio::main] async fn main() { let response = hyper::Client::new() .get("http://example.org".parse().unwrap()) .await .unwrap(); let body = hyper::body::to_bytes(response.into_body()).await.unwrap(); println!("body is {} bytes", body.len()); }
And now it looks somewhat closer to the reqwest
version. It's not doing the same thing yet though — we're getting a slice of bytes, not a UTF-8 encoded string. If we want a true equivalent, we'll do this:
Rust code
#[tokio::main] async fn main() { let response = hyper::Client::new() .get("http://example.org".parse().unwrap()) .await .unwrap(); let body = String::from_utf8( hyper::body::to_bytes(response.into_body()) .await .unwrap() .to_vec(), ) .unwrap(); println!("response body: {body}"); }
And you can kinda see all the failure points at the .unwrap()
callsites. Which, by the way, means "turn this Result<T, E>
into a T
or die trying", and by die I mean "print a backtrace/stacktrace and quit the program".
Client::get()
can fail because we might not be able to establish a TCP connection, we might not be able to write our HTTP/1.1 request, the server might not respond with HTTP/1.1 at all, or close the connection early.
body::to_bytes()
can fail because, again, the server could close the connection early. Or it could be using chunked transfer encoding and send invalid chunk prefixes.
String::from_utf8
can fail because the body might not actually be valid UTF-8, it could just be arbitrary binary spaghetti.
Still, that code isn't too bad. Why do we need to bother with reqwest
again?
I'm not sure! Let's try doing HTTPS now:
Rust code
let response = hyper::Client::new() // 👇 was http .get("https://example.org".parse().unwrap()) .await .unwrap();
Shell session
$ cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: hyper::Error(Connect, "invalid URL, scheme is not http")', src/main.rs:6:10 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Oh! There's a panic from one of those famous unwrap
.
Shell session
$ RUST_BACKTRACE=1 cargo run --quiet thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: hyper::Error(Connect, "invalid URL, scheme is not http")', src/main.rs:6:10 stack backtrace: 0: rust_begin_unwind at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/std/src/panicking.rs:584:5 1: core::panicking::panic_fmt at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/panicking.rs:142:14 2: core::result::unwrap_failed at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/result.rs:1814:5 3: core::result::Result<T,E>::unwrap at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/result.rs:1107:23 4: crash::main:: at ./src/main.rs:3:20 5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/future/mod.rs:91:19 6: tokio::park::thread::CachedParkThread::block_on:: at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/park/thread.rs:267:54 7: tokio::coop::with_budget:: at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:102:9 8: std::thread::local::LocalKey<T>::try_with at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/std/src/thread/local.rs:445:16 9: std::thread::local::LocalKey<T>::with at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/std/src/thread/local.rs:421:9 10: tokio::coop::with_budget at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:95:5 11: tokio::coop::budget at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/coop.rs:72:5 12: tokio::park::thread::CachedParkThread::block_on at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/park/thread.rs:267:31 13: tokio::runtime::enter::Enter::block_on at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/enter.rs:152:13 14: tokio::runtime::scheduler::multi_thread::MultiThread::block_on at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/scheduler/multi_thread/mod.rs:79:9 15: tokio::runtime::Runtime::block_on at /home/amos/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.2/src/runtime/mod.rs:492:44 16: crash::main at ./src/main.rs:15:5 17: core::ops::function::FnOnce::call_once at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/ops/function.rs:248:5 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
And there's a backtrace.
So, wait, hyper can't do HTTPS?
I mean.. it must be able to, right? Since reqwest can, and reqwest uses hyper?
That's correct! We just have to explicitly enable it. And this time, we'll use rustls
, not OpenSSL
:
Shell session
$ cargo add hyper-rustls (cut)
Rust code
#[tokio::main] async fn main() { let conn = hyper_rustls::HttpsConnectorBuilder::new() .with_native_roots() .https_or_http() .enable_http1() .build(); let client = hyper::Client::builder().build::<_, hyper::Body>(conn); let response = client .get("https://example.org".parse().unwrap()) .await .unwrap(); let body = String::from_utf8( hyper::body::to_bytes(response.into_body()) .await .unwrap() .to_vec(), ) .unwrap(); println!("response body: {body}"); }
The output is exactly the same, so I'm not going to show it here.
Okay yeah now it is a little long-winded.
Yes, hence the use for something higher-level like reqwest
. But you know what's cool about going lower-level?
No, control! Now, for example, we can enable a very fun rustls option.
Shell session
$ cargo add rustls
Rust code
use std::sync::Arc; use hyper_rustls::ConfigBuilderExt; use rustls::{ClientConfig, KeyLogFile}; #[tokio::main] async fn main() { let mut client_config = ClientConfig::builder() .with_safe_defaults() .with_native_roots() .with_no_client_auth(); // this is the fun option client_config.key_log = Arc::new(KeyLogFile::new()); let conn = hyper_rustls::HttpsConnectorBuilder::new() .with_tls_config(client_config) .https_or_http() .enable_http1() .build(); let client = hyper::Client::builder().build::<_, hyper::Body>(conn); let response = client .get("https://example.org".parse().unwrap()) .await .unwrap(); let body = String::from_utf8( hyper::body::to_bytes(response.into_body()) .await .unwrap() .to_vec(), ) .unwrap(); println!("response body: {body}"); }
And now for the grand reveal... again we'll run tcpdump in a terminal, and in the other:
Shell session
$ SSLKEYLOGFILE=/shared/sslkeylogfile cargo run (cut)
Tada!
All I needed was to give Wireshark the path to our sslkeylogfile
written by hyper, which looks like this btw:
CLIENT_HANDSHAKE_TRAFFIC_SECRET 7c0414ee6d236f73fd5f382bd35608a5e3c8d513c5b86574eb4a4f02209f1726 d477d1ae27cbb94ff7068f187540f9534dc8407e4e1ce4e5d76504799d094ffb2ee4038889d2b8fb870f236db46e6c6b SERVER_HANDSHAKE_TRAFFIC_SECRET 7c0414ee6d236f73fd5f382bd35608a5e3c8d513c5b86574eb4a4f02209f1726 3d5102a8a2f61187df77b5acd18f255fc62f29750d6ee9d0204ccdb088d814a5823b4f091d762c490bb3cd2f802cc4b4 CLIENT_TRAFFIC_SECRET_0 7c0414ee6d236f73fd5f382bd35608a5e3c8d513c5b86574eb4a4f02209f1726 bfe83697e414f522fbefb2a7685133bc4a6680277f84e43b9d18c5cbebff3401641d8b6b09a4c2e21c704868dd67c358 SERVER_TRAFFIC_SECRET_0 7c0414ee6d236f73fd5f382bd35608a5e3c8d513c5b86574eb4a4f02209f1726 92aadcdc3987693e5d6650cae166dccee5b305fc52e70901d093d3fa2f8f274d7b773b762e8888a4b23f6831a4c4d30e EXPORTER_SECRET 7c0414ee6d236f73fd5f382bd35608a5e3c8d513c5b86574eb4a4f02209f1726 0be36d7656f908a04847f3d09de9429d611ab539f761a1a96f70674c7efe588ac0829fd11b980490f7816cb5e5f2b61c
You can set that up for yourself in Edit -> Preferences -> Protocols -> TLS (you can type to jump to it), and from there, set the "(Pre)-Master-Secret log filename".
Here's what the client sent:
0000 47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a GET / HTTP/1.1.. 0010 68 6f 73 74 3a 20 65 78 61 6d 70 6c 65 2e 6f 72 host: example.or 0020 67 0d 0a 0d 0a g....
And here's what the server replied with:
0000 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d HTTP/1.1 200 OK. 0010 0a 41 67 65 3a 20 34 38 34 38 39 34 0d 0a 43 61 .Age: 484894..Ca 0020 63 68 65 2d 43 6f 6e 74 72 6f 6c 3a 20 6d 61 78 che-Control: max 0030 2d 61 67 65 3d 36 30 34 38 30 30 0d 0a 43 6f 6e -age=604800..Con 0040 74 65 6e 74 2d 54 79 70 65 3a 20 74 65 78 74 2f tent-Type: text/ 0050 68 74 6d 6c 3b 20 63 68 61 72 73 65 74 3d 55 54 html; charset=UT 0060 46 2d 38 0d 0a 44 61 74 65 3a 20 46 72 69 2c 20 F-8..Date: Fri, 0070 30 37 20 4f 63 74 20 32 30 32 32 20 31 37 3a 30 07 Oct 2022 17:0 0080 35 3a 30 30 20 47 4d 54 0d 0a 45 74 61 67 3a 20 5:00 GMT..Etag: 0090 22 33 31 34 37 35 32 36 39 34 37 2b 69 64 65 6e "3147526947+iden 00a0 74 22 0d 0a 45 78 70 69 72 65 73 3a 20 46 72 69 t"..Expires: Fri 00b0 2c 20 31 34 20 4f 63 74 20 32 30 32 32 20 31 37 , 14 Oct 2022 17 00c0 3a 30 35 3a 30 30 20 47 4d 54 0d 0a 4c 61 73 74 :05:00 GMT..Last 00d0 2d 4d 6f 64 69 66 69 65 64 3a 20 54 68 75 2c 20 -Modified: Thu, 00e0 31 37 20 4f 63 74 20 32 30 31 39 20 30 37 3a 31 17 Oct 2019 07:1 00f0 38 3a 32 36 20 47 4d 54 0d 0a 53 65 72 76 65 72 8:26 GMT..Server 0100 3a 20 45 43 53 20 28 64 63 62 2f 37 46 36 30 29 : ECS (dcb/7F60) 0110 0d 0a 56 61 72 79 3a 20 41 63 63 65 70 74 2d 45 ..Vary: Accept-E 0120 6e 63 6f 64 69 6e 67 0d 0a 58 2d 43 61 63 68 65 ncoding..X-Cache 0130 3a 20 48 49 54 0d 0a 43 6f 6e 74 65 6e 74 2d 4c : HIT..Content-L 0140 65 6e 67 74 68 3a 20 31 32 35 36 0d 0a 0d 0a ength: 1256....
This is all pretty familiar. I feel like we have a fairly good handle on what HTTP/1.1 looks like right now.
Which means we can probably do it ourselves.
First let's get rid of hyper:
Shell session
$ cargo rm hyper Removing hyper from dependencies $ cargo rm hyper-rustls Removing hyper-rustls from dependencies
We'll skip the plaintext part, since we're already at the "setting up rustls ourselves" stage. Because we'll want to be able to use the AsyncRead
/ AsyncWrite
traits, we'll want to pull in tokio-rustls as well.
Shell session
$ cargo add tokio-rustls Updating crates.io index Adding tokio-rustls v0.23.4 to dependencies.
Because we need to tell rustls which certificates to trust, we need to point it to a set of "certificate roots". We could use the Mozilla set or just rely on whatever the OS has installed. Let's go with the latter:
Shell session
$ cargo add rustls-native-certs Updating crates.io index Adding rustls-native-certs v0.6.2 to dependencies.
I also would like some nice error types by default, so, let's grab color-eyre:
Shell session
$ cargo add color-eyre Updating crates.io index Adding color-eyre v0.6.2 to dependencies
And because we're getting into serious business, let's also pull in tracing
and tracing-subscriber
:
Shell session
$ cargo add tracing tracing-subscriber (cut)
Finally, because writing a parser by hand is no fun, let's pull in nom:
Shell session
$ cargo add nom Updating crates.io index Adding nom v7.1.1 to dependencies.
Here's the main structure of our program:
Rust code
// in `src/main.rs` use std::{str::FromStr, sync::Arc}; use color_eyre::eyre::eyre; use nom::Offset; use rustls::{Certificate, ClientConfig, KeyLogFile, RootCertStore}; use tokio::{ io::{AsyncReadExt, AsyncWriteExt}, net::TcpStream, }; use tracing::info; use tracing_subscriber::{filter::targets::Targets, layer::SubscriberExt, util::SubscriberInitExt}; mod http11; #[tokio::main] async fn main() -> color_eyre::Result<()> { color_eyre::install().unwrap(); let filter_layer = Targets::from_str(std::env::var("RUST_LOG").as_deref().unwrap_or("info")).unwrap(); let format_layer = tracing_subscriber::fmt::layer(); tracing_subscriber::registry() .with(filter_layer) .with(format_layer) .init(); info!("Establishing a TCP connection..."); let stream = TcpStream::connect("example.org:443").await?; info!("Setting up TLS root certificate store"); let mut root_store = RootCertStore::empty(); for cert in rustls_native_certs::load_native_certs()? { root_store.add(&Certificate(cert.0))?; } let mut client_config = ClientConfig::builder() .with_safe_defaults() .with_root_certificates(root_store) .with_no_client_auth(); client_config.key_log = Arc::new(KeyLogFile::new()); let connector = tokio_rustls::TlsConnector::from(Arc::new(client_config)); info!("Performing TLS handshake"); let mut stream = connector.connect("example.org".try_into()?, stream).await?; info!("Sending HTTP/1.1 request"); let req = [ "GET / HTTP/1.1", "host: example.org", "user-agent: cool-bear/1.0", "connection: close", "", "", ] .join("\r\n"); // allocates gratuitously which is fine for a sample stream.write_all(req.as_bytes()).await?; info!("Reading HTTP/1.1 response"); let mut accum: Vec<u8> = Default::default(); let mut rd_buf = [0u8; 1024]; let (body_offset, res) = loop { let n = stream.read(&mut rd_buf[..]).await?; info!("Read {n} bytes"); if n == 0 { return Err(eyre!( "unexpected EOF (server closed connection during headers)" )); } accum.extend_from_slice(&rd_buf[..n]); match http11::response(&accum) { Err(e) => { if e.is_incomplete() { info!("Need to read more, continuing"); continue; } else { return Err(eyre!("parse error: {e}")); } } Ok((remain, res)) => { let body_offset = accum.offset(remain); break (body_offset, res); } }; }; info!("Got HTTP/1.1 response: {:#?}", res); let mut body_accum = accum[body_offset..].to_vec(); // header names are case-insensitive, let's get it right. we're assuming // that the absence of content-length means there's no body, and also we // don't support chunked transfer encoding. let content_length = res .headers .iter() .find(|(k, _)| k.eq_ignore_ascii_case("content-length")) .map(|(_, v)| v.parse::<usize>().unwrap()) .unwrap_or_default(); while body_accum.len() < content_length { let n = stream.read(&mut rd_buf[..]).await?; info!("Read {n} bytes"); if n == 0 { return Err(eyre!("unexpected EOF (peer closed connection during body)")); } body_accum.extend_from_slice(&rd_buf[..n]); } info!("===== Response body ====="); info!("{}", String::from_utf8_lossy(&body_accum)); Ok(()) }
At this point, since we're pretty familiar with the protocol itself, I want to take some time to talk about the buffering strategy.
We have rd_buf
, 1KiB on the stack, that's used for making reads, which are then copied into accum
. That essentially adds a memcpy
per read — we could easily do better by growing accum
and reading directly into it.
We could do even better if we were dealing with the headache that is uninitialized data (it's left as an exercise to you, the reader). I took a quick look and apparently by using bytes::BytesMut
it's not as much of a headache as I thought? Still, that's not what I'm focused on in this article. You do it.
I don't think I've said it explicitly, so here goes: we don't control the amount of data we read off of a socket. We can limit it (here we read at most 1024 bytes), but we could be getting data one byte at a time. There is something cleverer to be done parsing-wise, probably, but the solution here is to just accumulate the whole header (and some of the body, accidentally) into a single buffer, and have a "streaming parser".
The parser isn't as streaming as I'd like it to, but it does the job: it gracefully indicates whether the HTTP header is complete or not. And it gets that property simply by using nom's streaming
built-in parsers:
Rust code
// in `src/http11.rs` use nom::{ bytes::streaming::{tag, take_until, take_while1}, character::is_digit, combinator::{map_res, opt}, sequence::{preceded, terminated}, IResult, }; #[derive(Debug)] pub struct Response<'a> { pub status: u16, pub status_text: &'a str, // header names/values could be non-UTF-8, but let's not care for this sample. // we are however careful not to use a HashMap, since headers can repeat. pub headers: Vec<(&'a str, &'a str)>, } const CRLF: &str = "\r\n"; // Looks like `HTTP/1.1 200 OK\r\n` or `HTTP/1.1 404 Not Found\r\n` pub fn response(i: &[u8]) -> IResult<&[u8], Response<'_>> { let (i, _) = tag("HTTP/1.1 ")(i)?; let (i, status) = terminated(u16_text, ws)(i)?; let (i, status_text) = map_res(terminated(take_until(CRLF), tag(CRLF)), std::str::from_utf8)(i)?; let mut res = Response { status, status_text, headers: Default::default(), }; let mut i = i; loop { if let (i, Some(_)) = opt(tag(CRLF))(i)? { // end of headers return Ok((i, res)); } let (i2, (name, value)) = header(i)?; res.headers.push((name, value)); i = i2; } } /// Parses a single header line fn header(i: &[u8]) -> IResult<&[u8], (&str, &str)> { let (i, name) = map_res(terminated(take_until(":"), tag(":")), std::str::from_utf8)(i)?; let (i, value) = map_res( preceded(ws, terminated(take_until(CRLF), tag(CRLF))), std::str::from_utf8, )(i)?; Ok((i, (name, value))) } /// Parses whitespace (not including newlines) fn ws(i: &[u8]) -> IResult<&[u8], ()> { let (i, _) = take_while1(|c| c == b' ')(i)?; Ok((i, ())) } /// Parses text as a u16 fn u16_text(i: &[u8]) -> IResult<&[u8], u16> { let f = take_while1(is_digit); let f = map_res(f, std::str::from_utf8); let mut f = map_res(f, |s| s.parse()); f(i) }
Well, let's check that it actually works
Shell session
$ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 1.89s Running `target/debug/crash` 2022-10-09T16:56:38.643152Z INFO crash: Establishing a TCP connection... 2022-10-09T16:56:38.789985Z INFO crash: Setting up TLS root certificate store 2022-10-09T16:56:38.810252Z INFO crash: Performing TLS handshake 2022-10-09T16:56:39.035877Z INFO crash: Sending HTTP/1.1 request 2022-10-09T16:56:39.035973Z INFO crash: Reading HTTP/1.1 response 2022-10-09T16:56:39.142733Z INFO crash: Read 1024 bytes 2022-10-09T16:56:39.142865Z INFO crash: Got HTTP/1.1 response: Response { status: 200, status_text: "OK", headers: [ ( "Age", "515711", ), ( "Cache-Control", "max-age=604800", ), ( "Content-Type", "text/html; charset=UTF-8", ), ( "Date", "Sun, 09 Oct 2022 16:56:39 GMT", ), ( "Etag", "\"3147526947+ident\"", ), ( "Expires", "Sun, 16 Oct 2022 16:56:39 GMT", ), ( "Last-Modified", "Thu, 17 Oct 2019 07:18:26 GMT", ), ( "Server", "ECS (dcb/7F83)", ), ( "Vary", "Accept-Encoding", ), ( "X-Cache", "HIT", ), ( "Content-Length", "1256", ), ( "Connection", "close", ), ], } 2022-10-09T16:56:39.142994Z INFO crash: Read 586 bytes 2022-10-09T16:56:39.143027Z INFO crash: ===== Response body ===== 2022-10-09T16:56:39.143053Z INFO crash: <!doctype html> <html> <head> <title>Example Domain</title> (cut)
And it does! How cool is that? We've actually got a pretty decent start for a real HTTP/1.1 implementation here. It supports a subset of the spec, needs timeouts, and is a bit particular about what it'll accept, but I've seen worse.
Before we move on, let's take a quick look at some numbers: I've moved TLS configuration out of the way (we're not measuring that), and added code like this in various places, just to measure how long things take:
Rust code
info!("Performing a DNS lookup..."); let before = Instant::now(); let addr = "example.org:443" .to_socket_addrs()? .next() .ok_or_else(|| eyre!("Failed to resolve address for example.org:443"))?; println!("{:?} DNS lookup", before.elapsed()); info!("Establishing a TCP connection..."); let before = Instant::now(); let stream = TcpStream::connect(addr).await?; println!("{:?} TCP connect", before.elapsed());
Oh also, yeah, for measuring purposes, I've separated DNS lookup from the actual TCP handshake. You can do better than that with a crate like trust-dns-resolver, but that's out of scope for this article.
Let's see what we got!
Shell session
$ for i in $(seq 1 3); do echo "-----------------------"; RUST_LOG=warn cargo run --quiet; done ----------------------- 12.70664ms DNS lookup 88.916729ms TCP connect 242.988959ms TLS handshake 49.64µs Request send 212.122418ms Response header read 15.93µs Response body read ----------------------- 47.281324ms DNS lookup 125.635306ms TCP connect 199.093467ms TLS handshake 27.21µs Request send 289.471602ms Response header read 20.93µs Response body read ----------------------- 50.376377ms DNS lookup 92.435129ms TCP connect 288.879476ms TLS handshake 52.74µs Request send 200.708618ms Response header read 18.4µs Response body read
What conclusions can we draw from these numbers?
None. All benchmarks are lies. There's only sadness ahead.
Conclusion number 1: something's up with my DNS setup. I would expect this lookup to be cached. This is thankfully out of scope.
Conclusion number 2: request send is in the microseconds because the kernel does its own buffering. This is how long it takes to make a syscall, not how long it takes to actually reach the example.org server.
Conclusion number 3: the response body read is in the microseconds because the body is small, and it fits in the read we already did for the request header. If we did a request with a larger body (try it!), the numbers would look quite different.
But also... those numbers are pretty bad, right?
There's a couple things we can do here: we can fix my DNS setup (for the time being), we can switch to a release build (cargo run --release
), we can use a site other than example.org
which should be closer / more reliably low-latency, but even then:
Shell session
$ for i in $(seq 1 3); do echo "-----------------------"; RUST_LOG=warn cargo run --release --quiet; done ----------------------- 390.53µs DNS lookup 12.203892ms TCP connect 14.378385ms TLS handshake 2.93µs Request send 116.984718ms Response header read 510ns Response body read ----------------------- 375.14µs DNS lookup 18.963162ms TCP connect 13.734487ms TLS handshake 3.79µs Request send 123.921931ms Response header read 400ns Response body read ----------------------- 382.33µs DNS lookup 10.026293ms TCP connect 15.45463ms TLS handshake 3.47µs Request send 169.120957ms Response header read 460ns Response body read
There's one last thing we forgot to do: set TCP_NODELAY
, which disables Nagle's algorithm.
Previously, the TCP stack would wait until we have "enough data", or "some amount of time has passed", before actually sending the data to the server. With TCP_NODELAY
set, our numbers are looking better:
Shell session
$ for i in $(seq 1 3); do echo "-----------------------"; RUST_LOG=warn cargo run --release --quiet; done ----------------------- 371.14µs DNS lookup 21.606747ms TCP connect 14.028551ms TLS handshake 22.77µs Request send 18.990437ms Response header read 370ns Response body read ----------------------- 396.29µs DNS lookup 33.659129ms TCP connect 14.904545ms TLS handshake 17.32µs Request send 19.642322ms Response header read 360ns Response body read ----------------------- 384.77µs DNS lookup 23.117568ms TCP connect 19.567509ms TLS handshake 29.81µs Request send 19.120578ms Response header read 320ns Response body read
These look a lot better!
But you can still see how one would want to not use connection: close
, and instead use persistent HTTP connections instead.
This is not the same as HTTP pipelining, which I will not discuss, out of spite.
And just like that, we're done with HTTP/1.1. There's a lot more to it, but I feel like we have enough baggage to move on to the next major version of the protocol: HTTP/2.
There's two ways I write articles: either I write about something I already know and I have to feign ignorance, and make up mistakes for didactic purposes. Or I write about something I've actually never done before, and the article acts as structured notes on what I learn.
We've just moved decisively from the former into the latter.
I have been doing HTTP/2, a lot, but I was shielded from the truth by hyper.
So, let's do that first. We're not going to go through reqwest
first - as we've seen, its higher-level niceties don't buy us much in this particular scenario.
Shell session
$ cargo rm tokio-rustls nom $ cargo add hyper -F client,http1,http2,tcp $ cargo add hyper-rustls -F http2
And, well, let's get started!
Things look much like our HTTP/1.1 hyper example, with all the nice things we've added on top:
Rust code
use std::{str::FromStr, sync::Arc}; use hyper::{Client, Request}; use rustls::{Certificate, ClientConfig, KeyLogFile, RootCertStore}; use tracing::info; use tracing_subscriber::{filter::targets::Targets, layer::SubscriberExt, util::SubscriberInitExt}; #[tokio::main] async fn main() -> color_eyre::Result<()> { color_eyre::install().unwrap(); let filter_layer = Targets::from_str(std::env::var("RUST_LOG").as_deref().unwrap_or("info")).unwrap(); let format_layer = tracing_subscriber::fmt::layer(); tracing_subscriber::registry() .with(filter_layer) .with(format_layer) .init(); info!("Setting up TLS root certificate store"); let mut root_store = RootCertStore::empty(); for cert in rustls_native_certs::load_native_certs()? { root_store.add(&Certificate(cert.0))?; } let mut client_config = ClientConfig::builder() .with_safe_defaults() .with_root_certificates(root_store) .with_no_client_auth(); client_config.key_log = Arc::new(KeyLogFile::new()); let connector = hyper_rustls::HttpsConnectorBuilder::new() .with_tls_config(client_config) .https_only() .enable_http2() .build(); let client = Client::builder() .http2_only(true) .build::<_, hyper::Body>(connector); let req = Request::get("https://example.org").body(hyper::Body::empty())?; info!("Performing HTTP/2 request..."); let res = client.request(req).await?; info!("Response header: {:?}", res); info!("Reading response body..."); let body = hyper::body::to_bytes(res.into_body()).await?; info!("Response body is {} bytes", body.len()); Ok(()) }
Phew, some bit of luck that example.org
supports HTTP/2, eh? Did you plan that one out?
Well.. does it? Let's do some packet capture to find out...
Ah, it sure does! I can see HTTP2 as the decoded protocol, so it must be HTTP2.
We can see that TLS is still used, and it's still done over TCP. That's good! That's familiar.
Good thing we have a nice higher-level abstraction to protect us from the implementation details!
Not for long, cool bear... not for long.
See, the thing is, most of the things that are interesting about HTTP/2 are not implemented in the hyper
crate, they're implemented in the h2 crate.
And moving just one level of abstraction lower lets us peer into some of the details of HTTP/2, as a softer introduction of what's to come.
Shell session
$ cargo rm hyper-rustls hyper (cut) $ cargo add h2 tokio-rustls (cut)
Everything up until the TLS connection stage is the same. We can pass our "TLS socket" to the h2
crate directly.
Rust code
use std::{net::ToSocketAddrs, str::FromStr, sync::Arc}; use color_eyre::eyre::eyre; use rustls::{Certificate, ClientConfig, KeyLogFile, RootCertStore}; use tokio::net::TcpStream; use tracing::info; use tracing_subscriber::{filter::targets::Targets, layer::SubscriberExt, util::SubscriberInitExt}; #[tokio::main] async fn main() -> color_eyre::Result<()> { color_eyre::install().unwrap(); let filter_layer = Targets::from_str(std::env::var("RUST_LOG").as_deref().unwrap_or("info")).unwrap(); let format_layer = tracing_subscriber::fmt::layer(); tracing_subscriber::registry() .with(filter_layer) .with(format_layer) .init(); info!("Setting up TLS"); let mut root_store = RootCertStore::empty(); for cert in rustls_native_certs::load_native_certs()? { root_store.add(&Certificate(cert.0))?; } let mut client_config = ClientConfig::builder() .with_safe_defaults() .with_root_certificates(root_store) .with_no_client_auth(); client_config.key_log = Arc::new(KeyLogFile::new()); let connector = tokio_rustls::TlsConnector::from(Arc::new(client_config)); let addr = "example.org:443" .to_socket_addrs()? .next() .ok_or_else(|| eyre!("Failed to resolve address for example.org:443"))?; info!("Establishing TCP connection..."); let stream = TcpStream::connect(addr).await?; info!("Establishing TLS session..."); let stream = connector.connect("example.org".try_into()?, stream).await?; info!("Establishing HTTP/2 connection..."); let (_send_req, conn) = h2::client::handshake(stream).await?; tokio::spawn(conn); info!("Now what?"); Ok(()) }
The API here is interesting - we get a SendRequest
handle, and a Connection
. The Connection
bit implements Future
and we have to poll it for the other handles to make any progress — we achieve that here by simply spawning it onto the tokio runtime ("starting a background task").
And now, we can use the SendRequest
bit to... send a request! That's where we find out about a common crate between hyper
and h2
, the http crate.
Shell session
$ cargo add http Updating crates.io index Adding http v0.2.8 to dependencies
And that's very interesting from a design standpoint. For example, there's nothing preventing us from shoving a transfer-encoding
header, or a connection
header in there, even though that makes no sense in http/2.
But, HeaderMap vaguely behaves like a hash map, and so there's only so much correctness that is being enforced, regardless of what I think.
Anyway, let's send a request!
Rust code
// (rest of `main` omitted) info!("Establishing TLS session..."); let stream = connector.connect("example.org".try_into()?, stream).await?; info!("Establishing HTTP/2 connection..."); let (mut send_req, conn) = h2::client::handshake(stream).await?; tokio::spawn(conn); info!("Sending HTTP/2 request..."); let req = http::Request::builder() .uri("https://example.org/") .body(())?; let (res, _req_body) = send_req.send_request(req, true)?; let res = res.await?; info!("Got HTTP/2 response {res:?}"); let mut body = res.into_body(); let mut body_len = 0; while let Some(chunk) = body.data().await.transpose()? { body_len += chunk.len(); } info!("Got HTTP/2 response body of {body_len} bytes");
Just as before, this should pretty much work out of the box:
Shell session
$ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 3.01s Running `target/debug/crash` 2022-10-09T18:48:20.367727Z INFO crash: Setting up TLS 2022-10-09T18:48:20.417851Z INFO crash: Establishing TCP connection... 2022-10-09T18:48:20.549700Z INFO crash: Establishing TLS session... 2022-10-09T18:48:20.730001Z INFO crash: Establishing HTTP/2 connection... 2022-10-09T18:48:20.730173Z INFO crash: Sending HTTP/2 request... Error: 0: connection error detected: frame with invalid size Location: src/main.rs:55 Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it. Run with RUST_BACKTRACE=full to include source snippets.
Oh.
Mhh.
Say amos, you're still connecting to port 443, right?
Well yeah! It's "just https".
And how does the server know you want to speak HTTP2, exactly?
Ohhhh right. So there's essentially two ways. One is the Upgrade
header, that goes like so:
> OPTIONS / HTTP/1.1 > Host: server.example.org > Connection: Upgrade, HTTP2-Settings > Upgrade: h2c > HTTP2-Settings: <base64url encoding of HTTP/2 SETTINGS payload> < HTTP/1.1 101 Switching Protocols < Connection: Upgrade < Upgrade: h2c (HTTP/2 traffic ensues)
This exchange is adapted from RFC 7540. You can also user other verbs, making one HTTP/1.1 request, and using HTTP/2 for subsequence requests, but as the RFC notes, a large initial request can block the use of the connection for further requests.
But that's for plaintext http2, known affectionately as h2c
("c" for "cleartext", same meaning as "plaintext").
We're already doing TLS, and TLS lets us do ALPN, which stands for Application-Layer Protocol Negotiation: we can tell the server we're willing to speak h2 if they are, just like we use a TLS 1.2 extension to say we can do TLS 1.3 if they're game.
All we're really missing is this line:
Rust code
client_config.alpn_protocols = vec![b"h2".to_vec()];
In the real world, we'd probably want to pass both h2
and http1.1
, so we could fall back to HTTP 1.1 if we wanted, but here we're only about HTTP/2.
And it works:
Shell session
$ cargo run --quiet 2022-10-17T19:43:43.684054Z INFO crash: Setting up TLS 2022-10-17T19:43:43.697403Z INFO crash: Establishing TCP connection... 2022-10-17T19:43:43.780303Z INFO crash: Establishing TLS session... 2022-10-17T19:43:43.949577Z INFO crash: Establishing HTTP/2 connection... 2022-10-17T19:43:43.949817Z INFO crash: Sending HTTP/2 request... 2022-10-17T19:43:44.116467Z INFO crash: Got HTTP/2 response Response { status: 200, version: HTTP/2.0, headers: {"age": "560541", "cache-control": "max-age=604800", "content-type": "text/html; charset=UTF-8", "date": "Mon, 17 Oct 2022 19:43:44 GMT", "etag": "\"3147526947+ident\"", "expires": "Mon, 24 Oct 2022 19:43:44 GMT", "last-modified": "Thu, 17 Oct 2019 07:18:26 GMT", "server": "ECS (dcb/7EA3)", "vary": "Accept-Encoding", "x-cache": "HIT", "content-length": "1256"}, body: RecvStream { inner: FlowControl { inner: OpaqueStreamRef { stream_id: StreamId(1), ref_count: 2 } } } } 2022-10-17T19:43:44.116612Z INFO crash: Got HTTP/2 response body of 1256 bytes
The Debug
implementation of the body here is interesting: it shows a FlowControl
struct, which has an OpaqueStreamRef
. This is one of the big differences between HTTP/1.1 and HTTP/2. Although they both build upon TCP, HTTP/2 brings stream multiplexing.
We can treat our program like a black box and poke at it with strace to verify that this is the case:
Rust code
// omitted: TLS setup, etc. info!("Establishing HTTP/2 connection..."); let (mut send_req, conn) = h2::client::handshake(stream).await?; tokio::spawn(conn); let (tx, mut rx) = tokio::sync::mpsc::channel::<color_eyre::Result<()>>(1); for i in 0..5 { let req = http::Request::builder() .uri("https://example.org/") .body(())?; let (res, _req_body) = send_req.send_request(req, true)?; let fut = async move { let mut body = res.await?.into_body(); info!("{i}: received headers"); let mut body_len = 0; while let Some(chunk) = body.data().await.transpose()? { body_len += chunk.len(); } info!("{i}: received body ({body_len} bytes)"); Ok::<_, color_eyre::Report>(()) }; let tx = tx.clone(); tokio::spawn(async move { _ = tx.send(fut.await).await }); } drop(tx); while let Some(res) = rx.recv().await { res?; }
Shell session
$ cargo build --quiet && strace -e connect ./target/debug/crash 2022-10-17T19:57:33.515259Z INFO crash: Setting up TLS 2022-10-17T19:57:33.528504Z INFO crash: Performing DNS lookup connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(9, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa::3", &sin6_addr), sin6_scope_id=0}, 28) = 0 connect(9, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2606:2800:220:1:248:1893:25c8:1946", &sin6_addr), sin6_scope_id=0}, 28) = 0 connect(9, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0 connect(9, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("93.184.216.34")}, 16) = 0 amos@e7840d9c290d83 ~/bearcove/crash main* ❯ cargo build --quiet && strace -e connect ./target/debug/crash 2022-10-17T19:58:14.593076Z INFO crash: Setting up TLS 2022-10-17T19:58:14.610245Z INFO crash: Performing DNS lookup connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(9, {sa_family=AF_INET6, sin6_port=htons(53), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fdaa::3", &sin6_addr), sin6_scope_id=0}, 28) = 0 connect(9, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2606:2800:220:1:248:1893:25c8:1946", &sin6_addr), sin6_scope_id=0}, 28) = 0 connect(9, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0 connect(9, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("93.184.216.34")}, 16) = 0 2022-10-17T19:58:14.612669Z INFO crash: Establishing TCP connection... connect(9, {sa_family=AF_INET6, sin6_port=htons(443), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2606:2800:220:1:248:1893:25c8:1946", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress) 2022-10-17T19:58:14.695729Z INFO crash: Establishing TLS session... 2022-10-17T19:58:14.865204Z INFO crash: Establishing HTTP/2 connection... 2022-10-17T19:58:15.032358Z INFO crash: 0: received headers 2022-10-17T19:58:15.032444Z INFO crash: 0: received body (1256 bytes) 2022-10-17T19:58:15.033636Z INFO crash: 2: received headers 2022-10-17T19:58:15.033714Z INFO crash: 2: received body (1256 bytes) 2022-10-17T19:58:15.033830Z INFO crash: 3: received headers 2022-10-17T19:58:15.033873Z INFO crash: 3: received body (1256 bytes) 2022-10-17T19:58:15.033924Z INFO crash: 4: received headers 2022-10-17T19:58:15.033857Z INFO crash: 1: received headers 2022-10-17T19:58:15.033988Z INFO crash: 1: received body (1256 bytes) 2022-10-17T19:58:15.033998Z INFO crash: 4: received body (1256 bytes) +++ exited with 0 +++
We can see in those logs there's a single connect
call to port 443 (it also happens to use IPv6, good job example.org
!).
And we can see that the headers and bodies are received out of order, through that single TCP connection!
What we don't see, are the HTTP/2 messages being exchanged. Luckily, this article isn't over yet.
First off, let's get rid of the h2
crate:
Shell session
$ cargo rm h2
And reach out for a couple crates again, for encoding and decoding:
Shell session
$ cargo add byteorder Updating crates.io index Adding byteorder v1.4.3 to dependencies. $ cargo add nom Updating crates.io index Adding nom v7.1.1 to dependencies. $ cargo add enum-repr Updating crates.io index Adding enum-repr v0.2.6 to dependencies. $ cargo add bytes Updating crates.io index Adding bytes v1.2.1 to dependencies.
We'll start by writing decoding and encoding code for HTTP/2 frames, in a separate module:
Rust code
// in `src/h2.rs` use std::{ fmt, ops::{Deref, DerefMut}, }; use enum_repr::EnumRepr; use nom::{ combinator::map_res, number::streaming::{be_u24, be_u8}, sequence::tuple, IResult, }; use tokio::io::{AsyncWrite, AsyncWriteExt}; /// This is sent by h2 clients after negotiating over ALPN, or when doing h2c. pub const PREFACE: &[u8] = b"PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n"; /// See https://httpwg.org/specs/rfc9113.html#FrameTypes #[EnumRepr(type = "u8")] #[derive(Debug)] pub enum FrameType { Data = 0, Headers = 1, Priority = 2, RstStream = 3, Settings = 4, PushPromise = 5, Ping = 6, GoAway = 7, WindowUpdate = 8, Continuation = 9, } /// See https://httpwg.org/specs/rfc9113.html#FrameHeader #[derive(Debug)] pub struct Frame { pub frame_type: FrameType, pub flags: u8, pub reserved: u8, pub stream_id: u32, pub payload: OpaquePayload, } /// This is just used to avoid dumping the entire payload in the [fmt::Debug] /// implementation of [Frame]. #[derive(Default)] pub struct OpaquePayload(pub Vec<u8>); impl Deref for OpaquePayload { type Target = Vec<u8>; fn deref(&self) -> &Self::Target { &self.0 } } impl DerefMut for OpaquePayload { fn deref_mut(&mut self) -> &mut Self::Target { &mut self.0 } } impl fmt::Debug for OpaquePayload { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { f.debug_struct("OpaquePayload") .field("len", &self.0.len()) .finish() } } impl Frame { /// Create a new frame with the given type and stream ID. pub fn new(frame_type: FrameType, stream_id: u32) -> Self { Self { frame_type, flags: 0, reserved: 0, stream_id, payload: Default::default(), } } /// Parse a frame from the given slice. This also takes the payload from the /// slice, and copies it to the heap, which may not be ideal for a production /// implementation. pub fn parse(i: &[u8]) -> IResult<&[u8], Self> { let (i, (length, frame_type, flags, (reserved, stream_id))) = tuple(( be_u24, map_res(be_u8, |u| { FrameType::from_repr(u).ok_or(nom::error::ErrorKind::OneOf) }), be_u8, parse_reserved_and_stream_id, ))(i)?; let (i, payload) = nom::bytes::streaming::take(length)(i)?; let frame = Frame { frame_type, flags, reserved, stream_id, payload: OpaquePayload(payload.to_vec()), }; Ok((i, frame)) } /// Writes a frame to an [AsyncWrite]. pub async fn write(&self, w: &mut (dyn AsyncWrite + Unpin)) -> color_eyre::Result<()> { let mut header = [0u8; 9]; { use byteorder::{BigEndian, WriteBytesExt}; let mut header = &mut header[..]; header.write_u24::<BigEndian>(self.payload.len() as _)?; header.write_u8(self.frame_type.repr())?; header.write_u8(self.flags)?; header.write_u32::<BigEndian>(self.stream_id)?; } // We could be doing vectored I/O here, but there's no // `write_all_vectored` method in [AsyncWriteExt] w.write_all(&header).await?; w.write_all(&self.payload).await?; Ok(()) } } /// See https://httpwg.org/specs/rfc9113.html#FrameHeader - the first bit /// is reserved, and the rest is a 32-bit stream id fn parse_reserved_and_stream_id(i: &[u8]) -> IResult<&[u8], (u8, u32)> { fn reserved(i: (&[u8], usize)) -> IResult<(&[u8], usize), u8> { nom::bits::streaming::take(1_usize)(i) } fn stream_id(i: (&[u8], usize)) -> IResult<(&[u8], usize), u32> { nom::bits::streaming::take(31_usize)(i) } nom::bits::bits(tuple((reserved, stream_id)))(i) }
This is not a complete parser by any stretch of the imagination, but it should be enough to get the h2 server to send us some frames back!
Rust code
// in `src/main.rs` use std::{net::ToSocketAddrs, str::FromStr, sync::Arc}; use bytes::BytesMut; use color_eyre::eyre::eyre; use nom::Offset; use rustls::{Certificate, ClientConfig, KeyLogFile, RootCertStore}; use tokio::{ io::{AsyncReadExt, AsyncWriteExt}, net::TcpStream, }; use tracing::info; use tracing_subscriber::{filter::targets::Targets, layer::SubscriberExt, util::SubscriberInitExt}; use crate::h2::{Frame, FrameType}; mod h2; #[tokio::main] async fn main() -> color_eyre::Result<()> { // this is just a trick to get rust-analyzer to complete the body of the // function better. there's still issues with auto-completion within // functions, see https://github.com/rust-lang/rust-analyzer/issues/13355 real_main().await } async fn real_main() -> color_eyre::Result<()> { color_eyre::install().unwrap(); let filter_layer = Targets::from_str(std::env::var("RUST_LOG").as_deref().unwrap_or("info")).unwrap(); let format_layer = tracing_subscriber::fmt::layer(); tracing_subscriber::registry() .with(filter_layer) .with(format_layer) .init(); info!("Setting up TLS"); let mut root_store = RootCertStore::empty(); for cert in rustls_native_certs::load_native_certs()? { root_store.add(&Certificate(cert.0))?; } let mut client_config = ClientConfig::builder() .with_safe_defaults() .with_root_certificates(root_store) .with_no_client_auth(); client_config.key_log = Arc::new(KeyLogFile::new()); client_config.alpn_protocols = vec![b"h2".to_vec()]; let connector = tokio_rustls::TlsConnector::from(Arc::new(client_config)); info!("Performing DNS lookup"); let addr = "example.org:443" .to_socket_addrs()? .next() .ok_or_else(|| eyre!("Failed to resolve address for example.org:443"))?; info!("Establishing TCP connection..."); let stream = TcpStream::connect(addr).await?; info!("Establishing TLS session..."); let mut stream = connector.connect("example.org".try_into()?, stream).await?; info!("Establishing HTTP/2 connection..."); info!("Writing preface"); stream.write_all(h2::PREFACE).await?; let settings = Frame::new(FrameType::Settings, 0); info!("> {settings:?}"); settings.write(&mut stream).await?; let mut buf: BytesMut = Default::default(); loop { info!("Reading frame ({} bytes so far)", buf.len()); if stream.read_buf(&mut buf).await? == 0 { info!("connection closed!"); return Ok(()); } let slice = &buf[..]; let frame = match Frame::parse(slice) { Ok((rest, frame)) => { buf = buf.split_off(slice.offset(rest)); frame } Err(e) => { if e.is_incomplete() { // keep reading! continue; } panic!("parse error: {e}"); } }; info!("< {frame:?}"); } }
And indeed it does!
Shell session
$ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 2.09s Running `target/debug/crash` 2022-10-20T09:29:21.260528Z INFO crash: Setting up TLS 2022-10-20T09:29:21.282568Z INFO crash: Performing DNS lookup 2022-10-20T09:29:21.295450Z INFO crash: Establishing TCP connection... 2022-10-20T09:29:21.390187Z INFO crash: Establishing TLS session... 2022-10-20T09:29:21.591178Z INFO crash: Establishing HTTP/2 connection... 2022-10-20T09:29:21.591227Z INFO crash: Writing preface 2022-10-20T09:29:21.591293Z INFO crash: > Frame { frame_type: Settings, flags: 0, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T09:29:21.591362Z INFO crash: Reading frame (0 bytes so far) 2022-10-20T09:29:21.705711Z INFO crash: < Frame { frame_type: Settings, flags: 0, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 30 } } 2022-10-20T09:29:21.705789Z INFO crash: Reading frame (13 bytes so far) 2022-10-20T09:29:21.798666Z INFO crash: < Frame { frame_type: WindowUpdate, flags: 0, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 4 } } 2022-10-20T09:29:21.798737Z INFO crash: Reading frame (9 bytes so far) ^C
The SETTINGS frame we send is empty, but the server's SETTINGS frame isn't:
HyperText Transfer Protocol 2 Stream: SETTINGS, Stream ID: 0, Length 30 Length: 30 Type: SETTINGS (4) Flags: 0x00 0... .... .... .... .... .... .... .... = Reserved: 0x0 .000 0000 0000 0000 0000 0000 0000 0000 = Stream Identifier: 0 Settings - Header table size : 4096 Settings - Max concurrent streams : 100 Settings - Initial Windows size : 1048576 Settings - Max frame size : 16384 Settings - Max header list size : 16384
We could pretty easily parse those, but let's ignore them for now — we're only planning on making a simple GET method, we should be way below those limits.
The server also sends us a WINDOW_UPDATE frame:
HyperText Transfer Protocol 2 Stream: WINDOW_UPDATE, Stream ID: 0, Length 4 Length: 4 Type: WINDOW_UPDATE (8) Flags: 0x00 0... .... .... .... .... .... .... .... = Reserved: 0x0 .000 0000 0000 0000 0000 0000 0000 0000 = Stream Identifier: 0 0... .... .... .... .... .... .... .... = Reserved: 0x0 .000 0000 0000 1111 0000 0000 0000 0001 = Window Size Increment: 983041
...giving us a generous 983041-byte (~960 KiB) increment to the flow control window, that applies to the whole connection (the stream identifier is 0). Since it starts at 65535 bytes, we can send exactly one megabyte before the server needs to raise our allowance.
Even though we're not really planning on processing these settings, it's a good idea to acknowledge them, by sending another SETTINGS frame with the ack
flag set (0x01).
To make sure we're not acknowledging the server's own acknowledgement frame, we have to start thinking about flags.
We'll reach out for enumflags2, which conveniently comes with conversions between the underlying representation (here u8
), and a BitFlags<T>
type, letting us test for flag presence, set flags, and show both the numeric value and actual flags in its debug implementation.
Shell session
$ cargo add enumflags2 Updating crates.io index Adding enumflags2 v0.7.5 to dependencies.
To avoid matchine on frame type several times, we'll rename our existing FrameType
enum to RawFrameType
Rust code
// in `src/h2.rs` /// See https://httpwg.org/specs/rfc9113.html#FrameTypes #[EnumRepr(type = "u8")] #[derive(Debug)] pub enum RawFrameType { Data = 0, Headers = 1, // (cut) Continuation = 9, }
And introduce a new FrameType
enum, that contains the flags we know about:
Rust code
// in `src/h2.rs` /// Typed flags for various frame types #[derive(Debug)] pub enum FrameType { Data, Headers, Priority, RstStream, Settings(BitFlags<SettingsFlags>), PushPromise, Ping, GoAway, WindowUpdate, Continuation, }
And then we'll define our SettingsFlags
:
Rust code
// in `src/h2.rs` use enumflags2::{bitflags, BitFlags}; /// See https://httpwg.org/specs/rfc9113.html#SETTINGS #[bitflags] #[repr(u8)] #[derive(Copy, Clone, Debug, PartialEq, Eq)] pub enum SettingsFlags { Ack = 0x01, }
We can now trivially add helpers to convert from (RawFrameType, u8)
to FrameType
and back:
Rust code
// in `src/h2.rs` impl FrameType { fn encode(&self) -> (RawFrameType, u8) { match self { FrameType::Data => (RawFrameType::Data, 0), FrameType::Headers => (RawFrameType::Headers, 0), FrameType::Priority => (RawFrameType::Priority, 0), FrameType::RstStream => (RawFrameType::RstStream, 0), FrameType::Settings(f) => (RawFrameType::Settings, f.bits()), FrameType::PushPromise => (RawFrameType::PushPromise, 0), FrameType::Ping => (RawFrameType::Ping, 0), FrameType::GoAway => (RawFrameType::GoAway, 0), FrameType::WindowUpdate => (RawFrameType::WindowUpdate, 0), FrameType::Continuation => (RawFrameType::Continuation, 0), } } fn decode(ty: RawFrameType, flags: u8) -> Self { match ty { RawFrameType::Data => FrameType::Data, RawFrameType::Headers => FrameType::Headers, RawFrameType::Priority => FrameType::Priority, RawFrameType::RstStream => FrameType::RstStream, RawFrameType::Settings => { FrameType::Settings(BitFlags::<SettingsFlags>::from_bits_truncate(flags)) } RawFrameType::PushPromise => FrameType::PushPromise, RawFrameType::Ping => FrameType::Ping, RawFrameType::GoAway => FrameType::GoAway, RawFrameType::WindowUpdate => FrameType::WindowUpdate, RawFrameType::Continuation => FrameType::Continuation, } } }
Note that these aren't fallible — unknown frame types are handled when parsing RawFrameType
, and unknown flags are simply ignored.
The Frame
struct should refer to the strongly-typed FrameType
now, and should no longer have a flags
field:
Rust code
/// See https://httpwg.org/specs/rfc9113.html#FrameHeader #[derive(Debug)] pub struct Frame { // was `RawFrameType` 👇 (after rename) pub frame_type: FrameType, // removed: `flags: u8` pub reserved: u8, pub stream_id: u32, pub payload: OpaquePayload, }
The Frame::new
method should be updated too, but that's left as an exercise to the reader. Frame::parse
should also now decode:
Rust code
impl Frame { /// Parse a frame from the given slice. This also takes the payload from the /// slice, and copies it to the heap, which may not be ideal for a production /// implementation. pub fn parse(i: &[u8]) -> IResult<&[u8], Self> { let (i, (length, frame_type, flags, (reserved, stream_id))) = tuple(( be_u24, map_res(be_u8, |u| { RawFrameType::from_repr(u).ok_or(nom::error::ErrorKind::OneOf) }), be_u8, parse_reserved_and_stream_id, ))(i)?; let (i, payload) = nom::bytes::streaming::take(length)(i)?; // 👇 new! let frame_type = FrameType::decode(frame_type, flags); let frame = Frame { frame_type, reserved, stream_id, payload: OpaquePayload(payload.to_vec()), }; Ok((i, frame)) } }
And Frame::write
should encode:
Rust code
impl Frame { /// Writes a frame to an [AsyncWrite]. pub async fn write(&self, w: &mut (dyn AsyncWrite + Unpin)) -> color_eyre::Result<()> { let mut header = [0u8; 9]; { use byteorder::{BigEndian, WriteBytesExt}; let mut header = &mut header[..]; header.write_u24::<BigEndian>(self.payload.len() as _)?; let (ty, flags) = self.frame_type.encode(); header.write_u8(ty.repr())?; header.write_u8(flags)?; header.write_u32::<BigEndian>(self.stream_id)?; } // etc. } }
Say, Amos, aren't we getting carried away? Isn't this all throwaway code?
Oh sure, we could just as well have had a frame.flags & 0x01 != 0
or something in main.rs
, but it's actually less mental overhead for me to set up those nice abstractions, even for throwaway code.
It makes everything more readable, and it makes harder to "hold incorrectly", which is an important upside of Rust.
Each frame type has its own set of flags, and with that setup, we cannot accidentally mix them up. We also get a nice Debug
implementation.
Let me update main.rs
real quick, and you'll see:
Rust code
// don't pass any flags 👇 let settings = Frame::new(FrameType::Settings(Default::default()), 0); info!("> {settings:?}"); settings.write(&mut stream).await?;
Shell session
$ cargo run (cut) 2022-10-20T10:17:14.722354Z INFO crash: Writing preface 2022-10-20T10:17:14.722400Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T10:17:14.722466Z INFO crash: Reading frame (0 bytes so far) 2022-10-20T10:17:14.833748Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 30 } } 2022-10-20T10:17:14.833821Z INFO crash: Reading frame (13 bytes so far) 2022-10-20T10:17:14.964432Z INFO crash: < Frame { frame_type: WindowUpdate, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 4 } } 2022-10-20T10:17:14.964502Z INFO crash: Reading frame (9 bytes so far)
Okay, well... there's no flags so far, but, just you wait. In our read loop, if we get a settings frame that doesn't have the ACK flag, we send an ACK ourselves:
Rust code
info!("< {frame:?}"); if let FrameType::Settings(flags) = &frame.frame_type { if !flags.contains(SettingsFlags::Ack) { info!("Acknowledging server settings"); let settings = Frame::new(FrameType::Settings(SettingsFlags::Ack.into()), 0); info!("> {settings:?}"); settings.write(&mut stream).await?; } }
And with that, it's time to send a request! The way we do that is with a HEADERS frame, which has a bunch of interesting flags:
Rust code
/// See https://httpwg.org/specs/rfc9113.html#rfc.section.6.2 #[bitflags] #[repr(u8)] #[derive(Copy, Clone, Debug, PartialEq, Eq)] pub enum HeadersFlags { Priority = 0x20, Padded = 0x08, EndHeaders = 0x04, EndStream = 0x01, }
(The FrameType
enum, along with FrameType::encode
and FrameType::decode
, must be adjusted as well.)
The payload of the HEADERS
frame is a Header Block Fragment, which is encoded with HPACK, that has its own RFC, RFC 7541.
Luckily, there's also existing Rust implementations, which we'll just go ahead and use. The hpack crate hasn't been touched in a few years, let's hope it works just as well.
Shell session
$ cargo add hpack Updating crates.io index Adding hpack v0.3.0 to dependencies.
Mhh do we need to acknowledge the server SETTINGS frame before sending our HEADERS frame?
I don't think so. Here's what the RFC says:
To avoid unnecessary latency, clients are permitted to send additional frames to the server immediately after sending the client connection preface, without waiting to receive the server connection preface.
But then:
It is important to note, however, that the server connection preface SETTINGS frame might include settings that necessarily alter how a client is expected to communicate with the server. Upon receiving the SETTINGS frame, the client is expected to honor any settings established. In some configurations, it is possible for the server to transmit SETTINGS before the client sends additional frames, providing an opportunity to avoid this issue.
And by "some configurations", I'm assuming they mean the Upgrade mechanism, which is not the way we're doing HTTP/2 anyway, so... I guess let's see what happens.
Our request must happen over a new stream. Client-initiated streams must be odd-numbered, and there are no streams so far, so let's pick 1. We don't have a request body, so we can set END_DATA
(which will half-close the stream), and we only have a few headers, that will definitely fit in the initial HEADERS
frame, so we can set END_HEADERS
.
Our complete request sending code now looks like:
Rust code
info!("Writing preface"); stream.write_all(h2::PREFACE).await?; let settings = Frame::new(FrameType::Settings(Default::default()), 0); info!("> {settings:?}"); settings.write(&mut stream).await?; let mut encoder = hpack::Encoder::new(); let headers: &[(&[u8], &[u8])] = &[ (b":method", b"GET"), (b":path", b"/"), (b":scheme", b"https"), (b":authority", b"example.org"), (b"user-agent", b"fasterthanlime/http-crash-course"), // http://www.gnuterrypratchett.com/ (b"x-clacks-overhead", b"GNU Terry Pratchett"), ]; let mut headers_frame = Frame::new( FrameType::Headers(HeadersFlags::EndHeaders | HeadersFlags::EndStream), 1, ); headers_frame.payload.0 = encoder.encode(headers.iter().copied()); info!("> {headers_frame:?}"); headers_frame.write(&mut stream).await?;
Before we try it out, let's also add flags for the DATA
frame, which should be used by the server to send us the response body:
Rust code
/// See https://httpwg.org/specs/rfc9113.html#DATA #[bitflags] #[repr(u8)] #[derive(Copy, Clone, Debug, PartialEq, Eq)] pub enum DataFlags { Padded = 0x08, EndStream = 0x01, }
As before, the FrameType
enum, and its encode
and decode
methods should be adjusted.
And now, for the moment of truth:
Shell session
$ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 2.32s Running `target/debug/crash` 2022-10-20T12:39:16.427502Z INFO crash: Setting up TLS 2022-10-20T12:39:16.447999Z INFO crash: Performing DNS lookup 2022-10-20T12:39:16.460366Z INFO crash: Establishing TCP connection... 2022-10-20T12:39:16.547827Z INFO crash: Establishing TLS session... 2022-10-20T12:39:16.771621Z INFO crash: Establishing HTTP/2 connection... 2022-10-20T12:39:16.771665Z INFO crash: Writing preface 2022-10-20T12:39:16.771710Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:39:16.771909Z INFO crash: > Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b101, EndStream | EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 90 } } 2022-10-20T12:39:16.772016Z INFO crash: Reading frame (0 bytes so far) 2022-10-20T12:39:16.893181Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 30 } } 2022-10-20T12:39:16.893253Z INFO crash: Acknowledging server settings 2022-10-20T12:39:16.893294Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:39:16.893408Z INFO crash: Reading frame (13 bytes so far) 2022-10-20T12:39:16.979533Z INFO crash: < Frame { frame_type: WindowUpdate, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 4 } } 2022-10-20T12:39:16.979605Z INFO crash: Reading frame (12 bytes so far) 2022-10-20T12:39:16.979897Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:39:16.979966Z INFO crash: Reading frame (67 bytes so far) 2022-10-20T12:39:16.980128Z INFO crash: < Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b100, EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 175 } } 2022-10-20T12:39:16.980189Z INFO crash: Reading frame (9 bytes so far) 2022-10-20T12:39:17.021026Z INFO crash: Reading frame (11 bytes so far) 2022-10-20T12:39:17.021079Z INFO crash: Reading frame (256 bytes so far) 2022-10-20T12:39:17.021119Z INFO crash: Reading frame (512 bytes so far) 2022-10-20T12:39:17.021152Z INFO crash: Reading frame (1024 bytes so far) 2022-10-20T12:39:17.021188Z INFO crash: < Frame { frame_type: Data(BitFlags<DataFlags>(0b1, EndStream)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 1256 } } 2022-10-20T12:39:17.021225Z INFO crash: Reading frame (0 bytes so far) ^C
That looks very promising. Because the HEADERS
frame we get back has the END_HEADERS
flag set, I'm guessing all the response headers fit in there - we can simply decode them.
And then, it also seems like the whole response fits in the following DATA
frame, because it has the END_STREAM
flag set.
So, let's read those responses!
Outside our read loop, we can make an hpack decoder, and inside the loop, we'll handle HEADERS
and DATA
frames:
Rust code
let mut decoder = hpack::Decoder::new(); let mut decoder = hpack::Decoder::new(); let mut buf: BytesMut = Default::default(); loop { info!("Reading frame ({} bytes so far)", buf.len()); if stream.read_buf(&mut buf).await? == 0 { info!("connection closed!"); return Ok(()); } let slice = &buf[..]; let frame = match Frame::parse(slice) { Ok((rest, frame)) => { buf = buf.split_off(slice.offset(rest)); frame } Err(e) => { if e.is_incomplete() { // keep reading! continue; } panic!("parse error: {e}"); } }; info!("< {frame:?}"); match &frame.frame_type { FrameType::Settings(flags) => { if !flags.contains(SettingsFlags::Ack) { info!("Acknowledging server settings"); let settings = Frame::new(FrameType::Settings(SettingsFlags::Ack.into()), 0); info!("> {settings:?}"); settings.write(&mut stream).await?; } } FrameType::Headers(flags) => { assert!( !flags.contains(HeadersFlags::Padded), "padding not supported" ); assert!( !flags.contains(HeadersFlags::Priority), "priority not supported" ); assert!( flags.contains(HeadersFlags::EndHeaders), "continuation frames not supported" ); let headers = decoder.decode(&frame.payload.0).unwrap(); for (name, value) in headers { info!( "response header: {}: {}", String::from_utf8_lossy(&name), String::from_utf8_lossy(&value) ); } } FrameType::Data(flags) => { assert!(!flags.contains(DataFlags::Padded), "padding not supported"); assert!( flags.contains(DataFlags::EndStream), "streaming response bodies not supported" ); let response_body = String::from_utf8_lossy(&frame.payload.0); info!( "response body: {}", &response_body[..std::cmp::min(100, response_body.len())] ); info!("All done!"); return Ok(()); } _ => { // ignore other types of frames } } }
And.. let's try it out!
Shell session
$ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 2.01s Running `target/debug/crash` 2022-10-20T12:51:17.223557Z INFO crash: Setting up TLS 2022-10-20T12:51:17.244862Z INFO crash: Performing DNS lookup 2022-10-20T12:51:17.261562Z INFO crash: Establishing TCP connection... 2022-10-20T12:51:17.353975Z INFO crash: Establishing TLS session... 2022-10-20T12:51:17.557466Z INFO crash: Establishing HTTP/2 connection... 2022-10-20T12:51:17.557511Z INFO crash: Writing preface 2022-10-20T12:51:17.557594Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:51:17.557714Z INFO crash: > Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b101, EndStream | EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 90 } } 2022-10-20T12:51:17.557798Z INFO crash: Reading frame (0 bytes so far) 2022-10-20T12:51:17.649268Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 30 } } 2022-10-20T12:51:17.649344Z INFO crash: Acknowledging server settings 2022-10-20T12:51:17.649371Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:51:17.649512Z INFO crash: Reading frame (13 bytes so far) 2022-10-20T12:51:17.732283Z INFO crash: < Frame { frame_type: WindowUpdate, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 4 } } 2022-10-20T12:51:17.732356Z INFO crash: Reading frame (12 bytes so far) 2022-10-20T12:51:17.732416Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:51:17.732465Z INFO crash: Reading frame (67 bytes so far) 2022-10-20T12:51:17.732529Z INFO crash: < Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b100, EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 176 } } 2022-10-20T12:51:17.742680Z INFO crash: response header: :status: 200 amos@sonic ~/bearcove/crash main* ❯ cargo run Compiling crash v0.1.0 (/home/amos/bearcove/crash) Finished dev [unoptimized + debuginfo] target(s) in 2.15s Running `target/debug/crash` 2022-10-20T12:55:38.526593Z INFO crash: Setting up TLS 2022-10-20T12:55:38.547786Z INFO crash: Performing DNS lookup 2022-10-20T12:55:38.597677Z INFO crash: Establishing TCP connection... 2022-10-20T12:55:38.706586Z INFO crash: Establishing TLS session... 2022-10-20T12:55:38.897806Z INFO crash: Establishing HTTP/2 connection... 2022-10-20T12:55:38.897853Z INFO crash: Writing preface 2022-10-20T12:55:38.897915Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:55:38.898052Z INFO crash: > Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b101, EndStream | EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 90 } } 2022-10-20T12:55:38.898135Z INFO crash: Reading frame (0 bytes so far) 2022-10-20T12:55:38.986090Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b0)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 30 } } 2022-10-20T12:55:38.987852Z INFO crash: Acknowledging server settings 2022-10-20T12:55:38.987898Z INFO crash: > Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:55:38.987999Z INFO crash: Reading frame (13 bytes so far) 2022-10-20T12:55:39.079214Z INFO crash: < Frame { frame_type: WindowUpdate, reserved: 0, stream_id: 0, payload: OpaquePayload { len: 4 } } 2022-10-20T12:55:39.079273Z INFO crash: Reading frame (12 bytes so far) 2022-10-20T12:55:39.079338Z INFO crash: < Frame { frame_type: Settings(BitFlags<SettingsFlags>(0b1, Ack)), reserved: 0, stream_id: 0, payload: OpaquePayload { len: 0 } } 2022-10-20T12:55:39.079377Z INFO crash: Reading frame (67 bytes so far) 2022-10-20T12:55:39.079403Z INFO crash: < Frame { frame_type: Headers(BitFlags<HeadersFlags>(0b100, EndHeaders)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 175 } } 2022-10-20T12:55:39.088713Z INFO crash: response header: :status: 200 2022-10-20T12:55:39.088747Z INFO crash: response header: age: 524935 2022-10-20T12:55:39.088793Z INFO crash: response header: cache-control: max-age=604800 2022-10-20T12:55:39.088819Z INFO crash: response header: content-type: text/html; charset=UTF-8 2022-10-20T12:55:39.088834Z INFO crash: response header: date: Thu, 20 Oct 2022 12:55:39 GMT 2022-10-20T12:55:39.088850Z INFO crash: response header: etag: "3147526947+ident" 2022-10-20T12:55:39.088865Z INFO crash: response header: expires: Thu, 27 Oct 2022 12:55:39 GMT 2022-10-20T12:55:39.088881Z INFO crash: response header: last-modified: Thu, 17 Oct 2019 07:18:26 GMT 2022-10-20T12:55:39.088908Z INFO crash: response header: server: ECS (dcb/7F84) 2022-10-20T12:55:39.088923Z INFO crash: response header: vary: Accept-Encoding 2022-10-20T12:55:39.088939Z INFO crash: response header: x-cache: HIT 2022-10-20T12:55:39.088968Z INFO crash: response header: content-length: 1256 2022-10-20T12:55:39.088985Z INFO crash: Reading frame (9 bytes so far) 2022-10-20T12:55:39.126261Z INFO crash: Reading frame (11 bytes so far) 2022-10-20T12:55:39.126332Z INFO crash: Reading frame (256 bytes so far) 2022-10-20T12:55:39.126367Z INFO crash: Reading frame (512 bytes so far) 2022-10-20T12:55:39.126411Z INFO crash: Reading frame (1024 bytes so far) 2022-10-20T12:55:39.126460Z INFO crash: < Frame { frame_type: Data(BitFlags<DataFlags>(0b1, EndStream)), reserved: 0, stream_id: 1, payload: OpaquePayload { len: 1256 } } 2022-10-20T12:55:39.126519Z INFO crash: response body: <!doctype html> <html> <head> <title>Example Domain</title> <meta charset="utf-8" /> <m 2022-10-20T12:55:39.126552Z INFO crash: All done!
And that's it!
We've made an http/2 request all by ourselves, re-using "only" DNS, TCP, TLS, and HPACK.
You want to throw Ethernet in there for good measure or?
Nah, I'm good. Now that we've implemented HTTP/1.1 and HTTP/2 (minimally), we can properly compare them.
In HTTP/1.1, the "Host" header is used to pick "which of the virtual servers behind this IP address we're trying to reach". And we mentioned that it can be an issue if it doesn't match the hostname passed in the SNI extension during the TLS handshake.
HTTP/2 has that same issue (except the :authority
pseudo-header is used instead), and more! Because HTTP/2 allows pushing responses, before the client even knows it wants them.
Here's what RFC 9113 says about it:
Server push was designed to allow a server to improve client-perceived performance by predicting what requests will follow those that it receives, thereby removing a round trip for them.
For example, a request for HTML is often followed by requests for stylesheets and scripts referenced by that page. When these requests are pushed, the client does not need to wait to receive the references to them in the HTML and issue separate requests.
But also:
In practice, server push is difficult to use effectively, because it requires the server to correctly anticipate the additional requests the client will make, taking into account factors such as caching, content negotiation, and user behavior.
Errors in prediction can lead to performance degradation, due to the opportunity cost that the additional data on the wire represents. In particular, pushing any significant amount of data can cause contention issues with responses that are more important.
In practice, the world is moving towards early hints instead, which works over HTTP/1.1, 2, and 3, although RFC 8297 warns about compatibility:
Some clients might have issues handling a 103 (Early Hints) response, because informational responses are rarely used in reply to requests not including an Expect header field.
In particular, an HTTP/1.1 client that mishandles an informational response as a final response is likely to consider all responses to the succeeding requests sent over the same connection to be part of the final response. Such behavior might constitute a cross-origin information disclosure vulnerability in case the client multiplexes requests to different origins onto a single persistent connection.
Therefore, a server might refrain from sending 103 (Early Hints) responses over HTTP/1.1 unless the client is known to handle informational responses correctly.
Isn't that fun? I haven't even talked about informational responses in this article yet, and it's already long enough that I'll get e-mails about how it should've been a series instead (the chunked transfer-encoding of fasterthanlime writing).
Besides being hard to use properly, HTTP/2 push can only be used for evil. Servers can push responses for any :authority
, and it's up to the client to check that the server is, in fact, authoritative.
So just making your server well-behaved doesn't protect your web application. Every client implementation must be secure against that (or disable HTTP/2 push altogether, which I'd probably advise doing, at this point).
The happy path of HTTP/2 is pretty clear: a single HTTP/2 connection is "simply" a bundle of (bidirectional) streams, which feel a lot like TCP connections, except the concept of headers (and trailers) is baked into the framing format, so there's a lot less opportunity for accidentally interpreting part of the headers as the body, or vice versa.
Chunked transfer encoding doesn't really exist in HTTP/2 anymore, or rather, it's the default: each "chunk" is a DATA
frame, and the last one has the END_DATA
flag set, whether it's empty or not.
HTTP/2 doesn't use hexadecimal encoding for numbers anymore (which HTTP/1.1 used for chunk lengths), and... one would hope it doesn't use decimal encoding anymore, but...
Actually yeah, what happens to the content-length
header?
...that's an excellent question! Which reminds me, as of HTTP/2, all header names are lowercased, which is the Correct Decision (don't @ me):
Field names MUST be converted to lowercase when constructing an HTTP/2 message.
The content-length
header is still a thing, if you want to, but it must be correct, or else the message is malformed:
A request or response that includes message content can include a content-length header field. A request or response is also malformed if the value of a content-length header field does not equal the sum of the DATA frame payload lengths that form the content...
Of course there's a catch:
...unless the message is defined as having no content. For example, 204 or 304 responses contain no content, as does the response to a HEAD request. A response that is defined to have no content, as described in Section 6.4.1 of HTTP, MAY have a non-zero content-length header field, even though no content is included in DATA frames.
204 here being "No Content" (the response to a HEAD request), and 304 being "Not Modified" (the response to a conditional request, that tells the server "what version of the cacheable resource we have").
For request bodies, the same applies: you can set a content-length
header, but you better make sure it matches the data you send. Chunked transfer encoding is also deprecated / the default here, they're just DATA frames on the same stream, in the client->server direction.
Things get more interesting when you start thinking about what happens when a LOT of concurrent requests are done over the same HTTP/2 connection.
Before we even have to think about flow control, there's the question of stream IDs. We have stream IDs available (that's 2 million). Can we run out of them?
Yes we can. Without even sending 2 million requests. The only requirement when generating a new stream ID is that it must be "numerically greater than all the streams that the initiating endpoint has opened or reserved".
What happens when we run out of stream identifiers? It's easy for a client: they just open another HTTP/2 connection and go on about their business. For a server, it's a bit more awkward — they have to politely ask the client to go away. With a GOAWAY
frame.
This is interesting, because it means HTTP/2 can send errors out-of-band. In HTTP/1.1 land, if there's an error while generating the body, you're out of luck. Well, you can send some trailers maybe.
But just because an HTTP/2 server sends a GOAWAY
, doesn't mean the whole connection can be thrown away! There might still be several requests or responses in-flight, which must be allowed to gracefully complete.
And there's a smörgåsbord of potential race conditions here: if the client receives a GOAWAY
, how does it know which of the requests it has sent have been accepted by the server? Well, it knows because in the GOAWAY
payload, a "last stream ID" is specified (along with an error code and some optional, additional debug data).
So, from the perspective of the server, we might have:
- Receive
HEADERS
for stream 1 - Receive
HEADERS
for stream 3 - Receive
HEADERS
for stream 5 - Decide to do graceful shutdown, send
GOAWAY
with last stream id 5
And from the perspective of the client, we might have:
- Send
HEADERS
for stream 1 - Send
HEADERS
for stream 3 - Send
HEADERS
for stream 5 - Send
HEADERS
for stream 7 - Send
HEADERS
for stream 9 - Send
HEADERS
for stream 11 - Receive
GOAWAY
with last stream id of 5
And the client should know that requests 7, 9, and 11 will not be processed by the server and should be retried elsewhere.
This isn't something that can be really achieved with HTTP/1.1. If the TCP connection is closed without returning a response header, there's no telling what the server actually did. Returning a 429 (Too Many Requests) or a 503 (Service Unavailable) may indicate that a request is safe to retry, but that's backend-dependent. If you're writing a proxy, well... you need special response headers to indicate whether those came from the app you're proxying to, or the proxy itself.
I love how optimistic some web documentation is, by the way: did you know the Retry-After
header for 503 indicates "how long the service is expected to be unavailable" (according to MDN)? Does anyone use this for schedule maintenance, and return a value other than "idk try again in a couple seconds I guess"?
If you do, please send me a message. I always love to be proven wrong.
Flow control is a rich bug area with HTTP/2. We've already seen that either peer can send WINDOW_UPDATE
frames whenever they feel like it. But they can also send a SETTINGS_INITIAL_WINDOW_SIZE
as part of the SETTINGS
frame, which, as you may or may not remember, the client doesn't need to wait for, before sending requests.
That means the initial window size can be retroactively changed, and the relevant passage from the RFC is just delicious:
A change to SETTINGS_INITIAL_WINDOW_SIZE can cause the available space in a flow-control window to become negative. A sender MUST track the negative flow-control window and MUST NOT send new flow-controlled frames until it receives WINDOW_UPDATE frames that cause the flow-control window to become positive.
For example, if the client sends 60 KB immediately on connection establishment and the server sets the initial window size to be 16 KB, the client will recalculate the available flow-control window to be -44 KB on receipt of the SETTINGS frame. The client retains a negative flow-control window until WINDOW_UPDATE frames restore the window to being positive, after which the client can resume sending.
Furthermore, the size of the flow-control window for a stream can be changed by sending a SETTINGS
frame, but again, since everything is asynchronous, the peer that sent that frame must be prepared to receive more data than allowed by the new window size, until the SETTINGS
frame is acknowledged.
But what if the SETTINGS
frame is never acknowledged?
Ah, well the RFC covers that:
If the sender of a SETTINGS frame does not receive an acknowledgment within a reasonable amount of time, it MAY issue a connection error (Section 5.4.1) of type SETTINGS_TIMEOUT. In setting a timeout, some allowance needs to be made for processing delays at the peer; a timeout that is solely based on the round-trip time between endpoints might result in spurious errors.
A less nuclear option is to yeet a single stream (the one that's going over its flow-control window):
The receiver MAY instead send a RST_STREAM with an error code of FLOW_CONTROL_ERROR for the affected streams.
This is important because, again: in HTTP/1.1, you can simply close the TCP connection, but in HTTP/2, you must act on the stream level, where all streams share the same TCP connection.
And that's neat if you're monitoring your HTTP endpoints because, in HTTP/1.1, if you change your mind about making a specific HTTP request for example, all you can do is close the TCP connection in the middle of the server streaming the response back to you.
The server will never know if you went away by choice, or through an unfortunate sequence of events (loss of connectivity, OOM-kill, etc.). But with HTTP/2, you can send a RST_STREAM
with the CANCEL
error message.
(Chances are your HTTP/2 client won't do that, but it could. I like having at least the option of doing things properly).
That's not the end of flow control "fun". In section 10.5 (Denial-of-Service Considerations) of RFC 9113, we learn about whole new categories of things to fear.
Some are pretty basic: spamming SETTINGS
frames (each of them requiring a separate acknowledgement), or PING
frames, or sending WINDOW_UPDATE
frames with a tiny increment, forcing the other peer to generate a lot of tiny DATA
frames.
But some of the most "entertaining" ones have to do with the interaction of TCP and HTTP/2 flow control:
An attacker can provide large amounts of flow-control credit at the HTTP/2 layer but withhold credit at the TCP layer, preventing frames from being sent. An endpoint that constructs and remembers frames for sending without considering TCP limits might be subject to resource exhaustion.
And of course, because compression is involved in headers now, a whole other series of attacks becomes relevant, like CRIME, which stands for "Compression Ratio Info-leak Made Easy", and involves guessing a header value by chosen plaintext:
(..) the attacker being able to observe the size of the ciphertext sent by the browser while at the same time inducing the browser to make multiple carefully crafted web connections to the target site.
The attacker then observes the change in size of the compressed request payload, which contains both the secret cookie that is sent by the browser only to the target site, and variable content created by the attacker, as the variable content is altered.
When the size of the compressed content is reduced, it can be inferred that it is probable that some part of the injected content matches some part of the source, which includes the secret content that the attacker desires to discover.
As I said. Very entertaining.
Originally I started writing this article to tell y'all about a funny bug we had at work, but it's been a month, and two things happened:
- I decided it would be a swell idea to let this article turn into "let's implement H1+H2 from semi-scratch"
- We've had other, gnarlier H2 bugs that I don't have explanations for, yet.
Looking back at the original (internal) write-up I did for the bug, I'm not sure I even understand it anymore. I've tried coming up with a small reproduction for this article, but I failed at both the "small" and "reproduction" aspects of that endeavor.
So instead, we shall be looking at another fun bug!
Here's my test code:
TOML markup
# in `h2-repro/Cargo.toml` [package] name = "h2-repro" version = "0.1.0" edition = "2021" [dependencies] color-eyre = "0.6.2" hyper = { version = "0.14.20", features = ["client", "server", "http2", "tcp"] } tokio = { version = "1.21.2", features = ["full"] } tracing = "0.1.37" tracing-subscriber = { version = "0.3.16" }
Rust code
// in `in h2-repro/src/main.rs` use std::{convert::Infallible, net::TcpListener, str::FromStr, time::Duration}; use color_eyre::eyre; use hyper::{ body::Bytes, client::HttpConnector, service::{make_service_fn, service_fn}, Body, Client, Method, Request, Response, }; use tokio::sync::mpsc; use tracing::{error, info}; use tracing_subscriber::{filter::Targets, layer::SubscriberExt, util::SubscriberInitExt}; #[tokio::main] async fn main() { real_main().await.unwrap() } async fn real_main() -> eyre::Result<()> { color_eyre::install().unwrap(); let filter_layer = Targets::from_str(std::env::var("RUST_LOG").as_deref().unwrap_or("info")).unwrap(); let format_layer = tracing_subscriber::fmt::layer(); tracing_subscriber::registry() .with(filter_layer) .with(format_layer) .init(); let h2_max_streams: u32 = std::env::var("H2_MAX_STREAMS") .map(|s| s.parse().unwrap()) .unwrap_or(50); let h2_requests = std::env::var("H2_REQUESTS") .map(|s| s.parse().unwrap()) .unwrap_or(100); info!("{h2_requests} requests on {h2_max_streams} streams"); info!("(Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust)"); run_test(false, h2_max_streams, h2_requests).await?; run_test(true, h2_max_streams, h2_requests).await?; Ok(()) } async fn run_test(h2_only: bool, h2_max_streams: u32, h2_requests: u32) -> eyre::Result<()> { let prefix = if h2_only { "H2" } else { "H1" }; let ln = TcpListener::bind("[::]:0")?; let addr = ln.local_addr()?; let server = hyper::server::Server::from_tcp(ln)? .http2_max_concurrent_streams(h2_max_streams) .http2_only(h2_only) .serve(make_service_fn(|_conn| async { Ok::<_, Infallible>(service_fn(sample_endpoint)) })); let _server_jh = tokio::spawn(async move { server.await.unwrap(); }); let client = Client::builder().http2_only(h2_only).build_http::<Body>(); let (tx, mut rx) = mpsc::channel::<eyre::Result<()>>(4096); let body = Bytes::from(vec![0u8; 65535 + 1]); async fn do_one_request(req: Request<Body>, client: Client<HttpConnector>) -> eyre::Result<()> { let res = client.request(req).await?; _ = hyper::body::to_bytes(res.into_body()).await?; Ok(()) } for _ in 0..h2_requests { let req = Request::builder() .uri(format!("http://{addr}")) .method(Method::POST) .body(Body::from(body.clone()))?; let fut = do_one_request(req, client.clone()); let tx = tx.clone(); tokio::spawn(async move { _ = tx.send(fut.await).await }); } drop(tx); let mut complete_reqs = 0; while let Ok(Some(res)) = tokio::time::timeout(Duration::from_millis(500), rx.recv()).await { res?; complete_reqs += 1; } if complete_reqs != h2_requests { error!("{prefix}: Stuck at {complete_reqs} / {h2_requests}"); } else { info!("{prefix}: Completed {complete_reqs} / {h2_requests}"); } Ok(()) } async fn sample_endpoint(req: Request<Body>) -> Result<Response<Body>, Infallible> { let (_parts, req_body) = req.into_parts(); hyper::body::to_bytes(req_body).await.unwrap(); let res = Response::new("hi there".into()); Ok(res) }
This "simply" makes many concurrent requests with the same hyper::Client
to a hyper server, over HTTP/1.1, and then HTTP/2. (Separate clients, separate servers).
We can get it to work reliably:
Shell session
$ RUST_BACKTRACE=1 H2_MAX_STREAMS=100 H2_REQUESTS=100 cargo run --release --quiet 2022-10-20T18:35:28.635803Z INFO h2_repro: 100 requests on 100 streams 2022-10-20T18:35:28.635826Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:35:28.717719Z INFO h2_repro: H1: Completed 100 / 100 2022-10-20T18:35:28.814155Z INFO h2_repro: H2: Completed 100 / 100 $ RUST_BACKTRACE=1 H2_MAX_STREAMS=100 H2_REQUESTS=100 cargo run --release --quiet 2022-10-20T18:35:30.478822Z INFO h2_repro: 100 requests on 100 streams 2022-10-20T18:35:30.478842Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:35:30.589255Z INFO h2_repro: H1: Completed 100 / 100 2022-10-20T18:35:30.692042Z INFO h2_repro: H2: Completed 100 / 100 $ RUST_BACKTRACE=1 H2_MAX_STREAMS=100 H2_REQUESTS=100 cargo run --release --quiet 2022-10-20T18:35:31.202745Z INFO h2_repro: 100 requests on 100 streams 2022-10-20T18:35:31.202769Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:35:31.287174Z INFO h2_repro: H1: Completed 100 / 100 2022-10-20T18:35:31.430318Z INFO h2_repro: H2: Completed 100 / 100
What's interesting is what happens if the maximum number of streams is set lower than the maximum amount of requests we make.
Sometimes, the server actually gets to send a RST_STREAM
with error code REFUSED_STREAM
Shell session
$ RUST_BACKTRACE=1 H2_MAX_STREAMS=50 H2_REQUESTS=100 cargo run --release --quiet 2022-10-20T18:35:46.732954Z INFO h2_repro: 100 requests on 50 streams 2022-10-20T18:35:46.732986Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:35:46.782711Z INFO h2_repro: H1: Completed 100 / 100 The application panicked (crashed). Message: called `Result::unwrap()` on an `Err` value: 0: http2 error: stream error received: refused stream before processing any application logic 1: stream error received: refused stream before processing any application logic Location: src/main.rs:69 (cut)
In my view, this is a bug: I think hyper::Client
should retry the request. But I'm sympathetic to the argument that hyper has already encoded headers, and probably started sending the request body, and there's no trait bound for "replaying bodies", and it doesn't want to do its own buffering there, so, sure.
Some other times, though, something much more fun happens:
Shell session
$ RUST_BACKTRACE=1 H2_MAX_STREAMS=50 H2_REQUESTS=100 cargo run --release --quiet 2022-10-20T18:40:30.826371Z INFO h2_repro: 100 requests on 50 streams 2022-10-20T18:40:30.826391Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:40:30.909172Z INFO h2_repro: H1: Completed 100 / 100 2022-10-20T18:40:31.491769Z ERROR h2_repro: H2: Stuck at 51 / 100
It gets stuck!
Shell session
$ RUST_BACKTRACE=1 H2_MAX_STREAMS=200 H2_REQUESTS=240 cargo run --release --quiet 2022-10-20T18:41:28.854054Z INFO h2_repro: 240 requests on 200 streams 2022-10-20T18:41:28.854078Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:41:28.981241Z INFO h2_repro: H1: Completed 240 / 240 2022-10-20T18:41:29.571869Z ERROR h2_repro: H2: Stuck at 222 / 240
The test loses patience after 500 milliseconds, but believe me, it never gets unstuck.
Shell session
$ RUST_BACKTRACE=1 H2_MAX_STREAMS=100 H2_REQUESTS=150 cargo run --release --quiet 2022-10-20T18:41:52.152636Z INFO h2_repro: 150 requests on 100 streams 2022-10-20T18:41:52.152658Z INFO h2_repro: (Set $H2_REQUESTS and $H2_MAX_STREAMS environment variables to adjust) 2022-10-20T18:41:52.206723Z INFO h2_repro: H1: Completed 150 / 150 2022-10-20T18:41:52.775941Z ERROR h2_repro: H2: Stuck at 143 / 150
And now you're going to explain why that happens, right?
Oh no. I've done quite enough for now. That issue has been open for quite a while now. Maybe someone else will crack it!
Unlike HTTP/1.1, HTTP/2 has a conformance testing tool, h2spec. But it should be considered more of a starting point: there's plenty of opportunity for unfortunate interactions after that.
Thanks for following me into the depths of HTTP 1.1 and 2. Apologies for not covering HTTP/3, although, in fairness, it's probably as involved as both its predecessors added together.
Researching this article, as always, gave me simultaneously much-needed humility, and a renewed motivation to experiment further with implementing protocols from scratch (or nearly scratch).
I've been working on a new H1/H2 implementation in Rust, with very specific design objectives:
- Only target Linux
- Use
io_uring
for asynchronous I/O (through tokio-uring right now) - Use rustls for TLS handshakes, then kTLS (see this PR)
- Control memory usage carefully, using a fixed-size buffer pool
- Provide visibility into the exact state of H1/H2 connections
The requirements are so different from hyper/h2, that they're not at all competitors. It's also still very early stages: HTTP/1.1 barely works, and I just got a (pretty poor implementation of) chunked transfer-encoding last week.
What's interesting about building on top of io_uring
is that a lot of the async interface I was used to don't work or make sense anymore. A lot has been written about what the "proper" Rust interface for io_uring
should be: the short answer is that as soon as you submit an operation, the buffer is no longer owned by the application: it's owned by the kernel.
So if you look at tokio-uring's TcpStream::write
method for example, you'll see it takes a T: IoBuf
, owned, and returns a BufResult<usize, T>
, which is just a (Result<T>, B)
.
Which means, in practice, you find yourself writing yourself a lot of code like:
Rust code
let mut buf = get_a_buf(); let res; (res, buf) = src.read(buf).await; res?;
The naive solution is to use Vec<u8>
as a buffer type. A slightly less naive solution is to use something like Bytes
from the bytes crate (which is actually what hyper uses throughout).
But even then, Bytes
does more work than we need, because it's Send
, so it uses atomic reference-counting under the hood. tokio-uring
has its own start
method to start a runtime, and it's effectively a current_thread
tokio runtime: things don't need to be Send
because there's only ever one userland thread.
If you to utilize multiple cores, you can "simply" start one tokio-uring
runtime per core. Using socket options like SO_REUSEADDR
and SO_REUSEPORT
, accepted connections can be spread across workers. This isn't the only solution, or necessarily the right one, but it's an easy one.
And now, if you're able to do most of the processing for a connection in its own thread, limiting interactions with process-wide state as much as possible, you don't need atomic reference counting — regular reference counting works.
I've always wondered what the true cost of the Arc<GlobalState>
model is for high-traffic hyper applications. I was surprised to find out that Arc is not so free, under certain circumstances.
Similarly, most mutexes can now transform into RefCells, which still enforce "only one single mutable reference to something at any given time", but never spin, never block, never yield to another thread.
Inevitably, there'll still be some global state to interact with — I'm hoping communicating over channels with some "controller" runtime won't end up being prohibitively expensive.
Another thing I've found frustrating while working with hyper
is how little internal state it exposes. The complex lifecycle of some HTTP connections is somewhat lost behind abstractions like tower::Service
, and most of it is "tasks spawned on the runtime". Even with custom acceptors, custom body types, etc., I've been stumped multiple times.
This of course, makes sense for the de-facto standard HTTP implementation for Rust: hyper's public API has been remarkably stable, and changing it now would be an enormous undertaking, that would no doubt make a lot of noise (not necessarily in a good way).
I'm a big fan of the "let users as much of the internal state as possible" school of thought. It makes unknown unknowns a lot easier to chase down. That's what I'm going for with this specific exploration.
Anyway! I'm really excited about this work, and I do think more folks should play around with their own implementation of HTTP and adjacent protocols: Rust is truly an excellent language for it.
The inner workings of HTTP/2 were intimidating and mysterious before I wrote this article, and now they feel somewhat approachable.
I hope this article did the same for you — that's the whole reason I started writing "seriously" in the first place :)
Thanks for reading, and as always: take excellent care of yourself.
from Hacker News https://ift.tt/gsbXHV1
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.