Friday, April 23, 2021

Solving the Vaccine Data Problem

VaccinateCA, the non-profit I have been leading for the last few months, has expanded to Vaccinate The States. (Consider it a beta; we will keep improving it, but it can save lives today, so today it goes up.)

We have the country’s largest and best public data set on covid-19 vaccine availability. We get it by directly gathering intelligence (largely by calling healthcare providers), collating public sources, brokering data between other projects, and applying some engineering and operations elbow grease. We work with the federal government, some states and counties, and the world’s largest publishers (such as e.g. Google Maps).

The vaccination effort remains critically important in America. Half of our neighbors have not been vaccinated yet. The faster we can help them, the faster normal life returns and the faster we can use America’s manufacturing capacity to save lives elsewhere.

Here’s what we’ve learned, and how you can help.

How this started

Back in January, vaccine availability was chaotic. Distribution and allocation decisions were made at multiple levels of government and private enterprise via processes and systems which did not interoperate. Eligibility decisions were guided by a patchwork of policies and often overridden by individual healthcare workers. The situation was a mess.

Our friends and loved ones reported that they had to visit upwards of a dozen websites to figure out where they might be able to get a vaccination. They were calling 20+ locations sequentially and swapping tips via messenger. This clearly could not stand.

There were some whispers that someone in a position of authority was going to launch a Grand Unified Site which had all the information on it Any Day Now. I thought this was unlikely, and thought that the private sector would likely have to step into the gap. I tweeted that this would make a good hackathon-style project for interested technologists.

Within hours, the brightest team of people I’ve ever had the pleasure of working with were building it on Discord, and making all the right decisions. So I joined up for an evening, then a weekend, and then one thing lead to another and I ended up as CEO.

The core insight of VaccinateCA on Day 1 was that, if doses were actually being injected, that the person actually doing the injecting both a) knew whether or not they had more doses and b) to whom they could administer them. Essentially all other data in the ecosystem were stale or lies. Promulgated county policy? Doesn’t matter if a pharmacy chain isn’t following it. Reported stock numbers? Doesn’t matter if the database disagrees with the pharmacist on how many doses are left in the bottle.

While many waited for to be a grand technical solution to the issue, we started with the scrappy startup version: call locations which might have doses. There are, after all, a finite number of pharmacies and hospitals. At this point we were unknown, uncredentialed, unaided volunteers who had no special connections or expertise, so we did exactly what a vaccine seeker would do: pick up the phone book and start dialing.

I’ll never forget my first call, at 2 AM in Tokyo, to a pharmacy in California. “Excuse me, can I speak to the pharmacist?” “One moment.” … “Yes, how can I help you?” “Iwaswonderingifyouhadthecovidvaccine.” “I’m afraid we don’t have it yet, but we expect to get it it maybe two weeks.” “Thank you. If a patient who was 65 wanted it, what would they need to do?” “Register on the county website. Here’s the address: … “ “Thank you.”

And just like that, we learned that you don’t have to be anybody special to get useful information out of a pharmacy by calling their published number.

We did things the scrappy startup way, putting call results into Airtable (a wonderful product, by the way). Our website, for the first few months, was static HTML generated by Jekyll with a bit of JS on the frontend. The API was a JSON file in the cloud exported from Airtable on a cronjob.

And it started working, very quickly. We began to get anecdotal reports of successful vaccination (generally of a user’s parents) within 24 hours of launching. Within a few days, it was obvious that the little hackathon project was much more impactful than we had ever expected, and so we intensified efforts, becoming a “real” non-profit, hiring some folks onto the project full-time, and quickly iterating on strategies.

We’ve since served millions of users, had hundreds of thousands of them get vaccinated, and become the best public data set on vaccine availability in California… and soon the nation.

How this has evolved

This has been the fastest-moving startup I’ve ever worked on, both because of the speed at which we were working and the speed at which reality changed out from under us. We were trying to give away the most anticipated product launch in history; our competition had 50% week-over-week growth and was the world’s leading expert in viral marketing.

As we got a bit of press for being a volunteer-led effort while there were not high-quality government-backed efforts available, we were worried about getting told to not interfere. Gradually, the opposite happened. First quietly, and then formally, various official parties in California started to talk to us and ask what we were seeing in the data.

This is probably surprising, but true: the formal vaccination efforts also had a data problem. Governments did not always know precisely where the vaccines were being administered, or under precisely which eligibility criteria. Ground truth was critical for vaccine seekers but no one had it for them.

The formal effort didn’t know to what extent formally promulgated policies were being followed. This was often very spotty and often at a lag. Pharmacists often learned of changes in eligibility criteria by reading about them in the paper a few days after they had formally changed.

So we started developing some backchannels to report individual incidents that our phone operations discovered, and also passed over data sets and analyses in an ad hoc manner. We were able to e.g. unstick hundreds of doses in a particular county which were chillin’ in freezers because the local health department had, in all the hustle and bustle, missed a single CSV upload and therefore wasn’t scheduling appointments at a dozen pharmacies. (No one noticed because there was no infrastructure to check. It was no one’s job.)

As we developed a compelling data set and an operating model which kept it relatively fresh and accurate, we started to have success in convincing publishers to adopt it. It has backed Google Maps for the last few months; try [covid vaccine near me] if you live in California and look at the citations.

Our effort began helping healthcare providers directly. Increasingly, as we called pharmacies to ask them if they had the vaccine, they said “No, but check Vaccinate CA dot com. They list everywhere in the county that has it.”

We had originally thought this was going to be an OSS-style project and expected that interested groups might clone it in other states. This didn’t prove to be maximally viable, partially because we didn’t have the cycles to manage the OSS project and partially because this is not solely a software project. The really hard part is an operationally intensive effort to make hundreds or thousands of calls a day, and most volunteer projects couldn’t sustain that cadence for months.

Nonetheless, there have been many important efforts by technologists to gather and expose vaccine data, like VaccineSpotter. We have ended up in an emerging data broker role between community efforts and large consumers of this data, such as large publishers (e.g. Google) and government initiatives.

We also work with formal data sources, such as VaccineFinder, the CDC-blessed national initiative run out of Boston Children’s Hospital. They have a pipeline for updates directly from pharmacies’ internal systems; we have the ability to combine that list with other sites like community walk-in clinics that VaccineFinder was never designed to accommodate. Being an independent non-profit affords us some freedom to report ground truth directly in ways that the government often cannot. This is useful when work needs to be done quickly.

Once it became clear that we had a compelling data set, governments started to ask us how to integrate it into their own operations. Counties in California wanted data to use their own sites. We said we’d do them one better, and built them an embed, so that it would stay up-to-date without additional operational toil. (Here’s Alameda County’s; our map is halfway down the page.)

Many governments need a months-long RFP process to get software written. We had a working URL 30 minutes after they emailed us, and localized it to 7 languages the following day. We will shortly have something similar working nationally.

Expanding nationally

We have an extremely compelling data set in California, as a result of working on it for the last 100 days. We have wanted for months to expand nationally, but the default strategy of building from zero again was daunting.

VaccineFinder helped us get a leg up on this. We were also accelerated by the efforts of individual states and county health departments. It turns out that the most common publishing platform in America for vaccine availability is, I kid you not, a Facebook page. There are literally thousands of them.

We have a group of “web bankers” who, with some assistance from cronjobs and purpose-built UIs, reload those pages frequently, key in new vaccination locations, and queue them up for direct verification if required. We then fan the information out from the originating agency’s post to all consumers of our data set (our site, Google Maps, participating government entities, etc.)

We also do a metric shedload of web scraping. A lot of county health departments are rocking DreamWeaver on an ancient machine. Ahh, Windows 95, those were the days… Technical archeology brings back memories.

Why this effort needs to exist

For structural reasons, the challenge presented by covid-19 is pathologically misaligned with how the government writes and, more importantly, procures software. The pandemic moves faster than public digital infrastructure.

Our nation made the political decision to administer the vaccination effort very locally. This resulted in thousands of county health departments, most with no software capability to speak of, being in charge of availability criteria, stock levels, and allocation decisions.

State and local governments invested in separate systems which didn’t talk to each other. The bid process gives only one chance to get a system working; adapting to the situation at launch required months. The federal government repurposed vaccination infrastructure built around the yearly flu shot and not optimized for quickly responding to day-to-day updates on an intensely fluid situation. Political and legal considerations sometimes blocked government-to-government collaborative efforts.

We should have seen this coming. I should have seen this coming. I regret that I didn’t start work on this project last year. We could have had a national site ready the day of the first shot.

This is a lesson for next time. It took America two days—two days—to develop a covid-19 vaccine. That is a triumph worth celebrating. Next time, we should match our world-leading scientific expertise with the relatively pedestrian web application the country needs.

The continued criticality of the vaccination effort

There has been a turn in the narrative from “the vaccination effort is a disaster” to “the vaccination effort is going very well”, largely because the vaccine is now frequently available to people who write about national political issues.

This is not nearly good enough.

The experience of getting the vaccine is still abominable, particularly for Americans who are characteristically underserved by the healthcare economy and by government. Sites which take appointments still run out of them in minutes (and they’re routinely canceled when promised shipments fail to arrive on time). The government does not specialize in conversion optimization and it shows. Primary healthcare providers are often in the dark about the vaccination effort; patients often assume (wrongly) that they will be contacted when it is “their turn.”

We still have to vaccinate half the country. We have to bring the technology industry’s expertise to bear on the conversion problem. It can’t be harder to get vaccinated than it is to buy something on Amazon or sign up for Facebook.

We have to decrease the barriers to vaccination and deliver this everywhere throughout America. We have to publicize walk-in sites which are available outside of work hours. We have to surface the many vaccines available in underserved neighborhoods which are not on official maps, sometimes due to poor organization and sometimes, God help us all, as a considered policy choice.

We have to move faster. Most American lives which will be saved by the vaccination campaign have already been saved, due to the intensely higher risk among senior citizens. That should not cause a decrease in urgency. Many who are not yet vaccinated will die or suffer brutal (and sometimes lingering) bouts of covid-19. The longer it takes to approach herd immunity, the more disruption to daily life and the economy we will suffer.

Perhaps most importantly, the longer America struggles with covid-19, the later we will make the choice to help others. India is currently dealing with a prompt national healthcare crisis. Lower-income countries are looking at multi-year timelines until they can be vaccinated. Even relatively well-off nations like Japan and much of the EU are months behind the US on the vaccination curve. The faster we accelerate our efforts, the faster the best parts of the US vaccination’s response get to work on saving lives elsewhere.

Every day matters. Every dose matters.

You can help

We’re currently hard at work expanding our national data set to the coverage levels of our California data set. Our rate limiting step is ingesting new vaccination locations for contact. They’re scattered across literally thousands of URLs using hundreds of CMSes/data architectures/etc.

If you can write scrapers, you can turn some of those URLs into structured data to feed our pipeline. We’ll perform a concordance (“deduping, with style”) with our other data sources, put them into the queue for validation, and publish them to our site and our partners. For more detail see Github.

We’ve got a small team here working through things, and will without help probably cover 95%+ of sites in the nation within 4-6 weeks, but a few hours of your engineering time can help accelerate the work. It is a high leverage way to help your community quickly emerge from this crisis. We welcome programmers of all skill levels; this is not rocket science.

We are also hiring a few mid-career engineers. This is not the typical startup gig; you’ll get no equity, your employer will burn through their funding as quickly as possible, our TAM is declining by millions daily, and successful execution in your job duties will see all of us unemployed within months. If that sounds interesting, drop us a line and we’ll get back to you in the next few days as dust settles.

We also have opportunities for volunteering if you’d like to help call pharmacies, join the web bankers, or work on other tasks.



from Hacker News https://ift.tt/3neUKZL

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.