Saturday, July 23, 2022

How to think about task estimation

This is hopefully the first in a “series” of posts about estimation. Each post will be reasonably standalone, and I’m not building to any specific conclusion, but it will be an ongoing theme for a while.

“How long will this take?” is an extremely common question, and is equally commonly dreaded.

It’s true across all walks of life - both at work and at home - but personally I know this mostly from software development, where it’s a normal part of the job. Also it almost always sucks. Most developers I talk to hate estimation. As well they should - estimation as commonly practised is normally a poorly implemented waste of time.

The core problem is that nobody has ever really told you what an estimate is, or how to think about estimation. As a result, this question makes as much sense as “Can you florble the grobnab for me?”, and you do your best to emulate what you’ve seen other people do without ever knowing what a grobnab is and why one might want to florble it.

This is made worse by the fact that there’s really quite a lot to estimation. Doing estimation perfectly requires a detailed understanding of the problem, a solid intuition for probability and statistics, and carefully navigating a whole bunch of uncomfortable feelings and complicated social pressures in order to make good trade offs.

I sometimes joke “Estimation is easy! All you need to understand is software development, statistics, politics, therapy, negotiation, and business, and then estimation makes perfect sense. What’s the problem? Oh also your manager needs to understand all of those too.”

Fortunately you don’t actually need to be able to estimate perfectly in order for it to be useful. You’re already doing estimation and while there’s some debate as to how useful that is, it is still more useful than not doing it, and there are a number of easy ways to improve on it that I hope to teach you in this and future posts.

In this post in particular, I’m going to talk a little bit about what it actually means to florble a grobnab estimate a task. The core problem that we’re going to talk about is that estimation (deliberately, and correctly) combines two quite different things: How large a task is and how uncertain you are about that, and that depending on what problem you’re actually trying to solve, you need to combine these in different ways.

The big problem with task estimation is that it’s not actually one thing. You’re being asked “How long will this take?”, and expected to give a single number as an answer, but the truth is there is no single answer. For example, consider the questions “How long will this typically take?” and “How long might this take if things don’t go as smoothly as expected?”. These obviously have different answers.

The software developer defence is that every software development task is doing something new, but this isn’t really true, most software development is actually fairly routine, and the problem isn’t specific to software development. Any big project has uncertainty in it - e.g. physical building projects are just as (if not more) prone to estimates being wildly off, even when the task in question is fairly routine.

So the first answer to “How long will this take?” is really “Why do you want to know?”, and depending on what the answer to that question is, you’ll give a very different answer to the the first question.

Most of the time when you’re using estimates, you’re trying to answer questions like the following:

These sound like almost the same question, but they’re actually very different, because they respond to uncertainty, and to changes in the task, differently.

If we want to know how long we expect the task to take, what we’re really asking is “how much on average does this task cost?”. This allows us to decide whether the task is worth doing. If you do things that on average make you more revenue than they cost, you make a profit. That’s business, that is.

For personal projects it’s less formally about profit, but there’s still that same sort of underlying cost-benefit analysis - e.g. a home renovation project might be clearly worth it if you expect it to take a couple of days, and clearly not worth it if you expect it to take a couple of months.

In particular, you can often ignore or discount unlikely events for this. If there’s an outside chance that the task will blow up and prove much harder than you expect, and you’ll only find this out a day or two into the project, that’s probably OK - you can just abandon the project at that point! Rare but recoverable risks don’t typically factor into cost planning all that much.

Whether something is going to get done by a deadline on the other hand, you’ve already decided it’s worth doing, and what you’re interested is instead mostly about uncertainty, and this is what’s often ignored (to everyone’s detriment).

People assume that if you estimate that a task will take ten days, and the deadline is 12 days away, that means everything is fine and you don’t need to worry about it, but this isn’t true at all. A task might typically take 10 days, but if something unlikely but not terribly rare comes up in the middle of it it might blow up and take twice that. What you are interested in here is not how long the task will typically take, but something approaching a worst case scenario for the task.

In order to get more of a feel for the problems with estimation, consider the following toy problem: Flip a coin. If it’s heads, you’re done. If it’s tails, you repeat the task tomorrow. You will keep doing this until you get heads.

This isn’t a very good model of work - on the one hand it’s too tidy, and on the other it has more uncertainty than is found in most tasks. However, these two features make it a good model for understanding some problems of estimation that do occur in real work.

If this model is too abstract, think of the coin flip as representing some sort of research. You spend a day trying to figure out if a particular approach, a particular vendor, etc. will work for your problem. Once you’ve got the right approach, the task is easy, but you don’t yet know what the right approach is, and all of the hard work is in figuring that out.

Now, what is your estimate for how long this task will take? In fact, let’s make this easy. Which of the following three is the best estimate?

  1. Zero days (just the time to toss the coin, which is basically free)

  2. One day

  3. A week

Trick question, sorry, these are all perfectly reasonable answers, and you can’t know which one is right until we ask that crucial question: Why do you want to know?

It will take 0 days far more often than any other day - it will take 0 days half the time, one day a quarter of the time, two days an eighth of the time, etc. So 0 days is the most likely time for it to take (this estimate is called the “mode” in statistics). This is almost never what you want.

It will take one day or fewer precisely half the time (this estimate is called the median). Also, if you did this many times and averaged how long it would take each time, you would expect that average to be about one (this estimate is called the mean. It happens to be equal to the median here, but in general it’s not likely to be).

Why a week? Well, because 99% of the time it will take at most a week, but it will only take strictly less than a week about 98.4% of the time (this estimate is called 99 percentile, or 99%-ile for short), so a week is the number you can truthfully say you’re 99% confident you can get it done in.

So which of these estimates is the best estimate? Well, why do you want to know?

So suppose you’re going to get rewarded $10,000 for getting a heads on that coin - that’s a pretty good deal for an average of one day of work, right? So it’s a task worth doing.

In contrast, the average cost tells you almost nothing about whether you’re going to hit a deadline. Say we have a deadline for two days from now - someone told that the task will take a day might reasonably assume that this means that you’ll hit the deadline no problem because of advanced mathematical facts such as 1 < 2.  But in fact, all that’s required to miss the deadline is to toss tails three times in a row. This happens one time in eight, or a bit over 10% of the time. If you were to do this once a week, you’d expect to miss the deadline (on average!) 6.5 times per year. Depending on how much is riding on hitting those deadlines, this may or may not be a big deal.

In contrast, having the 99%-ile is reasonably useful for this question: Will we make that deadline? Well, we might, but we’re certainly not overwhelmingly confident.

In fact none of these estimates is really the right tool for this question. What we actually want to know is this: How likely are we to make our deadline? More on that in a future post.

In the other direction, the 99%-ile is almost useless for estimating whether a task is worth doing. Sure this task will sometimes take that long, but that’s not representative of its typical cost, which is almost 7 times lower. The 99%-ile is a very cautious estimate, and if you use that then you will typically be leaving money on the table.

This is why the most important question when estimating is “Why do you want to know?”, and any attempt to give an estimate in the absence of that is meaningless: An estimate must take into account both the size of the task, and your uncertainty around that size, and how you combine those two depends entirely on what problem you’re trying to solve.

The task of the previous section is, of course, a toy. Real tasks don’t actually work like that, because you don’t have nearly as clearly defined an understanding of the uncertainty in them.

As a result, you sadly can’t be quite as precise in your estimates as we were in our coin tossing model. The coin tossing model is the best case scenario for how accurate our estimates can be, and most real world estimates will be even harder than that.

I’ll have to leave a detailed toolkit for dealing with this for later articles, but rather than leave you completely in the dark, let me provide you with some useful rules of thumb.

The first is that you can estimate the 99%-ile as follows: imagine that the task goes about as badly as you’ve seen a task go in the last couple of years. Think back to recentish horror stories and ask “What went wrong there and how long would this take if that went wrong here?”. If you’ve had a long career replete with horror stories, don’t necessarily think of this as the worst problem you’ve seen ever - that will tend to make your estimate more pessimistic than you want - last couple of years is enough. The number you get here is your estimate for the 99%-ile.

Estimating the 99%-ile is often enough. If you’re just trying to figure out whether you can meet a deadline, you can stop here: If your 99% estimate takes you past the deadline, you’ll probably miss the deadline (hopefully I’ll have more advice about what to do then in a future article!), if your 99%-ile has you done before the deadline you expect to be fine.

Another reason you might stop here is that if the task is still worth doing if takes as long as your 99%-ile, the task is obviously worth doing and you might was well just do it. You might need a better estimate for planning purposes, or you might just want to go ahead and get it done. It’s up to you.

You can usually get the median (the number that it will take less than about half the time and more than about half the time) by asking the question “How long will this typically take?” - these are probably the estimates you’re already giving, and they’re probably pretty good as medians, it’s just that the median is rarely what you want.

Once you have these, if you want to do a mean estimate you can. There’s a technique called three point estimation that lets you take your median and 99%-ile (and a best case estimate) and gives you a mean.

I like three point estimation as an idea, but I confess I’ve never actually used it. Partly this is because I learned about it after it was directly useful to me, and partly because I don’t think mean estimation is quite useful enough to advocate for it. Estimating the mean isn’t usually what you want. It’s very useful for asking if a task is worth doing, but generally speaking there’s not much uncertainty as to whether a task is worth doing - either it’s obviously worth doing, obviously not worth doing, or you’ll put it off behind the huge backlog of more plausibly worthwhile tasks are done and only get to it when you’ve completed all the higher priority tasks.

That being said, if you’re doing sprint planning, and you absolutely have to do it through estimating individual tasks (I’m hoping to propose a better solution later, although ideally I’d like to play test it with some people first), give three point estimates a try. The mean is the only type of estimate that makes any sort of sense for this problem. Unfortunately the sense it makes is “least bad”.

The correct answer to “How large is this task?” is “Why do you want to know?”. Unfortunately, this is relatively rarely the welcome answer. So let me leave you here with an answer that will be better received.

First, figure out the 99%-ile. Practice this a bit before hand so this is easy, err on the side of overestimating if necessary. For example, suppose that number is 10 days.

Now your answer is something along the lines of the following: “It might take as long as 10 days. It’s probably less than that, but I’ll need an hour or so to think it through to be sure, and we might need to talk through some more of the details”.

Chances are, that answer is good enough if you’re just being asked for an off-the-cuff estimate. It gives a number that is good enough for a quick go/no-go decision, and makes the work of proper estimation explicit. It’s not good enough for detailed estimation, but it gets you out of the most stressful estimation situation (doing it under instant time pressure) and gives you space to think through the problem properly.

How do you think through the problem properly? Ah, well, that’s a topic for another time I’m afraid.

Do you like learning about this sort of thing? Why not learn about it from me directly! I’ve started offering various courses on the sorts of skills you need as a software developer, including one on estimation. You can learn more about them at consulting.drmaciver.com/courses.

I’m also available for a wide variety of other consulting and coaching services for software companies. Have a read through the consulting site and/or drop me an email at david@drmaciver.com if you want to know more.

If you liked this piece and want to read many more like it, why not subscribe if you’ve not already? Here’s a subscribe button for you to click. Go on, click the button…

If you’d like to hang out with the sort of people who read this sort of piece, you can join us in the Overthinking Everything discord by clicking this invitation link. You can also read more about it in our community guide first if you like.

The cover image is a scrum task board, taken by Flickr user Logan Ingalls and released under Attribution 2.0 Generic (CC BY 2.0)



from Hacker News https://ift.tt/lLpeFQB

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.