Thoughts on software estimation
1. Why do we estimate?
Estimations are part of our everyday life. We estimate how long the bus ride to work will be so that we know what time we should wake up. Drivers estimate how big a spot by the curb is to decide if they can park there or not. Engineers estimate how much time they need to get a building done so that they can charge the clients, pay the builders, and so on.
So, looking at these examples, I’d say that we estimate as a means of predicting something that usually can be measured: time, distance, size, etc, without having to actually measure it. We predict because measuring it may not be worth it, or because if we measure it, once we are done, this knowledge isn’t useful anymore.
With this prediction in hand we are able to make decisions. So I think that we can say that we estimate stuff to plan our actions.
2. Estimating projects
Let’s take the engineer as an example, I am not an engineer, but since my father is a builder, I think I can say some things about it. So the engineer has to estimate a building project, probably the differences between a project and another will eventually boil down to the material, the size, the number of builders that will be working, and a few more variables that are likely to be easily quantified. Once the engineer has all the information he needs, it’s just a matter of calculations. There aren’t many different ways of building a wall or a pillar after all. Of course, it is an estimation, so it may still prove itself invalid, that’s why it’s called estimation and not vision-from-my-crystal-ball.
Now, in software development we also need to estimate how long it will take, how many developers/QAs we will need, and so on, but the tricky thing is: how do we measure software? We don’t have materials, area, size. What units are going to be used as input for our calculations to come up with a reliable estimation?
Well, there aren’t any. Software development, different to classical engineering, isn’t measurable, because it’s an intellectual and creative type of work. Writing software for a bank is not the same thing as writing a mobile phone game. While we work in a development project we learn about the client’s domain, their needs, their specificities. And each client has their own details, even if they belong in the same type of industry. This makes all development projects a learning project. If we are learning something, we don’t really have the knowledge before it is done, so our estimations won’t be very accurate.
Not to mention differences of technology, the knowledge of the team members, how to deal with the client people to gather requirements and approve a completed functionality.
If you ever hear anyone saying that software development is an exact science, tell them: “Lies!”. It’s a lot about learning, finding different solutions for different problems and *a lot* of dealing with people, not so similar to the classical build-my-house project.
3. So what do we do?
Well, that’s a good question, since we can’t really say “we won’t estimate this software project”. We still need to plan it, and as we agreed estimations are needed for planning. So, where I come from we don’t estimate how big a software is, nor how long it’s going to take us to write it. We estimate how complex it seems to be, and the most important thing is: this estimations are relative, not absolute. We split the work into “stories“, an independent and small piece of work and assign “story points” to each story, a completely abstract unit of measurement.
If I tell you that I will need 4 hours to wash your car, this is an absolute estimation, since 4 hours is a fixed amount of something that we all know how to measure and have a common understanding of (hours). Now if we agree that a given piece of functionality for your software is 2 story points, it’s relative because it means that it is twice as complex as it is to write another piece of functionality that is 1 story point. But a “story point’ doesn’t really mean anything.
It’s true that there is a statistical co-relation between story points and time, as we can see on the chart below, but this co-relation of how much time in average a story point is worth varies quite a lot depending on the team, the project, the domain, technology and so on. The relativity between the points is still valid though.
4. Estimating software
So now that we can measure the effort to write software using its complexity in story points, we can start estimating our work. As we already agreed, development is a lot about learning, so we can infer that the more we work on a project, with the same team and environment, the more accurate our estimations will be, since we will know more about such project’s details. That’s why we ideally avoid estimating a whole project at once in the beginning, it’s better to estimate it in parts. For instance, estimating a bucket of stories for each planned release or each planned iteration.
After a few iterations we can get a grasp of an average number of story points that team can complete in a given amount of time. It’s important noticing that any change in any of these variables: team, project and environment may and will affect this co-relation. Once we have this average number we can start predicting how much time it will take us to complete a big set of stories.
Now, that sounds very good in theory, but in real life we usually need to tell our clients an estimation before we actually start working, depending on the contract and the relationship with the client, so… what do we do? This has got to be the one billion dollars question of software development lately. And the more I talk to people about it, the more it seems like there’s no satisfying answer.
When the client is known and there is already a strong relationship between the parts it is easier to get a flexible contract, usually in a time-and-materials fashion, allowing us to adapt the complexity x time co-relation as the project advances and negotiate deadlines x scope x team variables to get whatever the client needs the most as soon as we can deliver. This really requires mutual trust and collaboration, which agile is all about. But the world is not 100% agile, as humans are not 100% trustworthy, so from what I have experienced these contracts are not the most common out there, specially when we are trying to get a new client to partner up with.
On the other side of the rope lies the least flexible type of contracts: fixed-price and scope contracts. The client tells you: here’s what I need, you tell them: I need X months to get that done, and it will cost you Y money units. Needless to say there’s a big risk involved in this type of contract. If the estimation is too low either you may need to work over-hours to get the work done, or the deadline won’t be met. In the worst cases, both may happen. If the estimation is too high, the client may prefer a cheaper potential partner and leave you. In the case they accept your terms, there will probably be a big waste of money from the client’s part, and time from both your sides (see the Parkinson’s Law). Not to mention requirement changes that, regardless of what the clients say, will come up during the development.
Most of the other types of contracts I have seen lie in between these two types or are a mix of them. Some examples I have seen are offering a couple of iterations to develop a prototype for the potential client, for free. After this prototype is ready, if the client likes what they see, the team will have a much better knowledge to work on more reliable estimations for a fixed-price contract. Another option could be a contract with a fixed deadline but variable scope, where the team compromises to deliver at least X story points. Or fixed scope but variable deadline, depending on what’s more important for that client.
6. Managing estimations
That said, I have had many discussions with colleagues about how to manage existing estimations, mostly in regards of whether we should re-estimate stories or not.
This may become a rather hot debate, but my opinion is that: we should try to avoid re-estimating stories, unless we’re really far off.
I really like Mike Cohn’s take on the subject, I don’t think I can explain it as well as him, so just take a look at it!
The basic idea is that we have knowledge-before-the-fact and knowledge-after-the-fact, and we shouldn’t mix them on our backlog, since we will need a normalized set of data to plan our future work on.
The problem arises when we use estimations not only for planning, but also to charge the clients due to the contract type. In this case re-estimating stories or not may not be an option. If the estimations are not tied with the project costs, informing the client that a given story will take longer than planned may suffice, in this case, since the development is in progress, a timed-estimation may even be accurate enough and more useful.
In my opinion, software estimation techniques are quite fair nowadays, the problem is not how we estimate software, it’s how we charge our clients.
Estimations are called estimations for a reason, they are not supposed to be the truth written in stone, and contracts based on that are quite risky.
I have the impression that once we all build a common understanding that software development is not as a classical engineering type of project many things will become simpler, and we will stop feeling that we are working in the software-estimation industry instead of in the software-development one (at least I feel like that at times).
The engineering part of our work is done by the compilers and interpreters, not by the developers.