Jul 312020

Generally speaking, when a software development organization says that they’re going or practicing agile development, they mean they’re following the Scrum methodology. Most jokes about doing software development have nothing to do idiosyncrasies of programming itself (Gary Bernhardt’s classic “Wat” presentation notwithstanding), but rather about dealing with Scrum. I get the frustration, As fun as it is to make fun of Scrum (and I’ve made my fair share of jokes about it), I really do think it’s a good development process (when done well), and that we would do well to keep it in mind. So having said that, here’s my attempt to defend Scrum.

The most common response to people complaining about Scrum is that everyone’s bad experiences with the process stem from Scrum not being done right. For people that like Scrum as a process, it’s a reflex, and for people that don’t like Scrum, it’s a cliche. The biggest problem that Scrum has is that it has very distinctive “feature checkboxes” that make it easy for people to say they’re doing Scrum, without coming close to doing Scrum well. Combine this with the fact that Scrum (and other agile processes), allow and even encourage teams to do things “their way,” and not only is it easy to implement a process badly, but to implement it badly while convincing themselves they’re optimizing this for their organization.

The best ways to screw up Scrum

I want to spend a little bit here on places where I think Scrum organizations go wrong. Obviously I’m not a Scrum master (certified or otherwise, but the fact that you have to take a class to get this right is certainly not a point in Scurm’s favor), so my point-of-view is limited, although I do try to keep the various concerns being balanced in mind when decisions get made. That said, if you look at the textbook example of Scrum, it’s easy to find people doing things very differently without a clear evolutionary chain from doing standard Scrum to that point. I’ve seen organizations just start this far off the defined path by immediately declaring that’s what works best for them.

The most egregious example of “Scrum done wrong” is just arbitrarily declaring that you’re a Scrum organization because you have a standup every morning where you give status updates. I’ve never worked somewhere that was that bad, but it”s a good illustration of a company that’s basically cargo culting Scrum because the big companies do it. There’s really no salvaging this situation, just bail as soon as you can.

A particularly blatant example of this is when the Scrum “team” isn’t enough to build and ship self-contained code. It’s 1 thing if you’re a services based organization and work involves external (to your team) services – software isn’t written in a bubble. But the biggest offender that I’ve seen is when a team can’t go end-to-end with writing, testing, deploying, and supporting software that’s entirely inside their scope. 1 of Scrum’s defining features is supposed to be the cross-functional teams, which means the team should be able to do anything they need to do to build and run their part of the organization’s larger codebase. That’s usually 1 of the first things to get dropped when companies decide to start moving to Scrum. They have their silos, and while they may be willing to have QA mix with the developers, operations are a separate thing, and there will be no mingling on that front. I actually have a personal conspiracy theory that the whole DevOps movement is really an attempt to fix bad Scrum organizations. Scrum teams should be able to perform the entire software development process within their area of responsibility, up to and including deployments.

1 mistake I’ve seen people try is claiming to be “agile” is adding tickets to the sprint, mid-sprint (for simplicity, everything I’m about to say on this topic applies to significantly changing a ticket mid-sprint too). That’s a huge no-no. The whole point of the sprint structure is that you have frequent periods where you can re-evaluate your priorities. You can deal with the changed priorities and new work then, and not before. Once you decide what things you’re going to do during a sprint and commit to the sprint, you’re locked in until the end of that sprint. The only exception to this rule is if something literally blows up in production. There are still rules in “agile,” and it’s up to the team (and it’s direct management) to enforce them.

Another common abuse of Scrum is skipping parts because you don’t feel like doing them. Generally, this takes the form of skipping the various “ceremonies,” like retrospectives or sprint planning. Other common variations of this are doing them so rarely that you might as well not be doing them at all, or adding a few minutes to standup and calling it the “ceremony.” First of all, call them what they are – meetings. Secondly, they’re in the process for a reason. Suck it up and do them. Looking at the list of meetings per sprint may be frustrating (it certainly feels like a massive time drain that could have been spent coding), but I’ve never heard of people saying they work somewhere that did Scrum well while skipping steps. The point of the meetings are to keep everyone on the same page (daily Scrum, sprint planning, grooming, and demos), make sure any questions about the work that needs doing get answered (grooming, sprint planning), and to tighten the specifics of your company’s process and improve (retrospectives). For some reason, retrospectives always seem to be the first to go. I find it odd that a software development team would adopt a methodology that’s all about a tight feedback loop from users while simultaneously deliberately refusing to take advantage of that same benefit internally. I hate meetings as much as anybody else, but grooming meetings are requirements meetings, and not taking those seriously causes problems later. In the end you’re going to have to make the time to do some actual planning and scoping. Scrum may emphasize keeping the planning short-term and the scoping small, but you have to actually plan and scope.

A subtle sign that an organization isn’t really on board with the whole Scrum thing is that there’s no interest in a faster shipping cadence, at least among the actual development teams. If you’re doing Scrum correctly, then a story being “done” means that it is ready to go into production. If you’re keeping sprints short, that means you’re going to have stuff that’s user-ready on a pretty regular basis. As a friendly reminder, features that are merged in, ready to ship, but aren’t in the hands of your users are features that are not making you money, they’re not closing sales, and they’re not retaining customers. In fact, unshipped features might as well never have been made in the first place. Seeing how it feels really bad that to build something and nobody ever use it, at the very least the team responsible for building and running the software should be pushing to get things out of the door to go where their hard work can pay off.

In places that are starting out with Scrum, there’s likely cultural friction to shipping early, often, and before every possible facet of something is complete and polished, which is why I prefer to focus on the attitudes of the teams actually doing the work. Changing the minds of management and overall company culture takes time, and a lot of work to help show that risks can be mitigated. That means convincing them of the counter-intuitive fact that if their deployments are brittle, high-stress, high-risk affairs, then it’s because they’re not deploying enough. It means spending time building the automation, tooling, and test coverage to support frequent deployments. That means tickets for work that isn’t user-facing, and that is harder to demo for the business side of things. Basically the only thing stopping you from shipping features as soon as they’re done (or at least at the end of every sprint) should be a business decision (e.g. you want to do a big marketing blitz to announce the change, launch it at a keynote, etc.), rather than a technical decision.

Continuing on from that last point, an apparent indication of an organization doing Scrum wrong is that they ignore technical debt and infrastructure work. The basic idea here is that because product owners are in charge of prioritizing the backlog and the work that needs to be done, anything that isn’t a user-facing feature gets ignored in favor of stuff that demos well (either internally or when done by the sales team). I’m going to be honest here – this isn’t a failure of Scrum, or even the organization’s way of doing Scrum. This issue is on the developers for failing to make the business case for why the unsexy work needs to be done, even if it means putting features on the backburner. Technical and infrastructure work is valuable to the business – it enables your software to handle higher usage without having to spend more on servers, makes your software more reliable so you have less downtime, enables you to take less downtime (or enable your services to better handle external dependencies being down so you don’t have to worry about them so much), or gives you the ability to build new features faster. All of these are good, and increase the amount of money your software can bring to an organization, but in general we’re really bad about articulating that in business terms. As a result, the work sinks to the bottom of the backlog, never to see an active sprint.

When Scrum is done right

Now that we’ve gone over the stuff you’re likely to see when people aren’t really doing Scrum, here’s the things that I usually see when it’s done well. The first is that they’re going through all the meetings – daily standups, backlog grooming, sprint planning, retrospectives, and demos. Not only that, but they’re going through the full set for every single sprint. I know I went off about skipping these being a sign of Scrum not going to work earlier, but after working places where some of the meetings were skipped, and at places where they scheduled everything, the places with the extra meetings actually seemed like things ran more smoothly. It turns out that a large part of making Scrum work is communication and rapid feedback, and those meetings are designed to encourage those very things.

The next indication that Scrum’s working out is the team’s response to tickets that are initially slated to be a lot of points (and by “a lot of point,” I mean “more than 5”). In those situations, the team immediately starts looking at ways to break the ticket up into a bunch of smaller tickets with a more limited scope. In fact, the easiest work seems to always be the tickets with the longest description of the acceptance criteria covering the smallest possible amount of work. That’s not an accident. You have a handy, extensive list of test cases (that you can hopefully automate while you’re coding), and you’re keeping the scope of your changes small. It can make the tickets seem tedious, but there’s no ambiguity (which is particularly nice when the ticket moves through development to QA and then the final work is seen by the product owner responsible for creating it), and you don’t have to figure out requirements on the fly because something came up as you were writing the code.

Where daily stand-up meetings are the hallmark of an organization that’s at least paying lip service to Scrum, the defining trait of a team that’s actually doing Scrum well is that they hit their commitments consistently. Whatever your opinions on burndown charts, velocities, pointing, etc., at the end of the day Scrum is about regularly making realistic short-term goals and hitting them sprint after sprint. That makes sense – Scrum is a way of managing software development, it stands to reason that if it works the way it’s supposed to then things get done. You can’t argue with results, and when a team is regularly getting results at a sustainable pace, there’s no arguing with how they do things. Good Scrum gets things done.

Speaking of holding all of the meetings and getting things done, in good Scrum organizations, those retrospectives that you’re supposed to have (and people like to drop at the first excuse), produce actual work designed to make the process of writing software smoother and more reliable. But not only are they coming up with things that can make shipping software better (or help them ship better software), but they’re actually doing it. That means the stuff that’s being discussed at retrospectives are specific, and there’s a way to determine whether or not they worked, and how well. I’ve seen places where retrospectives were rare (at best), and the stuff mentioned in there seemed more like filler because you’re having the meeting so you need to put something in that part of the Confluence template. Here’s an example to illustrate the difference – in a company that’s not doing Scrum well, you get infrequent retrospectives that lead to things you can do better like Improve communication between development and QA to help the testing process. That’s meaningless because anything could count as “doing it,” and there’s no clear indication of whether it helped or not. Here’s a better run at the same thing: To reduce the back-and-forth when QA starts testing a story, please be sure to include a URL for the build being tested, as well as any details needed to look up test data needed to test the story, in the ticket before indicating that the story is ready to be tested.There’s specific things to do (put the URL with a deployed instance with your changes in the tickets, and make sure to include anything else QA would need to be able to find test data in the system), and something everyone can observe to confirm that things got better (there’s less questions and messages/emails between the developer who made the change and the person testing the change). And since retrospectives are being held at the end of every sprint, everyone will know in a few weeks if that worked.

The last sign I’ve noticed in jobs that were doing Scrum well is that kept sprints short. Sprints are supposed to be 4 weeks max, and 2 weeks is the standard. 2 weeks should probably be the real maximum sprint length. Anything longer than that and you lose the motivation to keep stories and task limited in scope. Remember, 1 of the biggest selling points of Scrum is that you work in small chunks and iterate rapidly. The longer you drag sprints out, the less rapidly you’re iterating. You also lose the benefits that came with having to make 2 week sprints work – every piece of work needed to be small and limited in scope, which makes them easy to review, and easy to test, and less likely to trigger merge conflicts which can cause unintentional bugs if you aren’t careful. Smaller-scoped tickets are means that the actual changes are likely to be done in shorter amounts of time, giving you the time to do things like write tests for the stories you worked on. Those help prevent regressions and catch unintentional side effects at the very beginning of the development process, which helps speed up your ability to ship new and useful things because you’re not stopping to go back and fix things that you thought worked last week.

I mentioned that I thought 2 week sprints should be as long as it gets – I’ve talked to developers that have worked in 1 week sprint shops, and while I haven’t had a chance to do that, it does sound like an impressive way of doing things. Basically the flow is something like this: Development is done on Monday, maybe stretching into Tuesday morning. By the time you’re done with lunch on Tuesday, all the tickets have an associated pull request. Peer reviews, and maybe even peer testing, happens Tuesday afternoon, and wraps up no later than Wednesday. Wednesday you do your sprint planning or backlog grooming, and everything’s in test by that afternoon. Thursday and Friday handle testing, and you’re done and ready to release at the end of the week. What are developers doing from Tuesday morning on, you may ask? Well, aside from following up on comments on the pull requests or from testing, they’re also looking at the first draft of stories in the backlog. Remember, you basically have 1 workday to get everything in, so stories need to be at a bare minimum of scope, and the list of things that need to be done to fulfill the acceptance criteria needs to only be thorough, it actually needs to be prescriptive by the time it’s ready to be pulled into a sprint. That means someone needs to go through the stories and figure out how to do them and include that information in the tickets (down to the  Add this line at this point in this file. level). This process is the hardest part of software development to estimate, and now it’s being done outside the confines of a sprint, which reduces the risk of underestimating a story and not having things done in time. By the time a ticket is groomed, everyone knows exactly what to do to resolve it, it’s just a matter of writing up the code (and if the person who groomed it is also the person who has it assigned to them, there’s a very good chance that code is mostly already written and just needs to be cleaned up before putting into a pull request.

Scrum does have problems

I’m not saying that Scrum is perfect or the only system that development organizations should use. It has it’s flaws and quirks, the most notable of which is the “points” system that goes with all the stories. The problem is that there’s no definition for how points work. To understand what I mean, think about any sport – I’m going to use basketball. “Points” in that sport is a function of the number of times the ball goes through the hoop. For a regular shot, that’s 2 points, but from far enough out, it’s 3 points, and for a free throw it’s 1 point. That means without having seen a game, I can still look at a score and have a rough understanding of the level of effort it took to get to that score, even if I don’t know the exact combination of field goals, 3-point shots, and free throws it took to get there. The problem with “points” in Scrum is that there’s no definition of what constitutes 1 point, what constitutes 3 points, etc.

In theory, you could make it a function of time, but that’s actually a Scrum no-no. In theory this keeps estimates relative to each other independent of how long it would take a specific developer to do it (e.g. it would likely take me far less time to make a change in a system I’m very familiar with than someone who hasn’t seen the codebase before). The argument for points being independent of time is that if story A would be about twice as much work as story B, regardless of which developer is doing the work, then you can make A’s points twice of B’s regardless of if 1 developer can do story B in 1 hour (and A in 2) and another can do story B and an hour and a half (and story A in 3), My issue with that is that philosophy is first, if something will take a different developer longer, then they’re going to adjust the point estimate upward, and secondly, there’s still no definition for a base level of effort. This issue is why I’ve sat in planning and grooming meetings when trying to figure out how many points a story should be and advocated scoring them under Whose Line Is It Anyways rules – “Everything’s made up and the points don’t matter.”

Another personal gripe is that all the stories in Scrum are supposed to be “user stories,” starting with the phrase “As a user…” That creates the problem I discussed with Scrum done wrong about technical debt, infrastructure, and internal tooling stories being disregarded or ignored – they don’t directly translate into the user’s point of view. That doesn’t mean they aren’t important. For instance, take an system that your company wanted to build and get out to market because it had potential, but you wanted to get it out quickly to gauge customer interest (a good practice and something that Scrum encourages). As it turns out, customers were interested and the product does well, but in the interest of getting version 1.0.0 out the door, you wrote something simple that isn’t going to scale well when there’s a massive spike in traffic, or even the regular volume of users if it continues to sell like it is now. So you need to re-architect your bottlenecks to handle heavier traffic. How do you write that user story, exactly? “As a user, I want the internal data processing pipeline (that I don’t actually know about because I have no idea how this software is written and don’t care so long as it works) to be able to process incoming data more efficiently or and to have more responsive queries?” Yes, all users want their software to be fast, but the stories that deal with the internal changes to make that happen don’t necessarily do anything new from a user’s perspective. So they get ignored for “As a user, I want a shiny new feature” or “As a user, I want this feature to do this other thing instead” stories.

Personally, I try to write stories in the third person (no “as a user” nonsense), and start with a bit of history, so they flow as a “When we wrote the software, we did {X} because {Y}, but now {things} so we need to change {stuff to do goes here}. {How that will benefit users goes here}.” For user-facing features and changes, you can basically spit out sales talking points for how that benefits users. For internal changes, this is where you show the business case for doing something even though the user won’t directly see a difference when it’s shipped – something like “This change enables us to identify exactly what data has changed recently, and lets us eliminate the slow queries to find that data in the first place, keeping the system from slowing down when there’s a spike in traffic, which has caused customer complaints to support in the past.”

If you really want to hear discussions about ways to do Scrum well, I recommend the MetaCast podcast. I recommend digging through their back catalog for the episodes they did on an “agile toolbox” for organizations starting to do agile development. They both have a lot of experience building agile processes (specifically using Scrum) in companies, and their discussions regularly go back to principles for informing the agile approach to situations.

As fun (and easy) as it is to make fun of Scrum as a software development process, it’s not a bad way of building software – if it’s done correctly. Probably Scrum’s biggest problem is that it’s so easy to imitate without actually doing it correctly. Doing Scrum well takes a lot of time, and on paper (or even at first blush), it seems like more time than it’s worth. But if you’re willing to go through all the steps, parts, and yes, even meetings, you start to realize that everything seems to run a little smoother. That every time you sit down to plan the next couple of weeks out, those 2-week plans are on-schedule a lot more often. That having a conscious focus on “how do we do better” is leading to doing better. Scrum is a pretty good way to organize development work, and I think that’s a point that needs acknowledgement sometimes. But all that being said, the memes are great, and don’t stop them for anything.

 Posted by at 11:45 AM