2

Summary:

In an organization that wants to use an agile/Scrum software development process, but that also has a large production support / operations burden with an expectation for quick response times, what's the best way to ensure an adequate level of production support / operations without throwing all teams' sprints into chaos?

Details:

I currently work in a software organization which consists of several teams that (have been attempting with various degrees of success to) use what is basically an agile Scrum process. We produce a set of websites that our customers use to run their businesses, and we bill them for new development they request and for ongoing maintenance and support.

Frequently, things come up during sprints that the business would like to see addressed quickly. Examples include:

  • The website is not working in production and our client is on the phone with support trying to find a resolution
  • Some bug has been discovered to have caused some bad database entries that require manual correction soon
  • There is some audit going on where the customer needs us to produce unusual report datasets
  • A new client missed a business-critical feature request and needs it worked and pushed ASAP
  • Automated or manual testing finds a regression defect after the sprint that introduce it and we need to fix it before an upcoming release

I am aware that the correct response to these kinds of things is to work them in the following sprint, get the team to modify the current sprint or cancel the sprint. But each of these has some problems we'd like to avoid:

  • Work in the following sprint - management buys into agile to a degree, but our paying customers don't care that we are agile, they just want a quick response to their issues.
  • Modify the current sprint - this can work from time to time, especially for missed feature requests, but for standard operations type stuff - small requests - this would bog down the team in a lot of process.
  • Cancel the sprint - this too has been done but it hurts our ability to deliver consistently in the long term... obviously doing this by default means just giving up on agile/Scrum, but then what do we replace it with?

One thought that has occurred to us is that we could perhaps have a new team, which uses a different process - Kanban? Something else? - to handle all of this kind of thing, and allow the agile/Scrum teams to remain more stable.

  • Is this a good idea or a bad idea?
  • Is there a name for this or any literature on how to do it well?

Note: we do have a non-technical customer support team who can help with usage, configuration and basic troubleshooting and workarounds. This has more to do with issues that require technical skills to investigate or resolve - code, database, etc.

Patrick87
  • 195
  • 1
  • 9
  • 1
    The question is interesting, but without an objective definition of “best way” and “chaos” it seems opinion based, and therefore out of scope. Moreover asking for literature and resources is also out of scope. – Christophe Jan 08 '22 at 00:05
  • #1 definitely needs to be tracked and #2 should definitely lower your velocity for actual work, leading, with proper communication to management (hopefully) to #3 better understanding of need to build in quality to reduce operational pain in the future. I worked on a team that had 1 (of about 7) people dedicated to "on-call" - that was that person's task for the sprint, it rotated - when things got so bad the team needed TWO of 7 dedicated to "on-call" we FINALLY got management to wake up and give us back some of our velocity so we could schedule necessary "engr debt" fixes OVER their wants. – davidbak Jan 10 '22 at 18:16

5 Answers5

5

You have two options: attempt to project a level of support and plan for this at your Sprint Planning for each team or create a support team that operates using their own cadence (perhaps a just-in-time flow).

If you have a good idea of the volume of support and operations work each Sprint, you can allocate time each Sprint for the teams to do that work. If you have historical data, you can use that to figure out about how many support requests come in per unit of time and how long it takes to resolve those requests and distribute that among each team. Then, at Sprint Planning, you need to set aside a certain amount of time for refinement, a certain amount of time for support, and the rest can be dedicated to design and development work. You'd also have to account for unplanned events, like sick leave of the team. When the team pulls work in to a Sprint, they only fill up the time allocated to design and development and makes sure that Sprint Goals are likely achievable within that timebox.

The other option is to dedicate a team to support and operations work. They would have the ability to handle and resolve all incoming support requests. When the support request volume is low, they would also be in a position to handle other system-wide operational tasks: infrastructure upgrades, vulnerability mitigations, build pipeline maintenance, automation. Lower priority operational tasks can be maintained in a queue of work and can be done when someone is not working on a support request. You can establish a dedicated team for this or you can rotate the team that does this kind of work regularly.

If you have a large amount of time spent on support requests, then I'd lean toward the dedicated team, at least at first.

Regardless of the approach, I would recommend reducing the amount of these support requests that come into the teams. Performing root cause analysis and correcting the issues that lead to them would reduce the time that you spend on the support requests. Instead of just fixing the bugs, figure out why the bugs are not detected within the design and development process and improve the process. Define the work to allow customers to self-service their audit needs rather than coming to the development team. Figure out why the product managers are not able to elicit critical feature requests from stakeholders and get them developed before they must be done ASAP.

Thomas Owens
  • 79,623
  • 18
  • 192
  • 283
2

Ultimately the one that works is

Modify the current sprint - this can work from time to time, especially for missed feature requests, but for standard operations type stuff - small requests - this would bog down the team in a lot of process.

Part of the point of Agile is not having to stick to a big upfront plan. If you're doing that you're just doing a waterfall every week.

However, as you say, the "ceremony" around adding work takes up time itself. So what has worked for me in the past is simply putting an "unplanned support" ticket in every sprint, allocated to (say) half a person-day or 5-10% of the total team time. You can then rotate that around individuals who are the first point of contact for these unplanned requests. As soon as something takes more than an hour or two to deal with, it should be elevated to a full ticket with time allocation of its own.

pjc50
  • 10,595
  • 1
  • 26
  • 29
1

Whatever process you choose you essentially have the same problem. "Development of new features is too slow".

Scrum, and agile in general, tries to maximise developer productivity and align with business planning by focusing on one thing at a time. Obviously there is some slack in the red tape and you can squeeze in the odd extra 'emergency' thing here and there. But in the end you are going to need hire more developers if you want more work done.

I know you are aren't going to want to hear it, but most of the things you mention can be dealt with by managing client expectations better. You don't demand to see Bill Gates when Microsoft Word doesn't support your new business process the way you would like it to. Not because its technically any different from your website, but because of your relationship with the software provider.

Instead of saying "We will put your request in the backlog" say "We have lots of other work on at the moment", If you are being asked for custom reports frequently, publish a price for it which will pay for a reporting team. List the supported features of your product when you sell it and stop sales people saying it does everything and you will add anything the customer needs.

Ewan
  • 70,664
  • 5
  • 76
  • 161
0

we could perhaps have a new team, which uses a different process - Kanban? Something else? - to handle all of this kind of thing

[...]

Is this a good idea or a bad idea?

You have a customer need: work that needs doing on a timescale quicker than your sprint cadence. On the basis that this would fulfil that need, it's a good idea and one which I've seen work well.

The problem you'll have is that nobody wants to be on the "support team": it's generally high pressure, relatively low interest work. Rather than trying to have a separate team, consider rotating developers from the main teams through the "support" role.

Philip Kendall
  • 22,899
  • 9
  • 58
  • 61
0

Running to fixed-duration Sprints containing a pre-agreed workload simply does not fit with any, realistic, "Support" model. "Stuff" happens as and when it wants to happen and it cares not a jot about your precious Sprint cadence, your prioritised Backlogs, your Ceremonies or anything else.
To Paraphrase Murphy's Law:

If something can go wrong, it will go wrong and at the worst possible time.

... usually in the early hours of the morning, just to really catch you on the hop.

You will, almost certainly, have to split the team in two, into "Development" and "Support" functions but, if you do, make absolutely certain that you regularly and frequently rotate your people between the two functions. In Agile, you're supposed to fix bugs before delivering new features. The same sort of logic should be applied here. Your Developers have to "earn" their time Developing by working in the trenches, doing the Support work. If nothing else, they will quickly learn that they don't like working with broken and unreliable code and, when they get back onto the Development side of things, that experience should actually drive your code quality up!

Phill W.
  • 11,891
  • 4
  • 21
  • 36