14

A bit of a background here - we are a small team (of 5) of RAD developers responsible for internal software development in a big non-software company. "Internal software" varies from a desktop .NET application using MSSQL server as a backend to Python scripts running on the background to MS Word documents and templates - a zoo of technologies.

The whole team consist of all-arounders able to get the requirements from users, code it up, test it and deploy into production. Once the software in the production it is being looked after by another team but it usually easy for us to intervene if something goes wrong.

All sounds good so far, but there is a problem - being a RAD team we have to release often, and there is no day going by without us releasing new versions of one or two applications(or it could be a script, updated word document, C++ console app, etc) into the production. We do a development testing and also do involve end-users by letting them run the software in UAT environment ...

... but the bugs are creeping in to production anyway. Users do understand that these bugs and the occasional instability is the price they are paying for getting what they want really quickly, but at the same time it got us thinking - perhaps we could improve our development or a release practices to improve the stability of the software and reduce the number of bugs we introduce when adding a new functionality.

The good thing - we don't really have much of the processes in the first place, so it should be easy to start improving, the bad thing - being a small RAD team we don't really have the much time and resources to implement something big, but we have been thinking about the following initiatives and would welcome any feedback, tips, hints and suggestions.

  1. Currently some of the applications are being released into the production straight after the developer testing, bypassing the user acceptance testing. That practice should be discontinued and even a small change has to be tested by a end-user. Each application will have a dedicated beta-tester selected from the end-users. Only after a beta-tester has ok-ed the new release it is promoted from test to production environment.

  2. We don't conduct code reviews - but we'll start doing code reviews before one of us checkin the changeset. I was also thinking about a "rollout review" - basically one of the developers has to sit next with the other watch him/her doing the software rollout (copy binaries, update configs, add new table to database, etc) - it usually only takes a 5-10 minutes so it won't take much of a "rollout review" time.

  3. How to miminise the rollback time when a new release is proven to be buggy enough to be pullout from production and to be replaced by a good previous version. We do store a history of all releases (as a binaries) to make it easy to go one version back - and though it is quick "overwrite a newly released binaries with a previous versions binaries" it is still a manual process which is error prone and demanding at times "what if the rollback will fail and will render the system unusable instead of buggy".

This is where we ran out of our ideas and we'd like to get your feedback on these and if you could share some simple release / dev process improvement advices - that would be awesome.

Dipan Mehta
  • 10,542
  • 2
  • 33
  • 67
PeterT
  • 349
  • 4
  • 9

5 Answers5

13

+1 for touching upon a great subject. When we do "Release early release often" line of development, things pick up real pace and as the momentum builds many such issues arise (as you described) which we are otherwise not very prepared to cope up with. Worst fear is when people see speed as an enemy of good work.

I have seen very limited literature on this however, this is what we practice that definitely helps:

1. Effective Bug tracking
Make bug tracking more effective - what we do is not only keep a list of bugs and tick mark, but when closed, we must define certain things like "was the problems reproducible?", "is this a permanent solution or work fix?", "what is the root cause" of trouble? This allows knowledge of what happened, when this bug was visible last time. This is key to ensure that bugs do not repeat often.

2. Define key fall back points
We all know that bugs will arrive, so we will need to provide effective fall-back which works most often. Time and again we finalize (with a ratio of about 1 of every 10 in our case) a most common release that works everywhere in most reliable manner. The total number of releases can be many but if anything goes wrong, the fall backs are select few and you don't have to fall back any further. One of the simplest way to know the best fall-back is to see which earliest release which has been running longest in production without much issues.

3. Distinguish risky and stable or small bug fix releases
When we know we have a major algorithm changes, more likely that bugs might creep in on scenarios that are not all foreseen. Where as there are times when issues are very small (or well understood) as well as little risk. Do NOT club such functionality and simple bugs in same releases. Always have a smaller bug fixed first, which must go wherever required. Make dedicated releases for special feature sets at best you can deprecate that feature but but all other important bugs are still available fixed in prior release.

4. Branch for significant feature development
Anything which associate changes which has design affect must be done separately on a branch. Larger development doesn't get completed quickly as compared to smaller bugs. If we introduce intermediate commits where 'partial' work related to feature which is still not in use - is a potential bug introduction region; the bugs which wouldn't have arise if full work for the feature would have completed atomically - hence these are bugs which we would n;t have to solve if there were branches.

5. Always plan release which are theme based
Many a times many different bugs arrive of different releases -but it is best to organize bugs (and features) which affect similar modules eases the repeat work and minimize the number of bugs originated from that work. Always prepare release road-map well in advance; bugs keep pouring in - and that falls into different target releases to optimally have a good group of bugs to be shot together in a good release. When similar bugs are combined together, it always gives better insight about contradicting scenarios.

6. Extend any new release first to a few customers
In our case, we see test it in couple of sites first and all other sites are applied a release only when there is a demand for it. Many a times some (or most) users would jump only from stable release to another stable releases only.

7. Regression Testing
Along the lines of bugs being collected - build regression suit. Also if possible mark critical bugs and test to be most important that become minimum qualifying criteria to be tested before a release candidate becomes a release indeed.

8. Pause and reflect
When many things go in full speed, there should be time to put some breaks - take a pause and have releases that are functionally no better. In fact have holiday of releases for some time. (the duration is inversely proportional to frequency). For example, many a times we have these so called "clean-up" releases which achieves nothing new from functionality point of view - but that helps great in keeping code maintainable. Most such releases are great fall back points that you almost never recall the history prior to that.

9. Perhaps the most strange
I find this one difficult to implement often but is a sure shot good trick. Swap the owner of certain modules. When people are asked code-reviews to be done, not much comes out of this practice. But when you have to seriously deal with that new code, when you swap authors, potential "bad" ailments gets noticed quickly much before they start polluting the code. Of course, this reduces the speed - but if you do this often, chances are that people master various parts of the code and learn about whole product which is other wise very difficult to teach.

10. Last but not the least
Learn to go back to white board often. The more you re-think as if this feature would have been part of our most initial design, how would we have thought of the design at that time? Sometimes, the biggest challenge with incremental work is just that we are too constrained by order of functionality we built first and quite often can't go back to basics. The trick is to keep seeing how would we generalize rather than accommodate this new feature or scenario. This requires that design remains current, and that happens only if we go back go drawing board often. Also, as new generation programmers join in, they become part of the thinking tank rather than just putting patches around.

EDIT
11. Keep track of work-around and design gaps.
Quite often we are under pressure of timelines to fix the bug and release in production. However, when the bug is at design level quite a few things needs to change but often people will fix by some short-cuts to meet the deadline. This is OK. However, as multiple such work around solutions increases, the code becomes fragile. Keep a special track on how many design gaps are already gone in. Typically, when you negotiate the timelines with project manager it is best to make him/her commit that we shall deliver this in short-cut to save production but we shall also have timeline and resources to get permanent solution.

Dipan Mehta
  • 10,542
  • 2
  • 33
  • 67
  • 1
    Kudos. This answer is much better than most of the online tutorials – Ubermensch Jan 17 '12 at 04:16
  • These are some very useful and important tools when helping "agile-resistant" teams to learn how to be Agile without necessarily committing everything all at once to changing the incumbent methodology. Your 9th point is effectively offering an opportunity to review code, without needing a formal review process or switching to pair programming, but requires a no-blame-no-pride mindset in order to avoid unnecessary friction developing. When branching however, I'd further suggest minimizing this to a single branch with the aim of merging back into the mainline as early as possible... – S.Robins Jan 17 '12 at 04:20
  • @DipanMehta The question seemed to be from a new-comer and it warranted an answer that could give him a broad perspective to build upon existing things in spite of being too specific and your answer is really close to it. – Ubermensch Jan 17 '12 at 04:23
  • 1
    ... as managing multiple branches can become seriously problematic to manage as time passes, so you would want to keep your branched changes small and suited to resolve a specific problem, merge, re-branch, etc. A good version control system with support for workspaces and which differentiates between a versioned "promote" and an unversioned "keep" can help to avoid branching altogether. IMHO however, it's better to get the process right, and then find tools to fit, rather than match processes to tools. – S.Robins Jan 17 '12 at 04:25
  • +1 for " it's better to get the process right, and then find tools to fit, rather than match processes to tools" – Ubermensch Jan 17 '12 at 04:29
4

You've already identified that you know there are problems with your processes which affect the quality of your software, and while this question will provoke a range of answers, my suggestion would be to look at the topic of software engineering and try and learn what developers in the main are finding themselves doing more and more of in that area. I suggest you start reading a few good resources to get yourself kicked off. A few that come to mind:

  • Lean Software Development by Mary and Tom Poppendeick provides a great read for people interested in learning how to identify "waste", and what to do about changing processes to become leaner and more efficient.
  • Head First Software Development by Dan Pilone and Russ Miles is a bit like one of those "for dummies" books at first glance, but by looking a little past the chaotic presentation style, it contains most of the information relating to the basics of software engineering and has a brief write up about Test Driven Development.
  • Introducing BDD is Dan North's page about getting into Behaviour Driven Development, or perhaps you'd prefer a BDD Wiki. These are starter references for BDD and you will probably want to look into tools and frameworks to help you. The important thing to understand is that BDD is effectively TDD taken to a higher conceptual level. It allows you to think about testing as you are thinking about specifications, and to test in the same language you use when you write specs. The frameworks generally integrate with other unit testing frameworks, so you get the best of both worlds if you decide that your testing might not necessarily benefit from the BDD syntax.
  • Wikipedia's Agile Software Development Article is a good primer all about agile software development, and provides a number of useful references and links to articles by some of the development community's more respected people.

In order to improve HOW you work, you need to allow yourself to be completely open-minded, and willing to step well outside your comfort zone in order to learn to improve the things that you do without clinging to certain concepts that you may find are more comfortable to hang on to. Speaking from personal experience, this is probably the hardest thing to do, or to encourage in others.

Change is hard at the best of times, even if you feel you are actively seeking change, so the best advice that I can really give you is to look at the various Agile methodologies that have been developed over the years to familiarize yourself with the practices that are considered to be most important (eg: Unit Testing, Continuous Integration, Refactoring, etc...), and then pick the methodology that seems closest to what you and your team will feel most comfortable with. Once you've made your decision, tune the practices and your development process so that it suits how your team would prefer to work, keeping in mind those lean principals and how you wish to work so that your team can produce the greatest value with the least waste. Finally, always look forward yet allow yourselves to second-guess your choices just enough to always seek to improve your processes over time, and introduce the changes to your processes in stages, so that it will be easier to swallow, and will become second nature to you as time passes and your processes are slowly improved.

If you feel your processes merely need tweaking, yet you are concerned that your tool chain isn't quite keeping up with your needs, then perhaps look to improvements there. At a minimum, a continuous integration integration product (such as Continuum, Cruise Control or Hudson), and an Issue Tracking system (such as Jira, or Redmine) should be a priority to help you to automate some of your build and release processes, and to keep your bugs and feature requests in check.

The reality is that no matter how RAD your processes are, if you don't put in the investment of time to getting things "right" for your team, your problems will only continue to grow with time, and your perception of available time will shrink accordingly. Big changes are usually out of the question when under heavy time pressure, but try and give yourself a little "wiggle room" in order to put systems into place to help you to take baby steps towards your team's vision of an ideal methodology.

S.Robins
  • 11,385
  • 2
  • 36
  • 52
  • I was referring to our team as team of "RAD" developers to emphasize the fact that we are in the business of "Rapid Application Development" where the development cycles are extremely short. So it is got nothing to do with RAD tools or IDEs. Thanks for your reply. – PeterT Jan 18 '12 at 10:17
  • @PeterT: Ah! My apologies for the misunderstanding. I must have skimmed your 3rd paragraph and missed the context. I'll edit my answer to suit, however the advice in the main still remains in context. :-) – S.Robins Jan 18 '12 at 10:42
3

I work in a small dev team also (only 2 of us) and we experienced similar issues you have mentioned. The main issue for us is that we both tend to work on separate tasks and it was becoming too common for us to complete a task/feature, test it (tested by developer only) and release quickly. This was resulting in a lot of small releases with users reporting small bugs that should have easily been picked up in testing.

In order to improve our process I started by introducing a Kanban board. The board was very simple to start with and only had a few columns (setup using a whiteboard, index cards and coloured magnets):

Backlog | To Do | Done

However, this quickly evolved to mirror our actual process:

Backlog | Development | Dev. Test | UAT | Done | Released

Along with the board we have a rule that each Task/Feature must be tested by another developer (as well as by the developer that implemented the feature). So by the time the a card reaches the 'Done' column it should have been tested by at least 2 developers and also User Acceptance Tested.

There are a lot of other benefits to using Kanban. For us it has improved communication and helped us to share knowledge as we are both involved to some degree in each task. It has also improved our release process as we can now see exactly what tasks/features are ready to release/done and can sometimes hold off on releasing if we know other taks will be done soon. For people outside the team, the board also acts as a quick reference to see what tasks we have scheduled, current work in progress and what was recently released.

Just as an aside, we use coloured magnets (one per developer) to flag the card we are curently working on but another option is to add swim lanes (rows), one per developer and place Kanban cards in relevant swim lanes. Again this helps as a quick reference to see what each developer is currently working on.

Other links I found useful:

Kanban Applied to Software Development: from Agile to Lean

A Kanban System for Software Engineering - Video

Hopefully Kanban would address point 1. in your question. In relation to code reviews, we do this at the dev testing stage so that any changes required after review are dev tested again before going to UAT. Rolling back depends on your environment but we deploy application files to Terminal Servers using batch files which rename current files and copy across new files from a central server and we can roll back fairly easily by placing the back up (previous files) in the central location and re-running scripts.

Matt F
  • 173
  • 6
2

Whenever I hear about defects, my first questions are about where the defects originate and where they get detected and removed. Defect Removal Efficiency is a good way of measuring and tracking this. By knowing where defects originate and working to improve the processes at those phases, you can reduce time and cost of a project. It's well known that it's cheaper to fix defects closer to their point of injection, so once you know where the defects are coming from, you can then look at activities changes to improve those phases.

Once you have information about where the defects are coming from, you can look at exactly what techniques and technologies you want to apply. Reviews of requirements, design, and code, automated tests, static analysis, continuous integration, and more extensive user-driven testing might be options that you should look at, depending on what phases generate defects.

To expand on your desire for code reviews, you should also consider different levels of code reviews based on the priority and risk of a module. Low risk, low priority modules might not need a code review at all, or perhaps just a simple desk check, where another developer just reads the code on his/her own and provides comments, would work. Other code review techniques include pair programming, walkthroughs, critiques, and inspections with various numbers of developers.

For the purposes of rolling back, I would look to automate that process using some kind of scripts to make it faster and less error prone. In a perfect world, I would want to increase the quality of shipped products such that it isn't necessary to roll back, and you can achieve this. Having the capability, though, might be a good idea, but make it as painless as possible.

Thomas Owens
  • 79,623
  • 18
  • 192
  • 283
1

As others have pointed out, adding regression testing will help to avoid the same defects appearing in the future. However, if you're encountering new defects, then it may be time to add assertions (a.k.a. contracts) to the code that specify the pre-conditions, post-conditions, and invariants of the classes and methods.

For example, if you have a class where a method can only accept numbers between 10 and 25 (this is called the pre-condition), you would add an assert statement at the beginning of the method. When this assertion fails, the program will crash immediately and you'll be able to follow the chain of methods that led to that failure.

Python, PHP and other programming languages are dynamically typed and don't add many conditions to methods. All that's needed for something to work is that it implements a particular method. I suspect that you need more conditions on your methods. You need to define and test to ensure that a method can actually work in its environment.

For C/C++ programs, I found that adding assertions to test memory allocation was very helpful in reducing the number of memory leaks in the program.

  • Well, I agree that asserts/post/pre-conditions checking is a good programming practice, and will eventually pay off, but my question was was aimed to improve the quality of the very frequent releases, not the quality of the code in general. – PeterT Jan 18 '12 at 10:26
  • It'll pay off right away because you'll have to start with adding asserts/condition-checking in each release for the new features/bug-fixes. It'd be a huge task to add asserts to the whole project in one go ;p –  Jan 18 '12 at 19:39
  • There is a thing with the asserts though - what if a got it wrong. What if we though the method should only accept numbers between 10 and 25, but in reality it is ok to widen the range to [0;50] and it was only found after a new release has been rolled out and been in production for a day. If a method under a quesiton is a low-level one and used in many places there is not much we can do, but to re-release with a fix. However if we would not have added the assertion at the method level to use a higher level try-catch block instead we could get away with only part of functionality .... – PeterT Jan 19 '12 at 14:37
  • ... not available so we could buy ourself some time to make a "proper" or call it "a scheduled" release one week later. I think you see my point. Thank you for your comment. – PeterT Jan 19 '12 at 14:39