Welcome

What works and what doesn't work in software development? For the first 10 years or so of my career, we followed a strict waterfall development model. Then in 2008 we started switching to an agile development model. In 2011 we added DevOps principles and practices. Our product team has also become increasingly global. This blog is about our successes and failures, and what we've learned along the way.



The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The comments may not represent my opinions, and certainly do not represent IBM in any way.

Monday, August 16, 2010

Doing Planning Poker Remotely?

Has anyone out there worked out a clean way to do Planning Poker remotely? We have development teams with people in multiple locations, so we conduct our planning sessions via conference call and screen sharing.

If you're not that familiar with Planning Poker, it's a quick way to size stories without getting down to the level of person-days. Story point sizings are based on your past experience - an 8-point story should take rougly as much design/dev/test/doc effort as an 8-point story you've finished in the past. You talk for a few minutes about a story, and then everyone in the planning session does a "secret ballot" where they choose a story point sizing. Then you discuss the largest and smallest sizings, and come to a consensus on what the sizing should be. You can use the story point sizings to plan how much work you can do in an entire release, across several sprints.

There's a Planning Poker plug-in for Sametime. The plug-in was OK - you could set up a planning session with a group of people, and enter the stories to be sized. Then when you selected a story, each person could submit their estimate, and it wouldn't show the estimates until everyone had submitted. But it's often a problem for people to get the plug-in installed and working. For example, if people had the wrong version of Sametime, or if they were using a Mac, it wouldn't work for them.

Another thing we have done is a standard group chat where everyone just types in their estimates, and everyone can see them right away. There are a couple of problems with that - for one thing, some people (mostly newbies) will refuse to submit their estimates until other people (usually the team lead / scrum master) submits theirs. Also, once a few people submit their estimates, all of the other estimates seem to magically become the same number. Which means that people are changing their answers based on what they already see on the screen. Sigh.

A third thing I've seen scrum masters do is set up a second display where they have an individual chat session with each person, and no one can see the sizing estimates except for the scrum master. That works resonably well, but it's still kludgy.

Can anyone recommend a better way to do this? What do you do?

Friday, June 4, 2010

Working with a remote team

I'm having a little adventure this month - we just hired five contractors in Bucharest, Romania to work on my current project. They're new to our product, and we have to get them to a point where they're doing productive work very quickly. It's a challenge for sure.

I've worked with remote teams before, and this experience is re-enforcing a few rules of thumb that I have:
  • Time zones matter. Romania is 7 hours ahead of us, which means that my day starts at 9 AM and theirs ends at 10 AM. So, we've blocked off 9-10 each day to talk to each other. This is significantly easier than working with teams in India or China, where the difference is close to 12 hours. (Meetings at 8 PM? 7 AM?) However, it's significantly more difficult than working with teams in Brazil, which are only 1 hour off, or even teams in Ireland, which are 5 hours off. The Romanian team is working while I'm asleep, so I have to make sure they all have enough to do before I stop working the night before. Also, if they get stuck on something, it's easy to lose a day. So to keep them productive, ideally each of them should have a few things they could be working on. That way if they're blocked on one task they can work on another until we sync up again.
  • You should be in the same building as the people you work with most closely. Yes, you can talk to someone 4-5 times a day over the phone, but it's significantly more efficient if you can talk in person. Especially if your work sometimes involves drawing things, like design diagrams or GUI prototypes. It's also very helpful if your work involves making difficult decisions, because you can read people's body language in person. Reading body language can help you see if someone is sure or unsure of what they're saying.
  • Teach the remote team to be as independent as possible. If they get stuck on something, they need to be able to work past the problem on their own, so you don't lose a day. Also, it helps if they're working on their own set of files/features, so they're not interfering with people who they can't communicate with immediately.
  • Language skills are critical. One of the worst feelings is explaining something to a technical team on the phone, and hearing crickets in response. Did they understand what you said or not? Are they talking to each other on mute? Even a thick accent can make communication difficult, because you may have to ask someone to repeat themselves over and over. After a few tries, people may give up and try to guess what the other person is saying. That can lead to misunderstandings. You can mitigate this somewhat by speaking more slowly and carefully, and by rephrasing what you heard the other person say. But this just slows down communication, so it's best to hire people you can communicate with easily in the first place.
  • Meeting people in person, even once, helps a lot. There are people that I spent a week with in training sessions three years ago. I still go to them first with a problem, and they are more likely to help me than someone I've never met. And vice versa. In some ways, I feel like I know them better than people I've worked with for years over the phone. Meeting people face to face naturally creates a certain comfort level and familiarity that's very difficult to develop otherwise.
  • Training takes a lot of time and money. This sounds obvious, but it takes a long time to get a new set of people up to speed. On our product this ranges from a month to a year or more, depending on the specific assignment. The hidden cost is that the people doing the training are not able to spend as much time on their other assignments. This is an argument for retaining existing employees whenever possible! Also, when searching for contractors or new hires, you should put a high premium on people who already have a skill set close to what you need, even if they cost more. One other tip in this area - the time to train five or six people at once is similar to the time required to train one or two, so you should try to add people to a project in batches, not one at a time.
  • Record your training and demo sessions. We record most of our training sessions and demos (end of sprint demos and customer demos). We use Lotus Live/Sametime Unyte/Webdialogs, or Camtasia Studio, to record screen sharing along with the sound from the conference call. We post them to a shared server and link to them from our development wiki. We use those recorded sessions over and over again. They come in handy for training new developers and testers, creating beta scenarios, training sales and support teams, creating new training classes, etc. We get pretty good reuse out of the slide decks too.

Anyone else have some favorite tips? Maybe you could help me!

Wednesday, April 7, 2010

Struggling with the Product Backlog

One area that our team consistently struggles with is the product backlog. This is not a new problem; even before we adopted agile development processes we had problems with this.

Our legacy system for tracking customer requirements is a Notes database. Notes does some things well, but this particular database is often a black hole. Here are my complaints with it:
  1. The search functionality is terrible. We have literally hundreds of requirements, so a search feature is really important. I can't get a full text search to work at all; maybe it's not enabled.
  2. It doesn't provide an easy way to prioritize requirements to get the most important to bubble up to the top.
  3. Almost anyone who works with customers can open requirements. Some people are not particularly good at writing up requirements. So, they may think that they've done their job by opening a requirement, but the product leadership has no idea what they're really asking for. Some people also don't fill in contact information for the customer requesting the feature, so it's difficult to go back to the source for more details.
  4. There's no good way to find very similar or overlapping requirements. We might have five separate requirements for the same thing, and no way of seeing that.
  5. It's not managed well. Nobody goes through the painful process of looking at all of the requirements to see which ones no longer apply, or which ones have become more important since they were opened.
We have started using Rational Team Concert (RTC or Jazz) for managing requirements. However, we still have this legacy database to deal with, and I'm pretty sure no one has the time or inclination to copy everything from Notes into RTC. Besides, would that even be a good idea? RTC at least has a good search function, and it's easy to prioritize things. But RTC doesn't fix problems 3 through 5.

So, what happens when the product backlog is poorly managed? We start planning a release and ask the product owner what should go into it. Some items are obvious and get added added to the release backlog right away. Then the blank stares begin. How can you even tackle the task of looking through the rest of the requirements to find the best and most important ones?

We end up developing whatever the product leadership has been hearing about the most in the past few months. Sometimes this is good enough, sometimes not. For example, if the support team is consistently hearing the same complaints, but not bringing them forward to product management, those complaints might not be addressed.

Something like an idea cloud would be nice. It could show us the keywords that are coming up most often in the requirements. That would help us find common themes, and overlapping requirements. But I have no idea how we could get that working with Notes or RTC.

Getting people to categorize their requirements when they open them could help too. At least that would help us group similar requirements together. RTC lets you enter keywords on stories, but that's freeform, so people may use different keywords for the same idea.

I would love to get some advice on how to improve this situation. Especially problems 3, 4, and 5.

Monday, March 22, 2010

Is Shift-Left Agile? And Death By Build

Sometimes people confuse "shift-left" and "agile" software development practices. They both involve testing earlier, and talking to customers earlier, but that's pretty much all they have in common.

"Shift-left" practices are ones that help you find and fix problems earlier. You can think of the software development life cycle as a time line, with gathering requirements on the left, then design, code, test, and field maintenance to the right. The earlier (and farther to the left) you find a problem, the less it will cost you to fix it. It takes extra time and effort to find problems earlier, but it's worth it in the long run. For example, if you create JUnit test cases for most of your new code, you will invest more time up front, but you'll almost certainly find some bugs even before the code is handed off to the testers.

Showing your code-in-progress to customers on a regular basis is also a "shift-left" practice, because you may find out that you are not delivering what the customers want early enough to change your design.

"Shift-left" practices can help make your project more agile, because if you catch bugs earlier, you don't need to have such a long test cycle at the end of a release. If you automate your tests and run the automated tests with every build, you can also find newly-introduced bugs in old code more easily, and cut down on the amount of regression testing and bug-fixing you have to do later.

But you can slow down a project with "shift-left" processes too. I know of a project where their build process takes 30+ hours. I'm assuming it takes so long because they're trying to do too much automated code analysis and automated unit testing as part of the build process. (My product is quite large, requiring several CDs, and it only takes about an hour to build everything.) Then, after the 30-hour build, they start their BVT (build verification test), which also takes many hours. The goal of testing during the build process is to catch bugs that escaped unit test before they get to the test team, and "shift-left". But because the build process is so slow and painful, they only run it once a week! So if a code change misses a build, it's delayed by an entire week. And if someone's new code breaks the build (heaven forbid), the entire project is delayed by a week.

So what did they do to avoid accidentally delaying the project by a week? They implemented a heavy process of code reviews to make sure that no one integrates code that will break the build. Code reviews and inspections are another "shift-left" tool. So now you have to make your code changes by Wednesday. On Wednesday you have to find a team leader to review your code changes and approve them so they're ready for the Thursday build. Then the build and BVT processes run on Thursday, Friday, Saturday, and code is ready to be tested on Monday.

This is the opposite of an agile process. Yet, strangely enough, this team claims that they are following agile software development processes, specifically Scrum. I'm sure the people who designed their development process had the best intentions. However, one of the more important tenets of agile software development is that you have to get your code into the hands of testers, and customers, as soon as possible. When it can easily take a week or more just to get new code to the test team, you're losing many of the benefits of agile development processes.

Don't get me wrong, shift-left and agile processes can work together quite nicely. But they can also work against each other if you're not careful!

Does anyone else have any shift-left horror stories to share?

Wednesday, March 17, 2010

Open Source Software

Thanks to Husain for this topic suggestion...
"What Software Development Methodology can best fit in delivering an open Source Software??? After doing a little research I found no better than the Agile method!!"

Open Source (and Community Source) software is special, in that much of it is developed in tiny pieces by people working independently. It does lend itself to agile development, in that the requirements are not all collected in advance. Open source development cannot be done in a waterfall fashion, where all of the design for a release is done at one time, then all of the development, then all of the testing.

The one open source project I worked on used a project backlog, where people would pick up the next high-priority work item (story) that interested them. Someone would write a bit of code, test it, document it if needed, and submit it. Then the changes were approved and integrated by the project owner, and released to the public within a few weeks. So, adding a new feature/story was like a mini-sprint. However, the project did not use time-boxed iterations with fixed start and end dates. Each person was on their own schedule. It was very agile.

I think this highlights the fact that Agile software development methodologies are not one-size-fits-all. What works for a typical open source project would not work for my current project.

Our product consists of several components that depend upon each other. To implement my new feature, I may have to ask someone from another component team to write some new code for me. And someone else may use the new code I'm writing in their component. Because we are adding major new features across the board, we usually need to install the same version of every component, or they will not work together. We also need to plan some features a few months in advance: component A will implement the first piece, then component B will use that new feature and implement the second piece, and then component C will use features from A and B to implement the third piece. Our customers expect A, B, and C to all work together, so we need some time to stabilize the code and test all of the pieces together. So we have a couple of months at the end of a release where no one is allowed to make any major changes, and only bug fixes go in. We also use this time to test the code on additional platforms, and in different languages. This is less agile, but it's not realistic to expect us to ship a complex product like this without some dedicated time to stabilize everything and clean things up.

I have some stories about a team I know of that calls themselves agile, but really isn't. It's pretty silly what they do, really. More on this later...

Monday, March 8, 2010

When is a story "done done"?

When exactly is a story "done done", ready to be checked off the product backlog?

On the one hand, saying that a story is "done" too soon will leave your project with hidden debt. If you're still writing code for a story after you've said that it's done, then it's not done! And if you're still writing code in the next sprint for something that was theoretically done in the previous sprint, then a few bad things are happening:
  • Your velocity for the previous sprint will look too high.
  • Your velocity for the current sprint will look too low.
  • You may or may not be "in sync" with the test team and product management on what's really done and what still needs to be tested.
  • You lose the benefits of time-boxed iterations.

But what about things like Bidi enablement, visual design clean-up, or extensive logging? I would argue that much of this code hygiene work can and should be done toward the end of the release. Get the new function and risky changes out there so they have time to mature. Then save a sprint or so at the end for clean-up work.

I believe it's reasonable to create a story like "Add Bidi support to the following areas: ...". It's testable, and it's new function. Plus, you get yourself into a Bidi-enablement mode, and you can make a single pass through the code making the same changes everywhere. This is good because it decreases the amount of context-switching your brain has to do.

On the other hand, I believe that logging/tracing, JUnit testing, and factoring out text for translation should be done before a story is "done". Logging/tracing make it easier to debug the code from the beginning, so you don't waste time finding where the bugs are. JUnit testing finds bugs early, so you can fix them more quickly and easily. And factoring out messages is easier to do while you're in the code; you'll inevitably miss some translatable text if you try to do it all later.

Performance testing is a tricky one. You can't leave it until the end because you may find that you need to make significant changes to your algorithm to improve performance, and then everything will need to be re-tested. But you could ship a product with new function that takes too long to run, if you ran out of time for performance testing. If a new feature warrants performance testing, I would recommend doing the performance testing one sprint after the new code goes in.

In my current project, we say a story is "done done" when all of the function test scenarios have been executed, and it has no severity 1 (the function doesn't work, and there's no work-around), severity 2 (the function has bugs that you can work around), or must-fix (as determined by the testers) defects that are still not closed. It may have severity 3 (small problem) or 4 (annoyance) defects. It also has to be demoed at the end of the sprint.

I know someone who works on a project where there is a long list of criteria to be met before a story is "done done done" as they call it. In addition to code hygiene, it also has to go through system test, translation, accessibility testing, and so on. As a result, no story is marked as "done" until the product is about to go out the door. This makes their story points meaningless. It's overkill!

On the other hand, it's not appropriate to say that a story is "done" just because you've completed unit testing on it. If the testing is not completed by the last day of the sprint, the story needs to be moved to the next sprint as debt. I've seen people try to fudge their way out of this... "well, this story only has one open defect, so we should get credit for it". We need to be at peace with sprint-to-sprint debt. The good news is that the stories that are almost done should be closed quickly at the beginning of the next sprint, so your velocity for the next sprint will be higher. Over time your story point velocity will average out to an accurate number.

I'd love to hear what some other teams are using as their "done done" criteria.

Wednesday, March 3, 2010

Using Story Points for Sprint Planning?

Our team is still trying to get the hang of using story points. Until very recently, we weren't using them at all. I think this is for a few reasons:
  • Story points are relative to each other, not to the calendar. This makes them more abstract than person-days, so they're a little hard to wrap your head around at first.
  • We weren't in the habit of playing planning poker to put points on our stories. On the rare occasion we did put points onto stories, they were really just bastardized person-days (1 story point = 1 person-day).
  • Because we didn't have story points on things we had already completed, we couldn't use previously completed stories as a reference to put points onto new stories.
  • Because we didn't have any historical data on how many story points we closed in previous sprints, we had no idea what our velocity was, so we couldn't use story points or velocities to help plan the next release.
  • Story points aren't detailed enough for sprint planning. If you know that you have two weeks to code, and 3 people writing the code, how many story points can you commit to in the sprint? It doesn't have any meaning.

It's a vicious cycle: story points aren't useful if you haven't used them before... so people don't try to use them... so you don't gather the historical data to make them useful...

The problem is that we were spending too much time doing sizing estimates. As soon as you try to size something in terms of person-weeks, that implies that you have a pretty good idea what the low-level tasks are, and how long it will take to complete them. Honestly, it wouldn't be unusual for us to invest a person-week into sizing a story, when you consider how many people were pulled into each sizing effort for a few hours apiece. So we were told that we had to start using story points, and stop spending so much time on the sizing estimates. In planning poker, you rarely spend more than 2-3 minutes sizing a story.

People scoffed at the idea that you could plan a sprint without getting down to the task level and estimating how long it would take to implement each story, though.

Then someone found this article:
http://blog.mountaingoatsoftware.com/why-i-dont-use-story-points-for-sprint-planning

Now, I think we will be using story points for release planning (meaning, planning far into the future), and task hours/person-days for planning each individual sprint.

This means that we'll have to take a leap of faith. We have to start assigning points to stories before we start working on them. Once we've done that for a couple of sprints, we'll have some idea what our velocity is. Then we can start to use that information for planning at the release level. We'll see how it goes.

So, what does your team do with story points and sprint planning? Is it working?

What is AgileFall?

AgileFall is a tongue-in-cheek term for a software development model where you are trying to be agile, but you keep falling into waterfall development habits. For example, a team practicing AgileFall may say:
  • We have sprints, but they are four or six weeks long.
  • We try to do most of the development at the beginning of the sprint and most of the testing at the end of the sprint.
  • We have a product backlog with priorities, but we start working on a release with a long list of features already committed to the business.
  • We know our product release dates several months in advance. There's a long list of target dates that must be met before the release date.
  • We test new features as they are developed in each sprint, but we also have a lonnnng system / globalization / translation / integration test cycle at the end of the release.
  • We sometimes spend too much time up front designing the software before we even start prototyping it.
  • We do our initial sizings in person-weeks, because we're not comfortable with story points and velocities yet.
  • When management asks us for a rough sizing, we might spend days working on that sizing effort, because we feel like we need to really understand the tasks involved before we "commit" to a sizing.
  • People get upset (or defensive) when stories are not completed in the sprint they were planned for.
  • We demo our software in development to customers, but only after we're happy with it. By the time we get around to having a demo, it might be too late to change much.

Is your team practicing AgileFall? In what ways?

Introduction

Welcome to my AgileFall blog! I've been creating software since 1995. I have a Bachelor's of Science in Computer Science from Duke, and a Master's of Science in Computer Science from UNC, where I focused on Distributed Computing and Software Engineering. I studied software enginnering theory, including the waterfall and agile models, in school. For the first 10 years or so of my career at IBM, we followed a strict waterfall development model. Then about 2 years ago we started switching to an agile development model. Our development and support team has also become increasingly global. This blog is about our successes and failures.