Marco Craveiro: 01/01/2016

Nerd Food: Interesting…

Time to flush all those tabs again. Some interesting stuff I bumped into recently-ish.

Finance, Economics, Politics

Why Big Oil Should Kill Itself: This is a really, really interesting article. The gist of it is that the entire logic around oil exploration is now a fallacy and it makes more economic sense to simply give up looking for oil because all oil that is left is just to expensive to commercialise. It also has a very interesting take on the valuation of oil companies (and sources of take overs) but I won't spoil it for you. If you are into oil (or against it), its a must read.
Oil Goes Nonlinear: Short but thought provoking. I don't tend to agree with Krugman on a lot of things, but quite like this analysis.
Africa’s Boom Is Over: And the bad news continue. Totally spot on analysis of what will befall us.
American Spring: interesting take on the state of affairs of American politics. Not sure I agree with everything, but definitely food for thought. "Statistically speaking, what are the odds that the two most qualified candidates to be president out of 300 million people are siblings? Or married?" Indeed.
A Year of Sovereign Defaults?: Very good and very scary. This has to be on the cards, the only question is the timing.
Really rich people are suddenly paying quite a bit more in taxes: some good news on the equality front I guess. But not quite sure it makes much of a difference in the big scheme of US things.
Argentina's 'little trees' getting chopped down by new president: Seems like Argentina is going to go through yet another turbulent period, with some good and bad news coming out. Interesting take on the impact to the less well off of the new policies. The chap is certainly a doer, it seems: A fast start.

Startups et al.

The WTF Economy: Tim O'Reilly (of publishing house fame) is setting up a conference in the future of work. Sounds extremely interesting. Hopefully, they will have a section dedicated to the developing world. Source: Tim O'Reilly (Twitter)
Bitcoin is Being Hot-Wired for Settlement: Garzik is at it again. Interesting news of cryptos in the settlement front. Source: Jeff Garzik (Twitter)
Elon Musk’s Billion-Dollar AI Plan Is About Far More Than Saving the World: So it seems Musk and Altman want to ensure AI plays nice. Not quite sure he's right on this one. Steven Levy's version is available here, more of an interview with Musk.
License to (Not) Drive: Levy gets to try the Google self-driving cars. Very interesting.
Hire Literally Anyone: Extremely interesting. I always thought the existing hiring practices are not very well thought out, but this article makes me realise that the flaws are deeper than I expected. While we are on this topic, this ain't too bad either: How to Hire.
The resolution of the Bitcoin experiment: Great - nay - insanely great analysis on the state of affairs in the BTC world. Spoken with authority.
On the dangers of a blockchain monoculture: Pours more petrol in the raging BTC fire. Very interesting. Never saw BTC as a monoculture, but actually it so is.
The Final Days of the Bitcoin Foundation?: And yet some more on the BTC impending doom. I just gotta stop reading about it now, the whole saga is far too depressing. Lets hope the technology survives where humans failed.
IBM Talks Open Ledger Project, Bright Future for Blockchain: Still trying to catch my breath on all the BTC articles coming out, and lo-and-behold, I missed the whole Open Ledger thing.
Apploitation in a city of instaserfs: Scary. Very scary. Reminiscent of the older Mirani article The secret to the Uber economy is wealth inequality. San Francisco is becoming more like Mumbai and that is not good news.

General Coding

Feeding Graph databases - a third use-case for modern log management platforms: Very interesting ideas on how to use logging data in a graph database. Sounds extremely counter-intuitive, and then you start reading at which point its like "Damn, why didn't I think of that before!". Source: Hacker News
Moores law hits the roof: Seems like the exponential function is revealing itself as a sigmoid, as everyone knew it would. Some of the cracks that are already present in Moore's law. Interesting to note that a transistor is now only a few silicon atoms wide - meaning we can't really make it much smaller. Source: Hacker News
No, I Don't Want To Configure Your App: Call to arms to get us all thinking on just how many configuration knobs you need to use something. Source: Hacker News
Your IDE Is Killing You: Somewhat preaching to the choir, since I am an Emacs user of old, but still a very cogent argument on why relying too much on IDEs is not a good thing. Source: Bruno Antunes (twitter)
Starters and Maintainers: The different personas around an open source project. Interesting, its good to be aware of which hat you are wearing when.
I Moved to Linux and It’s Even Better Than I Expected: A feel good story about the Linux desktop. Given how slowly things are progressing on that front, we all need one of these some times to cheer us up. Main value of the article though.

Databases

Encrypted databases with ZeroDB: I'm not exactly impressed with the technology itself, but more with the ideas one can extract from it. Briefly: what if the database only stores encrypted data, which only each client can decrypt? This is certainly a very useful thing for certain types of information and a PostgreSQL extension would be most useful. Source: Hacker News
Introduction to PostgreSQL physical storage: Great article on Postgres low-level details. One to read if you want to get serious about the Elephant but are not yet in the know.
Schema based versioning and deployment for PostgreSQL: Tips on how to manage versions for your stored procs, and also contains links for table management. For those of us not totally taken by NoSQL.

C++

Lessons learnt from 10+ years with actors in C++: The voice of experience talks about what they learned from using Actors over more than a decade. Worth reading if you are into that pattern.
Automating a C++ program from a Node.js web app: If you are considering exposing your C++ code into JS, this is a series of posts to read.
Starting a tech startup with C++: lots of libraries I never heard of and an insight on the performance differences between python and c++.
Writing modern C++ servers using Wangle: The follow up to the previous post, explaining how to write servers with Facebook technologies.
I want my pony! Or why you cannot have C++ exceptions with a stack trace: very interesting. Since I started using Boost.Exception I never missed the stack traces either. Source: Hacker News

Layman Science

Why String Theory Is Not A Scientific Theory: Doesn't say a lot of new things, but its good to remind ourselves on what exactly do we mean when we say "Science". This would save us from a lot of grief, such as considering Economics as a Science.
The cold fusion horizon: … talking about Science, I was surprised to find out that people are still talking seriously about cold fusion. Interesting article, because it takes the flip side of the Science coin: nothing should not be science unless it is not using the scientific method. Whilst up til now cold fusion has been more of a hoax, we should not discredit people who work on it provided they are following scientific principles. Who knows, they may be right in the end. Science is all about long-shots.

Other

Exit Sandman: Neil Gaiman goes in-depth with Overture, one of 2015's best comics: For the Sandman fans, the new (and last) Sandman book is all the rage. A great interview by the man himself.
Dear Zachary: bumped into this via Wait But Why, and, as usual, great tip. Fantastic documentary.
Solaris: Always wanted to watch this Tarkovsky movie and now it seems it is available online! This is part of an initiative described by Open Culture here. Source: Bruno Antunes (twitter)
Wittgenstein: A Wonderful Life: Found a Wittgenstein documentary, but sadly haven't had time to watch it just yet. In my watch list though.
Je ne suis pas Charlie: Haven't yet watched it but seems thought-provoking. Watch listed.
Tulipa Ruiz - Efêmera - Album Completo: New musical find in the Brazilian space (Portuguese).

Created: 2016-01-18 Mon 12:49

Emacs 24.5.1 (Org mode 8.2.10)

Validate

Nerd Food: On Product Backlog

Would be be good to have a better bug-tracking setup? Yes. But I think it takes man-power, and it would take something *fundamentally* better than bugzilla. -- Linus

Many developers in large companies tend to be exposed to a strange variation of agile which I like to call "Enterprise Grade Agile", but I've also heard it called "Fragile" and, most aptly, "Cargo-Cult Agile". However you decide to name the phenomena, the gist of it is that these setups contain nearly all of the ceremony of agile - including stand-ups, sprint planning, retrospectives and so on - but none of its spirit. Tweets such as this are great at capturing the essence of the problem:

Top tip: if you need to bring a notepad to the daily stand up to tell us what you did yesterday that's too many details
— Fran Buontempo (@fbuontempo) January 12, 2016

Once you start having that nagging feeling of doing things "because you are told to", and once your stand-ups become more of a status report to the "project manager" and/or "delivery manager" - the existence of which, in itself, is rather worrying - your Cargo Cult Agile alarm bells should start ringing. As I see it, agile is a toolbox with a number of tools, and they only start to add value once you've adapted them to your personal circumstances. The fitness function that determines if a tool should be used is how much value it adds to all (or at least most) of its users. If it does not, the tool must be further adapted or removed altogether. And, crucially, you learn about agile tools by using them and by reflecting on the lessons learned. There is no other way.

This post is one such exercise and the tool I'd like to reflect on is the Product Backlog. Now, before you read through the whole rant, its probably worth saying that this post takes a slightly narrow and somewhat "advanced" view of agile, with a target audience of those already using it. If you require a more introductory approach, you are probably better off looking at other online resources such as How to learn Scrum in 10 minutes and clean your house in the process. Having said that, I'll try to define terms best I can to make sure we are all on the same page.

Working Definition

Once your company has grokked the basics of agile and starts to move away from those lengthy specification documents - those that no one reads properly until implementation and those that never specified anything the customer wanted, but everything we thought the customer wanted and then some - you will start to use the product backlog in anger. And that's when you will realise that it is not quite as simple as memorising text books.

So what do the "text books" say? Let's take a fairly typical definition - this one from Scrum:

The agile product backlog in Scrum is a prioritized features list, containing short descriptions of all functionality desired in the product. When applying Scrum, it's not necessary to start a project with a lengthy, upfront effort to document all requirements. Typically, a Scrum team and its product owner begin by writing down everything they can think of for agile backlog prioritization. This agile product backlog is almost always more than enough for a first sprint. The Scrum product backlog is then allowed to grow and change as more is learned about the product and its customers.¹

This is a good working definition, which will suffice for the purposes of this post. It is deceptively simple. However, as always, one must remember Yogi Berra: "In theory, there is no difference between theory and practice. But in practice, there is."

Potmenkin Product Backlogs

Many teams finish reading one such definition, find it amazingly inspiring, install the "agile plug-in" on their bug-tracking software of choice and then furiously start typing in those tickets. But if you look closely, you'd be hard-pressed to find any difference between the bug tickets of old versus the "stories" in the new and improved "product backlog" that apparently you are now using.

This is a classic management disconnect, whereby a renaming exercise is applied and suddenly, Potemkin village-style, we are now in with the kool kids and our company suddenly becomes a modern and desirable place to work. But much like Potemkin villages were not designed for real people to live in, so "Potmenkin Product Backlogs" are not designed to help you manage the lifecycle of a real product; they are there to give you the appearance of doing said management, for the purposes of reporting to the higher eschelons and so that you can tell stakeholders that "their story has been added to the product backlog for prioritisation".

Alas, very soon you will find that the bulk of the "user stories" are nothing but glorified one-liners that no one seems to recall what exactly they're supposed to mean, and those few elaboratedly detailed tickets end up rotting because they keep being deprioritised and now describe a world long gone. Soon enough you will find that your sprint planning meetings will cover less and less of the product backlog - after all, who is able to prioritise this mess? Some stories don't even make any sense! The final act is when all stories worked on are stories raised directly on the sprint backlog, and the product backlog is nothing but the dumping ground for the stories that didn't make it on a given sprint. At this stage, the product backlog is in such a terrible mess that no one looks at it, other than for the occasional historic search for valuable details on how a bug was fixed. Eventually the product backlog is zeroed - maybe a dozen or so of the most recent stories make it through the cull - and the entire process begins anew. Alas, enlightenment is never achieved, so you are condemned to repeat this cycle for all eternity.

As expected, the Potmenkin Product Backlog adds very little value - in fact it can be argued that it detracts value - but it must be kept because "agile requires a product backlog".

Bug-Trackers: Lessons From History

In order to understand the difficulties with a product backlog, we turn next to their logical predecessors: bug-tracking systems such as Bugzilla or Jira. This post starts with a quote from the kernel's Benevolent Dictator that illustrates the problem with these. Linus has long taken the approach that there is no need for a bug-tracker in kernel development, although he does not object if someone wants to use one for a subsystem. You may think this is a very primitive approach but in some ways it is also a very modern approach, very much in line with agile; if you have a bug-tracking system which is taking time away from developers without providing any value, you should remove the bug-tracking system. In kernel development, there simply is no space for ceremony - or, for that matter, for anything which slows things down².

All of which begs the question: what makes bug-tracking systems so useless? From experience, there are a few factors:

they are a "fire and forget" capture system. Most users only care about entering new data, rather than worrying about the lifecycle of a ticket. Very few places have some kind of "ticket quality control" which ensures that the content of the ticket is vaguely sensible, and those who do suffer from another problem:
they require dedicated teams. By this I don't just mean running the bug-tracking software - which you will most likely have to do in a proprietary shop; I also mean the entire notion of Q&A and Testing as separate from development, with reams of people dedicated to setting "environments" up (and keeping them up!), organising database restores and other such activities that are incompatible with current best practices of software development.
they are temples of ceremony: a glance at the myriad of fields you need to fill in - and the rules and permutations required to get them exactly right - should be sufficient to put off even the most ardent believer in process. Most developers end up memorising some safe incantation that allows them to get on with life, without understanding the majority of the data they are entering.
as the underlying product ages, you will be faced with the sad graph of software death. The main problem is that resources get taken away from systems as they get older, a phenomena that manifests itself as a growth in the delta between the number of open tickets against the number of closed tickets. This is actually a really useful metric but one that is often ignored.³.

And what of the newest iterations on this venerable concept such as GitHub Issues? Well, clearly they solve a number of the problems above - such as lowering the complexity and cost barriers - and certainly they do serve a very useful purpose: they allow the efficient management of user interactions. Every time I create an issue - such as this one - it never ceases to amaze me how easily the information flows within GitHub projects; one can initiate comms with the author(s) or other users with zero setup - something that previously required mailinglist membership, opening an account on a bug-tracker and so forth. We now take all of this for granted, of course, but it is important to bear in mind that many open source projects would probably not even have any form of user interaction support, were it not for GitHub. After all, most of them are a one-person shop with very little disposable time, and it makes no sense to spend part of that time maintaining infrastructure for the odd person or two who may drop by to chat.

However, for all of its glory, it is also important to bear in mind that GitHub Issues is not a product backlog solution. What I mean by this is that the product backlog must be owned by the team that owns the product and, as we shall see, it must be carefully groomed if it is to be continually useful. This is at loggerheads with allowing free flow of information from users. Your Issues will eventually be filled up with user requests and questions which you may not want to address, or general discussions which may or may not have a story behind it. They are simply different tools for different jobs, albeit with an overlap in functionality.

So, history tells us what does not work. But is the product backlog even worth all this hassle?

Voyaging Through Strange Seas of Thought

One of the great things about agile is how much it reflects on itself; a strange loop of sorts. Presentations such as Kevlin Henney's The Architecture of Uncertainty are part of this continual process of discovery and understanding, and provide great insights about the fundamental nature of the development process. The product backlog plays - or should play - a crucial role exactly because of this uncertain nature of software development. We can explain this by way of a device.

Imagine that you start off by admitting that you know very little about what it is that you are intending to do and that the problem domain you are about to explore is vast and complex. In this scenario, the product backlog is the sum total of the knowledge gained whilst exploring this space that has yet not been transformed into source code. Think of it like the explorer's maps in the fifteen-hundreds. In those days, "users" knew that much of it was incorrect and a great part was sketchy and ill-defined, but it was all you had. Given that the odds of success were stacked against you, you'd hold that map pretty tightly while the storms were raging about you. Those that made it back would provide corrections and amendments and, over time, the maps eventually converged with the real geography.

The product backlog does something similar, but of course, the space you are exploring does not have a fixed geometry or topography and your knowledge of the problem domain can actively change the domain itself too - an unavoidable consequence of dealing with pure thought stuff. But the general principle applies. Thus, in the same way a code base is precious because it embodies the sum total knowledge of a domain - heck, in many ways it is the sum total knowledge of a domain! - so the product backlog is precious because it captures all the known knowledge of these yet-to-be-explored areas. In this light, you can understand statements such as this:

When your product backlog is empty, your product is dead - @KevlinHenney #agileotb
— Marc Johnson (@marcjohnson) September 4, 2014

So, if the backlog is this important, how should one manage it?

Works For Me, Guv!

Up to this point - whilst we were delving into the problem space - we have been dealing with a fairly general argument, likely applicable to many. Now, as we enter the solution space, I'm afraid I will have to move from the general to the particular and talk only about the specific circumstances of my one-man-project Dogen. You can find Dogen's product backlog here.

This may sound like a bit of a cop out, you may say, and not without reason: how on earth are you supposed to extrapolate conclusions from a one-person open source project to a team of N working on a commercial product? However, it is also important to take into account what I said at the start: agile is what you make of it. I personally think of it as a) the smallest amount of processes required to make your development process work smoothly and b) and the continual improvement of those processes. Thus, there are no one-size-fits-all solutions; all one can do is to look at others for ideas. So, lets look at my findings⁴.

The first and most important thing I did to help me manage my product backlog was to use a simple text file in Org Mode notation. Clearly, this is not a setup that is workable for a development team much larger than a set of one, or one that doesn't use Emacs (or Vim). But for my particular circumstances it has worked wonders:

the product backlog is close to the code, so wherever you go, you take it with you. This means you can always search the product backlog and - most importantly - add to it wherever you are and whenever an idea happens to come by. I use this flexibility frequently.
the Org Mode interface makes it really easy to move stories up and down (order is taken to mean priority here) and to create "buckets" of stories according to whatever categorisation you decide to use, up to any level of nesting. At some point you end up converging to a reasonable level of nesting, of course. It is surprising how one can manage very large amounts of stories thanks to this flexible tree structure.
it's trivial to move stories in and out of a sprint, keeping track of all changes to a story - they are just text that can be copy and pasted and committed.
Org Mode provides a very capable tagging system. I first started by overusing these, but when tagging got too fine grained it became unmaintainable. Now we use too few - just epic and story - so this will have to change again in the near future. For example, it should be trivial to add tags for different components in the system or to mark stories as bugs or features, etc. Searching then allows you to see a subset of the stories that match those labels.

A second decision which has proven to be a very good one has been to groom the product backlog very often. And by this I don't just mean a cursory look, but a deep inspection of all stories, fixing them where required. Again, the choice of format has proved very helpful:

it is easy to mark all stories as "non-reviewed" or some other suitable tag in Org Mode, and then unmark them as one finishes the groom - thereby ensuring all stories get some attention. As the product backlog becomes larger, a full groom could take multiple sprints, but this is not an issue once you understand its value and the cost of having it rot.
because the product backlog is with the code, any downtime can be used for grooming; those idle weekends or that long wait at the airport are perfect candidates to get a few stories looked at. Time spent waiting for the build is also a good candidate.
you get an HTML representation of the Org Mode file for free in GitHub, meaning you can read your backlog from your phone. And with the new editing functionality, you can also edit stories too.

Thirdly, I decided to take a "multi-pass" approach at managing the story lifecycle. These are some of the key aspects of this lifecycle management:

stories can only be captured if they are aligned with the vision. This filter saves me from adding all sorts of ideas which are just too "out of the left field" to be of practical use, but keeps those that may sound crazy are but aligned with the vision.
stories can only be captured if there is no "prior art". I always perform a number of searches in the backlog to look for anything which covers similar ground. If found, I append to that.
new stories tend to start with very little content - just the minimum required to allow resetting state back to the idea I was trying to capture. Due to this, very little gets lost. At this point, we have a "proto-story".
as time progresses, I end up having more ideas on this space, and I update the story with those ideas - mainly bullet points with one liners and links.
at some point the story begins to mature; there is enough on it that we can convert the "proto-story" to a full blown story. After a number of grooms, the story becomes fully formed and is then a candidate to be moved to a sprint backlog for implementation. It may stay in this state ad-infinitum, with periodic updates just to make sure it does not rot.
A candidate story can still get refined: trimmed in scope, re-targeted, or even cancelled because it no longer fits with the current architecture or even the vision. Cancelled stories are important because we may come back to them - its just very unlikely that we do.
every sprint has a "sprint mission"⁵. When we start to move stories into the sprint backlog, we look for those which resonate with the sprint mission. Not all of them are fully formed, and the work on the sprint can entail the analysis required to create a full blown story. But many will be implementable directly off of the product backlog.
some times I end up finding related threads in multiple stories and decide to merge them. Merging of related stories is done by simply copying and pasting them into a single story; over time, with the multiple passes done in the grooms, we end up again with a single consistent story.

What all of this means is that a story can evolve over time in the product backlog, only to become the exact thing you need at a given sprint; at that point you benefit from the knowledge and insight gained over that long period of time. Some stories in Dogen's backlog have been there for years, and when I finally get to them, I find them extremely useful. Remember: they are a map to the unknown space you are exploring.

With all of this machinery in place, we've ended up with a very useful product backlog for Dogen - one that certainly adds a lot of value. Don't take me wrong, the cost of maintenance is high and I'd rather be coding instead of maintaining the product backlog, especially given the limited resources. But I keep it because I can see on a daily basis how much it improves the overall quality of the development process. It is a price I find worth paying, given what I get in return.

Final Thoughts

This post was an attempt to summarise some of the thoughts I've been having on the space of product backlogs. One of its main objectives was to try to convey the importance of this tool, and to provide ideas on how you can improve the management of your own product backlog by discussing the approach I have taken with Dogen.

If you have any suggestions or want to share your own tips on how to manage your product backlog please reach me on the comments section - there is always space for improvement.

Footnotes:

Source: Scrum Product Backlog, Mountain Goat Software.

A topic which I covered some time ago here: On Evolutionary Methodology. It is also interesting to see how the kernel processes are organised for speed: How 4.4's patches got to the mainline.

Another topic which I also covered here some time ago: On Maintenance.

⁴

I am self-plagiarising a little bit here and rehashing some of the arguments I've used before in Lessons in Incremental Coding, mainly from section DVCS to the Core.

⁵

See the current sprint backlog for an example.

Created: 2016-01-17 Sun 23:55