Sunday, October 27, 2013

ACA/Obamacare website team develops new extension to Brooks' Law

*** Updated: The IRS system to support Obamacare/ACA does work.  As a result of which we learn of yet another hole in

Some day the political furor over the introduction of government-run health care will simmer down, and it will either start working or we will be stuck with it anyway (since that's the way government works).  At that point, academe will be free to climb down from its partisan cheerleading role and do some actual research into the issues that surface every month or so [or perhaps into the reasons why the issues do not surface until then], at which point will be included in every textbook as the defining case study on governance issues.  Meanwhile, you can look into governance issues via this blog ;-) ***

In "The Mythical Man-Month", Fred Brooks posited what has come to be known as "Brooks' Law": adding resources to a late project simply makes it later.  The reasoning is that the existing team now has to stop much of what it was doing to explain what is going on to the newbies rather than getting on with the actions that they already know they need to take.

The Affordable Care Act (better known as Obamacare) website project has recently undertaken what it calls a "surge" -- somewhat ironic, considering Obama's opposition to the surges in Iraq and Afghanistan -- to correct the problems in the ACA healthcare website and enrollment process.

At first sight this would appear to be an interesting and very public case study in whether Brooks' Law can be overcome. But Brooks' Law talks about technical resources, and it does not appear that the repair concept is about adding any significant numbers of contractors (i.e. developers or testers).  On the contrary, the contractors (in a display that was certainly unique in my experience) threw their contract-holders under the bus and claimed that the only resource lacking was project management skills on the part of the government.  Indeed, the primary change appears to have been the assignment of OMB Deputy Director Jeff Zients to give the matter his attention.  At least he has evidently taken one of the elementary steps of project management, which is to hold a review to determine what the plan actually is, what the status is, and what the issues are.  Elementary, perhaps, but we are to gather that this is concept is entirely novel for this program.

So it would seem that HHS and OMB have developed an extension to Brooks' Law - we can call it the Zients Law:  "while the addition of resources could delay a project, bringing a project manager on board may well be a necessary condition of getting a project back under control."

*** Update:  Within days, Zients announced that the web site would achieve full operating capability in 30 days, a breath-takingly remarkable assertion. Even simple things in government take more than 30 days. At the end of 30 days there was no announcement of anything: the manual enrollment process continued, and Zients slunk back to OMB with no fanfare.   In January 2014 the enrolment figures zoomed -- as the result of a back-end process moving millions of people from Medicaid to ACA.  

To be fair, Zients, or somebody, did put a lot of project management things into place.  A new contractor was brought on board at the turn of the year and they must have done a bang-up job because within weeks (March), more or less in the nick of time to meet the enrollment deadline, the website apparently handled several million last-minute on-line enrollments, thereby proving both the financial viability of the program itself and of course the technical viability of the website ***

I'll go out on a limb here and say that if (as claimed) senior executives and appointees never inquired into the status of this administration's number one domestic policy initiative, then they are so incompetent as to deserve removal for that.  On the other hand, if the project managers at any level were fully aware of the issues and chose not to pass that information up the chain, or worse to alter it, then they need to be terminated - not "allowed to retire", but terminated without future pensions and be darned glad that they are not being subjected to recoupment of the cost of this botched effort.

*** Update: CIOs are required to certify the condition of their major investments on the Federal IT Dashboard. It's managed by OMB -- which is part of the Executive Office of the President -- and it's quite a big deal. Lateness is not tolerated.  Except, apparently, sometimes.  The Dashboard, which is publicly available -- usually -- reveals that in January 2013 the project was suddenly listed as RED because -- well, because it wasn't passing planned tests.  The following month it went back to green.  In August 2013 (right about the time that the September rollout issues with HAD to have become obvious to anybody working on it) the ENTIRE Federal dashboard reporting process simply stopped.  Coincidence? Well, I guess you can't show a RED on a dashboard that isn't being updated, now can you?  Maybe Zients, as the OMB Deputy Director, knows how that happened.   Anyway in April the data stream started back up again.

So who's been fired?  Well, nobody.  A couple of people have been retired in regular order and Secretary Sebelius moved on amid the usual political and media swooning about her greatness.  The contractor was eventually replaced by another based on its track record of having done a similar project for one of the states (well, there's a good idea -- except that later it transpired that the state's exchange wasn't doing so well either). And 6 months later we learn that system has been making millions of dollars of erroneous payments of subsidies.  It is not the system's fault that the subsidies themselves are also coming under scrutiny as to whether they are legal at all  How do we know any of this?  Because the IRS system supporting ACA, which does work, is finding discrepancies in tax returns between the filer's incomes, the subsidies to which they are entitled, and the subsidies they have been getting.  So the technical solution is not impossible to attain; the IRS has already figured it out. ***

Thursday, October 17, 2013

Metadata more important than the payload? Exhibit A.

Noted in an earlier post, FedEx founder Fred Smith regarded the company's data about its packages as more important than the packages themselves.  How can this be?  Isn't the reason people use FedEx that they trust the company to handle their treasured packages with a bit more care and accountability than the alternative.  As with all worthy epigrams, the seemingly nonsensical and heretical remark contains some deep truths.  Of course I expect FedEx to value the actual package.  The point is that there is not much real value in taking very good care of the package if you don't actually remember where you put it.

This one certainly strikes home for me.  I've certainly experienced the opposite effect.  This isn't really a bash on United Airlines - after 18 years I've gotten over it - sort of, although apparently I remember every grim detail so maybe not.  But it is a great illustration of Fred Smith's point.

When my wife and I left for the UK on our honeymoon, the trousseau came along in two checked suitcases.  I only needed one for my clothes, and you should have no trouble guessing which one of the 3 bags actually made it to the conveyor belt in Heathrow.  Bad enough that we had to go purchase emergency gear in one of the most expensive cities in the world, but then because United (yes, you get named here) had absolutely no idea where these bags were, and very little system-wide data visibility, we had to go stopping into their office almost every day for 2 weeks in order to try and get some information and to pick up our daily ration of $20 which of course does not last more than a few minutes in London.  Needless to say, my better half was not showing up for multiple nice events in the same outfit, so I was in hock for several nice Peter Jones outfits by the time the trip ended.  Ouch.  But it gets worse.

The only consistent information we got was that eventually, all the bags they cannot repatriate with their owners are eventually carted off by policy to the Bag Mountain in Chicago, so at least after 3-4 weeks we would know where the bags were ... if, of course, United still had them under control at all.  On returning to Dulles airport, it occurred to me to stop by the baggage desk in case there was an update on the situation from within the US where maybe the interoffice communications were more effective.  By golly, yes!! The bags had been found and flown off that very morning ... to Chicago.  This was actually recorded in the same electronic file that documented the effort made to date to find the bags, so the only thing I could not work out is why (being in possession of my itinerary) they didn't just hold them and give them to me that very day.  The airline promised to get them back from Chicago on the next flight and I went home shaking my head.  At least the new clothes would last a long time, and indeed they did.

The adventure was't quite over.  I got home to find a dozen messages on my answering machine (that'll date you!) beginning an hour BEFORE the original flight saying they had my bags at Dulles and would I be home to accept delivery.  What??!!  How does THAT  not show up in the bag record?  So at any time in the prior 2 weeks they could have put the bags on a flight to the UK, which would have saved me a couple of thousand dollars and endless aggravation.

The metadata was definitely more valuable than the actual package.

I don't think there's value in a bunch of comments flaming United Airlines in particular, but I'd love to have your comments with examples of other cases where the metadata is indeed more important than the payload.

Sunday, October 13, 2013

Has IT world really reached a global consensus design?

At the Akamai annual meeting, FedEx CIO Rob Carter gave an engaging talk on the role of IT in a global business.  Three aspects of that talk stuck with me, two of which explain themselves and the other being worth some exposition:
  • (Paraphrasing) "The organization's philosophical approach to the management of business is encapsulated in the operational processes they use to carry out their business."  This topic deserves its own discussion space.
  • (Again paraphrasing) - Founder Fred Smith's observation that "the information about the package is more important than the package itself". Anybody who has ever had an airline lose a bag get this: it doesn't really matter whether the bag is safe or not if the airline cannot find it.  I've lived that nightmare, and the hassle was far  more than the value of the package.
  • Dominant design.
What is Dominant Design?  When all else fails, consult Wikipedia.  Here's the layman's version.  At some point emerging industry leaders with different solutions gradually get winnowed down through a market process based on choice and/or cost.  At that point, suddenly a single solution-type is discovered to have become the norm.  The previous solution-differentiated market leaders dissolve and the new market leaders become those that are using the new norm, with the providers who delivered that norm in the first place usually (but not always, see "Tandy" and IBM) dominating that clique.

Mr. Carter believes that the IT market, at least at the level of being a global info-structure, has reached that point of enduring paradigm for each of the major layers of the architecture:

  • Servers and calculation engines, and their underlying processors, have obeyed Moore's Law since 1975.  The geometric effect seems impossible to continue, yet it does.  Actual circuits continue towards nanoscale. With virtualization we have worked out how to scale without buying more boxes.
  • Network Fabric (WAN and LAN) presents the illusion of an alphabet soup of supported protocols but the reality is that TCP/IP is the pervasive standard.  As far as carrying capacity, it has been very little time since DSL replaced phone modems; now we expect 1Gb (and growing); the Moore's Law curve seems to apply in this space also.
  • Storage is another area showing exponential yet stable growth, with the original concept of transient memory being steadily eclipsed by persistent memory.  This was an area that seemed to have issues (cost, speed, size) not so long ago, but now pervasive memory technology appears to have crossed the bridge.  This area is also exhibiting Moore's Law behavior.  It's DC so not everybody is allowed to carry USB drives (ahem) but look at the SD cards everybody is buying for their smart phones as Exhibit A! 
  • Software is moving away from last decade's focus on application rationalization (portfolio management) to a general acceptance of application stores and SaaS.  Buy what you want or just rent it for a while; the "build" option is finally being seen as the approach of last resort.

His premises seem unarguable in the short term.  For the middle term, I have to wonder whether it is possible that the changes still to come can be just as creative as those we have witnessed over the past 25 years.  If they are, then these technical approaches too will one day seem as ludicrous as the 8-track cassette.

Your thoughts?  Are we there yet?

Thursday, October 3, 2013

Handling the irrevocable decision

This particular week, with no Washington Redskins game and no government (and yet the world continues to turn ...), we who reside in the Chesapeake area have the free time to ponder the mysteries of decision-making and governance.  That is what you were going to do, right?  Well, if you spent any time thinking "what in the heck were those guys thinking?", that's all about governance.

Executive-level choices are not easy, or they would be made by junior clerks.  Worse, they will not come to fruition for some time, leaving lots of opportunities for second-guessing.  Government and industry alike use processes built around the idea of gradual approaches to the objective, permitting periodic re-assessment. Nonetheless, it is a brave analyst who will insist on derailing a program at a milestone review, a dynamic that significantly weakens the value of those reviews; and, as in the current examples, it is not uncommon that decisions are made that essentially preclude re-thinking.

For those from other regions, the Redskins'  superstar rookie quarterback Robert Griffin (RG-III) was massively injured in his first season.  This spring, long before anyone could know whether a full recovery was possible, the organization announced that RG-III would be the starter in the fall no matter what. That choice also meant that the team would have to go through the pre-season (quite successfully) with a different leader; then adjust to the return of an unpracticed RG-III.  That adjustment has (so far) cost 3 of their first 4 regular games against very weak opponents.  Now for the part we don't know: did the early decision permit an orderly march to a future that is not yet apparent but just needs a couple more weeks to prove itself?  Or, with the season at risk before it is a quarter over, is it time to rethink the approach, and if so - how? If they change now, will the team end up going back to square one while re-synching?  Some time between late November and February, we will find out how the actual strategy worked out; but if it fails, we will never really know whether alternative choices would have worked any better.

*** Update: It didn't work out.  Throughout the 2013 season, RG-III was a shadow of his pre-injury self even though the rest of the team was uncharacteristically healthy.  Later in the season he was benched to see if the backup could get the team but with morale in the toilet it didn't make much difference either way.  As the last game ended, the coaching staff was fired.  The team owner resisted a strong push to bring in RG-III's former college coach. ***

Separating ourselves from the political content, the government shutdown offers similar decision-making puzzles.  With both parties having equally but oppositely apocalyptic views of what might happen if the other party gets its way, both of the party leaders adopted a public policy of "no compromise".  That admits of only one outcome: eventually someone must back down and thereby "lose".  That loser will surely be finished as party leader, and may well have led their entire party to effective annihilation; ironically, "winning" will gain almost no change in the status quo at all and does not relieve the winners' practical problems.  As with the Redskins, leaders elected to lock in, well in advance of any practical deadline, to a decision that effectively nullifies other options, and after a certain point any change of plan becomes pretty much impractical.

*** Update: the government shutdown lasted 27 days in in 2013 (the government's fiscal 2014). House leader John Boehner backed down completely, having gained absolutely nothing from the battle and losing some of what was already agreed to.  He survived that fight but now faces for the first time in many years a combative challenge for his House seat, which he is having to spend millions to defend.  Just as with the 1994 shutdown, the country got along just fine with only the essential personnel and saved a lot of money which is now being credited as "deficit reduction".  It saved a lot more when the looming specter of global warming resulted in an unusual number of snow-related shut-downs through the winter of 2013-2014.  You can't make this stuff up. ***

When you step back from the content issues, these two situations bring up similar questions with regard to governance:

  • How far in advance do we really have to lock into a decision?  At what point does lack of a committed decision create more confusion than having to change a decision?  How much detailed effort should we put into a concept that hasn't really been endorsed?
  • How deep into the future should we look when analyzing a particular short-term decision?
  • How do we identify that a decision needs rethinking without being condemned as a trouble-maker? 
  • How should we approach apparently irreversible decisions, as opposed to the usual ones that can be approached incrementally?
  • How can we be confident that the current fiasco really is less of a problem than what would happen if we moved to a different solution?
  • Intervening too early at the first sign of difficulty will eventually lead to stifling all innovations.  How do we tell whether early setbacks indicate growing pains of a bright new future, or the slide into worse results to come?  
  • If we are planning to shift gears, how do we assess the disruption effect?  Our economic analysis methods address Plan A and Plan B, and the partial costs involved in a shift.  We do not really  have a strong methodology for assessing the practical implications of making the change either within those programs or on other related activities, but for the most part decision-makers are making implicit subjective assessments of that disruption when they cling to something that is obviously not going too well rather than consider alternative approaches.