It’s way too early to know what really happened with the botched launch of Healthcare.Gov. We don’t know how it will all play in years to come and what its impact will be on the evolution of the Alternative Care Act (ACA), on election results over the next few years, or on President Obama’s legacy. Depending on how it all turns out over time, this will be just a chapter in future books on the history of the ACA and the Obama administration, or the subject of major books and investigative reports.
Most everyone who’s been involved with the development of complex IT systems knows how wrong things can sometimes go. So, when serious problems do happen, we are eager to learn the lessons that might help us avoid similar problems in the future. It’s quite possible that Healthcare.gov and the ACA’s overall IT system are such complex outliers, - technically, organizationally and politically, - that any lessons learned might apply to few other projects. But, given the increasing complexity of private and public sector IT systems, the lessons are worth thinking about.
I like the way Clay Shirky, - NYU faculty member as well as author and consultant, - framed the problem in a very interesting blog, - Healthcare.gov and the Gulf Between Planning and Reality. He writes about the gulf between those charged with planning the overall rollout of the ACA and health care exchanges and the realities of trying to get such a complex system designed, built and launched in a short amount of time. It’s essentially a tale of failure is not an option versus the messy world of highly complex IT systems. While the blog is focused on the launch of Healthcare.gov, it can also be read as a more general discussion of the kinds of problems often encountered with highly, complex IT-based projects when a management decision to win a deal at all costs comes back to haunt the implementation of the project.
The paragraph responsible for Shirky’s sinking feeling was part of an October 12 NY Times article From the Start Signs of Trouble at Health Portal. According to the article, the warnings came from CMS deputy CIO Henry Chao, the chief digital architect for the new online insurance marketplace. In response, his superior told him:
“. . . in effect, that failure was not an option, according to people who have spoken with him. Nor was rolling out the system in stages or on a smaller scale, as companies like Google typically do so that problems can more easily and quietly be fixed. Former government officials say the White House, which was calling the shots, feared that any backtracking would further embolden Republican critics who were trying to repeal the health care law.”
“The idea that failure is not an option is a fantasy version of how non-engineers should motivate engineers,” adds Shirky. “Failure is always an option. Engineers work as hard as they do because they understand the risk of failure.” In his opinion, neither technology, talent, budgets or the government’s bureaucratic processes are the main culprits here. Rather, this is a management and a cultural problem. As a result of the huge political pressures they were under, top administration officials did not feel that they could seriously address the possibility that things might go wrong.
Other articles paint a similar picture, such as this recent one in the WSJ’s CIO Journal:
“It was on a cold, sunny day in Baltimore last January that Curt Kwak, chief information officer of the Washington Health Benefit Exchange, first realized that the signature feature of President Obama’s Affordable Care Act could be in trouble. That day, at a status review meeting of CIOs of state health exchanges, he learned that many of his peers were far behind where they should have been. According to Mr. Kwak, several of his peers hadn’t yet selected a systems integrator - tech vendors who play crucial roles in fitting together the multiple components of health insurance exchanges that allow consumers to select and enroll in health plans.”
Why did the administration as well as several states wait so long to start the planning of the ACA system including the health care exchanges? Ezekiel Emanuel, - oncologist, vice provost and professor at the University of Pennsylvania and former White House advisor on health policy, - said in a good article on the subject that the administration did not want to release detail regulations and specifications on the exchange while in the middle of the 2012 election campaign in order to avoid political controversies. “This may have been a smart political move in the short term, but it left the administration scrambling to get the IT infrastructure together in time, robbing it of an opportunity to adequately consult with independent experts, test the site and fix any problems before it opened to the public.”
But, then came the reality, which Shirky describes as the painful tradeoff between features, quality and time.
“When a project cannot meet all three goals - a situation Healthcare.gov was clearly in by March - something will give. If you want certain features at a certain level of quality, you’d better be able to move the deadline. If you want overall quality by a certain deadline, you’d better be able to simplify, delay, or drop features. And if you have a fixed feature list and deadline, quality will suffer. . . You can slip deadlines, reduce features, or, as a last resort, just launch and see what breaks. . . That just happened to this administration’s signature policy goal.”
The inability of a troubled project to meet all three goals simultaneously, almost feels like the complex systems equivalent of the Heisenberg uncertainty principle - that is, it’s impossible to simultaneously determine the exact position and velocity of an atomic particle with any great degree of accuracy no matter how good your measurement tools are. While clearly not a scientific principle, but a set of guidelines based on decades of experience, there seem to be intrinsic limits to our ability to fix troubled IT projects no matter how hard we try.
In The Mythical Man-Month, noted computer scientist and software engineer Fred Brooks introduced one of the most important concepts in complex IT systems: adding manpower to a late software project makes it later. Brooks’ Law as his concept became known, remains as true today as when it was first formulated almost 40 years ago.
Over the years, we have learned that there are limits to our ability to pre-plan complex IT projects in advance. You need a good design, architecture and overall project plan, but you also need the flexibility to learn as you go and make trade-offs as appropriate. Most such projects are therefore released in stages, with alpha and beta phases that start testing the system with a select and relatively small number of users. Such early testing uncovers not only software bugs but also design flaws that users have trouble with.
Another important lesson is that all parties involved in a complex, high risk project must have have a good working relationship. All available information on the status of the project should be shared, so there are few last-minute surprises. Tradeoff decisions and project adjustments should involve all key members of the team. Behind most seriously troubled projects lies not only a gulf between planning and reality, but a lack of the close collaboration and overall good will necessary to make the project succeed.
It’s hard to imagine a more politically contentious project than the ACA. The administration was worried that any glitches uncovered while testing the system as part of the usual staged release cycle would give further ammunition to those trying to kill the ACA altogether. They may have felt that slipping deadlines and reducing features prior to the October 1 launch was not politically feasible, and that they therefore had no choice but to launch anyway and hope for the best. Did they make the right decisions? We’ll find out in the fullness of time.
Perhaps the lesson is sweeping public policy change that requires systematic integration of disparate IT systems is an exercise in hubris.
The reality maybe to think big but craft the implementation locally. If a policy vision can not be broken into small execution elements under local control, it is not a clear vision, but wishful thinking.
Google, Amazon, et al do not have 30 year old infrastructure with legacy code for which the source does not exist. This is not uncommon in state government and even in the public sector. Integration at this scale must include a clear view of the complexity of the task, and adapt its implementation accordingly. I do not think this clarity was evident when the legislation was passed, let alone when the rules were promulgated by HHS.
Posted by: BRCDbreams | December 10, 2013 at 11:13 AM