The Phoenix Project

Il serait possible de résumer ce livre comme “L’histoire d’un projet qui prend feu”.

Plus ou moins jusqu’à la moitié du livre, c’est la descente aux Enfers de la gestion d’un projet IT: Bill Palmer se retrouve propulsé de la tête des opérations à la tête du département IT; son premier jour à cette fonction le met face à face avec une erreur au niveau de la paie, liée à une mise à jour du SAN (mais spoiler alert en fait liée à un process IT de sécurité qui obfuscait les valeurs du numéro d’identification national de chaque employé - parce que c’est pas légal).

Ensuite, on part sur les échecs continus du plus gros projet IT mis sur les rails, mais qui accumule déjà 3 ans de retard et 20 millions $ hors budget.

Le petit plus, c’est que chaque point est analysé et une solution est envisagée; au final, on arrive à la mouvance DevOps, où l’idéal visé est de d’avoir un Time To Market le plus faible possible. Et pour ça, on doit laisser tomber les grosses releases contenant plusieurs fonctionnalités attendues, mais effectuées seulement tous les 9 à 12 mois.

Chaque petite économie réalisée au fur et à mesure d’un projet se traduit par une dette technique.

“The Phoenix Project” est un roman sur l’IT: Bill Palmer est propulsé à la tête du département informatique, suite à un énorme problème sur la paie. On le suit ensuite sur tous les problèmes qu’il rencontrera, à chaque fois suivis par des propositions d’améliorations.

C’est parfois un peu trop fleur bleue, mais il n’y a pratiquement que des bonnes idées à suivre.

J’ai principalement recopié certains passages qui m’ont intéressés. Tout ce qui suit en anglais ci-dessous est repris tel quel du livre, accompagné de la page.

Au niveau organisationel
#

IT is pure knowledge work, and so therefore, all you work is like that of an artisan. Therefore, there’s no place for standardization, documented work procedures, and all that high falutin’ rigor and discipline that you claimed to hold so near and dear.

Your job as VP of IT is to ensure the fast, predicatable and uninterrupted flow of planned work that delivers value to the business, while minimizing the impact and disruption of unplanned work, so you can provide stable, predicatable and secure IT service. (p. 91)

On a beau tout planifier, il y a constamment des interruptions et urgences non-planifiées qui doivent être prises en charge rapidement, et qui reculent d’autant toutes les autres tâches à réaliser. Le travail du responsable est d’assurer que le travail d’amélioration de la partie business soit réalisé et soit le moins possible impacté par toutes les urgences qui arrivent en parallèle.

The only thing more dangeroux than a developer is a developer conspiring with security. The two working together gives us means, motive and opportunity. (p. 39)

By reducing the number of projects in flight, we are keeping lanes of work. (p. 275)

Dette technique
#

Quelques petits passages concernant la dette technique, le travail non-planifié et les raccourcis pris par certaines personnes dans l’entreprise pour court-circuiter les étapes officielles.

Unplanned work might be called anti-work, since it further highlights its destructive and available nature. “That’s why it’s so important to know where unplanned work comes from”. (p. 161)

Unplaned work kills your ability to do planned work.

Technical debt comes from taking shortcuts, which may make sense in the short term. But like financial debt, the compounding interest costs grow over time. If an organization doesn’t pay down its technical debt, every calory in the organization can be spent just paying interest, in the form of unplanned work. (p. 195)

People think that just because IT doesn’t use motor oil and carry physical packages, that it doesn’t need preventive maintenance. Because the work and the cargo that IT carries is invisible. Preventing oil changes and vehicules maintenance policies are like preventive vendor patches and change management policies.

Pour paraphraser le passage ci-dessus: “si vous ne vous occupez pas de la maintenance de vos outils, vos outils vous le rappeleront eux-même (au pire moment)”.

Gestion et parties mouvantes
#

In order to control the system, we need to reduce the number of moving parts.

Work in progress goes out

Due date performance goes up, as work in progress goes out.

It takes some of our most frequent services requests documented exactly as the steps are, what resources can execute them and time how long each operation takes.

We must convince that IT is capable of not just screwing up less often, but helping all of the business win.

There are three internal control objectives (p. 252):

to gain assurance for reliability of financial reporting

compliance with laws and regulations

efficiency and effectiveness of operations.

Infrastructures et développements
#

Infrastructure should be treated as code. Environments creation may be integrated into the development process.

“Value stream map” = estimate the time needed for each step, through deployment pipeline. This should be allowed through a common build procedure.

“Months before the product launch, the code is already in production. It’s merely a flag toggled and it’s already tested by internal uses/testers” (p. 352).

ITIL = automation around change, configuration and release management.

The Three Ways explained
#

(copié/collé du texte de l’auteur)

In the Phoenix Project, we describe the underpinning principles that all the DevOps patterns can be derived from as “The Three Ways”. It is intended to describe the values and philosophies that guide DevOps processes and practices.

The First Way is about the left-to-right flow of work from Development to IT Operations to the customer. In order to maximize flow, we need small batch sizes and intervals of work, never passing defects to downstream work centers, and to constantly optimize for the global goals (as opposed to local goals, such as Dev feature completion rates, Test find/fix ratios, or Ops availability measures).

The necessary practices include continuous build, integration and deployment, creating environments on demand, limiting work in process, and building safe systems and organizations that are safe to change.

The Second Way is about the constant of fast feedback from right-to-left at all stages of the value stream, amplifying it to ensure that we can prevent problems from happening again or enable faster detection and recovery. By doing this, we create quality at the source, creating or embedding knowledge where we need it.

The necessary practices include “stopping the production line” when our builds and tests fail in the deployment pipeline; constantly elevating the improvement of daily work over daily work; creating fast automated test suites to ensure that code is always in a potentially deployable state; creating shared goals and shared pain between Development and IT Operations; and creating pervasive production telemetry so that everyone can see whether code and environments are operating as designed and that customuer goals are being met.

The Third Way is about creating a culture that fosters two things: continual experimentation, which requires taking risks and learning from success and failure, and understanding that repetition and practice is the prerequisitive to mastery.

Experimentation and risk taking are what enable us to relentlessly improve our systme of work, which ofter requires us to do things very differently than how we-ve it for decades. And when things go wrong, our constant repetition and daily practice is what allows us to have the skills and habits that enable us to retreat back to a place of safety and resume normal operations.

The necessary practices include creating a culture of innovation and risk taking (as opposed to fear and mindless order taking) and high trust (as opposed to low trust, command-and-control), allocating at least twenty percent of Development and IT Operations cycles towards non-functional requirements, and constant reinforcements that improvements are encouraged and celebrated.

Equipe, bonnes pratiques et feedbacks
#

We need short and quick cycle times to continually integrate feedback from the market place.

(par exemple au travers de testing A/B)

A great team performs best when they practice: practice creats habits, habits create astery of any process or skill. Repetition, especially for things that require teamwork, creats trust and transparency. (p. 275)

Until code is in production, no value is actually being generated, because it’s merely work in progress stuck in the system.

“Don’t be the idiot that failed because he didn’t asked for help” (p. 336)

The five dysfunctionments of a team:

Absence of trust

Fear of conflict

Lack of commitments

Avoidance of accountability

Inattention to results.

Flux d’informations
#

In any system of work, the theoretical ideal is single-piece flow, which maximize throughput and minimizes variance.

“The flow of work should ideally go in one direction: forward. When I see work going backward, I think ‘waste’. It might be of defects, lack of specification, rework, … Regardless, it’s something we should fix.” (p. 285)

A intégrer dans le processus de développement
#

Features
Stability
Security
Scalability
Manageability
Operability
Continuity.

CIA :

Confidentiality
Integrity
Availability

Au niveau organisationel #

Dette technique #

Gestion et parties mouvantes #

Infrastructures et développements #

The Three Ways explained #

Equipe, bonnes pratiques et feedbacks #

Flux d’informations #

A intégrer dans le processus de développement #