Sunday, September 18, 2022

Securing the Supply Chain of Nothing

The Cybersecurity and Infrastructure Security Agency (CISA), the National Security Agency (NSA), and the Office of the Director of National Intelligence (ODNI) recently released a document entitled, “Securing the Software Supply Chain – Recommended Practices Guide for Developers.” I hoped the document might shed light on practical, perhaps even novel, ways for the private sector to increase systems resilience to supply chain attacks. The authors are respected authorities, and the topic is salient to the public.

Instead, the document’s guidance contains a mixture of impractical, confusing, confused, and even dangerous recommendations.

There is a collective ignis fatuus in the information security community that it is the “job” of an organization’s employees to prioritize security above all else. This fallacy holds us back from achieving better defense outcomes. Unfortunately, “Securing the Software Supply Chain” calcifies this falsehood.

Therefore, I have written this rebuttal in the form of ten objections:

  1. Slowing down software delivery does not help security, it hurts it
  2. There is an underlying paradox (the “Thinking Machine” paradox)
  3. Most enterprises have no chance of implementing this
  4. Most enterprises will not want to implement this
  5. Security vendor bolt-on solutions are overemphasized
  6. Relevant security and infrastructure innovation is omitted
  7. Inaccuracies about software delivery practices and basic terminology
  8. Confusing, contradictory messages from the authoring agencies
  9. Omission of second order effects and underlying causal factors
  10. Dangerous absolution of security vendors’ own software security

Objection #1: Slowing down software delivery does not help security, it hurts it

Empirical evidence from the enterprise software engineering world makes it clear that speed benefits not only business outcomes (like release velocity) but also security outcomes (like time to recovery, time to fix security issues). “Securing the Software Supply Chain” instead effectively recommends against speed, arguing for slowness and additional hurdles and frictions throughout the software delivery process.

The guidance is for a centralized security team to impose significant restrictions on software engineering activity, thereby enforcing security as the top priority. For instance, IDEs (and plug-ins) are treated as threats to be “preapproved, validated, and scanned for vulnerabilities before being incorporated onto any developer machine.”

Speed, and even reliability, are treated as justified casualties for security’s sake. Given the mission of intelligence agencies, chiefly national security, they rightly view security obstruction as worthwhile. Their goal is not to grow market share or their customer base or improve profit margins by shipping more software, faster. This means their perspective simply does not translate to the private sector, unless we, as a society, decide a corporation must serve national security rather than its shareholders.

The result from this guidance – if the recommendations were implemented by, say, the Fortune 500 – is that software struggles to get built. Software engineers would quit (and have quit over lesser inconveniences in the private sector). Whatever the enterprise’s budget, with these recommendations they will build software slowly, and it will be of poorer quality – which is a direct detriment to the goal of higher quality, more secure software.

Objection #2: There is an underlying paradox (the “Thinking Machine Paradox”)

There is a logical inconsistency – perhaps even a paradox – presented: developers cannot be trusted, but automation is discouraged in favor of manual processes run by humans. If there is paranoia about “unintended flaws” introduced by developers, then why give them more opportunity to introduce them by recommending manual processes?

If there is concern that their user credentials will beget malefaction and mistakes, then why discourage service accounts, which are inherently decoupled from specific user identities? Without service accounts, if the human user leaves or is fired, then their credentials are still tied to whatever activity – which is a dangerous game to play.

The guide goes to great lengths to paint the picture of developer as an insider threat – whether purposefully malicious or “poorly trained”— but then explicitly espouses manual build and release processes. So, individual software engineers are simultaneously not to be trusted but also, they should be the ones to perform and approve software delivery activities?

Human brains are not great at executing tasks the same way every time. Computers are much better at that! And yet while the guide warns about the dangers of human mistakes, they want us to rely on those humans for repeatable tasks rather than automating them.

Objection #3: Most enterprises have no chance of implementing this

Most enterprises have no chance of implementing the recommendations in “Securing the Software Supply Chain.” It is allegedly meant as a reference guide for developers, but it is really a reference guide for no one other than an intelligence agency with the same goals and resources as the NSA.

This is a criticism often made about Google: they propose advice that works for them and their titanic budget and pool of talent without considering the constraints and tradeoffs faced by “mere mortals.” CISA, the NSA, and the ODNI have fallen into a similar trap.

There are numerous recommendations that are impractical for enterprises, and not just the absurd one of disallowing internet access in “development” and “engineering” systems. For instance, if enterprises documented everything that a piece of software performs, it would be equivalent to writing it twice (and the documentation would inevitably differ from the source code); enterprises would likely be better off with no documentation at all and just reading the source code.

As another example, they also recommend that “Fuzzing should be performed on all software components during development.” If fuzzing all software components during development was a strict requirement, enterprises might never ship software again. They also recommend “Using a testing approach…to ensure that repaired vulnerabilities are truly fixed against all possible compromises.” If enterprise software engineering teams knew the graph of possible compromises, why would we need all of this guidance?

Objection #4: Most enterprises will not want to implement this

Most enterprises will also not want to implement this. The recommendations do not scale, are not aligned to enterprise software delivery priorities, and erode productivity.

Intelligence agencies, whose mission is national security, have no choice but to implement a paradigm as described because the alternative is simply not developing software. Enterprises in the private sector, whose mission is making money, do not face the same constraint; the constraint they face is, instead, the number of resources at their disposal and, to support their mission, it is generally better to spend on revenue or profit-generating activities than those that obstruct or erode it (like the recommendations in this document).

The top priority among enterprise customers is usually not security, either, regardless of the enterprise being B2B or B2C. Security will only be the top concern about customers if the primary customer is an intelligence agency, which constitutes very few enterprises, especially within the Fortune 500/1000.

Through this lens, advice like “If possible, the engineering network should have no direct access to the internet” and “If possible, development systems should not have access to the Internet” suggests a mental model of for-profit enterprises that significantly differs from reality. Similarly, decrying “ease of development” features is not a reasonable position in the reality of enterprise software development; more constructive would be to suggest that such features must be considered as part of the system design and protected behind appropriate access controls and audit logging.

Some recommendations would even be considered “bad practice” by software engineers from a reliability and resilience perspective, and therefore rejected. The guide suggests using “a temporary SSH key” to allow admin access into production systems, whereas Ops engineers and SREs often prefer immutable infrastructure specifically to disallow the use of SSH, which helps with reliability (and cuts off a tempting mechanism for attackers).

Objection #5: Security vendor bolt-on solutions are overemphasized

There is a pervasive overemphasis on vendor tooling with near-complete omission of non-vendor solutions. Specifically, the document touts a laundry list of bolt-on commercial solutions by incumbent security vendors – IAST, DAST, RASP, SCA, EDR, IDS, AV, SIEM, DLP, “machine learning-based protection” and more – often repeatedly singing their praises. Rather than providing constructive, sane advice on automating processes and making them repeatable and maintainable, they recommend a smorgasbord of bolt-on tools.

The guidance is explicit in discouraging open-source software as well. There is also little about security through design, such as the D.I.E. triad. Unfortunately, this gives the impression that security vendors successfully lobbied for their inclusion in the document, which calls into question its neutrality.

In fact, by promoting these commercial security tools, they promote dangerous advice like manual release processes. For instance, they recommend: “Before shipping the package to customers, the developer should perform binary composition analysis to verify the contents of the package.” But developers should not be performing package releases themselves if the desired outcome is high quality, secure software (see Objection #2).

As another example, they recommend that “SAST and DAST should be performed on all code prior to check-in and for each release…” But how is an enterprise supposed to perform DAST/SAST on code before it’s checked in? It is an ad absurdum of “shift left.” It is only one step leftward away from running the tools earlier inside the developer’s brains as they brainstorm what code to write.

But there is no mention of the need to integrate DAST/SAST into developer workflows and ensure that speed is still upheld. In enterprise software security, the success of a security bolt-on solution depends either on usability or coercion.

If you make the secure way the easy way and ensure that software engineers are not required to deviate unduly from their workflows to interact with the security solution, then it is quite likely to beget better security outcomes (or at least not worse productivity outcomes). The alternative is to mandate that a solution must be used; if it is unusable, then it will be bypassed to ensure work still gets done, unless there is sufficient coercion to enforce its use.

When the bolt-on recommendations are combined with their advice elsewhere to disconnect development systems from the internet, it begs the question: How do you use the SCA tools, among others, if the dev systems are not connected to the internet?

“Basic hygiene” is arguably better than any of these bolt-on option, including things like:

  • Knowing what dependencies are present
  • Being purposeful about what goes into your software
  • Choosing a tech stack you can understand and maintain
  • Choosing tools that are appropriate for the software you are building

What does “being purposeful about what goes into your software” mean? It means:

  • Including dependencies as part of design, rather than implementation
  • Being cautious about adding dependencies
  • Knowing why you’ve included other libraries and services
  • Understand the packaging concerns of your dependencies; for example, if you include a dependency, what does it cost to feed the beast in terms of operationalizing and shipping it?
  • If you take on a dependency in another team’s service, what are their SLOs? Do you trust them? Is it a stable system? Or have there been problems?
  • If it’s an open-source library, is it maintained by one person? A team? A company with a support contract you can purchase? A company with a support contract you already have in place? Can you see its updating and patching history?

In essence, the answer is not: “never take on dependencies”; the answer is to understand what your dependencies are.

Overall, the recommended mitigations are all about outputs rather than outcomes, about security theater rather than actually securing anything and verifying it with evidence. It is clear the guidance does not consider organizations who ship more than once per quarter; in fact, they seem to view fast software delivery as undesirable. They are mistaken (as per Objection #1).

Objection #6: Relevant security and infrastructure innovation is omitted

As mentioned in Objection #5, the document ignores a wealth of innovation in the private sector on software security over the past decade. The guidance seems to take the stance that the NSA/CISA/ODNI way is better than the private sector’s status quo, which is a false dichotomy. In fact, companies like SolarWinds have admitted their slowness was a detriment and have since modernized their practices — including the use of open source — to achieve improved security.

The fact that none of those innovations were included suggests an insular examination of the problem at hand, which erodes the intellectual neutrality of the document. I’ve listed the kinds of security and infrastructure innovations I would expect to see in a reference guide like this below (I have no doubt there are many others worthy of inclusion, too):

The guidance would also be strengthened by considering survey data from the private sector, such as the recent Golang survey (which has a section dedicated to Security, including fuzzing) and GitHub’s Octoverse Security report from 2020 (their Dependabot tool is also an arguably glaring omission).

As another example, the document cautions against “allowing an adversary to use backdoors within the remote environment to access and modify source code within an otherwise protected organization infrastructure.” This dismisses the last 30+ years of source code management (SCM) systems.

You cannot just up and change source code without people noticing in modern software delivery. Even subversion is built on the idea of tracking deltas; if you change some code, there exists a record of when that code was changed, by who, and when. Most development workflows configure the SCM system to require peer approval before merging changes to important branches. It is worrisome if the authoring agencies are unaware of this given it has been the status quo for decades; if their vendors do not exhibit these practices, then this suggests a serious problem with federal procurement.

Objection #7: Inaccuracies about software delivery practices and basic terminology

There are consistent misunderstandings and inaccuracies throughout the document about modern software delivery practices, including misunderstandings and inaccuracies about basic terminology, such as CI/CD, orchestration, nightly builds, code repositories, and more. This is part of a larger cultural problem in information security of trying to regulate what they do not understand.

The reference guide does not seem to understand who does what in enterprise engineering teams. Product and engineering management do not define security practices and procedures today, because their priorities are not security but instead whatever success metrics correspond to the business logic under their purview (usually related to revenue and customer adoption). The characterization of QA is particularly perplexing and suggests a significantly different QA discipline exists in intelligence agencies than does in the private sector.

If enterprises were to follow the advice that “software development group managers should ensure that the development process prevents the intentional and unintentional injection of… design flaws into production code,” then software might never be released again in the private sector. The guide also seems to believe that software engineers are unfamiliar with the concept of feature creep, as if that is not the unfortunate default in product engineering today.

There are also simply perplexing statements. For instance, “An exception to [adjusting the system themselves] would be when an administrator has to fix the mean time to failure in production.” (p. 30, emphasis mine). I do not know what they mean by this and struggle to guess what they might mean. This, and other confusing passages tarnish the intellectual credibility of the guide.

The guidance is inaccurate even in areas that should be their area of expertise, like cryptography. “The cryptographic signature validates that the software has not been tampered with and was authored by the software supplier.” No, it doesn’t. It validates that the supplier applied for that key at some point; it does not say much about the security properties of the software in question. For instance, there is an anti-cheat driver for the game Genshin Impact whose key still hasn’t been revoked despite being vulnerable to a privilege escalation bug.

Finally, in what is more of an inaccuracy about enterprise security than software delivery, the authors refer to “zero-day vulnerabilities” as an “easy compromise vector.” This may be true for the Equation Group and their black budget funding but is not true from the perspective of most cybercriminals and, therefore, enterprises.

Enterprises are still wrestling with leaked credentials, social engineering, and misconfigurations (see Objection #8). For a Fortune 500, it is a victory if attackers are forced to use 0day to compromise you; you’ve exhausted all their other options, which should be considered a rightful accomplishment relative to the security status quo.

Objection #8: Confusing, contradictory messages from the authoring agencies

It is confusing to see that the same agency (CISA) who emphasized the need for repeatability and scalability last year only mentions the importance of repeatability once. And that another authoring agency (NSA) stated in their report from January 2020 that supply chain vulnerabilities require a high level of sophistication to exploit while having low prevalence in the wild.

However, this guidance makes it seem like software engineering activities should be designed with supply chain vulnerabilities as the highest weighted factor. Misconfigurations are given scarce mention, despite the NSA citing them as most prevalent and most likely to be exploited.

In the aforementioned guide by CISA, they highlight some of the benefits of cloud computing and automation. But in this reference guide, they indicate that the cloud is more dangerous than on-premises systems, without explaining why it might be so.

Much of the language is confusing, too, and never receives clarification. What is a “high risk” defect? What does “cybersecurity hygiene” mean in the context of development environments? They insist that “security defects should be fixed,” but what defines a defect? That it exists? That it’s exploitable? That it’s likely a target for criminals? Nation states? It remains unclear.

Objection #9: Omission of second order effects and underlying causal factors

There is no consideration of second order effects or underlying causal factors, such as organizational incentives and production pressures. (There is one mention, quite in passing, that there might be various constraints the developer faces). This ignores the rich literature around resilience in complex systems as well as behavioral science more generally. In fact, the recommendations are arguably the opposite of adaptive capacity and reflect rigidity, which is seen as a trap in resilience science.

If enterprises are to attempt implementing these recommendations (which I very much discourage them from doing), then guidance on how to achieve them despite vertiginous social constraints is essential. Much of what is outlined will be irrelevant if incentives are not changed.

There is also no discussion of user experience (UX) considerations when implementing these suggestions, which, perhaps more than what is implemented, will influence security outcomes the most; an unusable workflow will be bypassed. Because the guidance ignores the complexities of software delivery activities and accepts convenient explanations for developer behavior, the resulting advice is often uncivilized.

There is a missed opportunity to discuss making the secure way the easy way, the importance of minimizing friction in developer workflows, the need to integrate with tooling like IDEs, and so forth. This absence results in a guide that feels both shallow and hollow.

Objection #10: Dangerous absolution of security vendors’ own software security

There is ample discussion of the need to scan software being built or leveraged from third parties, and how a long list of commercial security tools can support this endeavor (see objection #5). Curiously, there is no mention of the need for those tools to be scanned, such as performing SCA for your EDR or anti-virus bolt-on solution.

Security tools usually require privileged access in your systems, whether operating as a kernel module or requiring read/write (R/W) access to critical systems. Combined with the long history of critical vulnerabilities in security tools, this is a rather troubling omission. This is all aside from the numerous software reliability problems engendered by endpoint security tools, which also fail to receive mention. Engineering teams notoriously despise endpoint security on servers for valid reasons: kernel panics, CPUs tanking, and other weird failures that are exasperating to debug.

Given the paranoia about IDEs and other developer tools, it feels strange that security vendors receive absolution and nary a caveat regarding their code security. Where is the recommendation for a proctology exam on the security tools they recommend? Do all of their recommendations for countless hurdles in the way of code deployment apply to patches, too? They are software changes, after all. If they do not apply, then that is a massive loophole for attackers to exploit.

Another surprise is that “anti-virus” is listed as being capable of detecting things like DLL injections; recent empirical data suggests otherwise even for EDR, which is considered leagues ahead of antivirus solutions. Again, it gives the impression that the guidance is biased in favor of security vendors rather than a more complete set of strategic options. The decimating performance impact these tools can have on the underlying systems, such as kernel panics on production hosts, also fails to receive mention.

Conclusion

If you read this guidance and implement it into your enterprise organization, you will end up securing the supply chain of nothing. Your engineering organization will dismiss you as an ideologue, a fool, or both; else, you will have constrained software engineering activities to such a degree that it makes more sense to abandon delivering software at all. In fairness, the most secure software is the software never written. But, in a world whose progress is now closely coupled with software delivery, we can do better than this document suggests.


Thanks to Coda Hale, James Turnbull, Dr. Nicole Forsgren, Ryan Petrich, and Zac Duncan for their feedback when drafting this post.



from Hacker News https://ift.tt/gT9o1r8

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.