BeyondCorp is dead, long live BeyondCorp

With the US government’s recent memo on Zero Trust Cybersecurity Principles, there’s renewed interest (and investment) from organizations in adopting zero trust architectures. BeyondCorp, Google’s initial implementation which spawned the pursuit of zero trust in general, is still the guiding star for many organizations. It would seem that the authors of the US government’s memo have, just like the rest of the security industry, read the BeyondCorp whitepapers — and heavily based their strategy on BeyondCorp.

In reality, however, no organization has successfully implemented a fully zero trust architecture, and many proponents of zero trust — including the US government — have missed a key component: devices. Let’s ignore the memo’s recommendations on DNSSEC and STARTTLS, and focus just on the zero trust architecture.

What is zero trust architecture?

Traditional network architecture relied on a network perimeter to delineate between trusted and untrusted users, such as trusted employees inside a firewall, vs. potentially untrusted parties outside of it. By moving to a zero trust architecture, the location of an individual, specifically, which network they are on, is no longer solely what determines whether the individual is trusted, but other context is used to determine whether they can access a given application. There is no longer such a thing as a privileged, physical, corporate network.

(Most discussion about zero trust these days typically refers to users accessing an enterprise’s internal services, applications, or machines; rather than one service connecting to another service. This is unfortunate, as service to service communication is extremely relevant — even if you validate and allow an individual to access a low risk application, but that application then can make calls to a high risk application, you haven’t effectively protected your high risk applications. If you’re only protecting the front door, but the killer is already inside the house… well, you’re not going to have a good day.)

We can think of access controls for each application in terms of network segmentation. Each application is in its own segment (micro segmentation), instead of broad concentric circles of increasingly trusted applications. It’s not that a VPN is bad per se, it’s that any entrypoint or gateway that gains access to a broad set of applications, instead of only a specific application, is bad.

BeyondCorp, first introduced in a paper in 2014, is Google’s original, specific implementation from which the broader generalized set of principles (and analyst categories) for zero trust architecture emerged.

In BeyondCorp, the idea is that applications are available directly from the public web, not inside a trusted network, and that every request to access an application is a policy decision, based on the user, device, and application. Each of these characteristics can be along a spectrum:

User trust: Determined using a user identity (SSO token) and user database (identity provider). A user can be challenged to prove that they are trusted, such as requiring them to sign in or requiring a hardware second factor.
Device trust: Determined using a device inventory, device identity (certificate), and device measurements. A device might be untrusted, like a hotel lobby computer in a foreign country, or might be more trusted, like a fully patched Chromebook, using an MDM, in the company’s device registry, with a device certificate protected by a TPM, also used for secure boot to verify the device’s OS, connecting from the same physical network as the company’s head office.
Application policy: Each application can define a policy with a different fine-grained set of requirements for access, including different minimum requirements for user and device trust. If there is insufficient trust when a user tries to access an application, a challenge can be issued to the user to further authenticate.

This is an incredibly complex, sophisticated model for how to manage access to internal applications. BeyondCorp is the gold standard of what we should all aim for when designing zero trust architectures---including what the US government is now mandating.

There’s one problem: a fully zero trust architecture is incredibly difficult and incredibly expensive to deploy, and arguably, no one has ever achieved it.

Even Google’s BeyondCorp isn’t a fully zero trust architecture

Google adopted its zero trust architecture of BeyondCorp gradually, targeting both greenfield and brownfield applications, over the span of several years. Today, still, BeyondCorp isn’t 100% rolled out at Google. It never will be. There will always be a gap of enterprise applications or new tools being introduced which require work to integrate.

Like any system, there are exceptions — and those exceptions become more and more expensive (and therefore unworthwhile) to address. Citing another of the BeyondCorp papers:

Despite all the best efforts to define, roll out, measure, and enforce controls, you may inevitably face the harsh reality that 100% uniform control deployment is a mythical state where unicorns frolic unconcerned about malware and state-sponsored attackers.

Google has already invested substantial resources, over many many years, in developing BeyondCorp:

Fully achieving the goals outlined in this paper (and the more general goals of BeyondCorp) requires significant resources.

If it’s “significant resources” at Google scale, it must be a massive investment. I would venture to say that if a VP knew the initial cost and time it would actually take, they might not have made that investment. (It’s likely more than the $9.9M the Office of Personnel Management has set aside.) For Google, there was a clear reason to invest; but for a lot of other organizations, there isn’t.

I don’t want to paint the impression that Google’s implementation of BeyondCorp isn’t successful. It is — it’s a project that has significantly changed how Google operates and improved its security. It has introduced a new model for how to think about network security in the industry — zero trust architecture — and birthed multiple analyst categories, as well as many, many startups. It’s a multi-year initiative in security teams at every major tech company. The US government’s memo cites it in everything but name.

So then, despite its popularity, it shouldn’t come as a surprise that if Google hasn’t fully succeeded, companies other than Google haven’t either.

Security teams will laugh if you say you’re implementing a zero trust architecture

As with other groundbreaking research coming out of top tech companies, when the BeyondCorp paper was published, startups were created with the goal of reproducing such a zero trust architecture to make it available to anyone. They tried to offer, and I quote, “BeyondCorp… for the Cloud Native organization”, “BeyondCorp outside of Google”, and “BeyondCorp for the rest of us”, amongst others. They bought beyondcorp.com. They created an Alliance. The first BeyondCorp paper (there are many) was released in 2014 — but it’s 2022 now, so shouldn’t you, too, have a zero trust architecture?

What happened? Startups are going to startup, by failing, or getting acquired and absorbed. Today, you can use an OpenSSH ProxyCommand config to authenticate your SSH sessions using Okta, and you can limit access to an application from a device based on IP address and whether screen lock is enabled. That’s better, but not saying much, unfortunately.

The reality is if you say you’re “doing zero trust” to a security professional today, they’ll assume you’re naïve, and haven’t realized what you’re actually signing yourself up for. They’ve tried to use the tools already available in the market themselves to get closer to a zero trust architecture, and faced too many challenges. They’ve seen the market get excited, companies be born and die, and yet nothing really changed. If multiple companies registered in Delaware already died trying to make this happen, why would a company headquartered in Virginia be successful? This might be a rare case of security professionals being realistic, not overly pessimistic (as they are often portrayed).

Not to depress you even further, but it’s even worse. The tools that are on the market today aren’t even doing the hard part of zero trust yet.

Everyone seems to have missed the bit about devices

Recall that Google’s BeyondCorp has three pieces: users, devices, and application policies. But — you may have noticed — the heftiest section of that, by a long shot, is devices. Device trust should be determined using a device inventory, device identity (like a certificate), and device measurements.

So, what does the US government memo say in its barely one and a half pages on devices? That you should inventory your assets, and have government-wide endpoint detection and response. There’s more written on MFA than the whole topic of devices! (Not that MFA isn’t important.)

Why are devices so important? If you recall, if a user has insufficient trust to access a specific application, Google’s BeyondCorp can require the user to perform a challenge, such as re-authenticating or using a hardware second factor. But what’s completely impractical? Asking the user to change the device they are on! (Even my favourite Googler has 5 laptops at home, but only some of them work, even fewer are for work, and most of them are in the other room.) So, when you’re making a decision to authorize a specific access request, you have some data about the user, but ideally, you have as much data about the device as possible, since you can’t get any more. Most of what you can infer about the level of trust for a given connection comes from the device, not the user.

What should you, and the US government, be doing to measure device trust? To poorly summarize: have an inventory of your fleet, use an MDM to measure OS version, patch level, and encryption status, yes — the memo got this part. (By the way, have you tried to buy an MDM that covers more than three OSes lately?)

But also, use a device certificate that is specific to each device, protected by the machine’s Trusted Platform Module (TPM). To do that, you only need enough of an understanding of TPMs to implement secure boot to verify the device’s OS, as well as protected device certificates (sorry, NSA, I don’t think that your implementation in Java, which was susceptible to log4j, will do), and the ability to run an enterprise CA for those certificates, that is available whenever your employees need to access any applications. You already run one of those for your SSH certificates, right?

Where does that leave us?

So, someone at the US government also read the BeyondCorp papers, and also wants a true, working, zero trust architecture. Don’t we all. And I suspect that the government mandating this (by 2024!) won’t make it true — there isn’t a tech company where a top down mandate like this would work today, and I have no reason to believe the US government can do better.

These are the right goals. As an industry, we can continue to build pieces of an ideal zero trust solution. A solution that includes:

User authentication, based on single sign-on, and hardware second factors;
Device authentication, based on a device registry, a hardware-bound device identity, and measured device characteristics like secure boot;
Application policies, so that each application is micro segmented and enforces its own policies, with no single point of failure;
No public point of entry to the network;
End-to-end encryption with world-class cryptography;
… and meets all the traditional enterprise requirements, like audit logging or SAML integration.

I don’t think anyone today is positioned to build every part of that solution. It’s far too much for an organization even like the US government to get right in only 3 years. We’ll still be pulling together components piece meal, where they exist.

The US government memo feels not like a mandate for zero trust, but a mandate to insert “zero trust” in all of our marketing. Here’s hoping we get a few of the missing pieces out of it too.