Zero Trust Architecture in Microsoft and Google BeyondCorp

 

Page content

Zero Trust Architecture in Microsoft

By 2020, Microsoft identified four core scenarios to achieve zero trust. These scenarios satisfy the requirements for strong identity, enrollment in device management, and device health validation. It also made way for alternative access for un-managed devices and validation for application health.

Microsoft Zero Trust Architecture

The initial scope for implementing zero trust focused on common corporate services used in the Microsoft enterprise by information workers, employees, partners, and vendors. Their zero trust implementation focused on the core set of applications that Microsoft employees use on a day‑to‑day basis and platforms that would include iPhones, Androids, Macs, and Windows. Microsoft took a structured approach toward zero trust that has spanned many years. Their road map was organized by phases that include an identity‑driven security solution phase that centers on securing user identity with strong authentication, as well as the elimination of passwords. It also included the verify device health phase, the verify access phase, and verify services. In the verify identity phase Microsoft began the zero trust journey implementing two‑factor authentication via smart cards for all users to access the corporate network remotely.

The Zero Trust model

Based on the principle of verified trust—in order to trust, you must first verify—Zero Trust eliminates the inherent trust that is assumed inside the traditional corporate network. Zero Trust architecture reduces risk across all environments by establishing strong identity verification, validating device compliance prior to granting access, and ensuring least privilege access to only explicitly authorized resources.

Zero Trust requires that every transaction between systems (user identity, device, network, and applications) be validated and proven trustworthy before the transaction can occur. In an ideal Zero Trust environment, the following behaviors are required:

  • Identities are validated and secure with multi-factor authentication everywhere.
  • Devices are managed and validated as healthy.
  • Telemetry is pervasive.
  • Least privilege access is enforced.

Zero Trust Architecture in Google BeyondCorp

BeyondCorp began as an internal Google initiative to enable every employee to work from untrusted networks without the use of a VPN. BeyondCorp is used fully by Googlers every day to provide user‑ and device‑based authentication and authorization for Google’s core infrastructure. The primary components of it would include single sign‑on, access proxies, access control engines, user and device inventory, security policy, and trust repository. Connecting from a particular network must not determine which services you can access. Access to services is granted based on what Google knows about you and your device. All accesses to services must be authenticated, authorized, and encrypted. When you look at their PIN test rules of conduct for their cloud consumers, which is about two sentences, you can see the effect of what zero trust accomplished by means of their BeyondCorp mission. If we were to look closely at the BeyondCorp architecture, it contains four primary components. Data sources maintains management agents, certificate authorities, asset inventories, and exceptions to hard rules. Access intelligence maintains the device inventory service, which also houses the trust inferrer, which is a system that continually analyzes and annotates device state. The system sets the maximum trust tier accessible by the device and assigns the VLAN to be used by the device on the corporate network. These data are recorded in the device inventory. Gateways house network switches, interactive logins, and web proxies. And finally, resources maintain VLANs, code repositories, and bug trackers. Resources are accessed via gateways such as SSH servers, web proxies, or 802.1X‑ enabled networks. Gateways perform authorization actions such as enforcing a minimum trust tier or assigning a VLAN at the point of user log in.

What is BeyondCorp?

BeyondCorp is Google’s implementation of the zero trust model. It builds upon a decade of experience at Google, combined with ideas and best practices from the community. By shifting access controls from the network perimeter to individual users, BeyondCorp enables secure work from virtually any location without the need for a traditional VPN.

BeyondCorp began as an internal Google initiative to enable every employee to work from untrusted networks without the use of a VPN. Now, BeyondCorp is used by most Googlers every day to provide user- and device-based authentication and authorization for Google’s core infrastructure and corporate resources.

Components of BeyondCorp

BeyondCorp allows for single sign-on, access control policies, access proxy, and user- and device-based authentication and authorization. The BeyondCorp principles are:

  • Access to services must not be determined by the network from which you connect
  • Access to services is granted based on contextual factors from the user and their device
  • Access to services must be authenticated, authorized, and encrypted

BeyondCorp consists of many cooperating components to ensure that only appropriately authenticated devices and users are authorized to access the requisite enterprise applications. Each component is described below

BeyondCorp components and access flow

Device Inventory Database

BeyondCorp uses the concept of a “managed device,” which is a device that is procured and actively managed by the enterprise. Only managed devices can access corporate applications. A device tracking and procurement process revolving around a device inventory database is one cornerstone of this model. As a device progresses through its life cycle, Google keeps track of changes made to the device. This information is monitored, analyzed, and made available to other parts of BeyondCorp. Because Google has multiple inventory databases, a meta-inventory database is used to amalgamate and normalize device information from these multiple sources, and to make the information available to downstream components of BeyondCorp. With this meta-inventory in place, we have knowledge of all devices that need to access our enterprise.

Device Identity

All managed devices need to be uniquely identified in a way that references the record in the Device Inventory Database. One way to accomplish this unique identification is to use a device certificate that is specific to each device. To receive a certificate, a device must be both present and correct in the Device Inventory Database. The certificate is stored on a hardware or software Trusted Platform Module (TPM) or a qualified certificate store. A device qualification process validates the effectiveness of the certificate store, and only a device deemed sufficiently secure can be classed as a managed device. These checks are also enforced as certificates are renewed periodically. Once installed, the certificate is used in all communications to enterprise services. While the certificate uniquely identifies the device, it does not single-handedly grant access privileges. Instead, it is used as a key to a set of information regarding the device.

User and Group Database

BeyondCorp also tracks and manages all users in a User Database and a Group Database. This database system tightly integrates with Google’s HR processes that manage job categorization, usernames, and group memberships for all users. As employees join the company, change roles or responsibilities, or leave the company, these databases are updated. This system informs BeyondCorp of all appropriate information about users that need to access our enterprise.

Single Sign-On System

An externalized, single sign-on (SSO) system is a centralized user authentication portal that validates primary and secondfactor credentials for users requesting access to our enterprise resources. After validating against the User Database and Group Database, the SSO system generates short-lived tokens that can be used as part of the authorization process for specific resources.

Deployment of an Unprivileged Network

To equate local and remote access, BeyondCorp defines and deploys an unprivileged network that very closely resembles an external network, although within a private address space. The unprivileged network only connects to the Internet, limited infrastructure services (e.g., DNS, DHCP, and NTP), and configuration management systems such as Puppet. All client devices are assigned to this network while physically located in a Google building. There is a strictly managed ACL (Access Control List) between this network and other parts of Google’s network.

Internet-Facing Access Proxy

All enterprise applications at Google are exposed to external and internal clients via an Internet-facing access proxy that enforces encryption between the client and the application. The access proxy is configured for each application and provides common features such as global reachability, load balancing, access control checks, application health checks, and denial-of-service protection. This proxy delegates requests as appropriate to the back-end application after the access control checks (described below) complete.

Any modern Web application deployed at scale employs front-end infrastructure, which is typically a combination of load balancers and/or reverse HTTP proxies. Enterprise Web applications are no exception, and the front-end infrastructure provides the ideal place to deploy policy enforcement points. As such, Google’s front-end infrastructure occupies a critical position in BeyondCorp’s enforcement of access policies. The main components of Google’s front-end infrastructure are a fleet of HTTP/HTTPS reverse proxies called Google Front Ends. GFEs provide a number of benefits, such as load balancing and TLS handling “as a service.” As a result, Web application back ends can focus on serving requests and largely ignore the details of how requests are routed. BeyondCorp leverages the GFE as a logically centralized point of access policy enforcement. Funneling requests in this manner led us to naturally extend the GFE to provide other features, including self-service provisioning, authentication, authorization, and centralized logging. The resulting extended GFE is called the Access Proxy (AP).

Public DNS Entries

All of Google’s enterprise applications are exposed externally and are registered in public DNS with a CNAME pointing the applications at the Internet-facing access proxy.

Trust Inference for Devices and Users

The level of access given to a single user and/or a single device can change over time. By interrogating multiple data sources, we are able to dynamically infer the level of trust to assign to a device or user. This level of trust can then be used by the Access Control Engine (described below) as part of its decision process. For example, a device that has not been updated with a recent OS patch level might be relegated to a reduced level of trust. A particular class of device, such as a specific model of phone or tablet, might be assigned a particular trust level. A user accessing applications from a new location might be assigned a different trust level. We use both static rules and heuristics to ascertain these levels of trust.

Access Control Engine

An Access Control Engine within the access proxy provides service-level authorization to enterprise applications on a per-request basis. The authorization decision makes assertions about the user, the groups to which the user belongs, the device certificate, and artifacts of the device from the Device Inventory Database. If necessary, the Access Control Engine can also enforce location-based access control. The inferred level of trust in the user and the device is also included in the authorization decision. For example, access to Google’s bug tracking system can be restricted to full-time engineers using an engineering device. Access to a finance application can be restricted to fulltime and part-time employees in the finance operations group using managed non-engineering devices. The Access Control Engine can also restrict parts of an application in different ways. For example, viewing an entry in our bug tracking system might require less strict access control than updating or searching the same bug tracking system.

References