Explore Next Generation Network Isolation and Access Control

OverviewSecurity domain isolation is one of the most common and fundamental issues in enterprise security. Currently, the main method of network isolation is network isolation (especially important is physical isolation). For a very small company, having a VPC in the cloud provides basic separation between the office network and the production network, but for large companies that have their own IDCs, network infrastructure, or even build their own cloud infrastructure themselves, It is a basic and complex safety building. The meaning of the foundation here is not without its technical content. Instead, it emphasizes that it is at a fundamental position in the security system. The foundation is not well developed and the superstructure is not solidly constructed.The goal of segregation is to limit the maximum extent of risk spread and to limit the scope of victimization to a safe area. Similar to the quarantine model of a dock, a bunker will not sink
into the water. From an attack point of view, network isolation can stop the attacker after a single point of success. The more defensive perspectives can refer to the "Advanced Guide to Internet Enterprise Security," a chapter on defense in depth.Network isolation is usually divided into (1) IDC isolation and (2) office network isolation, and (3) access control between the office network and IDC. Due to the sensitivity of security policy, this article will not disclose too detailed strategy or implementation plan.

IDC isolationThe following start with the IDC quarantine, the security does not have the characteristics of the company is usually divided into: no isolation within the office network, there is no isolation within the production network.Even some companies with a multi-billion-dollar market value do the same. The result of such a score is more convenient for operation and maintenance, but in terms of safety, nothing is done. The more common practice is shown below (an example of an e-commerce company):First, IDC differentiates between public and private clouds as a whole, followed by a separate production and test environment within a private cloud, followed by an isolated office network from IDC. Of course, this granularity is relatively coarse, and there is actually a finer division within each security domain. Operations and data channels are set up between security domains. Data needs to be desensitized and audited when it is transferred between security domains. Let's look at another example of a large Internet company security domain:The more granular definition of the channel between the OA-IDC, that is marked as part of the internal operation of the network diagram, for example, the operation and maintenance of the fortress machine belongs to the internal operation of the network, the idea is to operation and maintenance channels, data channels, test Environment are placed on the internal operating network can be understood as a whole is OA-test (internal operating channel) - production, 3 big security domain.In addition to the major internet companies in the country, look at an example of an international e-commerce and cloud computing giant:Office network contains the test environment, through the release system and production environment isolation, production environment in addition to mandatory compliance requirements such as key management, the basic did not do too much isolation, presumably for network resource flexibility considerations, staging can be understood as A pre-release environment at the CI end will globally differentiate Region from the cloud computing (almost no interoperability between regions in North America, Asia-Pacific ... Region), AZ (Availability Zone, which can be understood as useful in the same IDC The same disaster recovery level) isolation concept.Basically, the above examples represent the security domain isolation methodology of the vast majority of Internet and cloud computing companies worldwide. So, we absorb the advantages of each, collated out a relatively common examples of security domain division, as shown below:In most companies, if safety is done carefully, the need for less aggressive or technology-led is basically OK. However, this issue of network isolation is inherently contradictory with elastic computing. The more detailed isolation, the rapid expansion, service orchestration, and resource recycling are all detrimental. Massive IDC operation and maintenance environment, the concept of fully automated resource management will introduce barriers.On the other hand, although many rules such as separation of production tests are clearly defined, there are often many problems in actual daily use. Internet company R & D is not a traditional waterfall model, not necessarily full-time and sufficient supply of testers, especially for the expansion of the business a lot of processes and rules are often in a fuzzy zone, the test environment may not be able to meet all the test requirements Such as full-link pressure measurement. The challenge posed by these problems is that security isolation hinders business requirements, but services can not be interrupted. Therefore, there are many disguised operations such as going directly to the production environment to skip the safety precaution Set the scene and rules, and finally make the isolation looks a bit unreal.The author details a bit Network Isolation was born in the last century, is the concept since the network security, resulting in an Internet is not developed and there is no mass IDC era, so this model may be a little local (not all and absolute) out-of-date. Until Google's paradigm was discovered, a completely new idea was outlined: Instead of using network-based isolation, access control was implemented using application-level isolation, as shown in the following figure:We analyze this program in detail from several levels.First of all, Google's concept of IDC management on this scale must be realized through automated management. Human flesh is impossible. Of course, it is impossible for human flesh to approve an ACL strategy. All cluster automation management through the premise of a high degree of elastic computing power: all the machines to install initialization, on-line, application instance deployment, to automatically erase data off the assembly line, recycling of resources are automated, so excessive isolation will also inhibit the production capacity . Historically, these automated tasks have come through sshd services and require root privileges. Google considers this a huge security risk, so we redesigned an admin service that runs on all VMs and container instances. The admin service is essentially an RPC service , Support for authentication, ACL-driven and auditing, and requires minimal permissions to do the job. The equivalent of the original command through the SSH pipe into a bunch of RPC call through the admin service instructions, each parameter can be audited. This is one of Google's background.The second point, this mechanism can work on the premise that: Google IDC intranet only the RPC protocol, unlike other companies like mysql, ssh, rpc, http and other agreements, so only the RPC service access control is quite To provide access control to all attack surfaces, this prerequisite for a pervasive IDC intranet scenario with a variety of complex protocols is not ready, can not be said to be useless, but apparently other protocols still have an attack surface, and still Can be used for intranet penetration and horizontal expansion. This is also why Google has studied the technology stack rather deeply than most corporate infrastructures, so the watchdog may also realize that this is not an all-encompassing solution.The third point, the working principle of this mechanism, you can refer to the right half of the figure. Service invocation through RPC authentication: For example, a type of front-end service A can only access a type of background service B, can also be derived from the business X front-end service A can only access the business X of the background service B, and can not access the business Y Background service Testing and production separation, but also only need to test and production is defined as different business categories. From the attacker's point of view, if you get a machine within the network, you take out the scanner to sweep other machines, although the route is reachable, but because there is no corresponding RPC permissions, so there is no access to other applications token, equivalent to be isolated.Unfortunately, however, preconditions for IDC intranet convergence as an RPC protocol do not apply to the vast majority of companies, so this form of floor-to-ceiling improvement remains to be discussed, but I think this represents a future direction, Security isolation for very large-scale IDC adaptation can not be achieved by simply demarcating security domains and manual ACL approvals.

Office network isolationI described in the "Advanced Guide to Internet Enterprise Security" an OA network division, as shown below:First of all IT applications (if in the network, then) with the desktop users, desktop users according to the functional department of dynamic VLAN planning, this is a traditional way of the security domain. If the strategy is more convergent, it can also play a good role.Google's BeyondCorp (https://cloud.google.com/beyondcorp/) does not go a long way and is essentially a product of cloud adoption and mobility for office applications.Pictured is BeyondCorp this mechanism, by identifying the current device status, the use of dynamic access control policy to determine which current OA applications can access the application. This mechanism with the traditional OA security domain the most essential difference is:

Bird's traditional access control policies are based on IP / MAC.
BeyondCorp model is based on the device / account.
The traditional model of access control is static, the latter is dynamic.
The traditional model is ACL, BeyondCorp is a little wind control means.For businesses, if there's a high degree of mobile workforce, then cloud and SSO are all off-the-shelf, just grooming assets, and transforming gateway to support a wind-control engine will make BeyondCorp a reality. For many non-highly mobile companies, if the traditional security domain division, network monitoring, terminal security management is done in place, the cost of forced conversion into BeyondCorp is very high, I tend to think ROI May not be enough to underpin safety team performance.

Between OA and IDC
There are a few necessary channels:

     SSH remote access channel, through the springboard, full audit, permission recovery.
     Data security environment: Data development, BI reporting, etc., all need to contact the data warehouse development operations staff distributed, usually there will be some similar virtual desktop audit.
     Data transmission channel: for various reasons debug, the test needs to upload / return data, can only be carried out in the designated transmission channel, must meet the desensitization and audit requirements.
     Code distribution channels: Usually large Internet companies have their own set of distribution systems, and even with R & D students native disk mapping program, so this channel security are done in the distribution system.
     Infrastructure Management Channel: A special version of the O & M channel that integrates with some O & M management tools or automation platform and needs to be able to reach all IDC resources.

summaryFor most enterprises, the traditional method of partitioning the security domain is still adequate, at least for a few years without much problem, but for IDC large enterprises, elastic computing and security can become the antithesis, if you need to go Google's model needs High automation of cluster management, high level of service governance, highly unified technology stack and extremely converged intranet protocols.If the OA network needs BeyondCorp mode, cloud office and mobility needs to be relatively high. Taking a step back to building a company as a whole, pursuing the ultimate in network isolation is not the right strategy if there are many other shortcomings elsewhere, as opposed to resources (for larger security teams Including self-made resources) are devoted to higher priority areas, such as infrastructure mitigation capabilities.From a high P perspective, the pursuit of technological leadership is a matter of duty. For the safety manager, risk management and ROI will always be the core of safety work. Therefore, at the end of 2017, Most companies Google model is not necessary to copy, unfortunately, the company where I happened to have the basic conditions in this regard, so the security team is like a posture on the road. )