Validating and Restoring Defense in Depth Using Attack Graphs
by user
Comments
Transcript
Validating and Restoring Defense in Depth Using Attack Graphs
Validating and Restoring Defense in Depth Using Attack Graphs Richard Lippmann, Kyle Ingols, Chris Scott, Keith Piwowarski, Kendra Kratkiewicz, Mike Artz, Robert Cunningham MIT Lincoln Laboratory 244 Wood Street Lexington, Massachusetts 02420-9108 Email: [email protected] Abstract— Defense in depth is a common strategy that uses layers of firewalls to protect Supervisory Control and Data Acquisition (SCADA) subnets and other critical resources on enterprise networks. A tool named NetSPA is presented that analyzes firewall rules and vulnerabilities to construct attack graphs. These show how inside and outside attackers can progress by successively compromising exposed vulnerable hosts with the goal of reaching critical internal targets. NetSPA generates attack graphs and automatically analyzes them to produce a small set of prioritized recommendations to restore defense in depth. Field trials on networks with up to 3,400 hosts demonstrate that firewalls often do not provide defense in depth due to misconfigurations and critical unpatched vulnerabilities on hosts. In all cases, a small number of recommendations was provided to restore defense in depth. Simulations on networks with up to 50,000 hosts demonstrate that this approach scales well to enterprise-size networks. I. I NTRODUCTION Defense in depth is a common strategy used to protect critical resources on enterprise networks as well as Supervisory Control and Data Acquisition (SCADA) and other process control subnets. It relies primarily on multiple layers of firewalls between protected systems and the internet. There is often a perimeter firewall between a corporate enterprise network and the internet, internal firewalls protect subnets for separate enterprise units, and deeper internal firewalls protect critical subnets. It is difficult to verify the protection this affords against cyber attacks because firewall rules are complex and vulnerabilities on hosts exposed through firewalls are constantly being discovered and patched. This paper describes a tool named NetSPA (NETwork Security and Planning Architecture) that verifies and, if This work is sponsored by the United States Air Force under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the United States Government. 1 of 10 necessary, provides suggestions to restore defense in depth for large enterprise networks. It provides a comprehensive solution that is not currently available due to the limitations of current security tools. For example Network vulnerability scanners, such as Nessus [1], discover hundreds or thousands of vulnerabilities on even small networks but do not indicate which of these enable an attacker to progress through a network to reach critical resources. A firewall ruleset evaluator identifies common misconfigurations, such as rules permitting arbitrary inbound traffic to a common server port (e.g. [2]), but again does not indicate which rules permit an attacker to reach critical resources. NetSPA uses attack graphs to measure and maintain network security. Many different representations have been proposed for attack graphs (e.g. [3]–[9]). The representation we use is illustrated in Fig. 1. This attack graph starts at a root node that represents the attacker starting host. The root node in this graph is the upper node labeled “attacker.” Edges represent a specific instance of a vulnerability that an attacker is able to exploit. An edge connects a source node where the attack originated to a target node representing a new privilege provided by the attack. The new target node represents either increased privilege on an alreadycompromised host or new privilege on a new host. Multiple edges connect two nodes in an attack graph when there are multiple vulnerabilities that achieve the same goal and any one is sufficient to compromise the target host. This formally makes our attack graph a multigraph. An attack graph for a network shows all hosts that can be compromised by an attacker starting at a specific location and the sequences of actions that permit these compromises. This attack graph was generated using NetSPA from data collected at an actual site containing the network shown in Fig. 2. This site contains Division and Office local area networks (LANs) separated from a larger outside enterprise network by the upper perimeter firewall. The Office LAN is in addition separated from the Division LAN by the lower External Network Attacker Host … Perimeter Firewall Division LAN … Internal Firewall Office LAN Office DMZ Fig. 1. Simplified Attack Graph for Real Network Fig. 2. internal firewall and there is an Office DMZ subnet that contains servers accessible from hosts on the Division LAN. SCADA devices or other critical resources would be located on the Office LAN subnet to provide the greatest protection from attackers on the external network. There are more than 200 hosts on the Division LAN and the external network but fewer than 10 hosts on the Office LAN. The original attack graph generated for this site contained thousands of nodes and edges and could not have been easily displayed in this paper. This complex graph was automatically simplified to create the graph shown on the left. In this graph, nodes represent many hosts compromised at the same level from the same source node and edges represent all vulnerabilities that can be used to compromise any of these hosts. The attack graph is simplified also to only show hosts that can be compromised at the “system” level that corresponds a UNIX root or Windows administrator level. The attack graph shows that an attacker on the external network (Node N1) can directly compromise 93 hosts in the external network by exploiting 88 vulnerabilities across these hosts and that one server (Node N2) can also be compromised through the upper perimeter firewall. Following these hosts, the division LAN server (Node N2) can be used as a stepping stone to compromise 125 hosts in the Division LAN by exploiting 103 vulnerabilities across these hosts. This server can also compromise a server in the Office DMZ (Node N664) using any one of four vulnerabilities. Finally, the server in the Office DMZ can be used as another stepping stone to compromise one host in the Office LAN using any of two vulnerabilities (Node N1982). Nodes for the two last machines have high numbers because they represent numbers from the original much larger attack graph before simplification. This simple graph shows that two stepping-stone hosts enable an outside 2 of 10 Topology of Real Network attacker to compromise 220 hosts at this site and to reach into the most protected Office LAN in the network. Attack paths through firewalls shown in Fig. 1 were unexpected. They demonstrate that even three layers of firewall rules (counting the LAN DMZ) are not providing defense in depth. Paths through all the firewalls were enabled by only ten critical unpatched vulnerabilities from among the thousands discovered by vulnerability scanners on these networks. This paper focuses on using NetSPA to verify the security of existing networks and, if necessary, create a prioritized list of recommendations for system administrators that provide the greatest improvement in network security by blocking the most destructive attack paths first. For example, in the above attack graph the first recommendation is to patch vulnerabilities on the DNS server (Node N2). An automatic system was developed to create recommendations because attack graphs are often large, complex, and difficult to interpret visually. The rest of the paper covers the NetSPA system in detail, describes results obtained on operational networks, and discusses related work. Further details on the NetSPA system are available in a recent report [10]. II. N ET SPA I NPUTS AND TASKS A block diagram of NetSPA is shown in Fig. 3. The left side of this figure shows that NetSPA automatically imports the following into an internal database: 1) Vulnerability Scans, such as those produced by Nessus [1], that list where vulnerabilities are in the network and provide information on individual hosts and open ports. Data Preprocesssing Vulnerability Scanners (Nessus, ISS, ...) Vulnerability Information (NVD, CVE, ..) Attack Graph Generation Relational Database Recommendations and Results Firewall Configuration Files (Sidewinder, Checkpoint, ...) Attack Graph Analysis Topology Information, Asset Values, ... Fig. 3. Reachability Analysis NetSPA System Block Diagram 2) Vulnerability Databases, such as NVD [11], that describe the prerequisites for and the effects of exploiting vulnerabilities. 3) Firewall Rules, such as Sidewinder rulesets, that describe how traffic may or may not flow through a filtering device. 4) Topology Information that specifies how firewalls and hosts from vulnerability scans are connected together. The first three items in the list can be obtained automatically and imported, but topology information needs to be provided by hand. This information is limited, doesn’t change often, and is relatively easy to enter. The right side of Fig. 3 shows that NetSPA uses imported data to perform three main tasks. These are (1) Compute reachability, (2) Create an attack graph, and (3) Generate recommendations. Algorithms to perform these three tasks that are efficient in time and space are described below. III. R EACHABILITY C OMPUTATION Reachability is a critical prerequisite of attack graph construction. It refers to determining which ports on all other hosts in a network can be reached via TCP or UDP connections from all interfaces on all hosts in a network. This involves a complex analysis of firewall rules and the network topology to make sure there is at least one path between the two hosts of interest to the port of interest. Prior work often explicitly assumed reachability information was provided (e.g. [7], [12]). This is reasonable for theoretical analyses, but we have found empirically that computing reachability can sometimes be more costly than computing attack graphs. Alternatively, some systems (e.g. [3], [13]) assume that reachability can be determined using information from network vulnerability scanners. This approach does not scale to large networks and will miss reachability 3 of 10 available to attackers because firewalls frequently allow connections only from specific source IP addresses. The straightforward approach to reachability analysis involves filling in elements in a reachability matrix. Each row in the matrix represents a source interface, and each column represents a destination port. Each cell of the matrix holds a Boolean value, indicating whether or not reachability exists between the source interface and the destination port. Precomputing this matrix is both spaceintensive (roughly quadratic in the number of hosts in the network) and time-intensive. Correctly computing a cell’s value is a computationally intensive task because each cell represents a potential traffic flow in the network. Our system determines if the network permits or denies the traffic by traversing the network model, crossing filtering devices and evaluating firewall rules when encountered. Two major improvements dramatically improved performance over the past year. The first improvement was to compute outbound reachability (rows in the matrix) only for hosts the attacker is actually able to compromise at a user or administrator level. Reachability from the non-compromised hosts is irrelevant to the attack graph, and this simple optimization saves large amounts of time when analyzing well-secured networks. The second improvement involves grouping hosts with similar reachability matrix entries. This is accomplished by separating the traditional reachability matrix into multiple submatrices for traffic within an individual subnet and traffic that crosses between subnets. Imagine a small hypothetical network with two subnets separated by a firewall. A full reachability matrix for this network is shown in Fig. 4. Interfaces and ports in the first subnet are listed before interfaces and ports in the second. The upper left and lower right submatrices represent intra-subnet reachability, and the remaining two submatrices are inter-subnet reachability. Interfaces that are fully interconnected without any in- Target Host/Port Fig. 4. Unfiltered Subnet 1 Filtered 1 -> 2 Filtered 2 -> 1 Unfiltered Subnet 2 Source Interface Source Interface Target Host/Port Full Reachability Matrix Unfiltered Subnet 1 Filtered 1 -> 2 Filtered 2 -> 1 Unfiltered Subnet 2 Fig. 5. Reachability Matrix with Filtered and Unfiltered Domains tervening filtering are placed in unfiltered reachability domains. Unfiltered reachability domains are formed for all interfaces in a subnet when there is no filtering between these interfaces. This method treats submatrices of the reachability matrix as individual subrows without losing information, as shown by the upper left and lower right subrows in Fig. 5. This method effectively reduces each unfiltered submatrix of dimensions m × n to a subrow of length n. Interfaces that are in the same unfiltered domain and are treated identically by all filtering rules are placed in filtered reachability domains. These interfaces have identical reachability to destinations outside of their unfiltered reachability domains. This method also makes it possible to treat additional submatrices of the reachability matrix as individual subrows, as shown in Fig. 5. For each set of interfaces in a filtered reachability domain, reachability needs to be calculated for only one interface of the set (one row of the submatrix), and all other members of the set share that reachability information. The example of Fig. 5 shows a single filtered reachability domain in subnet one, and three filtered reachability domains in subnet two. Two of those three domains are singletons and apply to only one interface. Every interface with reachability information belongs to one unfiltered and one filtered reachability domain. Filtered reachability domains are constructed automatically using firewall rule sets to identify hosts that are treated identically by all firewalls as described in [10]. In most networks, the two types of reachability domains reduce computation and storage requirements substantially. If our sample network had 200 hosts in each subnet with one port open on each, then the number of unique entries in the reachability matrix starts at 400 × 400 = 160, 000 without reachability domains. With reachability domains, the number of unique entries could drop as low as 800. This reduces storage and computation by more than two orders of magnitude. 4 of 10 IV. S ELECTING S OURCE AND D ESTINATION A DDRESSES A naive approach to creating attack graphs would consider only IP addresses of physical hosts in a network as source and destination addresses. This may miss exploitable attack paths for two reasons. First, firewalls can include Network Address Translation (NAT) rules to translate destination IP addresses that don’t exist on any physical host into a physical address. An attacker using such a destination address may be able to penetrate a firewall to an internal host accessed using destination NAT rules. This problem is addressed by adding a virtual destination host on each subnet in analyzed networks. This host includes all IP addresses mentioned in destination NAT rules across all firewalls. It forces our algorithms to reach any hosts that can only be accessed through destination NAT rules. A second limitation of using only IP addresses of physical hosts is that an attacker can sometimes spoof the source IP address of a host to use an IP address specifically allowed to pass through a firewall. An attacker using such a source address may be able to penetrate a firewall using custom pass rules inserted to satisfy some current or past requirement. This issue is addressed by allowing a user to select a virtual source host as the attacker starting node. In NetSPA, the attacker starting location can be the IP address of an actual physical host or a virtual host with a list of source IP addresses selected specifically to penetrate firewalls. A virtual host includes IP addresses found in any firewall rule and a representative address for each address range referenced by any firewall rule. These source addresses exercise all firewall rules and represent an attacker with complete knowledge of all firewall configurations. When a virtual attacker host is used, a worst-case assumption is made that an attacker can use any source IP address and attempt to bypass as many network restrictions as possible. V. V ULNERABILITY E VALUATION Attack graph generation requires knowledge of the prerequisites required for vulnerability exploitation and of the effect of exploitation on attacker privileges and the network. Previous work developed languages to represent attack prerequisites and effects. JIGSAW [14], LAMBDA [15], and CAML [16], for example, use complicated models of prerequisites and attack effects. Other attack graph papers (e.g. [5], [7], [13]) define their own models. None of these approaches have been applied to large networks with many vulnerabilities and many require complex hand analysis of vulnerabilities to extract the information required. We developed a simpler approach that has been successfully applied to large networks with hundreds of vulnerabilities. Our approach uses an attacker model that is based solely on malicious actions. If a vulnerability exists, and the attacker can reach the vulnerable port, then it is assumed that the attacker can successfully exploit the vulnerability to its fullest extent. No attempt is made to model exploitation details as in [13] such as how exploit scripts are downloaded, installed, and executed because these tasks can be performed in many different ways. We take a similar approach to vulnerability classification. Prerequisites and post-conditions used for vulnerabilities are shown in Table I. The only prerequisite is locality and there are only four effects. This simplified classification is used because this information can be automatically extracted from existing data sources. Locality specifies the location of the attacker when exploiting a vulnerability and effect specifies the privilege level obtained by an attacker (Administrator, User, Other, or DoS). Initially, we thought that locality and effect information could easily be extracted from existing vulnerability inTABLE I V ULNERABILITY C LASSIFICATIONS Locality Effect Local Exploit only from the vulnerable machine itself. Remote Exploit remotely over the network. Administrator Administrator- or Root-level access to vulnerable host. User User- or guest-level access to vulnerable host. Other Confidentiality and/or integrity loss, e.g. read files, corrupt limited files, learn about software versions running on a host. DoS Target service or host disabled with no access to host. 5 of 10 formation sources. Vulnerability scanners such as Nessus provide information on discovered vulnerabilities, and when CVE identifiers [17] are provided, they can be used to crossreference vulnerability databases, such as NVD [11] and Bugtraq [18]. Unfortunately we found that CVE identifiers are not always provided by vulnerability scanners, that databases are error-prone and often inconsistent, and that information is often provided as free-form text designed for human, not computer, consumption. We used machine learning techniques to develop a pattern classifier that unifies all of the available data, including human-readable text strings, and determines the locality and effect of vulnerabilities. A total of 215 vulnerabilities found on actual networks were first analyzed by hand to determine their locality and effect and features were extracted from Nessus, the NVD database, and the CVE text description of these vulnerabilities. Features included the values of categorical and binary data fields used by Nessus and the NVD database to characterize vulnerabilities. They also included binary features that indicated the presence of common generic phrases in descriptive text such as “execute arbitrary code” and “gain system privileges.” Different types of classifiers were evaluated using the LNKnet pattern classification software package [19] with the goal of correctly determining locality and effect. A logistic regression classifier performed as well as more complex decision tree and support vector machine classifiers when measuring performance using 10-fold crossvalidation testing. The expected generalization error of this classifier estimated using such testing is nearly 100% correct for locality and better than 90% correct for effect. The small number of effect errors was caused primarily by missing values in the NVD vulnerability database fields and ambiguous text descriptions. Although the classifier makes some errors, the consensus opinion provided is still better than that provided by any one of the individual data sources. It has been used successfully in many field trials to obtain locality and effect information for vulnerabilities that have not been hand-analyzed by a security expert who has examined all relevant information. There will always be some errors in vulnerability classification whether made by a automated classifier or a human expert. NetSPA is flexible enough to allow rapid regeneration of attack graphs to examine the effect of changing the locality and effect of specific critical vulnerabilities and to regenerate attack graphs when classifications change. VI. ATTACK G RAPH G ENERATION The simple network shown in Fig. 6 illustrates how graphs are constructed. This network contains five hosts, A, B , C , D , and X , collocated on a hub. A is the attacker’s starting location. The other four hosts have a remote-toadministrator vulnerability v1 , shown in graphs with a solid edge. Additionally, X is able to remotely log on as the administrator to B , C , and D. This is denoted v 2 , because an attacker can use this capability as a remoteto-administrator vulnerability after compromising X . It is shown as a dashed edge in graphs. A. Full Graph A full attack graph that shows all possible paths to compromise all vulnerable hosts has been generated in many previous studies (e.g. [5]–[9]). A full graph for the simple network is shown in Fig. 7. Unfortunately, it is impractical to generate a full graph even for modest networks. The number of nodes in the full graph grows combinatorially as O(h!), where h is the number of vulnerable non-attacker hosts in the network. B. Predictive Graph After exploring a number of alternative graph types and generative algorithms, we developed a graph, called a predictive graph, that can be built efficiently and has two critical properties. First, it determines all hosts that can be compromised by an attacker from a given starting location. Second, without regenerating the graph, it correctly predicts the effect of removing vulnerabilities. Removing vulnerabilities could correspond to patching vulnerable software, altering firewall rules to block access to a vulnerable service, or removing the vulnerable software altogether. The effect of removing vulnerabilities can be determined by removing the associated edges from the graph and observing which nodes are made unreachable from the starting location. Because this does not involve regenerating the graph, but simply editing the graph, we call this a predictive graph. A predictive graph must make correct predictions when any combination of edges is removed. The predictive graphs we build are directed and acyclic. The predictive graph corresponding to the network in Fig. 6 is shown in Fig. 8. Detailed pseudo code for the A Attacker Fig. 6. X System Administrator B User Desktop C User Desktop D User Desktop ABCDX Sample Network 6 of 10 predictive graph algorithm is available in [10]. A predictive graph is built in a breadth-first fashion from the root. To extend the graph from a given node, all of the vulnerabilities on all of the vulnerable ports that the attacker can reach from that node are explored. If the vulnerability yields a level of access not yet achieved on the path from the root to the current node, and the vulnerability instance has not been exploited by any node along the path from the root to the current node, then the graph records the exploit and adds the resulting node to the breadth-first queue. The second condition, checking for the vulnerability’s prior use by any parent node, is the pruning condition that creates a predictive graph. The correctness of this pruning condition, which we have named dynamic pruning, is shown in [10]. Redundant paths in a full graph are eliminated in a predictive graph. The remaining structure fulfills the predictive requirement. For example, consider an analysis of the network in Fig. 6 to determine the effect of patching vulnerability v1 on host B . Designate this edge/host pair as E(v1 , B) where, more generally, E(v, h) is the edge/host pair indicating that vulnerability v is exploited on host h. If every E(v1 , B) edge/host pair is removed from the predictive graph, then the graph correctly shows that B is still vulnerable due to the path A → E(v1 , X) → E(v2 , B) starting from the attacking host. It is difficult to create computation bounds for predictive graphs because the graph size depends on the network topology, firewall settings, and location of vulnerable hosts and the attacker. On a flat network with h hosts interconnected with no filtering devices, computation is bounded by O(h2 log h) which is roughly quadratic in the number of hosts. Simulation studies described below and field trials on actual networks suggest that complexity often grows less than quadratically even in complex enterprise networks. Predictive graphs were successfully computed for all actual and simulated networks used in this paper. These graphs, however, may become impractical to compute in some rare situations. One is when there are many firewall layers and when many different hosts are exposed and can be compromised directly through each firewall. A variant of a predictive graph, called the node-predictive graph, mitigates this problem. A node-predictive graph accurately predicts the effect of completely patching any combination of nodes, where nodes represent either individual hosts or groups of hosts. Node-predictive graphs are built using the predictive graph method with an algorithm called dynamic host collapse, or DHC. DHC places hosts into host groups, grouping them when the hosts have equivalent inbound compromisability. In other words, if two hosts can be compromised at the same level (user, administrator, other, DoS) from every host the attacker has already compromised, Level 0 (Root) Level 1 Level 2 Level 3 Level 4 C D D C B X C B D X B C X C D D A X C B B A D C X D B D C X B Fig. 7. Vulnerability v1 Vulnerability v2 B X ABCDX Full Graph Fig. 8. then the hosts are equivalent from the point of view of the attack graph. This also means that if one host in a group can be compromised, then all hosts in that group can be compromised. A host group is treated like a single host by the attack graph generator and host groups are consistent across the graph. A detailed discussion of node-predictive graphs is available in [10]. VII. ATTACK G RAPH A NALYSIS Attack graphs are often too complex for hand analysis. To address this problem, an analysis system was developed that automatically provides system administrators with an easily understood, prioritized list of recommended network changes. The first step of creating recommendations requires computing a measure called the Network Compromise Percentage, or NCP from the attack graph. The NCP is the percentage of the hosts on the network on which the attacker has obtained user or administrator-level access. An NCP of 0% indicates no compromises were possible; an NCP of 100% indicates compromise of every host in the network. Hosts that provide critical network resources such as servers or that interface to SCADA devices can be given higher weights, or “asset values,” in this calculation related to their importance as described in [10]. In this analysis it is assumed that all hosts are equally important. The NCP 7 of 10 ABCDX Predictive Graph assigns a simple, easy-to-understand number to an attack graph. The analysis system hypothesizes patches to hosts and evaluates their effects on the network’s security. Each hypothetical patch or group of patches is called a recommendation. Recommendations are host-based, rather than vulnerability-based, to make them easy for an administrator to understand and implement. Per-host recommendations are generated by traversing the complete attack graph and recording all edges that can be used to compromise each host. This creates a list of exploitable vulnerabilities for each host and ignores vulnerabilities that the attacker cannot directly exploit. Group recommendations are also created that represent the impact of patching several hosts at once. This analysis begins at the attacker’s starting node. For each child node (each representing a host the attacker can compromise directly), we compute the set of all hosts compromised below the child node. We then collect the child nodes into groups. A child node N c joins the recommendation group if the group already contains a node N d such that the set of hosts compromised below N c has a non-null intersection with the set of hosts compromised below N d . The algorithm thus forms groups of hosts which are stepping-stones to some common subset of additional hosts. Any singleton group is recursively explored, treating the single node in the group as the attacker’s starting node and running the algorithm again. Per-host and group recommendations make it possible to find and report both single-host and multiplehost stepping stones in the attack graph. Predictive graphs make it possible to predict the impact of recommendations without rebuilding the graph. This is accomplished by removing edges that target the host or hosts the recommendation suggests patching. Once the edges are removed from the graph, the graph’s new NCP is computed. A recommendation’s value is the difference, ∆NCP, between the original and new NCP. The collection of recommendations is sorted by ∆NCP and presented as an ordered list, in order of effectiveness. The recommendation which reduces the NCP value the most is displayed first. VIII. F IELD T RIALS AND S CALING A. Field Trials NetSPA has been evaluated on three real networks under the guidance and cooperation of system administrators of these networks. One network was at a military site and two were at our laboratory. One of the laboratory networks is shown in Fig. 2. This network was analyzed multiple times over a few years. One of the resulting attack graphs is shown in Fig. 1. For all networks, NetSPA required only a few minutes to build predictive graphs and generate recommendations. Initial vulnerability scans typically found vulnerabilities on hundreds to thousands of computers. When firewall information was available, NetSPA discovered previously unknown critical attack paths that were only uncovered by automatically gathering and analyzing both vulnerability and firewall configuration information. Some of these paths were surprising and unexpected. Recommendations NetSPA produced restored defense in depth by patching a small number of vulnerabilities on stepping-stone hosts that enabled attackers to penetrate a network. Hand examination of firewall rules and vulnerabilities identified by NetSPA recommendations revealed two common security problems. The first problem, also noted by [2], was misconfigured firewalls. Examples discovered include old legacy rules that unnecessarily allowed access from outside IP addresses or subnets and firewall rules where internal and external addresses are incorrectly interchanged. These were discovered by NetSPA’s reachability analysis. Incorrect rules were primarily indicated by starting virtual IP addresses created as described in section IV that allowed attackers to penetrate firewalls. The second problem discovered by NetSPA was recently discovered, but unpatched, vulnerabilities on hosts that were exposed through firewalls. NetSPA identified a small number of critical vulnerabilities that needed to be patched to restore security. In these and 8 of 10 other large networks it is often too costly to immediately patch every vulnerability found on every host whenever a new vulnerability is announced. NetSPA alleviates this problem by identifying a small number of critical vulnerabilities that can be more readily patched to maintain network security. B. Simulated Scaling Experiments Simulation studies were performed to evaluate scaling performance using networks with up to 50,000 hosts. An enterprise network model was used that contains five enclaves where each enclave models a company or university site. Each enclave contains a perimeter firewall, hosts in a DMZ, hosts in an administrative LAN, hosts in multiple IPv4 class C subnets, and hosts behind internal firewalls protecting additional class C subnets. The number of hosts in simulations was increased by scaling up the number of hosts in enclave subnets while keeping the number of subnets and enclaves constant. All experiments were performed on a single processor Pentium 4 1.80-GHz machine with 1024 MB PC133 SDRAM, running Microsoft Windows XP. All times reported here are the overall totals, including database load, reachability computation, attack graph generation and analysis, and output. Total memory usage required by NetSPA never exceeded 250 MB. Simulation parameters were adjusted to represent a simple and a complex network. The simple network has 10 open ports per host and 10 IP-specific filtering rules per perimeter firewall for a total of 50 rules in all firewalls. The complex network has 30 open ports per host and 400 IP-specific rules per perimeter firewall for a total of 2,000 rules in all firewalls. An additional experiment was run with an attacker starting from one fixed IP address instead of trying all IP addresses selected for the virtual attacker host. For these tests, firewall rules permitted the same level of attacker progress in either case, ensuring reachability computation was the only difference between the two simple enterprise cases. In all simulations, half of all hosts on the inside of each enclave contained a remote-to-administrator vulnerability and half contained no vulnerabilities. In addition, the perimeter firewall allowed one path into a simulated DMZ subnet and one path out of this subnet into the interior of each enclave. The lowest curve in Fig. 9 shows the total NetSPA computation time for an attacker starting from one specific external IP address for the five-site simple enterprise network. This can be compared to the next higher curve in this figure, which is the total computation time for the enterprise network using the default external virtual attacker host that explores paths created with all IP addresses that occur in 10000.00 Seconds 1000.00 Complex Network (Overall) 100.00 Simple Network (Overall) Simple, Single Source-IP Network (Overall) 10.00 1000 10000 100000 Total Hosts Fig. 9. Growth as the Number of Hosts (Enterprise Network) any firewall rule. The total computational time is smaller with only a single starting IP address because reachability from the attacker’s starting location has to be computed only from one IP address. An attack graph that includes only one IP starting address might be of interest if attackers were forced to use one or more fixed starting addresses. Overall run times for the simple enterprise network are roughly a minute for networks with up to 2,000 hosts and less than an hour for simple enterprise networks with up to 32,000 hosts. These are the most efficient attack graph generation times we are aware of. In fact, we are aware of no past paper on attack graphs where attack graphs are computed for networks with more than a few hundred hosts. In addition, the slopes of all curves indicate that computation grows less than quadratically in the number of hosts for this simulated enterprise network. This also is the lowest computational complexity growth rate for attack graphs that we are aware of. The upper curve of Fig. 9, shows the overall run time for the complex network. The overall run time with 10,000 hosts increases to roughly three hours for the complex network from 10 minutes for the simple network with a similar number of hosts. These run times are practical for even large enterprise networks. The increase in run times is caused by additional starting addresses for the virtual attacker host, additional firewall rules that need to be analyzed, and additional target ports that must be considered when computing reachability. IX. R ELATED W ORK A comprehensive annotated review of past research on attack graphs is provided in [20]. Although research has made significant progress, no past paper on attack graphs 9 of 10 that we are aware of reports on the analysis of real networks with more than hundreds of hosts. The two most capable past systems for network vulnerability analysis are the Topological Vulnerability Analysis (TVA) tool [3], [13] and the MulVAL Logic-based Network Security Analyzer [21]. The MulVAL system [21] uses hostbased data collection to accurately identify vulnerabilities on each host in a network and assumes reachability information is provided. It has been tested using data from a real network with three hosts and simple simulated networks obtained by duplicating each of these hosts. The goals of this system are similar to those of NetSPA. NetSPA, however, explicitly computes reachability, automatically imports vulnerability information, and has been used on real networks with thousands of hosts. The Topological Vulnerability Analysis (TVA) tool [13] was used to construct attack graphs for a small 17-host network, but scaling was poor. These tools rely on algorithms described in [12] where the the computation to compute attack graphs is bounded by O(h 6 ), where h represents the number of hosts in a network. Although this is polynomial and not combinatorial, as stated in [12], this approach will only scale to networks with at most “tens or hundreds of hosts.” Recent papers [3] suggest that scaling for the TVA tool has been improved, but run times or scaling results for large networks have not been presented. Other past systems used full attack graphs [5], [6], [9] and can not scale to large networks because the size of these graphs is combinatorial in the number of hosts. Past experiments [7] also demonstrate that model checking exhibits poor scaling for even small simulated networks with only ten’s of hosts. One commercial tool named Skybox View [22] has been developed that uses attack graphs to analyze networks. Only general information is available at the company’s web site [22] and a patent has been issued that describes the basic approach [23]. To the best of our understanding, this system computes what we call host-compromised attack graphs that determine only the shortest path to compromise hosts from a given starting location. A host-compromised graph is not predictive. It is thus more computationally expensive to formulate and evaluate recommendations using such graphs because a new graph must be built to determine which hosts can be compromised following each possible recommendation. X. C ONCLUSIONS AND F UTURE W ORK A tool named NetSPA (NETwork Security Planning Architecture) was developed to verify and, if necessary, restore the security of large enterprise networks. During field trials, NetSPA successfully imported vulnerability scanner and firewall configuration information and was able to produce attack graphs and make recommendations in only a few minutes for three actual networks with 200 to 3,400 hosts. Unexpected attack paths were generated that exposed incorrect firewall rules and critical unpatched vulnerabilities on hosts exposed through firewalls. The small number of highpriority recommendations provided by NetSPA were followed to successfully restore the security of these networks. Studies using simulated enterprise networks demonstrate that NetSPA can successfully analyze complex networks with than 50,000 hosts using general-purpose computing hardware. Computing reachability between all hosts was often more computationally intensive than constructing attack graphs. Although many past studies assume reachability information is available, much of our effort has focused on making reachability computation more efficient and accurate by grouping hosts into filtered and unfiltered reachability domains, emulating filtering devices, computing reachability on demand, and using efficient data structures and algorithms. Future work is planned in several areas. Attack graph building and reachability computations support most attacks found by network vulnerability scanners, but do not easily support some attacks that have multiple prerequisites. For example, they do not directly support trust relationship attacks where an attacker obtains a prerequisite for another attack (e.g. a password) from one host and then uses this prerequisite to compromise another remote host. They also do not directly model situations where an attacker compromises a firewall and changes the firewall rules, because this changes the network reachability. Client side attacks, where a client is compromised when connecting to a malicious server, are also not modeled. We are exploring attack graph modifications that would permit us to handle these attack types. In addition, we are exploring other uses of attack graphs. One is to perform “what-if” experiments and predictively explore the effect of changes in network security policy and of adding hypothetical “zero-day” vulnerabilities. A second is to use attack graphs to filter alerts from intrusion detection systems and firewalls as suggested by [15] and others. ACKNOWLEDGMENTS We would like to thank unnamed system administrators who helped us perform field trials, Peter Mell and Tim Grance who support the NIST ICAT and NVD databases, and the many developers of Nessus. R EFERENCES [1] “Nessus, The Nessus Security Scanner,” http://www.nessus.org. [2] A. Wool, “A quantitative study of firewall configuration errors,” Computer, vol. 37, no. 6, pp. 62–67, 2004. [3] S. Noel and S. Jajodia, “Understanding complex network attack graphs through clustered adjacency matrices,” in ACSAC ’05: Proceedings of the 21st Annual Computer Security Applications Conference, Tucson, AZ, 2005, pp. 160–169. [4] W. Li, “An approach to graph-based modeling of network exploitations,” Ph.D. dissertation, Mississippi State University, 2005. [5] L. P. Swiler et al., “Computer-attack graph generation tool,” in Proceedings DARPA Information Survivability Conference and Exposition (DISCEX II), Los Alamitos, CA, 2001, pp. 307–321. [6] T. Tidwell et al., “Modeling internet attacks,” in Proceedings of the Second Annual IEEE SMC Information Assurance Workshop, West Point, NY, 2001, pp. 54–59. [7] O. Sheyner et al., “Automated generation and analysis of attack graphs,” in Proceedings of the 2002 IEEE Symposium on Security and Privacy, Oakland, CA, 2002, pp. 273–284. [8] M. Artz, “NETspa, a network security planning architecture,” Master’s thesis, Massachusetts Institute of Technology, 2002. [9] R. Ortalo, Y. Deswarte, and M. Kaaniche, “Experimenting with quantitative evaluation tools for monitoring operational security,” IEEE Trans. Software Eng., vol. 25, no. 5, pp. 633–650, 1999. [10] R. P. Lippmann et al., “Evaluating and strengthening enterprise network security using attack graphs,” MIT Lincoln Laboratory, Lexington, MA, Tech. Rep., 2005, ESC-TR-2005-064. [11] P. Mel, T. Grance, et al., “NVD, The National Vulnerability Database,” National Institute of Standards and Technology, http://nvd.nist.gov. [12] P. Ammann, D. Wijesekera, and S. Kaushik, “Scalable, graph-based network vulnerability analysis,” in Proceedings of the 9th ACM Conference on Computer and Communications Security. ACM Press, 2002, pp. 217–224. [13] S. Jajodia, S. Noel, and B. O’Berry, “Topological analysis of network attack vulnerability,” in Managing Cyber Threats: Issues, Approaches, and Challenges, A. L. V. Kumar, J. Srivastava, Ed. Kluwer Academic Publisher, 2003, ch. 5. [14] S. Templeton and K. Levitt, “A requires/provides model for computer attacks,” in Proceedings of the 2000 Workshop on New Security Paradigms. New York, NY: ACM Press, 2001. [15] F. Cuppens and R. Ortalo, “LAMBDA: A language to model a database for detection of attacks,” in Proceedings of the Third International Workshop on Recent Advances in Intrusion Detection, 2000, pp. 197–216. [16] S. Cheung, U. Lindqvist, et al., “Modeling multistep cyber attacks for scenario recognition,” in Proceedings of the Third DARPA Information Survivability Conference and Exposition (DISCEX III), 2003, pp. 284–292. [17] “CVE, Common Vulnerabilities and Exposures Dictionary, The MITRE Corporation,” http://cve.mitre.org. [18] “The SecurityFocus Vulnerability Database, SecurityFocus Symantec Corporation,” http://www.securityfocus.com/archive/1. [19] R. P. Lippmann, L. Kukolich, and E. Singer, “LNKnet: Neural network, machine learning, and statistical software for pattern classification,” Lincoln Laboratory Journal, vol. 6, no. 2, pp. 249– 268, 1993. [20] R. P. Lippmann and K. W. Ingols, “An annotated review of past papers on attack graphs,” MIT Lincoln Laboratory, Lexington, MA, Tech. Rep., 2005, ESC-TR-2005-054. [21] X. Ou, S. Govindavajhala, and A. W. Appel, “Mulval: A logicbased network security analyzer,” in Proceedings of the 14th Usenix Security Symposium, 2005. [22] “Skybox Security, Inc.” http://www.skyboxsecurity.com. [23] G. Cohen et al., “System and method for risk detection and analysis in a computer network,” United States Patent 6,952,779, October 2005. 10 of 10