Anycast enumeration and geolocation approaches - Marc
by user
Comments
Transcript
Anycast enumeration and geolocation approaches - Marc
Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches master’s degree Anycast enumeration and geolocation approaches academic year 2013/2014 Supervisor Ch.mo prof Antonio PESCAPE Ch.mo prof Dario ROSSI candidate Danilo CICALESE Matr. M63000196 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Context Anycast: Many Hosts, one IP! With anycast, multiple hosts can share the same IP address. When a packet is sent to an anycast address, the network will deliver it to the topologically closest host. Who is using IP anycast? Content Delivery Networks, i.e., EdgeCast, CloudFlare. Root DNS. Google public DNS. 12 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Motivation and contribution State of the art: geolocation techniques fail with anycast IPs. Where is Google’s 8.8.8.8? Who do you believe? United States (freegeoip.net). Mountain View, California (IP2Location). New York, New York (Geobytes). United States (Maxmind). Broomfield, Colorado (IPligence). Our contribution is a methodology to: Determine if a service uses IP anycast. Enumerate replicas sharing the same IP address. Geolocate those replicas. 13 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Methodology Measure Latency Planetlab Ripe … Detect and Enumerate Solve MIS Geolocate Classification Maximum likelihood Optimum (brute force) 5-approximation (Greedy) …. Iterate Feedback 14 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Detection The latency measurement is converted in a georaphic areas considering the speed of the light in a optical fiber The Vantage points are referring to two different instances if: The packet cannot travel faster than the speed of the light! 15 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Enumeration: Greedy algorithm 16 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Enumeration: Greedy algorithm 16 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Enumeration: Greedy algorithm 16 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Enumeration: Greedy algorithm 16 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Geolocation and iteration Locations at city granularity: 1 ms difference in latency measurement corresponds to a 100 km disc in geodesic distance terms. Internet Service Providers and system administrators often use machine names that map to the city they are serving, i.e, IATA and IXP code. 7 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Geolocation and iteration Locations at city granularity: 1 ms difference in latency measurement corresponds to a 100 km disc in geodesic distance terms. Internet Service Providers and system administrators often use machine names that map to the city they are serving, i.e, IATA and IXP code. Location metrics: Distance from the border. User internet population. 7 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Geolocation and iteration Locations at city granularity: 1 ms difference in latency measurement corresponds to a 100 km disc in geodesic distance terms. Internet Service Providers and system administrators often use machine names that map to the city they are serving, i.e, IATA and IXP code. Location metrics: Distance from the border. User internet population. Geolocation error: The percentage of correct classification (i.e., geolocation). The mean geolocation error in kilometers. 7 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Experimental validation In this section we limited our analysis to 200 PlanetLab Vantage points. We validate our methodology against publicly available ground truth: F, K, I, L DNS root servers. DNS CHAOS query. 18 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Experimental validation In this section we limited our analysis to 200 PlanetLab Vantage points. We validate our methodology against publicly available ground truth: F, K, I, L DNS root servers. DNS CHAOS query. Enumeration: the greedy solver is in most of the cases just as good as the brute force solution and it’s faster than brute force( hundreds of milliseconds vs thousands of seconds). Geolocation: equal importance for the distance from the border and user internet population. 18 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Measurement campaign Datasets: Ripe 6000 nodes, 122 countries, 2168 AS. Ripe 500 nodes, random selection. Ripe 200 nodes: stratified selection, at least 100 km distant from each other. 19 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Measurement campaign Datasets: Ripe 6000 nodes, 122 countries, 2168 AS. Ripe 500 nodes, random selection. Ripe 200 nodes: stratified selection, at least 100 km distant from each other. Results: Using the full dataset, it s possible to enumerate the 76% of the anycast instances and geolocated the 80% of them. Random selection provides poor results while the stratified selection achieve comparable results. 19 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Comparison state of the art Enumeration: [1] directly quantitatively comparable, [1] employs DNS root servers as case of study. Geolocation: [2] uses the Client Centric Geolocation, CCG. It’s only qualitatively comparable, as they target Google infrastructure. REFERENCES: [1] X. Fan, J. Heidemann and R. Govindan, “Evaluating anycast in the Domain Name System” in Proc. IEEE INFOCOM, 2013. [2] M. Calder, X. Fan, Z. Hu, E. Katz-Bassett, J. Heidemann and R. Govindan, “Mapping the expansion of Google’s serving infrastructure” in Proc. ACM IMC, 2013. 110 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Comparison state of the art Enumeration: [1] directly quantitatively comparable, [1] employs DNS root servers as case of study. Geolocation: [2] uses the Client Centric Geolocation, CCG. It’s only qualitatively comparable, as they target Google infrastructure. REFERENCES: [1] X. Fan, J. Heidemann and R. Govindan, “Evaluating anycast in the Domain Name System” in Proc. IEEE INFOCOM, 2013. [2] M. Calder, X. Fan, Z. Hu, E. Katz-Bassett, J. Heidemann and R. Govindan, “Mapping the expansion of Google’s serving infrastructure” in Proc. ACM IMC, 2013. 110 Scuola Politecnica e delle Scienze di Base Corso di Laurea Magistrale in Ingegneria Informatica Anycast enumeration and geolocation approaches Conclusions We propose a novel methodology to detect, enumerate and geolocate anycast replicas. Our methodology does not rely on a protocol specific information. Fewer vantage points suffices to provide recall and accuracy similar to large scale techniques. FUTURE WORKS: Refine methodology. Selection of the Vantage points. Internet anycast census. 111