Apdex Implementation at AOL Session 45A Session 45A Apdex Case Studies
by user
Comments
Transcript
Apdex Implementation at AOL Session 45A Session 45A Apdex Case Studies
Session 45A Apdex Case Studies Apdex Implementation at AOL Session 45A CMG International Conference San Diego, California December 5, 2007 Eric Goldsmith Operations Architect [email protected] Our Environment Operations organization Measuring Web site performance from customercentric view Full page load measured from outside datacenter Multiple geographic locations Goals Short-term: Identify product issues/outages Long-term: Achieve uniform geographic performance, in parity with competitors Slide 2 Session 45A Apdex Case Studies Current Metrics & Shortcomings Response Time & Availability Often don’t tell whole user-experience story Reported as averages Hides variance, and is skewed by outliers Reported in absolute numbers No context of a target (goal) value Slide 3 Goals of Apdex use Inclusive view of performance, availability, and data distribution “Building in” of a target, and data normalization around it Performance is evaluated qualitatively against a target Slide 4 Session 45A Apdex Case Studies Data Source and Collection Using commercial 3rd-party tool to gather measurements from multiple geographic locations Data of interest for our Apdex calculations 1. 2. 3. 4. Date/Time Measurement Value Success/Error (Error = Frustrated) Test Location Data collection is batched (daily) Slide 5 Calculation and Graphing in Excel Calculate sub-score for each row (data point) If (error) score = 0 else if (measurement <= T) score = 1 else if (measurement <= F) score = 0.5 else score = 0 Define interval over which to calculate Adpex score – Hourly, daily, weekly, etc. – Segregate by location, if desired – Apdex spec recommends >100 data points per interval Then calculate overall Apdex score for interval =sum(sub-scores) / count(measurements) Get fancy with DSUM() and DCOUNT() Database lookups simplify segregation by date, location, etc. Slide 6 Session 45A Apdex Case Studies Target ‘T’ Determination We chose our targets based on competitor performance For a given Web site, identify its target competitor (may be self) The ‘T’ marker method we chose initially was based on “Best Time Multiple” “Measure average response time from a ‘good’ location, then add 50% to build in tolerance for other locations” Instead, we averaged data from all locations Our thinking was that the 50% inflation wasn’t necessary because of the natural diversity of the data from multiple geographic locations Slide 7 Example Results Presentation Performance - National A [1.1] B [1.1] C [1.1] 1.00 Excellent 0.95 0.90 Good 0.85 0.80 Fair 0.75 0.70 Poor 0.60 0.55 0.50 0.45 0.40 0.35 0.30 Unacceptable 0.25 0.20 0.15 0.10 0.05 31-Aug-07 30-Aug-07 29-Aug-07 28-Aug-07 27-Aug-07 26-Aug-07 25-Aug-07 24-Aug-07 23-Aug-07 22-Aug-07 21-Aug-07 20-Aug-07 19-Aug-07 18-Aug-07 17-Aug-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 11-Aug-07 9-Aug-07 10-Aug-07 8-Aug-07 7-Aug-07 6-Aug-07 5-Aug-07 4-Aug-07 3-Aug-07 2-Aug-07 0.00 1-Aug-07 Apdex Score 0.65 Slide 8 Session 45A Apdex Case Studies Example Results Presentation cont’d Performance - Regional A-East [1.1] A-West [1.1] B-East [1.1] B-West [1.1] C-East [1.1] C-West [1.1] 1.00 Excellent 0.95 0.90 Good 0.85 0.80 Fair 0.75 0.70 Apdex Score 0.65 Poor 0.60 0.55 0.50 0.45 0.40 0.35 0.30 Unacceptable 0.25 0.20 0.15 0.10 0.05 31-Aug-07 30-Aug-07 29-Aug-07 28-Aug-07 27-Aug-07 26-Aug-07 25-Aug-07 24-Aug-07 23-Aug-07 22-Aug-07 21-Aug-07 20-Aug-07 19-Aug-07 18-Aug-07 17-Aug-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 11-Aug-07 9-Aug-07 10-Aug-07 8-Aug-07 7-Aug-07 6-Aug-07 5-Aug-07 4-Aug-07 3-Aug-07 2-Aug-07 1-Aug-07 0.00 Slide 9 Problems with our initial T Initial results were promising…but as we examined data over time, the Apdex results didn’t always correlate well with observations Performance - West A-West [1.1] B-West [1.1] Target competitor never achieves Excellent level C-West [1.1] 1.00 Excellent 0.95 0.90 Good 0.85 0.80 Fair 0.75 0.70 Poor 0.60 0.55 0.50 Significant performance change not reflected 0.45 0.40 0.35 0.30 (see next slide) Unacceptable 0.25 0.20 0.15 0.10 0.05 31-Aug-07 30-Aug-07 29-Aug-07 28-Aug-07 27-Aug-07 26-Aug-07 25-Aug-07 24-Aug-07 23-Aug-07 22-Aug-07 21-Aug-07 20-Aug-07 19-Aug-07 18-Aug-07 17-Aug-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 11-Aug-07 10-Aug-07 9-Aug-07 8-Aug-07 7-Aug-07 6-Aug-07 5-Aug-07 4-Aug-07 3-Aug-07 2-Aug-07 0.00 1-Aug-07 Apdex Score 0.65 Slide 10 Session 45A Apdex Case Studies Example of Initial T Problem West Coast Page Load Time Before After T F 5.0 • 44% reduction in average load time • But Apdex score didn’t change 4.5 4.0 3.5 Time (sec) 3.0 2.5 2.0 1.5 1.0 0.5 17-Aug-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 0.0 Slide 11 Plan B We experimented with various T determination techniques, and eventually settled on the “Empirical Data” method “Find T that results in the proper Apdex for a well studied group” In our environment… For a given Web site, identify its target competitor (may be self) – The performance of this competitor is defined as “Excellent” Determine the smallest T such that the competitor’s Apdex score remains Excellent for a period of time (at least 1 month) Slide 12 Session 45A Apdex Case Studies New T With the new T, the Apdex results correlate better with observations Performance - West A-West [1.6] B-West [1.6] Target competitor now achieves Excellent level C-West [1.6] 1.00 Excellent 0.95 0.90 Good 0.85 0.80 Fair 0.75 0.70 Apdex Score 0.65 Poor 0.60 0.55 0.50 Performance change now reflected 0.45 0.40 0.35 0.30 Unacceptable 0.25 0.20 0.15 0.10 0.05 31-Aug-07 30-Aug-07 29-Aug-07 28-Aug-07 27-Aug-07 26-Aug-07 25-Aug-07 24-Aug-07 23-Aug-07 22-Aug-07 21-Aug-07 20-Aug-07 19-Aug-07 18-Aug-07 17-Aug-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 11-Aug-07 9-Aug-07 10-Aug-07 8-Aug-07 7-Aug-07 6-Aug-07 5-Aug-07 4-Aug-07 3-Aug-07 2-Aug-07 1-Aug-07 0.00 Slide 13 Changing T Define technique for reevaluating T on an ongoing basis But don’t want to change T too often Suggestions for reevaluating T: Quarterly, looking at prior 3 months of data When a significant product change occurs When requested (from business) Slide 14 0.80 0.75 0.65 0.60 0.95 0.90 0.85 0.80 0.75 0.65 0.60 31-Aug-07 0.85 30-Aug-07 0.90 30-Sep-07 1.00 29-Aug-07 0.95 29-Sep-07 0.25 28-Aug-07 1.00 28-Sep-07 27-Aug-07 26-Aug-07 25-Aug-07 24-Aug-07 23-Aug-07 22-Aug-07 21-Aug-07 20-Aug-07 19-Aug-07 0.25 27-Sep-07 26-Sep-07 25-Sep-07 24-Sep-07 23-Sep-07 22-Sep-07 21-Sep-07 B [1.6] 20-Sep-07 18-Aug-07 17-Aug-07 B [1.6] B [1.1] 19-Sep-07 18-Sep-07 17-Sep-07 16-Aug-07 15-Aug-07 14-Aug-07 13-Aug-07 12-Aug-07 11-Aug-07 10-Aug-07 9-Aug-07 8-Aug-07 7-Aug-07 6-Aug-07 5-Aug-07 4-Aug-07 3-Aug-07 2-Aug-07 1-Aug-07 Apdex Score A [1.6] A [1.1] 16-Sep-07 A [1.6] 15-Sep-07 14-Sep-07 13-Sep-07 12-Sep-07 11-Sep-07 10-Sep-07 9-Sep-07 8-Sep-07 7-Sep-07 6-Sep-07 5-Sep-07 4-Sep-07 3-Sep-07 2-Sep-07 1-Sep-07 Apdex Score Session 45A Apdex Case Studies Example - T Change Performance - National C [1.6] C [1.1] Excellent Good 0.70 Fair 0.55 Poor 0.50 0.45 0.40 0.35 0.30 0.20 Unacceptable 0.15 0.10 0.05 0.00 Slide 15 Apdex vs. Other Metrics Performance - National C [1.6] Excellent Good 0.70 Fair 0.55 Poor 0.50 0.45 0.40 0.35 0.30 0.20 Unacceptable 0.15 0.10 0.05 0.00 Slide 16 Session 45A Apdex Case Studies Apdex Score Apdex vs. Performance & Availability Deep Dive 1 Virtually no change in Apdex for B, despite large change in performance and availability. 1.00 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 5.0 Excellent Go o d Fair Poor A [1.6] B [1.6] C [1.6] Unacceptable 4.5 Performance (seconds) 4.0 3.5 3.0 A 2.5 B 2.0 C 1.5 1.0 Deep Dive 2 0.5 0.0 100.0 Apdex shows B performing better than A. Perf/Avail charts show opposite. Availabiilty (percent) 99.5 99.0 98.5 98.0 97.5 97.0 96.5 96.0 95.5 Deep Dive 1 30-Sep-07 29-Sep-07 28-Sep-07 27-Sep-07 26-Sep-07 25-Sep-07 24-Sep-07 23-Sep-07 22-Sep-07 21-Sep-07 20-Sep-07 19-Sep-07 18-Sep-07 17-Sep-07 16-Sep-07 15-Sep-07 14-Sep-07 13-Sep-07 12-Sep-07 11-Sep-07 9-Sep-07 10-Sep-07 8-Sep-07 7-Sep-07 6-Sep-07 5-Sep-07 4-Sep-07 3-Sep-07 2-Sep-07 1-Sep-07 95.0 Slide 17 60 50 Performance - National B T B-Avg F 7.0 40 Performance (seconds) Virtually no change in Apdex for B, despite large change in performance and availability. 30 20 10 6.0 0 4.0 3.0 2.0 1.0 314 (0.22) F A 22 0.75 S 422 (0.60) T 195 (0.14) F A 88 0.74 S 419 (0.59) T 219 (0.15) F A 73 0.74 15-Sep-07 377 (0.53) T 14-Sep-07 S 13-Sep-07 0.0 12-Sep-07 Performance (seconds) 5.0 Slide 18 Session 45A Apdex Case Studies Deep Dive 2 60 50 Performance - National A A-Avg T 40 B B-Avg F Performance (seconds) Apdex shows B performing better than A. Perf/Avail charts show opposite. 7.0 30 20 10 6.0 0 Performance (seconds) 5.0 4.0 3.0 2.0 1.0 15 106 0.79 436 (0.61) 213 (0.15) 7 0.70 553 (0.79) 146 (0.10) 5 63 0.62 506 (0.36) 919 (0.32) 0.76 5 0.68 942 (0.66) 607 (0.84) 469 (0.17) 111 (0.08) 4 0.89 23-Sep-07 341 (0.24) 1074 (0.38) 22-Sep-07 552 (0.20) 215 (0.15) 21-Sep-07 807 (0.59) 390 (0.55) 20-Sep-07 19-Sep-07 0.0 3 0.83 Slide 0.92 19 Closing Thoughts We’re still exploring the application of Apdex in an Operations organization Can Apdex be used to identify the day to day "issues" traditionally identified through analysis of performance and availability metrics? Or is it better suited as a method of performance representation for the business side of the house? Interesting to calc: what would it take for a product to achieve the next "band" of performance What performance level do I need to move from Poor to Fair Help in establishing interim targets Slide 20 Session 45A Apdex Case Studies Thank You Questions?