BEST PRACTICES FOR DEPLOYING EMC XTREMIO ALL-FLASH STORAGE
by user
Comments
Transcript
BEST PRACTICES FOR DEPLOYING EMC XTREMIO ALL-FLASH STORAGE
BEST PRACTICES FOR DEPLOYING EMC XTREMIO ALL-FLASH STORAGE WITH BROCADE GEN 5 SAN FABRICS ABSTRACT This paper provides guidance on best practices for EMC XtremIO deployments with Brocade Storage Area Network (SAN) Fabrics. SANs must be designed so applications can take full advantage of the extremely low latency, high IO, and bandwidth capabilities of all-flash arrays (AFA) such as XtremIO. August, 2015 Authors: Marcus Thordal, Director, Solutions Marketing, Brocade Anika Suri, Systems Engineer, Brocade Victor Salvacruz, Corporate Solutions Architect, EMC XtremIO EMC WHITE PAPER To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller, visit www.emc.com, or explore and compare products in the EMC Store Copyright © 2015 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. VMware, vCenter, and vRealize are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number H14475 © 2015, Brocade Communications Systems, Inc. All Rights Reserved. ADX, Brocade, Brocade Assurance, the B-wing symbol, DCX, Fabric OS, HyperEdge, ICX, MLX, MyBrocade, OpenScript, The Effortless Network, VCS, VDX, Vplane, and Vyatta are registered trademarks, and Fabric Vision and vADX are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or service names mentioned may be trademarks of others. Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government. The authors and Brocade Communications Systems, Inc. assume no liability or responsibility to any person or entity with respect to the accuracy of this document or any loss, cost, liability, or damages arising from the information contained herein or the computer programs that accompany it. The product described by this document may contain open source software covered by the GNU General Public License or other open source license agreements. To find out which open source software is included in Brocade products, view the licensing terms applicable to the open source software, and obtain a copy of the programming source code, please visit http://www.brocade.com/support/oscd. 2 TABLE OF CONTENTS EXECUTIVE SUMMARY .............................................................................. 5 AUDIENCE ....................................................................................................... 5 TERMINOLOGY ................................................................................................. 5 SAN DESIGN CONSIDERATIONS ............................................................... 5 SHARED VERSUS DEDICATED SAN FOR XTREMIO TOPOLOGY ................... 5 APPLICATION WORKLOADS ...................................................................... 6 VDI ................................................................................................................ 6 OLTP............................................................................................................... 6 VSI ................................................................................................................. 6 SCALE OF THE SAN ................................................................................... 6 SAN TOPOLOGY REDUNDANCY AND RESILIENCY ...................................... 6 SAN DESIGN GUIDELINES ........................................................................ 9 SAN DESIGN EXAMPLE COMBINING TRADITIONAL NON-FLASH AND EMC XTREMIO CLUSTERS ................................................................................. 9 OPERATIONS AND MAINTENANCE .......................................................... 11 ZONING .........................................................................................................11 MONITORING AND ANALYTICS ..........................................................................13 BROCADE SAN CONTENT PACK FOR VMWARE VREALIZE LOG INSIGHT .. 15 BROCADE SAN HEALTH AND EMC MITRENDS .......................................................15 HOW SAN HEALTH WORKS ...............................................................................15 COMPREHENSIVE REPORTING ...........................................................................15 EMC STORAGE ANALYTICS ................................................................................17 BUSINESS CONTINUITY AND BACKUP .................................................... 17 EMC XTREMIO STORAGE CONSIDERATIONS ........................................... 17 SWITCH CONNECTIVITY AND ZONING ................................................................17 LUN QUEUE ....................................................................................................18 STORAGE TUNING ...........................................................................................18 MONITORING THE EXISTING XTREMIO ARRAY .....................................................18 3 APPENDIX A: REFERENCES ..................................................................... 19 EMC ...............................................................................................................19 BROCADE .......................................................................................................19 4 EXECUTIVE SUMMARY With a new generation of all-flash arrays (AFAs) providing unprecedented storage performance, storage area networks (SAN) must be designed so applications can take full advantage of the extremely low latency, high IO, and bandwidth capabilities of the arrays. In this paper, we provide guidance on best practices for EMC® XtremIO™ deployments with Brocade SAN Fabrics. Whether you are deploying AFA storage for dedicated or mixed-application workloads or adding XtremIO cluster(s) to an existing SAN, this paper provides you with a methodology based on joint tests with EMC and Brocade. AUDIENCE This guide is intended for IT data storage architects and SAN administrators who are responsible for storage or SAN design based on the EMC XtremIO storage system and Brocade® Gen 5 Fibre Channel SAN fabrics. TERMINOLOGY Below are some commonly used terms and abbreviations that you will find throughout this document. AFA All-flash array ICL Inter-Chassis Links, also referred to as UltraScale ICLs, are dedicated ports for connectivity between DCX 8510 directors providing up to 64 Gbps of throughput per link ISL Interswitch links provide connectivity between two Fibre Channel switches via E_ports OLTP Online transaction processing VDI Virtual desktop infrastructure VSI Virtual server infrastructure X-Brick An X-Brick storage appliance is the building block of an XtremIO storage system which can be clustered to scale performance and capacity XDP XtremIO Data Protection XIOS XtremIO Operating System DAE Disk-array enclosure Table 1 Terminology SAN DESIGN CONSIDERATIONS When designing the SAN for an all-flash array, understanding the application workloads and the intended scalability, redundancy, and resiliency requirements are the main factors to consider. In this paper, we will discuss these considerations in more detail and how they guide the SAN design decision process. SHARED VERSUS DEDICATED SAN FOR XTREMIO TOPOLOGY When deploying XtremIO cluster(s) into an existing SAN, the topology design is already in place and the main design consideration is the adequate sizing of the ISLs between the edge switches where the servers are connected and the core switches where the XBricks are connected. In implementations where the XtremIO clusters are deployed to serve specific latency- and IO-intensive applications such as OLTP database servers, connecting both servers and X-Bricks to the core backbone switch(es) can be advantageous. 5 Figure 1. IO-intensive hosts connected at core When deploying dedicated SANs for XtremIO cluster storage services, the SAN design can be tailored directly based on the application workloads and scalability requirements. In the next sections, we discuss how different application types and scalability requirements influence the SAN design decisions. APPLICATION WORKLOADS VDI When the application workloads are well known and the AFA serves few (one or two) application types, reasonably precise estimates for the optimal fan-in ratio of compute to storage are possible. For example, if the AFA is used for a VDI application, each compute (server) will host a fixed number of VDI instances with a known IO profile. This profile determines the number of IOs per server and correlates to a fixed number of servers per storage port and the number of X-Bricks necessary for the XtremIO cluster. OLTP In the case of an OLTP database application, the database is running on one or few clustered servers which determine the number of servers per storage port and XtremIO cluster. ISLs can then be sized accordingly to provide the necessary capacity between the servers and the XtremIO cluster(s). VSI When the AFA is serving multiple applications, it becomes more difficult to determine the fan-in ratio. To ensure adequate capacity between servers and XtremIO storage clusters, it is necessary to size the ISLs between the SAN switches more conservatively by provisioning more bandwidth than if the application workload is very well known. In this way, administrators can mitigate the potential for a negative performance impact in a shared SAN or mixed application environment. SCALE OF THE SAN A best practice for SAN infrastructure design is to plan for a three to five year lifespan for the solution to be deployed. Considerations for the length of the life of a SAN infrastructure include a combination of equipment depreciation considerations, the limited predictability of business transformation/development, and technology refresh and improvements. Having clear indications or understanding of application needs over the same period determines the range of scalability and flexibility necessary in the design. SAN TOPOLOGY REDUNDANCY AND RESILIENCY Redundancy refers to duplicated components in a system (in this case the SAN), whereas resiliency refers to the capability to continue when failures occur. So while redundancy supports resiliency, the degree of redundancy determines the level of resiliency. While SANs are deployed with redundant fabrics, each fabric can also be designed without any single point of failure and thus be resilient by design. Designing fabrics to be resilient requires dual-core or full-mesh topologies. 6 Figure 2 is an example of a redundant SAN design and Figure 3 shows an example of a redundant and resilient SAN design. Figure 2. Figure 3. Redundant SAN fabrics Redundant and resilient SAN fabrics For most XtremIO deployments, a core-edge topology is often best suited to meet the needs of scalability and uniform IO performance. Figure 4 provides an example of a core-edge topology with X-Bricks connected at the core and hosts connected at the edge. Edge switches may be placed at the Top of Rack (ToR) or Middle of Row/End of Row (MoR/EoR) depending on the server density in the racks and the density of the edge switch used in the design. 7 Figure 4. Core-edge SAN fabric design For small XtremIO deployments—single cluster with one to two XtremIO clusters throughout the lifecycle of the environment and limited requirements for scale—a simple collapsed design with a single backbone switch in each fabric with hosts and X-Bricks directly connected will be sufficient. Figure 5 shows an example of a collapsed design with hosts and storage directly connected to a backbone switch. Figure 5. Collapsed SAN fabric design For very large scale XtremIO deployments with multiple X-Brick clusters, a full mesh UltraScale ICL design is recommended. This design, utilizing the ICLs on the DCX 8510 platform for Switch Fabric backplane interconnect between the switches, provides a combination of completely flexible placement of hosts and storage with maximum level of scalability and uniform IO performance. Figure 6 shows an example of the full-mesh UltraScale ICL design enabling placement of hosts and servers independently across racks with the DCXs placed MoR/EoR. 8 Figure 6. UltraScale SAN fabric design For an overview of SAN topology design best practices, see: http://www.brocade.com/downloads/documents/best_practice_guides/san-design-best-practices.pdf SAN DESIGN GUIDELINES Designing SANs fully dedicated with XtremIO using the three SAN designs reviewed in the previous section is often a straightforward sizing exercise. Calculate the ISL bandwidth necessary for the server density and number of X-bricks in each XtremIO cluster combined with the anticipated server and storage growth during the lifecycle of the computing infrastructure. • For core-edge designs, a good rule of thumb is to design with ISL bandwidth equal to total capacity of all X-brick ports which are accessible (zoned) to hosts attached to each ToR switch. • For collapsed SAN designs, calculate the number of host and storage ports for the lifetime of the infrastructure. • For UltraScale ICL designs, the ICL capacity must be deployed to match the total number of X-brick ports which are not local on each DCX Backbone. SAN DESIGN EXAMPLE COMBINING TRADITIONAL NON-FLASH AND EMC XTREMIO CLUSTERS Designing SANs with a mix of traditional non-flash disk systems and XtremIO clusters is more complex. In the following example, the solution is sized based on the assumption that the SAN infrastructure will be reevaluated after three years, but needs to sustain the business demands for a total of five years. This assumption will leave room for evaluation of new technology and a phased technology refresh and/or redesign which includes scalability to incorporate unanticipated extensive growth and demands for advanced fabric services. The solution will be continuously expanded in a phased way so there is always approximately 15 percent unused port capacity equal to a utilization of 85 percent as well as available slots in the DCXs. In that way, the SAN infrastructure becomes a business enabler rather than a business inhibitor. Illustrated in Figure 7, the fabric is a core-edge architecture with a single DCX Backbone. 9 Figure 7. Single core-edge backbone architecture This architecture meets the requirements for connectivity initially. With incremental growth of servers and storage, new servers are connected with edge switches and storage is connected at the core as shown in Figure 8. Figure 8. Scaling by adding edge switches (and port blades on the backbone as needed) In our example, the SAN is scaled to provide connectivity for 400 servers with access to both traditional non-flash and XtremIO storage. The split between traditional non-flash storage and XtremIO ports connected per fabric is 36 traditional non-flash storage ports and two distinct XtremIO clusters with a single X-brick requiring four storage ports at the core (in each fabric). This configuration results in an overall host to storage port ratio (fan-in) of 10:1. With anticipated incremental growth of 30 percent year over year in servers and storage during a five-year lifespan, the solution scales from 400 hosts (and 44 total storage ports) to 1,150 hosts and 104 traditional non-flash storage ports and 24 XtremIO ports across two XtremIO clusters with six X-bricks per cluster for a total of approximately 1,598 used ports including ISLs. 10 The design can scale well beyond the example shown using dual core or ICL design as discussed in the previous section. Note: Though the host to storage port ratio requirements may differ in your environment as well as growth rate and port utilization objectives, we use this calculation example to illustrate how to project and accommodate the growth of a SAN fabric. The key building blocks and design assumptions are: Edge switch = Brocade 6520 (GEN5 switch with a total of 96 ports) On each edge switch, planned port allocations are 60 ports for host connections and 12 ports for ISL connections to the backbone core. The remaining 24 ports on the edge switch are reserved as a buffer for unanticipated server growth or ISL bandwidth need. On the backbone core, the 48 port blades (FC16-48) are used for storage and ISLs. Throughout the lifetime of the solution, 286 ports will be consumed on the DCX Backbone core. Table 2 shows how the fabric grows year by year using the anticipated growth rate of 30 percent per year. Edge hosts Traditional storage ports XtremIO ports Total storage ports ISLs Total edge ports Total core ports Edge switches DCX-8510-8 Backbone FC16-48 port blade Ports used Ports unused Port utilization Total ports per fabric Year 1 400 36 8 44 56 456 100 6 1 3 556 164 77% 720 Table 2 Year 2 520 48 12 60 72 592 132 8 1 4 724 236 75% 960 Year 3 680 64 16 80 96 776 176 11 1 5 952 346 73% 1296 Year 4 880 86 20 106 120 1000 226 14 1 6 1226 164 88% 1390 Year 5 1150 104 24 128 160 1310 288 18 1 8 1598 514 76% 2112 Fabric growth year by year OPERATIONS AND MAINTENANCE Administration, maintenance, and provisioning of a SAN fabric are essential to a good fabric design. Following are Brocade and EMC best practices for provisioning and monitoring SAN fabrics. ZONING Brocade Best Practice for Zoning Zoning is a fabric-based service in SANs that groups host and storage nodes which need to communicate. It allows nodes to communicate with each other only if they are members of the same zone. Nodes can be members of multiple zones, allowing for a great deal of flexibility when you implement a SAN using zoning. Zoning not only prevents a host from unauthorized access of storage assets, but it also stops undesired host-to-host communication and fabric-wide Registered State Change Notification (RSCN) disruptions. Brocade recommends that users always implement zoning, even if they are using LUN masking. Also, PWWN identification for zoning is recommended for both security and operational consistency. For details, please refer to the Brocade SAN Fabric Administration Best Practices Guide. Zone membership is primarily based on the need for a host to access a storage port. Hosts rarely need to interact directly with each other and storage ports never initiate SAN traffic by virtue of their nature as targets. Zones can be grouped by array, by host operating system, by application, or by location within the data center. 11 The recommended grouping method for zoning is single initiator zoning (SIZ), sometimes called ‘single HBA zoning.’ With SIZ, each zone has only a single host bus adapter (HBA) and one or more storage ports. It is recommended that you use separate zones for tape and disk traffic when an HBA is carrying both types of traffic. SIZ is optimal because it prevents any host-to-host interaction and limits RSCNs to just the zones that need the information within the RSCN. EMC XtremIO Best Practice for Zoning For zoning against an XtremIO storage array, it is recommended that at least two HBAs per host be available. It is also recommended that you zone initiators to all available storage ports. The recommended maximum number of paths to storage ports per host is 16, as described below in Table 3. To ensure a balanced utilization of the full range of XtremIO resources, utilize all storage ports in uniform, across all hosts and clusters. XtremIO storage arrays currently support a maximum of eight X-Bricks. Each X-Brick is comprised of two storage controllers, each of which has two Fibre Channel (FC) ports designated respectively as FC1 and FC2. In the tables below, XN_SCN_FCN pertains to the FC target ports on each XtremIO storage controller. In a dual X-Brick configuration, a host may have up to eight paths per device. The following diagram displays the logical connection schemes for eight paths. Figure 9. Logical connection schemes for eight paths 2 HBAs HBA1 HBA2 2 X-Bricks Ports Per Cluster X1_SC1_FC1 X1_SC2_FC1 X2_SC1_FC1 X2_SC2_FC1 X1_SC1_FC2 X1_SC2_FC2 X2_SC1_FC2 X2_SC2_FC2 Table 3 8 Two X-Bricks and two HBAs 12 4 HBAs HBA1 HBA2 HBA3 HBA4 2 X-Bricks Ports per Cluster X1_SC1_FC1 X1_SC2_FC1 X2_SC1_FC1 X2_SC2_FC1 X1_SC1_FC2 X1_SC2_FC2 X2_SC1_FC2 X2_SC2_FC2 X1_SC1_FC1 X1_SC2_FC1 X2_SC1_FC1 X2_SC2_FC1 X1_SC1_FC2 X1_SC2_FC2 X2_SC1_FC2 X2_SC2_FC2 Table 4 16 Two X-bricks and four HBAs For more details, please refer to the XtremIO host configuration guide available on the EMC support website. MONITORING AND ANALYTICS With Brocade FC SANs, dashboards and reports can be configured to show only the most relevant data, enabling administrators to more efficiently prioritize their actions and maintain network performance. Fabric Vision and MAPS Brocade Fabric Vision technology combines capabilities from the Gen 5 Fibre Channel ASIC, Fabric OS® (FOS), and Brocade Network Advisor to address the challenges of network monitoring, maintaining 100 percent uptime, and ensuring network health and top performance. Fabric Vision is not a single product, but comprises a suite of features and technologies that help administrators address problems before they impact operations, accelerate new application deployments, and reduce operational costs. It provides visibility and insight across the storage network through innovative diagnostic, monitoring, and management technologies. Effective use of Fabric Vision is critical in environments with low latency, high IO processing, and flash-enabled storage where network bottlenecks or slow drain issues can quickly accelerate to impact storage response times and application performance. Fabric Vision helps prevent issues from occurring and allows for fast troubleshooting and resolution when issues do occur. Fabric Vision Components Fabric Vision comprises a combination of technologies: 1. ClearLink Diagnostics: Diagnostic Port (D_Port) provides loopback test capabilities for link latency and distance measurement at the optical and electrical level to validate the integrity and performance of optics and cabling, ensuring signal and optical quality and optimal performance across SAN and WAN connections. Pre-validating the integrity of cables and optics with ClearLink prior to deployment identifies potential support issues before they occur and enhances the resiliency of highperformance fabrics. In particular, for all-flash array performance, any impurity in the physical infrastructure can impact performance. 2. Monitoring and Alerting Policy Suite (MAPS): A policy-based monitoring and alerting tool that proactively monitors the health and performance of the SAN infrastructure based on pre-defined policies that cover more than 170 customizable rules, ensuring application uptime and availability. Administrators desiring a pristine network can set an ‘Aggressive’ policy level that has rules and actions with strict thresholds to minimize the possibility of data errors. With the ability to tailor the MAPS policies, you can monitor all-flash ports more closely to faster identify any performance degradation. 3. Flow Vision: A comprehensive tool that allows administrators to identify, monitor, and analyze specific application and data flows in order to maximize performance, avoid congestion, and optimize resources. 13 Flow Vision consists of: o Flow Monitoring—Monitors specified traffic flows from source to destination through the SAN o Flow Generator—Generates traffic between any two ports in a Gen 5 fabric o Flow Mirroring—Captures packet data as it flows through the SAN, then displays and analyzes the captured packet’s data Flow Vision is most suited to temporary use while troubleshooting high-latency conditions, as continual use results in the collection of large amounts of diagnostic data. Flow Vision can be used as needed to verify optimal performance for the most demanding applications requiring optimal performance. 4. Fabric Performance Impact Monitoring: Identifies and alerts administrators to device or ISL congestion and high levels of latency in the fabric which can have a severe impact on all-flash array performance. FPI Monitoring provides visualization of bottlenecks and identifies slow drain devices and impacted hosts and storage. 5. At-a-glance Dashboard: Includes customizable health and performance dashboard views, providing all critical information in one screen. Viewable dashboard ‘widgets’ that should be monitored include errors on all-flash array facing ports, top 10 flows, memory usage, and port health. 6. Forward Error Correction (FEC): Automatically detects and recovers from bit errors, enhancing transmission reliability and performance. FEC can reduce latency time significantly by preventing the need to retransmit frames with bit errors. 7. Credit Loss Recovery: Automatically detects and recovers buffer credit loss at the virtual channel level, providing protection against performance degradation and enhancing application availability. 8. Compass (Configuration and Operational Monitoring Policy Automation Services Suite): An automated configuration and operational monitoring policy tool that enforces consistency of configuration across the fabric and monitors changes, simplifying SAN configuration and alerting you to changes. In medium to large sized environments, this can prevent inadvertent changes to switch configurations that may impact the preferred parameters set across the fabric to optimize performance. Figure 10. Sample MAPS dashboard widgets 9. VMware® vRealize™ and Log Insight: Brocade has worked closely with VMware to offer deeper investigation capabilities for root-cause analysis and remediation of SANs in virtualized environments. The Brocade SAN Content Pack eliminates noise from millions of events and amplifies critical SAN alerts to accelerate troubleshooting with actionable analytics. Faster troubleshooting provides time to proactively drive the value of IT resources. Leveraging Brocade Fabric Vision technology provides thorough knowledge of the behavior and health of Brocade SAN fabrics over time in order to identify and remediate patterns impacting virtual machine performance and application responsiveness. 14 BROCADE SAN CONTENT PACK FOR VMWARE VREALIZE LOG INSIGHT Brocade SAN Content Pack enables faster resolution of networking issues, reducing downtime and IT costs while improving operational efficiency. It includes: • Intelligently classified events in-context within the VMware vCenter™ Log Insight dashboards • Pre-defined queries, alerts, and fields that can be customized to specific environments, offering both simplicity and flexibility • Increased visibility of issues across the SAN network and operational efficiency with Brocade Fabric Vision Technology integration For details, please refer to the Brocade and VMware Technology Alliance at http://www.brocade.com/partnerships/technologyalliance-partners/partner-details/vmware/index.page BROCADE SAN HEALTH AND EMC MITRENDS Brocade SAN Health is a free software utility designed to securely audit and analyze SAN environments. It allows the user to perform critical tasks such as: • Taking inventory of devices, switches, firmware versions, and fabrics • Capturing and displaying historical performance data • Checking zoning and switch configurations against best practices • Assessing performance statistics and error conditions • Producing detailed graphical reports and diagrams • Reporting areas of improvement HOW SAN HEALTH WORKS SAN Health helps the user focus on optimizing SANs rather than manually tracking SAN components. A wide variety of features makes it easier to collect data, identify potential issues, and check results over time. COMPREHENSIVE REPORTING SAN Health utilizes two main components: a data capture application and a back-end report processing engine. After it finishes the capture of switch diagnostic data, the back-end reporting process automatically generates a Visio topology diagram and a detailed ‘snapshot’ report on the user’s SAN configuration. This report contains summary information about the entire SAN as well as specific details about fabrics, switches, and individual ports. Other useful items in the report include alerts, historical performance graphs, and recommended best practices. Because SAN Health provides a point-in-time snapshot of your customer's SAN, Brocade recommends using SAN Health to track traffic pattern changes in weekly or monthly increments. With a built-in scheduler, the user can run SAN Health after primary business hours for added safety and convenience. Additional detailed information, sample reports, Visio diagrams, and a list of supported devices is available at http://www.brocade.com/sanhealth. EMC Mitrend takes the comprehensive output from SAN Health and creates an easy-to-understand summary presentation to help the customer make important decisions about the SAN environment. Using SAN Health data, Mitrend recommends consolidation and technology refresh options. Together, SAN Health and Mitrend provide the information customers need to quickly monitor the health of their SANs and make essential daily and long-term decisions. 15 Figure 11. Sample reports published with Brocade SAN Health Figure 12. Sample diagram published with Brocade SAN Health 16 EMC STORAGE ANALYTICS VMware vRealize Operations Manager is a software product that collects performance and capacity data from monitored software and hardware resources. EMC Storage Analytics provides vRealize Operations Manager with valuable information (such as performance and capacity metrics) via an EMC adapter—enabling analytics, reporting, and alert monitoring (storage essentials) against an XtremIO array. Alerts specific to XtremIO have been added as of Revision 03. XtremIO topology, metrics, and dashboards have been added as early as Revision 01 with vRealize Operations Manager 6.0.1. EMC Storage Analytics uses the power of existing vCenter features to aggregate data from multiple sources and process the data with proprietary analytic algorithms. With this information, it provides valuable information for operations staff, analysts, and ultimately decision makers to enable pro-active actions as opposed to reactionary firefighting. BUSINESS CONTINUITY AND BACKUP EMC customers and solution architects have a plethora of tools at their disposal for sizing performance capacity (either IOPS, bandwidth, or latency) and storage capacity. These tools can factor in data reduction that can be achieved due to deduplication and compression for typical enterprise applications and use cases. XtremIO also integrates with other EMC data protection, disaster recovery, and backup products and solutions. These solutions can span applications that run in both physical (i.e. bare metal) environments as well as virtual environments. As of XIOS version 4.0, more flexibility and options will be available for interoperability and integration. A storage assessment or business continuance study will provide the answer to the question: How many X-Bricks should be deployed for existing apps per XtremIO clusters? Or How many X-Bricks should comprise an XtremIO cluster at a site or various sites if applicable? For high availability, mobility, tiering, replication, disaster recovery, and backup, EMC XtremIO is fully integrated with EMC VPLEX®, EMC RecoverPoint®, and EMC Data Domain®. EMC VPLEX is a unique continuous availability and data mobility platform that enables mission-critical applications to remain up and running during any of a variety of planned and unplanned downtime scenarios within a data center and across data centers. VPLEX permits painless, non-disruptive data movement, taking non-distributed technologies and enabling them to function across arrays and across distance. EMC RecoverPoint is an operational and disaster recovery solution that provides concurrent local and remote replication with continuous data protection for any point-in-time recovery from XtremIO to any storage array. Please refer to this data sheet for details: http://www.emc.com/collateral/solution-overview/h13248-data-protection-emc-xtremio-so.pdf EMC Data Domain deduplication systems are disk-based inline deduplication appliances and gateways that provide data protection and disaster recovery for enterprise environments. EMC XTREMIO STORAGE CONSIDERATIONS As of XIOS 4.0, cluster expansion is fully supported. For example, the number of X-Bricks can be increased from two to four or up to the maximum supported of eight X-Bricks per cluster. As implied by automatic data distribution through XDP, data for existing applications will now be automatically rebalanced across all X-Bricks comprising the array. Performance now spans all of the X-Bricks comprising the cluster, thus ensuring the uniform utilization of resources (both software modules as well as hardware resources). SWITCH CONNECTIVITY AND ZONING Switch/switch port connectivity and zoning best practice recommends balanced utilization of all storage ports per host/s, but not exceeding 16 paths to the array. This should be considered while implementing the new SAN design. The steps might include removing repeatedly referenced/used storage ports (existing X-Bricks) and adding new storage ports (from new X-Bricks) from/to existing zones. 17 LUN QUEUE Generally speaking, it is recommended that you configure more than a solitary LUN on the XtremIO storage system. This will ensure increased I/O queueing from the various paths already created via zoning on a per LUN basis. A good rule of thumb is to configure at least four LUNs for use per application or per host. STORAGE TUNING XtremIO has no knobs for tuning storage since any application has full access to all resources of the array. XtremIO outlines recommendations and guidelines for various platforms in the XtremIO Storage Host Configuration Guide. MONITORING THE EXISTING XTREMIO ARRAY On an existing XtremIO cluster, in addition to the readily available metrics provided for SAN clients, metrics are also provided for the back-end components. Most importantly, resource utilization of the software modules handling the data path is reported. This is important for assessing the consumption of existing applications against the array. For example, if an array is consistently showing over 90 percent utilization, the array is very likely close to reaching its maximum performance capacity. 18 APPENDIX A: REFERENCES EMC EMC Storage Analytics 3.1.1 Installation and User Guide EMC Storage Analytics 3.1.1 Release Notes XtremIO Storage Array User Guide XtremIO Release Notes XtremIO Storage Array Software and Upgrade Guide XtremIO Performance and Data Services for Oracle Databases XtremIO 3.0.1, 3.0.2, and 3.0.3 Storage Array Pre-Installation Checklist XtremIO 3.0.3 Global Services Product Support Bulletin XtremIO 3.0.1, 3.0.2, and 3.0.3 Storage Array Software Installation and Upgrade Guide BROCADE Brocade SAN Fabric Administration Best Practices Guide (http://www.brocade.com/downloads/documents/best_practice_guides/sanadmin-best-practices-bp.pdf) Brocade Fabric OS Administrator’s Guide (http://www.brocade.com/downloads/documents/product_manuals/B_SAN/FOS_AdminGd_v730.pdf) 19