Networking | Programming | Security | Linux | Computer Science | About

Physical Layer Architecture Recommendations

The following recommendations concern aspects of Layer 1, known as the physical layer. This is where the rubber meets the road, bridging the gap between the logical world of software and the physical world of electronics and transmission systems.

Use dedicated switches

Although it may be possible to use existing network infrastructure for a new cluster, we recommend deploying dedicated switches and uplinks for Hadoop where possible. This has several benefits, including isolation and security, cluster growth capacity, and stronger guarantees that traffic from Hadoop and Spark won’t saturate existing network links.

Consider a cluster as an appliance

This is related to the previous point, but it is helpful to think of a cluster as a whole, rather than as a collection of servers to be added to your network. When organizations purchase a cluster as an appliance, installation becomes a relatively straightforward matter of supplying space, network connectivity, cool‐ ing, and power—the internal connectivity usually isn’t a concern. Architecting and building your own cluster means you necessarily need to be concerned with internal details, but the appliance mindset—thinking of the cluster as a single thing—is still appropriate.

Manage oversubscription

The performance of any cluster network is entirely driven by the level of over‐subscription at the switches. Cluster software can drive a network to capacity, so the network should be designed to minimize oversubscription. Cluster software performs best when oversubscription is kept to around 3:1 or better.

Consider InfiniBand carefully

Hadoop clusters can be deployed using InfiniBand (IB) as the Layer 1 technology, but this is uncommon outside of Hadoop appliances. At the time of this writing, InfiniBand isn’t supported natively by services such as Hadoop and Spark. Features such as remote direct memory access (RDMA) are thus left unused, making the use of IP over InfiniBand (IPoIB) essential. As a consequence, the performance of InfiniBand is significantly reduced, making the higher speeds of InfiniBand less relevant. InfiniBand also introduces a secondary network interface to cluster servers, mak‐ ing them multihomed. Finally, the relative scarcity of InfiniBand skills in the market and the cost in comparison to Ethernet make the technology more difficult to adopt and maintain.

Use high-speed cables

Clusters are commonly cabled using copper cables. These are available in a num‐ ber of standards, known as categories, which specify the maximum cable length and maximum speed at which a cable can be used.

Since the cost increase between cable types is negligible when compared to servers and switches, it makes sense to choose the highest-rated cable possible. At the time of this writing, the recommendation is to use Category 7a cable, which offers speeds of up to 40 Gb/s with a maximum distance of 100 meters (for solid core cables; 55 meters for stranded).

Fiber optic cables offer superior performance in terms of bandwidth and distance compared to copper, but at increased cost. They can be used to cable servers, but they are more often used for the longer-distance links that connect switches in different racks. At this time, the recommendation is to use OM3 optical cabling or better, which allows speeds up to 100 Gb/s.

Use high-speed networking

The days of connecting cluster servers at 1 Gb/s are long gone. Nowadays, almost all clusters should connect servers using 10 Gb/s or better. For larger clusters that use multiple switches, 40 Gb/s should be considered the minimum speed for the links that interconnect switches. Even with 40 Gb/s speeds, link aggregation is likely to be required to maintain an acceptable degree of oversubscription.

Consider hardware placement

We recommend racking servers in predictable locations, such as always placing master nodes at the top of the rack or racking servers in ascending name/IP order. This strategy can help to reduce the likelihood that a server is misidenti‐ fied during cluster maintenance activities. Better yet, use labels and keep docu‐ mentation up to date. Ensure that racks are colocated when considering stacked networks, since the stacking cables are short. Remember that server network cables may need to be routed between racks in this case. Ensure that racks are located no more than 100 meters apart when deploying optical cabling.

Don’t connect clusters to the internet

Use cases that require a cluster to be directly addressable on the public internet are rare. Since they often contain valuable, sensitive information, most clusters should be deployed on secured internal networks, away from prying eyes, possibly using a leased line. Good information security policy says to minimize the attack surface of any system, and clusters such as Hadoop are no exception. When absolutely required, internet-facing clusters should be deployed using fire‐ walls and secured using Kerberos, Transport Layer Security (TLS), and encryption.

Published on Fri 12 April 2019 by Tracey Mann in Networking with tag(s): architecture