Wrong answers in advance – 2 or wrong answers that many people want to hear in a technical interview: Networks / Habr

Wrong answers in advance – 2 or wrong answers that many people want to hear in a technical interview: Networks / Habr

For the laziness league: Some people still think that Etherchannel\LAG is the best thing ever invented in networking.

Part 0. Recruitment and HR 2024 – what’s new
Wrong answers in advance – 1 or we pass the first hiring boss – HR
Wrong answers in advance – 2 or wrong answers that many people want to hear in a technical interview: Networking.
(plan) Wrong answers in advance – 3 or wrong answers that many want to hear in a technical interview: SDLC/Devops.

Introduction for those who are not familiar with the theory
Consider a simple network on ip v4 – Client (server, workstation) = switch1 = switch2 = gateway. The client initially does not know the MAC address of the gateway, and sends the message “To everyone who is here from the gateway ip”. Switch 1 also initially does not know who it has where, and forwards this message to all ports except the original one. Switch 2 performs a similar operation, the gateway responds, and then it is not so interesting.
But what happens if we pull out the wire between the switches? Obviously, switch 1 will not be able to communicate with switch 2, NOT WORKING.
Let’s put two wires! Port 1 of switch 1 > switch 2 port 1 , 2>2. What will be? Switch 1 will send a packet to switch 2 via wire 1.1 >1.2, switch 2 will send a packet to its port 2.2 – switch 1, not waiting for such meanness, will accept the packet in 2.2 and do as instructed – send a broadcast packet outside its MAC table to all ports, besides, where he came from. And we went, cats packets are propagated by multiplication, everything lies, the accounting department is unable to open 1s. Network storm and ring, of course, 1-2 classes in the first course on networks.
how to be First, the STP protocol (and its development, together with storm-control and other bpdu guard settings) appeared a long time ago – when the second (third, and so on) port is simply blocked in such situations, everyone is happy. But there is still one wire at work, I want two. Etherchannel \ LAG appears – when 2 (3,4 and up to 8) ports are combined into one virtual one, inside is the magic of balancing, 2 common IEEE 802.3ad protocols – static and LACP, and a dozen vendor ones. Everything is described in a clear way. Until we start thinking about the pros and cons of Stack and Mlag – article 1, article 2 and so on IP factories begin.
Result: In the general case of networks – Etherchannel\LAG is good, it is reliable. Even if shot in the leg, the other leg will remain.

How to deal with customers?
After all, clients also have 1 wire in the initial situation, they pulled it out, or the port failed – and that’s it, there is no connection, can you use two? Maybe once upon a time, in the days of Windows server 2003, it worked like that. We install the driver + software from the vendor, two physical ports are assembled into a virtual one, and some kind of balancing for several data streams even works. Good, reliable.

Everything changes with the advent of virtualization and more
Already in Windows 2012R2, maybe even earlier, in 2012 (interestingly, when I did not search in Linux) there are three connection and balancing options – Static Teaming, Switch Independent (SET), LACP
As you can easily see, the Switch Independent (SET) mode appears – when all the settings are done on the server side, and the balancing takes place there. LAG on the side of the switch is not always necessary, you can read reddit or Dell
Pros and cons are known – LAG, as it were, pretends to monitor that both ports are working (which does not always work in real situations), SET requires to monitor the state of the ports. Monitoring is needed both there and there, in any case, broken packets must be monitored, and signal-noise and traffic-error levels must be monitored. And that, sometimes Cisco can arrange a show with the Nexus and the need to call for TAC help, and in other places between the Cisco Nexus – anyone – you can eat something with a spoon.

With the addition of virtualization comes a virtual switch – for MS this is described in Hyper-V Cluster NIC Teaming
IN VMware Broadcom – in the documentation with standard and distributed virtual switch.

What is the difference between a regular (physical) switch and a virtual one?
First of all, exactly what is described above – in the usual case on the virtual switch of MS and Broadcom can not arrange a ring The switch does not send broadcast packets between physical NICs. This is directly written in the Broadcom documentation – Typically VMware vSwitches (Standard and Distributed) It is impossible to change the parameters if there were no more than two virtual clicks at the end of the 2nd time on the axis. Understanding the BPDU Filter feature in vSphere (2047822)
and therefore STP must be disabled – STP can cause temporary loss of network connectivity when a failover or failback event occurs (1003804)

For MS, the same is described in Network Recommendations for Hyper-V Cluster in Windows Server 2012, Windows Server 2016: NIC Teaming Functionality, Step-by-Step – Deploy Switch Embedded Teaming (SET)

For Linux there are also variations described in RH – Chapter 3. Configuring network bonding

But then how to ensure fault tolerance and balancing, because everything said above – regarding LAG – is so good, described? Let’s do LAG!

Let’s skip the point that the LACP protocol works only when buying a vDS, and with a Virtual Standard Switch (vSS) – only static (1001938). But why do LACP at all? Standard NIC Teaming already allows balancing, both Active-passive and Active-standby. Teaming and Failover Policy. There are at least two full articles about the pros and cons of both solutions – once, two.

I will quote an excerpt from the first translation:
Load Balance Teaming (LBT): allows you to perform rebalancing of traffic on the load of physical ports. LACP – does not allow.
LACP: More difficult to configure (more operations) on both the host and switches.
LACP: More difficult to monitor on the ESXi side

RDMA, SDS and management come into play
RDMA – Remote direct memory access)
SDS – software-defined storage

RDMA on Windows gives a speed gain of 20-30% (To RDMA, or not to RDMA – that is the question), for SDS vSAN gives both a speed gain and a reduction in processor load
but:
RDMA does NOT work with LAG and in addition:

Basic information about Direct Memory Access (RDMA) in vSphere: PVRDMA does not support NIC teaming. HCA should be only uplink on vSphere (Distributed) Switch
vSAN with RDMA does not support LACP or IP-hash-based NIC teaming. vSAN with RDMA helps with NIC failover.
Agents: If Link Aggregation Control Protocol (LACP) is used on an ESXi host that is a member of vSAN cluster, don’t restart ESXi management agents with services.sh command.

That is, either LAG and the absence of balancing and complexity in administration, or RDMA, balancing (including in vSAN in Air gap mode) and a little less problems.

For MS Windows, read Manage SMB Multichannel and SMB Direct using SET – i.e. without LAG: You can use RDMA-capable network adapters using Switch Embedded Teaming (SET) starting with Windows Server 2016.

Limitation
RDMA has its limits, and the first is distance. And not in the sense that RDMA (RoCE) cannot be transmitted over a hundred kilometers – maybe a hundred kilometers
But due to the loss of speed (without optimization) and the possible loss of packets at such distances, it is possible, but not always necessary.
And for a stretched vSAN cluster – RDMA does not work –

vSAN over RDMA does not support split or two-node vSAN clusters.
vSAN cluster sizes limited to 32 hosts
vSAN cluster must not be running the vSAN iSCSI services
vSAN cluster must not be running in a stretched cluster configuration
vSAN cluster must be using RDMA over Layer 2. RDMA over Layer 3 is not supported.
vSAN cluster running vSAN over RDMA is not supported from disaggregated vSAN topologies like vSAN Max.
vSAN cluster cannot be used for technical policy based on IP Hash or any active/active connection where the conference is balanced by two or more uplinks.

Stretched Clusters Network Design v7
What Are vSAN Stretched Clusters v8
vSAN Frequently Asked Questions (FAQ)

Note: under vSAN cluster, you should not write in vSAN iSCSI services, it is not clearly written – in vSAN Frequently Asked Questions (FAQ) this line is there, in Characteristics of vSAN iSCSI Network 8 – I do not see it.

There could be a text about how painful all of the above is in Linux and vendor support, and separately about the fact that heat exchangers can violate the first commandment – do not create a loop inside a virtual candle, but it’s already five pages. If you try to dive into the network with VXlan and the organization of networks in containers – there will be another 20 pages

Together
Despite all the above, often only the correct answer to the question “how can we make a fault-tolerant cluster of anything” – let’s collect LAG! Everything above is a newfangled lie, 640k 10G should be enough for everyone, for 25/100 we have neither money nor specialists.

Homework
Pros and Cons of Air Gap Network Configurations with vSAN
NFS Multi-Connections in vSphere 8.0 Update 1

Related posts