The ERSPAN Past and Present of Mylinking™ Network Visibility

The most common tool for network monitoring and troubleshooting today is Switch Port Analyzer (SPAN), also known as Port mirroring. It allows us to monitor network traffic in bypass out of band mode without interfering with services on the live network, and sends a copy of the monitored traffic to local or remote devices, including Sniffer, IDS, or other types of network analysis tools.

Some typical uses are:

• Troubleshoot network problems by tracking control/data frames;

• Analyze latency and jitter by monitoring VoIP packets;

• Analyze latency by monitoring network interactions;

• Detect anomalies by monitoring network traffic.

SPAN Traffic can be locally mirrored to other ports on the same source device, or remotely mirrored to other network devices adjacent to Layer 2 of the source device (RSPAN).

Today we are going to talk about Remote Internet traffic monitoring technology called ERSPAN (Encapsulated Remote Switch Port Analyzer) that can be transmitted across three layers of IP. This is an extension of SPAN to Encapsulated Remote.

Basic operation principles of ERSPAN

First, let's take a look at ERSPAN's features:

• A copy of the packet from the source port is sent to the destination server for parsing through Generic Routing Encapsulation (GRE). The physical location of the server is not restricted.

• With the help of the User Defined Field (UDF) feature of the chip, any offset of 1 to 126 bytes is carried out based on the Base domain through the expert-level extended list, and the session keywords are matched to realize the visualization of the session, such as the TCP three-way handshake and RDMA session;

• Support setting sampling rate;

• Supports packet interception length(Packet Slicing), reducing the pressure on the target server.

With these features, you can see why ERSPAN is an essential tool for monitoring networks inside data centers today.

ERSPAN's main functions can be summarized in two aspects:

• Session Visibility: Use ERSPAN to collect all created new TCP and Remote Direct Memory Access (RDMA) sessions to the back-end server for display;

• Network troubleshooting: Captures network traffic for fault analysis when a network problem occurs.

To do this, the source network device needs to filter out the traffic of interest to the user from the massive data stream, make a copy, and encapsulate each copy frame into a special "superframe container" that carries enough additional information so that it can be correctly routed to the receiving device. Moreover, enable the receiving device to extract and fully recover the original monitored traffic.

The receiving device can be another server that supports decapsulating ERSPAN packets.

Encapsulating ERSPAN packets

The ERSPAN Type and Package Format Analysis

ERSPAN packets are encapsulated using GRE and forwarded to any IP addressable destination over Ethernet. ERSPAN is currently mainly used on IPv4 networks, and IPv6 support will be a requirement in the future.

For the general encapsulation structure of ERSAPN, the following is a mirror packet capture of ICMP packets:

encapsulation structure of ERSAPN

The ERSPAN protocol has developed over a long period of time, and with the enhancement of its capabilities, several versions have been formed, called "ERSPAN Types ". Different Types have different frame header formats.

It is defined in the first Version field of the ERSPAN header:

ERSPAN header version

In addition, the Protocol Type field in the GRE header also indicates the internal ERSPAN Type. The Protocol Type field 0x88BE indicates ERSPAN Type II, and 0x22EB indicates ERSPAN Type III.

1. Type I

The ERSPAN frame of Type I encapsulates IP and GRE directly over the header of the original mirror frame. This encapsulation adds 38 bytes over the original frame: 14(MAC) + 20 (IP) + 4(GRE). The advantage of this format is that it has a compact header size and reduces the cost of transmission. However, because it sets GRE Flag and Version fields to 0, it does not carry any extended fields and Type I is not widely used, so there is no need to expand more.

The GRE header format of Type I is as follows:

GRE header format I

2. Type II

In Type II, the C, R, K, S, S, Recur, Flags, and Version fields in the GRE header are all 0 except the S field. Therefore, the Sequence Number field is displayed in the GRE header of Type II. That is, Type II can ensure the order of receiving GRE packets, so that a large number of out-of-order GRE packets can not be sorted due to a network fault.

The GRE header format of Type II is as follows:

GRE header format II

In addition, the ERSPAN Type II frame format adds an 8-byte ERSPAN header between the GRE header and the original mirrored frame.

The ERSPAN header format for Type II is as follows:

ERSPAN header format II

Finally, immediately following the original image frame, is the standard 4-byte Ethernet cyclic redundancy check (CRC) code.

CRC

It is worth noting that in the implementation, the mirror frame does not contain the FCS field of the original frame, instead a new CRC value is recalculated based on the entire ERSPAN. This means that the receiving device cannot verify the CRC correctness of the original frame, and we can only assume that only uncorrupted frames are mirrored.

3. Type III

Type III introduces a larger and more flexible composite header to address increasingly complex and diverse network monitoring scenarios, including but not limited to network management, intrusion detection, performance and delay analysis, and more. These scenes need to know all the original parameters of the mirror frame and include those that are not present in the original frame itself.

The ERSPAN Type III composite header includes a mandatory 12-byte header and an optional 8-byte platform-specific subheader.

The ERSPAN header format for Type III is as follows:

ERSPAN header format III

Again, after the original mirror frame is a 4-byte CRC.

CRC

As can be seen from the header format of Type III, in addition to retaining the Ver, VLAN, COS, T and Session ID fields on the basis of Type II, many special fields are added, such as:

• BSO: used to indicate the load integrity of data frames carried through ERSPAN. 00 is a good frame, 11 is a bad frame, 01 is a short frame, 11 is a large frame;

• Timestamp: exported from the hardware clock synchronized with the system time. This 32-bit field supports at least 100 microseconds of Timestamp granularity;

• Frame Type (P) and Frame Type (FT) : the former is used to specify whether ERSPAN carries Ethernet protocol frames (PDU frames), and the latter is used to specify whether ERSPAN carries Ethernet frames or IP packets.

• HW ID: unique identifier of the ERSPAN engine within the system;

• Gra (Timestamp Granularity) : Specifies the Granularity of the Timestamp. For example, 00B represents 100 microsecond Granularity, 01B 100 nanosecond Granularity, 10B IEEE 1588 Granularity, and 11B requires platform-specific sub-headers to achieve higher Granularity.

• Platf ID vs. Platform Specific Info: Platf Specific Info fields have different formats and contents depending on the Platf ID value.

Port ID Index

It should be noted that the various header fields supported above can be used in regular ERSPAN applications, even mirroring error frames or BPDU frames, while maintaining the original Trunk package and VLAN ID. In addition, key timestamp information and other information fields can be added to each ERSPAN frame during mirroring.

With ERSPAN's own feature headers, we can achieve a more refined analysis of network traffic, and then simply mount the corresponding ACL in the ERSPAN process to match the network traffic we are interested in.

ERSPAN Implements RDMA Session Visibility

Let's take an example of using ERSPAN technology to achieve RDMA session visualization in an RDMA scenario:

RDMA: Remote Direct Memory Access enables the network adapter of server A to read and write the Memory of server B by using intelligent network interface cards (inics) and switches, achieving high bandwidth, low latency, and low resource utilization. It is widely used in big data and high-performance distributed storage scenarios.

RoCEv2: RDMA over Converged Ethernet Version 2. The RDMA data is encapsulated in the UDP Header. The destination port number is 4791.

Daily operation and maintenance of RDMA requires collecting a lot of data, which is used to collect daily water level reference lines and abnormal alarms, as well as the basis for locating abnormal problems. Combined with ERSPAN, massive data can be captured quickly to obtain microsecond forwarding quality data and protocol interaction status of switching chip. Through data statistics and analysis, RDMA end-to-end forwarding quality assessment and prediction can be obtained.

To achieve RDAM session visualization, we need ERSPAN to match keywords for RDMA interaction sessions when mirroring traffic, and we need to use the expert extended list.

Expert-level extended list matching field definition:

The UDF consists of five fields: UDF keyword, base field, offset field, value field, and mask field. Limited by the capacity of hardware entries, a total of eight UDFs can be used. One UDF can match a maximum of two bytes.

• UDF keyword: UDF1... UDF8 Contains eight keywords of the UDF matching domain

• Base field: identifies the start position of the UDF matching field. The following

L4_header (applicable to RG-S6520-64CQ)

L5_header (for RG-S6510-48VS8Cq)

• Offset: indicates the offset based on the base field. The value ranges from 0 to 126

• Value field: matching value. It can be used together with the mask field to configure the specific value to be matched. The valid bit is two bytes

• Mask field: mask, valid bit is two bytes

(Add: If multiple entries are used in the same UDF matching field, the base and offset fields must be the same.)

The two key packets associated with RDMA session status are Congestion Notification Packet (CNP) and Negative Acknowledgment (NAK):

The former is generated by the RDMA receiver after receiving the ECN message sent by the switch (when the eout Buffer reaches the threshold), which contains information about the flow or QP causing congestion. The latter is used to indicate the RDMA transmission has a packet loss response message.

Let's look at how to match these two messages using the expert-level extended list:

RDMA CNP

expert access-list extended rdma

permit udp any any any any eq 4791 udf 1 l4_header 8 0x8100 0xFF00 (Matching RG-S6520-64CQ)

permit udp any any any any eq 4791 udf 1 l5_header 0 0x8100 0xFF00 (Matching RG-S6510-48VS8CQ)

RDMA CNP 2

expert access-list extended rdma

permit udp any any any any eq 4791 udf 1 l4_header 8 0x1100 0xFF00 udf 2 l4_header 20 0x6000 0xFF00 (Matching RG-S6520-64CQ)

permit udp any any any any eq 4791 udf 1 l5_header 0 0x1100 0xFF00 udf 2 l5_header 12 0x6000 0xFF00 (Matching RG-S6510-48VS8CQ)

As a final step, you can visualize the RDMA session by mounting the expert extension list into the appropriate ERSPAN process.

Write in the last

ERSPAN is one of the indispensable tools in today's increasingly large data center networks, increasingly complex network traffic, and increasingly sophisticated network operation and maintenance requirements.

With the increasing degree of O&M automation, technologies such as Netconf, RESTconf, and gRPC are popular among O&M students in network automatic O&M. Using gRPC as the underlying protocol for sending back mirror traffic also has many advantages. For example, based on HTTP/2 protocol, it can support the streaming push mechanism under the same connection. With ProtoBuf encoding, the size of information is reduced by half compared to JSON format, making data transmission faster and more efficient. Just imagine, if you use ERSPAN to mirror interested streams and then send them to the analysis server on gRPC, will it greatly improve the ability and efficiency of network automatic operation and maintenance?


Post time: May-10-2022