跳转到内容
Ask AI

Best Practices for EVPN MC-LAG (unnumbered BGP)

此内容尚不支持你的语言。

This document introduces the recommended baseline solution, configuration guide, and maintenance guide in detail for data center series switches in EVPN MC-LAG scenarios, to achieve VXLAN Anycast Gateway functionality under leaf-spine infrastructure.

This manual is intended for project planning, design and implementation personnel. They are expected to:

  • Be familiar with Asterfusion data center series switches.
  • Be familiar with EVPN MC-LAG fundamentals.

EVPN MC-LAG is a dual-homed and dual-active VXLAN distributed gateway solution widely adopted in data center environments. By combining the advantages of EVPN and MC-LAG technologies, this solution provides support for high reliability in access scenarios, achieving load balancing, fault tolerance, independent nodes upgrades and other functions.

The two Leaf nodes connected to the same device are called MC-LAG peer. The nodes perform an active and a standby role. The node with the smaller local IP address is designated as active endpoint, and the one with larger local IP address serves as standby endpoint. The distinction between active and standby applies only to the control plane. In the forwarding plane, MC-LAG peers operate with equal status, independently determining traffic forwarding paths based on their local forwarding decisions.

ICCP (Inter-Chassis Communication Protocol), defined in RFC 7275, is utilized in this scheme for establishing ICCP connections between MC-LAG peer to synchronize interface statuses, table entries, and perform role negotiation.

The keep-alive link is a heartbeat detection link, typically implemented as a direct Layer 3 connection between MC-LAG peer, periodically sends heartbeat packet. It’s used for transmitting ICCP control protocol packets, synchronizing table information, establishing MC-LAG peer relationships, and implementing configuration consistency checks.

The peer-link refers to a direct physical link between MC-LAG peers, used to forward traffic when there is a failure in downstream links. Generally, it is recommended to be shared with keep-alive link for operational efficiency and resource optimization.

The DAD (Dual Active Detection) link is a layer 3 interoperable link used by MC-LAG peer to send dual-active detection packets, applicable in scenarios when the peer-link and keep-alive link are shared. When the keep-alive link is detected as disconnected, the system will automatically shut down all interfaces on the standby node except logical interfaces, management ports, and peer-link interfaces.

The port-channel on MC-LAG peer connecting to the same servers or hosts are called member lags.

MC-LAG operates in two modes: dual-active mode and active-standby mode. In dual-active mode, both the active and standby node will forward traffic, achieving load sharing across member lags. In active-standby mode, usually only the active node will forward traffic flows, while the standby one will not. The standby node will remain inactive (block traffic) until a failover occurs. Currently, the MC-LAG solution is in the dual-active mode and does not support the active-standby mode.

A VTEP (VXLAN Tunnel Endpoint) is an edge node in a VXLAN network, responsible for encapsulating and decapsulating VXLAN packets and communicating with the underlying physical network. In a distributed gateway architecture, the VTEP also serves as a gateway, providing layer 3 routing functionality for traffic within the VXLAN overlay.

The VXLAN anycast gateway is an architecture used to achieve more efficient and flexible routing and bridging functions in a VXLAN network. By distributing gateway functionality across multiple VTEP nodes, this approach allows each VTEP to perform layer 2/layer 3 forwarding independently, thereby enhancing overall performance and resiliency.

A typical EVPN MC-LAG networking is shown in the figure below.

Description: Service servers are dual-connected to Server Leaf nodes, which set up EVPN MC-LAG as layer 3 anycast gateways. Spine nodes are connected to Leaf nodes, running routing protocols for underlay reachability. The following table shows the recommended deployment approach for this solution.

Table 1 Recommended deployment

ItemRecommended approaches to deploy
Interconnection of Leaf and Spine- BGP is the most commonly used routing protocol when deploying data center topologies. We recommend using BGP protocol for underlay route reachability. To achieve the separation of underlay and overlay routes, we recommend using the interconnect port IP to establish underlay BGP neighbors and Loopback0 IP to establish overlay BGP neighbors.
- It is recommended to use high-speed interfaces for interconnection between Leaf and Spine. You could configure physical layer 3 ports or into a link aggregation group to increase the bandwidth of uplink.
- It is recommended that Spine nodes are configured with the same ASN to minimize unnecessary route exchanges, thus reducing resource usage.
- It is required that a pair of MC-LAG peer are configured with the same ASN.
Deploy peer-link and keep-alive link- It is recommended to use high-speed interfaces for interconnection and configure static aggregation links as peer-link on MC-LAG peer.
- It is recommended to share peer-link with keep-alive link.
- The peer-link shall be configured in tagged mode and added to both the dedicated VLAN for keep-alive link and the service VLAN for hosts. The IP address of the dedicated VLAN interface (VLANIF) is used as the communication endpoint for the ICCP connection.
Deploy DAD link- It is recommended to use an independent low-speed interface for interconnection as DAD link. The solution does not support using the management port to deploy DAD link.
- It is recommended to configure the keep-alive link first and ensure the MC-LAG keep-alive status is active before configuring DAD link, preventing incorrect error-down of node ports due to premature activation of dual-active detection mechanisms.
- The DAD link shall not share physical or logical link with peer-link or keep-alive link.
Deploy MC-LAG member lags- It is recommended to use low-speed interfaces for MC-LAG member lags. Multiple members can be configured.
- For reliability reasons, it is recommended to use LACP dynamic aggregation. You could enable fast-rate (short timeout) on both sides to improve the performance of fault convergence.
Servers/switches connected to Leaf nodes- It is recommended to use bond4 (load sharing) for access on the server side when configuring the bond.
- If the servers are installed by PXE, the feature LACP fallback is required to be enabled on the port-channel of one of the connected Leaf.
Layer 3 VXLAN anycast gateway- When deploying layer 3 VXLAN gateways on Leaf, all nodes should be configured with the same IP and MAC addresses on the VLANs shared between them.
- For a pair of MC-LAG peers, the MAC addresses configured on VRFs (Virtual Routing and Forwarding instances) bound to the same Layer 3 VNI (L3VNI) shall be identical.
- It is recommended to configure Loopback1 IP as the local EVPN VTEP IP. For a pair of MC-LAG peers, the VTEP IP shall be identical.
- At the same time, this solution provides support for layer 3 connectivity when required. The establishment of routing protocols between Leaf peers and access nodes is achieved by enabling Unique-IP feature on Leaf nodes and configuring different Primary IP and MAC addresses for VLANs when required.
Load Balance- In a data center network, usually the BGP ASN of each node is different, and the AS-path of the routes from other nodes are different as well. Therefore, it is necessary to enable the multi-path functionality on nodes that receive routes to achieve load balancing of services.
Fault Tolerance- It is recommended to configure monitor link groups on the Leaf switches to guarantee fast failover in case of uplink or node failures.
- It is recommended to configure startup-delay on the uplink interfaces of MC-LAG nodes. This ensures that uplink interfaces remain in a down state until MC-LAG information synchronization is complete, thereby minimizing packet loss during traffic switchback scenarios caused by node reboots or failure recovery.
- It is recommended to configure the BGP graceful restart (GR) and BGP max-med on-startup functionalities on Leaf and Spine switches according to the actual needs for ensuring fast failover of routing protocols in case of node-level failures.

The commonly used EVPN MC-LAG network is shown in the Figure 2, with VXLAN distributed gateway deployment and dual uplinks of service servers into Server Leaf.

The IP address planning for each interface is shown in the following table.

Table 2 IP address planning for interfaces

NodeInterfaceIP addressNodeInterfaceIP address
Spine1Loopback0172.16.1.165/32Spine2Loopback 0172.16.1.167/32
Leaf1Ethernet 0/4740.95.0.1/30Leaf2Ethernet 0/4740.95.0.2/30
Loopback 0172.16.1.179/32Loopback 0172.16.1.166/32
Loopback 1172.16.2.179/32Loopback 1172.16.2.179/32
Vlan1010.10.0.1/24Vlan1010.10.0.1/24
Vlan2010.20.0.1/24Vlan2010.20.0.1/24
Vlan3010.30.0.179/24(Primary IP)
10.30.0.1/24(Secondary IP)
Vlan3010.30.0.166/24(Primary IP)
10.30.0.1/24(Secondary IP)
Vlan409440.94.0.1/30Vlan409440.94.0.2/30
Leaf3Ethernet 0/4740.95.0.1/30Leaf4Ethernet 0/4740.95.0.2/30
Loopback 0172.16.1.170/32Loopback 0172.16.1.162/32
Loopback 1172.16.2.170/32Loopback 1172.16.2.170/32
Vlan1010.10.0.1/24Vlan1010.10.0.1/24
Vlan2010.20.0.1/24Vlan2010.20.0.1/24
Vlan3010.30.0.170/24(Primary IP)
10.30.0.1/24(Secondary IP)
Vlan3010.30.0.162/24(Primary IP)
10.30.0.1/24(Secondary IP)
Vlan409440.94.0.1/30Vlan409440.94.0.2/30

The configuration baseline of this document only involves the networking devices in the Spine-Leaf part of Figure 2. Other networking equipment configuration is skipped.

Table 3 Configuration Overview

Node
Configure Spine nodes.Configure interconnect interfaces and loopback
Configure underlay BGP neighbors
Configure overlay BGP neighbors
Configure Leaf nodes.Configure interconnect interfaces and loopback
Configure underlay BGP neighbors
Configure overlay BGP neighbors
Configure MC-LAG
Configure GW and VRF instances
Configure downstream cross-device aggregation groups
Configure EVPN and VXLAN mapping
Configure monitor link group
(Optional) Configure for layer 3 service connectivity with user-side
(Optional) Optimize ARP parameters
(Optional) Configure cross-VRF communication

Take Leaf1, Leaf2, Spine1 and Spine2 for example ( the configurations of Leaf3, Leaf4 are similar to Leaf1, Leaf2 ), the detailed configuration procedures are given below.

Configure interconnect interfaces and loopback

Section titled “Configure interconnect interfaces and loopback”

Table 4 Configure interconnect interfaces and loopback on Spine nodes

DescriptionsSpine1Spine2
Configure the IP addresses of the interfaces connected to Leaf nodes according to Table 2.interface ethernet 0/0
description to_Leaf1
ipv6 use-link-local
!
interface ethernet 0/4
description to_Leaf2
ipv6 use-link-local
!
interface ethernet 0/0
description to_Leaf1
ipv6 use-link-local
!
interface ethernet 0/4
description to_Leaf2
ipv6 use-link-local
!
Configure the IP address of Loopback 0, which will be used as the router-ID as well as the source interface for establishing BGP neighbors.interface loopback 0
ip address 172.16.1.165/32
!
interface loopback 0
ip address 172.16.1.167/32
!

The underlay layer 3 reachability between Leaf and Spine nodes is achieved by configuring eBGP routing protocols on the interconnecting physical interfaces, while overlay network is built by Loopback0. Create two BGP peer groups on all Spine nodes.

  • PEER_to_Leaf: used to establish underlay BGP neighbors with Leaf nodes.
  • PEER_to_Leaf_EVPN: used to establish overlay BGP neighbors with Leaf nodes.

Table 5 Configure underlay BGP neighbors on Spine nodes

DescriptionsSpine1Spine2
Create a route-map and an IP prefix list to advertise Loopback.route-map advertise_loopback permit 10
match ip address prefix-list loopback
exit
!
ip prefix-list loopback seq 10 permit 172.16.1.0/24 ge 24
ip prefix-list loopback seq 20 permit 172.16.2.0/24 ge 24
!
route-map advertise_loopback permit 10
match ip address prefix-list loopback
exit
!
ip prefix-list loopback seq 10 permit 172.16.1.0/24 ge 24
ip prefix-list loopback seq 20 permit 172.16.2.0/24 ge 24
!
Configure BGP ASN and the router-ID. Enable BGP GR, max-med on start-up and multi path features.router bgp 65165
bgp router-id 172.16.1.165
no bgp ebgp-requires-policy
bgp bestpath as-path multipath-relax
bgp max-med on-startup 120
bgp graceful-restart
router bgp 65165
bgp router-id 172.16.1.167
no bgp ebgp-requires-policy
bgp bestpath as-path multipath-relax
bgp max-med on-startup 120
bgp graceful-restart
Create a peer-group named PEER_to_Leaf and enable BFD.neighbor PEER_to_Leaf peer-group
neighbor PEER_to_Leaf remote-as external
neighbor PEER_to_Leaf bfd
neighbor ethernet 0/0 interface peer-group PEER_to_Leaf
neighbor ethernet 0/4 interface peer-group PEER_to_Leaf
neighbor ethernet 0/8 interface peer-group PEER_to_Leaf
neighbor ethernet 0/12 interface peer-group PEER_to_Leaf
!
neighbor PEER_to_Leaf peer-group
neighbor PEER_to_Leaf remote-as external
neighbor PEER_to_Leaf bfd
neighbor ethernet 0/0 interface peer-group PEER_to_Leaf
neighbor ethernet 0/4 interface peer-group PEER_to_Leaf
neighbor ethernet 0/8 interface peer-group PEER_to_Leaf
neighbor ethernet 0/12 interface peer-group PEER_to_Leaf
!
Advertise Loopback IPs.address-family ipv4 unicast
redistribute connected route-map advertise_loopback
exit-address-family
!
address-family ipv4 unicast
redistribute connected route-map advertise_loopback
exit-address-family
!

Table 6 Configure overlay BGP neighbors on Spine nodes

DescriptionsSpine1Spine2
Create a peer-group named PEER_to_Leaf_EVPN. Enable BGP EVPN address-family and specify loopback 0 as the source interface.router bgp 65165
neighbor PEER_to_Leaf_EVPN peer-group
neighbor PEER_to_Leaf_EVPN remote-as external
neighbor PEER_to_Leaf_EVPN ebgp-multihop 5
neighbor PEER_to_Leaf_EVPN update-source 172.16.1.165
neighbor 172.16.1.179 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.166 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.170 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.135 peer-group PEER_to_Leaf_EVPN
!
address-family ipv4 unicast
no neighbor PEER_to_Leaf_EVPN activate
exit-address-family
!
address-family l2vpn evpn
neighbor PEER_to_Leaf_EVPN activate
advertise-all-vni
exit-address-family
exit
!
router bgp 65165
neighbor PEER_to_Leaf_EVPN peer-group
neighbor PEER_to_Leaf_EVPN remote-as external
neighbor PEER_to_Leaf_EVPN ebgp-multihop 5
neighbor PEER_to_Leaf_EVPN update-source 172.16.1.167
neighbor 172.16.1.179 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.166 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.170 peer-group PEER_to_Leaf_EVPN
neighbor 172.16.1.135 peer-group PEER_to_Leaf_EVPN
!
address-family ipv4 unicast
no neighbor PEER_to_Leaf_EVPN activate
exit-address-family
!
address-family l2vpn evpn
neighbor PEER_to_Leaf_EVPN activate
advertise-all-vni
exit-address-family
exit
!

Configure interconnect interfaces and loopback

Section titled “Configure interconnect interfaces and loopback”

Table 7 Configure interconnect interfaces and loopback on Leaf nodes

DescriptionsLeaf1Leaf2
Configure the IP address of the interface interconnected with Spine.interface ethernet 0/48
description to_Spine1
ipv6 use-link-local
!
interface ethernet 0/52
description to_Spine2
ipv6 use-link-local
!
interface ethernet 0/48
description to_Spine1
ipv6 use-link-local
!
interface ethernet 0/52
description to_Spine2
ipv6 use-link-local
!
Configure the IP address of Loopback0 as the router-ID as well as the source interface for establishing BGP neighbors.interface loopback 0
ip address 172.16.1.179/32
!
interface loopback 0
ip address 172.16.1.166/32
!
Configure the IP address of Loopback1 as local VTEP IP.interface loopback 1
ip address 172.16.2.179/32
!
interface loopback 1
ip address 172.16.2.179/32
!

Create two BGP peer groups on all Leaf nodes.

  • PEER_to_Spine: used to establish underlay BGP neighbors with Spine nodes.
  • PEER_to_Spine_EVPN: used to establish overlay BGP neighbors with Spine nodes.

Table 8 Configure underlay BGP neighbors on Leaf nodes

DescriptionsLeaf1Leaf2
Create a route-map and an IP prefix list to advertise Loopback.route-map advertise_loopback permit 10
match ip address prefix-list loopback
exit
!
ip prefix-list loopback seq 10 permit 172.16.1.0/24 ge 24
ip prefix-list loopback seq 20 permit 172.16.2.0/24 ge 24
!
route-map advertise_loopback permit 10
match ip address prefix-list loopback
exit
!
ip prefix-list loopback seq 10 permit 172.16.1.0/24 ge 24
ip prefix-list loopback seq 20 permit 172.16.2.0/24 ge 24
!
Configure BGP ASN and router-ID. Enable BGP GR, max-med on start-up and multi path features.router bgp 65100
bgp router-id 172.16.1.179
no bgp ebgp-requires-policy
bgp bestpath as-path multipath-relax
bgp max-med on-startup 120
bgp graceful-restart
router bgp 65100
bgp router-id 172.16.1.166
no bgp ebgp-requires-policy
bgp bestpath as-path multipath-relax
bgp max-med on-startup 120
bgp graceful-restart
Create a peer-group named PEER_to_Spine and enable BFD.neighbor PEER_to_Spine peer-group
neighbor PEER_to_Spine remote-as external
neighbor PEER_to_Spine bfd
neighbor ethernet 0/48 interface peer-group PEER_to_Spine
neighbor ethernet 0/52 interface peer-group PEER_to_Spine
!
neighbor PEER_to_Spine peer-group
neighbor PEER_to_Spine remote-as external
neighbor PEER_to_Spine bfd
neighbor ethernet 0/48 interface peer-group PEER_to_Spine
neighbor ethernet 0/52 interface peer-group PEER_to_Spine
!
Advertise Loopback IPs.address-family ipv4 unicast
redistribute connected route-map advertise_loopback
neighbor PEER_to_Spine route-map advertise_loopback out
exit-address-family
exit
!
address-family ipv4 unicast
redistribute connected route-map advertise_loopback
neighbor PEER_to_Spine route-map advertise_loopback out
exit-address-family
exit
!

After finishing the configurations above, you can check of underlay BGP status through command show ip bgp summary:

mclag-leaf-1# show ip bgp summary
IPv4 Unicast Summary (VRF default):
BGP router identifier 172.16.1.179, local AS number 65100 vrf-id 0
BGP table version 9
RIB entries 13, using 2392 bytes of memory
Peers 2, using 1447 KiB of memory
Peer groups 2, using 128 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
ethernet 0/48 4 65165 133 133 0 0 0 01:58:22 4 7 N/A
ethernet 0/52 4 65165 131 133 0 0 0 01:58:11 4 7 N/A
Total number of neighbors 2

There are two messages you could take notice of in the output.

  1. The field “State/PfxRcd” shows the state of the BGP session. If the state is Idle/Connect/Active, it means that the BGP session establishment is abnormal. If it is displayed as a number, it means that the BGP session has established successfully, and the number is the count of route prefixes received from BGP peer.
  2. The field “Up/Down” shows the time duration for which the BGP session has been in the current state.

Table 9 Configure overlay BGP neighbors on Leaf nodes

DescriptionsLeaf1Leaf2
Create a peer-group named PEER_to_Spine_EVPN. Enable BGP EVPN address-family and specify the source interface as Loopback 0.router bgp 65100
neighbor PEER_to_Spine_EVPN peer-group
neighbor PEER_to_Spine_EVPN remote-as external
neighbor PEER_to_Spine_EVPN ebgp-multihop 5
neighbor PEER_to_Spine_EVPN update-source 172.16.1.179
neighbor 172.16.1.165 peer-group PEER_to_Spine_EVPN
neighbor 172.16.1.167 peer-group PEER_to_Spine_EVPN
!
address-family ipv4 unicast
no neighbor PEER_to_Spine_EVPN activate
exit-address-family
!
address-family l2vpn evpn
neighbor PEER_to_Spine_EVPN activate
advertise-all-vni
exit-address-family
exit
!
router bgp 65100
neighbor PEER_to_Spine_EVPN peer-group
neighbor PEER_to_Spine_EVPN remote-as external
neighbor PEER_to_Spine_EVPN ebgp-multihop 5
neighbor PEER_to_Spine_EVPN update-source 172.16.1.166
neighbor 172.16.1.165 peer-group PEER_to_Spine_EVPN
neighbor 172.16.1.167 peer-group PEER_to_Spine_EVPN
!
address-family ipv4 unicast
no neighbor PEER_to_Spine_EVPN activate
exit-address-family
!
address-family l2vpn evpn
neighbor PEER_to_Spine_EVPN activate
advertise-all-vni
exit-address-family
exit
!

After finishing the configurations above, you can check overlay BGP status through command show bgp l2vpn evpn summary.

mclag-leaf-1# show bgp l2vpn evpn summary
BGP router identifier 172.16.1.179, local AS number 65100 vrf-id 0
BGP table version 0
RIB entries 151, using 27 KiB of memory
Peers 2, using 1447 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
172.16.1.165 4 65165 196 328 0 0 0 00:05:52 2153 2181 N/A
172.16.1.167 4 65165 136 332 0 0 0 00:05:53 2087 2181 N/A
Total number of neighbors 2

Table 10 Configure MC-LAG on Leaf nodes

DescriptionsLeaf1Leaf2
Configure the VLANIF for establishing MC-LAG.vlan 4094
!
interface vlan 4094
ip address 40.94.0.1/30
exit
!
vlan 4094
!
interface vlan 4094
ip address 40.94.0.2/30
exit
!
Configure the peer-link and add to the VLAN.interface link-aggregation 9999
switchport trunk vlan 4094
exit
!
interface ethernet 0/72
link-aggregation-group 9999
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
exit
!
interface ethernet 0/76
link-aggregation-group 9999
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
exit
!
interface link-aggregation 9999
switchport trunk vlan 4094
exit
!
interface ethernet 0/72
link-aggregation-group 9999
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
exit
!
interface ethernet 0/76
link-aggregation-group 9999
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
exit
!
Configure a MC-LAG domain.mclag domain 1
local-address 40.94.0.1
peer-address 40.94.0.2
peer-link link-aggregation 9999
commit
!
mclag domain 1
local-address 40.94.0.2
peer-address 40.94.0.1
peer-link link-aggregation 9999
commit
!
Configure the DAD link.interface ethernet 0/47
ip address 40.95.0.1/30
exit
!
mclag domain 1
dad local-address 40.95.0.1
dad peer-address 40.95.0.2
commit
!
interface ethernet 0/47
ip address 40.95.0.2/30
exit
!
mclag domain 1
dad local-address 40.95.0.2
dad peer-address 40.95.0.1
commit
!

After completing the above configuration, you can check the MC-LAG state by using the show mclag state command.

mclag-leaf-1# show mclag state
The MCLAG's keepalive is: OK
MCLAG info sync is: completed
Domain id: 1
MCLAG session Channel: Primary channel
VRF Name: default
consistency Check Action: idle
Local Ip: 40.94.0.1
Peer Ip: 40.94.0.2
Dad Local Ip: 40.95.0.1
Dad Peer Ip: 40.95.0.2
Peer Link Interface: lag 9999
Keepalive time: 1
Dad Detection Delay: 15
Dad Recovery Delay Mlag Intf: 60
Dad Recovery Delay Non Mlag Intf: 0
Dad VRF Name: default
Dad Status: dual-active
session Timeout : 15
Peer Link Mac: 60:eb:5a:01:10:b1
System Mac: 60:eb:5a:01:10:b1
Peer Mac: 60:eb:5a:01:10:c1
Admin Role: None
Role: Active
MCLAG Interface:
Loglevel: NOTICE

Mainly focus on the following information:

  1. The current MC-LAG status can be obtained from the column The MCLAG’s keepalive is: If it shows ERROR, it means that there is an exception in the establishment of MC-LAG; if it shows OK, it means that the establishment of MC-LAG is successful;
  2. Role indicates the role of the current device in MC-LAG, which is categorized into Active and Standby.
  3. Dad Status indicates the DAD status: if it is single-active, it means that there is an exception in DAD establishment; if it is dual-active, it means that DAD establishment is successful, and thereafter the Standby device will be error-down when the MC-LAG status is in exception.

Table 11 Configure GW and VRF instances on Leaf nodes

DescriptionsLeaf1Leaf2
Create a VLAN for service traffic forwarding.vlan 10
!
vlan 20
!
vlan 10
!
vlan 20
!
Disable forwarding for ARP packetsarp broadcast disablearp broadcast disable
Create VRF instances. The MAC addresses of the VRFs sharing the L3-VNI of the MC-LAG peer shall be the same. The VRFs cannot be the same between different VTEPs.vrf 10123
mac 00:00:00:01:23:00
exit-vrf
vrf 10123
mac 00:00:00:01:23:00
exit-vrf
Configure the layer 3 GW and set up the ARP proxy in EVPN mode. The IP and MAC of the shared VLANIF on MC-LAG peer are required to be the same.interface vlan 10
mac-address 00:00:00:10:00:00
vrf 10123
ip address 10.10.0.1/24
arp proxy mode evpn
!
interface vlan 20
mac-address 00:00:00:20:00:00
vrf 10123
ip address 10.20.0.1/24
arp proxy mode evpn
!
interface vlan 10
mac-address 00:00:00:10:00:00
vrf 10123
ip address 10.10.0.1/24
arp proxy mode evpn
!
interface vlan 20
mac-address 00:00:00:20:00:00
vrf 10123
ip address 10.20.0.1/24
arp proxy mode evpn
!
(Optional) For silent hosts, it is required to enable ARP proxy extension feature.interface vlan 10
arp proxy extend reply
arp proxy extend request
!
interface vlan 10
arp proxy extend reply
arp proxy extend request
!
Add the peer-link to the service VLAN.interface link-aggregation 9999
switchport trunk vlan 10
switchport trunk vlan 20
!
interface link-aggregation 9999
switchport trunk vlan 10
switchport trunk vlan 20
!

Configure downstream cross-device aggregation groups

Section titled “Configure downstream cross-device aggregation groups”

Table 12 Configure downstream cross-device aggregation groups on Leaf nodes

DescriptionsLeaf1Leaf2
Create a dynamic aggregation group and enable fast-rate.interface link-aggregation 100
lacp fast-ratecommit
!
interface link-aggregation 101
lacp fast-rate
commit
!
interface link-aggregation 100
lacp fast-rate
commit
!
interface link-aggregation 101
lacp fast-rate
commit
!
Join the member lag of the MC-LAG.mclag domain 1
member lag 100
member lag 101
!
mclag domain 1
member lag 100
member lag 101
!
(Optional) If the servers are installed via PXE, the fallback feature is required to be enabled on one of Leaf nodes.interface link-aggregation 100
lacp fallback
commit
!
interface link-aggregation 101
lacp fallback
commit
!
-
Add to the service VLAN.interface link-aggregation 100
switchport trunk vlan 10
!
interface link-aggregation 101
switchport trunk vlan 20
!
interface link-aggregation 100
switchport trunk vlan 10
!
interface link-aggregation 101
switchport trunk vlan 20
!
Configure the member of the LAG and enable storm suppression for BUM.interface ethernet 0/0
link-aggregation-group 100
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
!
interface ethernet 0/1
link-aggregation-group 101
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
!
interface ethernet 0/0
link-aggregation-group 100
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
!
interface ethernet 0/1
link-aggregation-group 101
storm-suppress broadcast packets 1000
storm-suppress multicast packets 1000
storm-suppress unknown-unicast packets 1000
!

After finishing the configurations above, you can check the port-channel status through command show link-aggregation summary.

mclag-leaf-1# show link-aggregation summary
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available,
S - selected, D - deselected, * - not synced
No. Team Dev Protocol Ports Description
----- --------------- --------------- --------------- -------------
0100 lag 100 LACP(A)(Up) 0/0 (S) N/A
0101 lag 101 LACP(A)(Up) 0/1 (S) N/A
9999 lag 9999 LACP(A)(Up) 0/72 (S) N/A
0/76 (S)

In the example above, “Up” means the current state of the port-channel is normal, and “S” means the member port is selected.

Table 13 Configure EVPN and VXLAN mapping on Leaf nodes

DescriptionsLeaf1Leaf2
Configure EVPN local VTEP IP. The VTEP IP of MC-LAG peer shall be the same.interface vxlan 0
source 172.16.2.179
exit
interface vxlan 0
source 172.16.2.179
exit
Configure layer 2 VXLAN mapping.vlan 10
vni 100
!
vlan 20
vni 200
!
vlan 10
vni 100
!
vlan 20
vni 200
!
Configure layer 3 VXLAN mapping.vrf 10123
vni 10000
exit-vrf
!
vrf 10123
vni 10000
exit-vrf
!

After configurations you can check VXLAN tunnels through command show vxlan tunnel.

mclag-leaf-1# show vxlan tunnel
+--------------+-------+--------+-------+
| RemoteVTEP | VNI | VLAN | VRF |
+==============+=======+========+=======+
| 172.16.2.170 | 100 | 10 | |
+--------------+-------+--------+-------+
| 172.16.2.170 | 200 | 20 | |
+--------------+-------+--------+-------+
| 172.16.2.170 | 300 | 30 | |
+--------------+-------+--------+-------+
| 172.16.2.170 | 10000 | | 10123 |
+--------------+-------+--------+-------+

The purpose of configuring an interface linkage group is to safeguard the switchover in case of link/node failure and reduce packet loss. When the status of all uplink ports changes from up to down, the downlink ports will automatically go down; when the status of some uplink ports is restored, the downlink ports will be delayed for a period of time before resuming up. The purpose of configuring interface delayed startup is to reduce packet loss in the traffic cutback scenario when the node reboots or fails to recover. When the node reboots and recovers, the physical interfaces will be delayed for a certain period of time before resuming up, and the default delayed startup time for all interfaces is 150 s. In large-scale terminal scenarios (when the number of terminals in the entire network exceeds 1K), the MC-LAG peer needs to synchronize a large number of table entries after startup, and if the upstream port has been restored to make the traffic cut back but the table entry synchronization is not yet completed, then packet loss will occur. Therefore, it is recommended to configure a delay time of 300 seconds or longer on the uplink port, and use the default delay time of 150 seconds on the rest of the interfaces, so that the peer-link and keep-alive links are restored before the uplink port, and sufficient time is reserved for the synchronization of MC-LAG table entries.

Table 14 Configure Monitor Link on Leaf nodes

DescriptionsLeaf1Leaf2
Create a monitor link group and specify the delay time.monitor-link-group aster 60
!
monitor-link-group aster 60
!
Specify the physical port connected to the Spine as the uplink port and configure an interface delay of 300 seconds to start.interface ethernet 0/48
monitor-link aster uplink
startup-delay 300
!
interface ethernet 0/52
monitor-link aster uplink
startup-delay 300
!
interface ethernet 0/48
monitor-link aster uplink
startup-delay 300
!
interface ethernet 0/52
monitor-link aster uplink
startup-delay 300
!
Specify the physical ports connected to servers as downlink.interface ethernet 0/0
monitor-link aster downlink
!
interface ethernet 0/1
monitor-link aster downlink
!
interface ethernet 0/0
monitor-link aster downlink
!
interface ethernet 0/1
monitor-link aster downlink
!

After configurations you can check monitor link through command show monitor-link.

mclag-leaf-1# show monitor-link
+---------------+---------+----------------+------------------+-------------+------------+
| Group Name | Delay | Uplink Ports | Downlink Ports | LACP LAGs | Networks |
+===============+=========+================+==================+=============+============+
| aster | 60 | 0/48 | 0/0 | | |
| | | 0/52 | 0/1 | | |
+---------------+---------+----------------+------------------+-------------+------------+

After completing the above configuration, you can check the startup delay time of the interface through command show interface startup_delay.

mclag-leaf-1# show interface startup_delay
Port start up delay time
---------- ---------------------
0/47 150
0/48 300
0/52 300
0/56 150
0/60 150
0/64 150
0/68 150
0/72 150
0/76 150
……

(Optional) Configure for layer 3 service connectivity with user-side

Section titled “(Optional) Configure for layer 3 service connectivity with user-side”

Usually, the user-side devices are attached to MC-LAG nodes with static routing. To cope with flexible and rapid deployment scenarios and meet the needs of rapid service growth, this solution provides the ability to establish dynamic routing protocols between MC-LAG nodes and user-side devices. Specifically, it is necessary to enable the Unique-IP function of the corresponding VLAN on a pair of MC-LAG peers. The current Unique-IP on the MC-LAG supports both diff_mac and same_mac modes.

  • In diff_mac mode, the shared VLAN on MC-LAG peer shall be configured with different IP and MAC.
  • In same_mac mode, two Leaf peers need to be configured with different IPs and the same MAC address under the VLANIFs. When establishing a routing protocol between MC-LAG peers via the peer-link link, the diff_mac mode is required. When establishing a routing protocol between an MC-LAG device and an access-side device, both modes are acceptable, but the same_mac mode is recommended. Below is a configuration example for establishing a BGP neighbor relationship between an MC-LAG device and the access-side device.

Table 15 (Optional) Configure for layer 3 service connectivity with user-side

DescriptionsLeaf1Leaf2
Create a VLAN and enable Unique-IP. configure different master IPs for establishing routing protocols and same slave IPs as VLAN gateways.vlan 30
!
interface vlan 30
arp proxy mode evpn
mac-address 00:00:00:30:00:00
vrf 10123
ip address 10.30.0.179/24
ip address 10.30.0.1/24 secondary
!
mclag domain 1
unique-ip vlan 30 same_mac
!
interface link-aggregation 9999
switchport trunk vlan 30
!
vlan 30
!
interface vlan 30
arp proxy mode evpn
mac-address 00:00:00:30:00:00
vrf 10123
ip address 10.30.0.166/24
ip address 10.30.0.1/24 secondary
!
mclag domain 1
unique-ip vlan 30 same_mac
!
interface link-aggregation 9999
switchport trunk vlan 30
!
Add the LAG to the service VLANinterface link-aggregation 101
switchport trunk vlan 30
interface link-aggregation 101
switchport trunk vlan 30
Create a BFD profile. The recommended BFD detection and reception interval is 1000ms, and the local detection multiplier is 3 by default.bfd
profile bfd_to_tor
transmit-interval 1000
receive-interval 1000
exit
!
exit
!
bfd
profile bfd_to_tor
transmit-interval 1000
receive-interval 1000
exit
!
exit
!
To simplify the configuration, create a peer group for interacting with the user-side device, and enable BFD.router bgp 65100 vrf 10123
no bgp ebgp-requires-policy
neighbor PEER_to_Tor peer-group
neighbor PEER_to_Tor remote-as external
neighbor PEER_to_Tor ebgp-multihop 255
neighbor PEER_to_Tor bfd profile bfd_to_tor
neighbor 10.30.0.3 peer-group PEER_to_Tor
bgp listen range 10.30.0.0/24 peer-group PEER_to_Tor
!
router bgp 65100 vrf 10123
no bgp ebgp-requires-policy
neighbor PEER_to_Tor peer-group
neighbor PEER_to_Tor remote-as external
neighbor PEER_to_Tor ebgp-multihop 255
neighbor PEER_to_Tor bfd profile bfd_to_tor
neighbor 10.30.0.3 peer-group PEER_to_Tor
bgp listen range 10.30.0.0/24 peer-group PEER_to_Tor
!
Advertise IPv4 address family routes to BGP EVPN neighbors.address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
exit
!
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
exit
!

After configurations you can check BGP status through command **show ip bgp ** [vrf vrf_name] summary.

Leaf1:

mclag-leaf-1# show ip bgp vrf 10123 summary
IPv4 Unicast Summary (VRF vrf 10123):
BGP router identifier 10.200.0.1, local AS number 65100 vrf-id 211
BGP table version 671
RIB entries 51, using 9384 bytes of memory
Peers 1, using 723 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.30.0.3 4 65236 1872 1770 0 0 0 00:10:59 25 27 N/A
Total number of neighbors 1

Leaf2:

mclag-leaf-2# show ip bgp vrf 10123 summary
IPv4 Unicast Summary (VRF vrf 10123):
BGP router identifier 10.200.0.1, local AS number 65100 vrf-id 94
BGP table version 10342916
RIB entries 51, using 9384 bytes of memory
Peers 1, using 723 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.30.0.3 4 65236 27110 44103 0 0 0 00:13:16 27 27 N/A
Total number of neighbors 1

By default, the times of ARP aging probes of this series of switches is 5, and the ARP aging time is 300 seconds. In large-scale scenario (when the number of hosts in whole network is up to 1K), if the switch receives a large number of ARPs from hosts in a short time, it may not be able to process in time due to the limitation of CPU capability, resulting in the aging of some ARP entries, and then packet loss. At this time, you could increase ARP probe times to 10, and increase the aging time to 600/1200 seconds to ease the pressure on CPU and avoid packet loss.

Table 16 (Optional) Optimize ARP parameters on Leaf nodes

DescriptionsLeaf1Leaf2
Set ARP probe times.arp probe times 10
!
arp probe times 10
!
Set ARP aging time.arp timeout 600
!
arp timeout 600
!

(Optional) Configure cross-VRF communication

Section titled “(Optional) Configure cross-VRF communication”

Assume that VLAN10 is located in Vrf10123 and VLAN20 is located in Vrf2520 (config steps are skipped). Since VLAN10 and VLAN20 are located in different VRFs, you need to configure route leakage for cross-VRF communication to achieve host intercommunication or route exchange between VLAN10 and VLAN20.

Table 17 (Optional) Configure cross-VRF communication

DescriptionsLeaf1Leaf2
Configure Vrf10123 and Vrf2520 to leak routes to each other with BGP IPv4 address family.router bgp 65100 vrf 2520
!
address-family ipv4 unicast
import vrf 10123
exit-address-family
exit
!
router bgp 65100 vrf 10123
address-family ipv4 unicast
import vrf 2520
exit-address-family
exit
!
router bgp 65100 vrf 2520
!
address-family ipv4 unicast
import vrf 10123
exit-address-family
exit
!
router bgp 65100 vrf 10123
address-family ipv4 unicast
import vrf 2520
exit-address-family
exit
!
(Optional) Leak static routes, directly connected routes, and kernel routes from Vrf10123 to Vrf2520 if required.router bgp 65100 vrf 10123
!
address-family ipv4 unicast
redistribute static
redistribute connected
redistribute kernel
exit-address-family
exit
router bgp 65100 vrf 10123
!
address-family ipv4 unicast
redistribute static
redistribute connected
redistribute kernel
exit-address-family
exit
(Optional) Advertise leaked IPv4 routes in Vrf2520 to the remote VTEP if required.router bgp 65100 vrf 2520
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
exit
!
router bgp 65100 vrf 2520
address-family l2vpn evpn
advertise ipv4 unicast
exit-address-family
exit
!
(Optional) Configure a route map to filter the routes if required. For example, keep the segment route 25.10.12.0/24 in Vrf2520 from being leaked.ip prefix-list IP_LIST_V4 seq 4294967295 deny 0.0.0.0/0 le 32
ip prefix-list IP_LIST_V4 seq 10 permit 25.10.12.0/24 le 24
!
bgp extcommunity-list expanded EXTCOM_RT_Vrf2520 seq 5 permit .*:2520
!
route-map VRF2520_v4 permit 5
match extcommunity EXTCOM_RT_Vrf2520
exit
!
route-map VRF2520_v4 permit 10
match ip address prefix-list IP_LIST_V4
exit
!
router bgp 65100 vrf 2520
!
address-family ipv4 unicast
table-map VRF2520_v4
exit-address-family
exit
!
ip prefix-list IP_LIST_V4 seq 4294967295 deny 0.0.0.0/0 le 32
ip prefix-list IP_LIST_V4 seq 10 permit 25.10.12.0/24 le 24
!
bgp extcommunity-list expanded EXTCOM_RT_Vrf2520 seq 5 permit .*:2520
!
route-map VRF2520_v4 permit 5
match extcommunity EXTCOM_RT_Vrf2520
exit
!
route-map VRF2520_v4 permit 10
match ip address prefix-list IP_LIST_V4
exit
!
router bgp 65100 vrf 2520
!
address-family ipv4 unicast
table-map VRF2520_v4
exit-address-family
exit
!

Table 18 Display MC-LAG status

PurposeCommand
Display the MC-LAG status.show mclag status
Display the MC-LAG parameter consistency check results.**show mclag consistency_check_result ** [number]

Table 19 Display interfaces status

PurposeCommand
Display all interfaces status.show interface summary
Display port-channel status.show link-aggregation summary
Display IP configuration and status of layer 3 ports.show ip interfaces
Display error-down interfaces.show interface errdown
Display VLAN configurations.show vlan summary
Display delayed startup information for interfaces.show interface startup_delay
Display monitor-link configuration.show monitor-link
Display counter statistics.show counters interface

Table 20 Display table entries commonly used

PurposeCommand
Display local MAC entries.show mac-address
Display remote MAC entries. You could specify the VTEP IP.show vxlan remotemac {all|A.B.C.D}
Display local and remote ARP entries.show arp
Display interface count statistics.show counters interface
Displaying underlay routing information.show ip route
Display overlay routing information.show ip route vrf vrf_name
Display underlay BGP neighbors.**show ip bgp ** [ vrf vrf_name] summary
Display overlay BGP neighbors.show bgp l2vpn evpn summary
Display all BGP neighbors.**show bgp ** [vrf vrf_name] summary
Display VXLAN tunnel establishment status.show vxlan tunnel
Display local VXLAN mapping information.show vxlan map
Display EVPN type-1 routes.show bgp l2vpn evpn route type ead
Display EVPN type-2 routes.show bgp l2vpn evpn route type macip
Display routes advertised to BGP IPv4 neighbors.show ip bgp neighbors A.B.C.D advertised-routes
Display all routes received from BGP IPv4 neighbors. The feature soft-reconfiguration inbound is required to be enabled first.show ip bgp neighbors A.B.C.D received-routes
Display routes advertised to BGP EVPN neighbors.show ip bgp l2vpn evpn neighbors A.B.C.D advertised-routes
Display all routes received from BGP EVPN neighbors.show ip bgp l2vpn evpn neighbors A.B.C.D routes

By following the procedure below to upgrade Spine and Leaf, you can significantly reduce the impact to service during node upgrade in EVPN Multihoming scenario.

  1. Backup configuration files Back up the configuration file of your system to a server or locally. The path on the switch is /etc/sonic/config_db.json.
  2. Collect the table entries
  • For Spine nodes, it is necessary to collect the BGP neighbors and routes info before upgrading.
  • For Leaf nodes, it is necessary to collect MC-LAG, ARP, MAC, LAG, VRF, VNI, BGP, routes, VXLAN tunnels, ES, and other information before upgrading. So that you can verify that the status of the upgraded switch is normal.
  1. Transfer the image to switch Get the latest software image and the MD5.
  1. Log in to the switch and copy the software image from remote server to the target switch via SCP. For example:
mclag-leaf-1# scp source sonic@10.250.0.243:/AsterNOS/release/AsterNOS_V3.1_R0407P00-FL.bin target .
The authenticity of host '10.250.0.243 (10.250.0.243)' can't be established.
ED25519 key fingerprint is SHA256:gpANANn/+MH0zXnIR/3yXO0v0bdFkGD0lZwrqUEUKyE.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '10.250.0.243' (ED25519) to the list of known hosts.
sonic@10.250.0.243's password:
AsterNOS_V3.1_R0407P00-FL.bin 100% 1425MB 82.8MB/s 00:17
leaf1# system ls
AsterNOS_V3.1_R0407P00-FL.bin

After the transmission is complete, please check the MD5 of the image. If it is not the same as the given value, it means that the file is incomplete or there was an error during transmission and it requires re-transmission.

mclag-leaf-1# system md5sum AsterNOS_V3.1_R0407P00-FL.bin
fed40a54f42fa54ced69c99e4311ba7e AsterNOS_V3.1_R0407P00-FL.bin

The following is an example of the specific procedures for upgrading Spine1.

  1. First manually switch the network-side traffic to Spine2. Depending on the routing protocols in the network, for BGP, you can enable graceful-shutdown to lower the priority of routes advertised from Spine1.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Enable BGP graceful-shutdown.bgp graceful-shutdown
Exit configuration view.end
  1. Verify that there is no traffic passing through Spine1.
DescriptionCommand
Clear the statistics.clear counters interface
Display interfaces counter statistics.show counters interface
  1. Save the configuration.
DescriptionCommand
Save the configuration.write
  1. Install the new image and reboot.
DescriptionCommand
List the files in your directory.system ls
Install the new image.image update bin-file
Confirm that the image is successfully installed.show image
Reboot the switch.reboot
  1. After reboot, wait for about 6 minutes to do the verification of the operation status of the upgraded switch.
DescriptionCommand
Check if the current version is the expected one.show version
Check that the containers on the switch are operating normally.system docker ps
Check the configuration.show running-config
Check whether the physical interfaces are up.show interface summary
  1. Check the BGP sessions and routing entries.
DescriptionCommand
Check underlay BGP neighbor status.show ip bgp summary
Check overlay BGP neighbor status.show bgp l2vpn evpn summary
Check routing table.show ip route
  1. Restore network-side traffic.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Disable BGP graceful-shutdown.no bgp graceful-shutdown
Exit configuration view.end
Save the configuration.write
  1. Check whether the service has been restored.

Verify that traffic is restored on Spine1.

DescriptionCommand
Display interfaces counter statistics.show counters interface

Now the upgrade of Spine1 is done, and then you could repeat the above procedure for Spine2.

The following is an example of the specific procedures for upgrading Leaf1.

  1. Log in to the Leaf1 and Leaf2 switches and check the status of the aggregation ports on the switches to make sure that the hosts with dual-return access under MC-LAG are not in the single-hanging state;
DescriptionCommand
Display port-channel status.show link-aggregation summary
  1. First manually cut traffic from the network side to Leaf2. Depending on the routing protocols in the network, for BGP, you can enable graceful-shutdown to lower the priority of routes advertised from Leaf1.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Enable BGP graceful-shutdown.bgp graceful-shutdown
Exit configuration view.end
  1. Manually down all LAG interface protocols in use for Leaf1 and manually shutdown, thereby switching all network-side and user-side traffic to Leaf2;
DescriptionCommand
Enter LAG configuration view.interface link-aggregation lag-id
Set the LAGIF to be down on protocol level.lacp graceful-down
Shutdown the LAGIF.shutdown
Exit configuration view.end
  1. Verify that there is no traffic passing through Leaf1.
DescriptionCommand
Clear the statistics.clear counters interface
Display interfaces counter statistics.show counters interface
  1. Save the configuration.
DescriptionCommand
Save the configuration.write
  • Please be sure that you have saved your configuration before installing a new image.
  1. Install the new image and reboot.
DescriptionCommand
List the files in your directory.system ls
Install the new image.image update bin-file
Confirm that the image is successfully installed.show image
Reboot the switch.reboot
  1. After reboot, wait for about 6 minutes to do the verification of the operation status of the upgraded switch.
DescriptionCommand
Check if the current version is the expected one.show version
Check that the containers on the switch are operating normally.system docker ps
Check the configuration.show running-config
Check whether the physical interfaces are up.show interface summary
  1. Check the BGP sessions, check that the MC-LAG is in the OK state and check the VXLAN tunnel has been established;
DescriptionCommand
Check underlay BGP neighbor status.**show ip bgp ** [vrf vrf_name] summary
Check overlay BGP neighbor status.show bgp l2vpn evpn summary
Display the MC-LAG status.show mclag state
Check the VXLAN tunnel.show vxlan tunnel
  1. Restore user-side traffic. Bring back the Leaf downlink port.
DescriptionCommand
Enter LAG configuration view.interface link-aggregation lag-id
Start up the LAGIF.no shutdown
  1. Check that the ARP, MAC, and routing table entries on Leaf1 are fully synchronized with Leaf2.
DescriptionCommand
Display local MAC entries.show mac-address
Display remote MAC entries.show vxlan remotemac {all|A.B.C.D}
Displays local and remote ARP entries.show arp
Display remote hosts routing table entries.**show ip route ** [vrf vrf_name]
  1. Restore network-side traffic.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Disable BGP graceful-shutdown.no bgp graceful-shutdown
Exit configuration view.end
Save the configuration.write
  1. Check whether the service has been restored.

Verify that traffic is restored on Leaf1. Also check with the network administrator to make sure that services are normal.

DescriptionCommand
Display interfaces counter statistics.show counters interface

Now the upgrade of Leaf1 is done, and then you could repeat the above procedure for Leaf2.

Seamless Migration from Standalone to MC-LAG

Section titled “Seamless Migration from Standalone to MC-LAG”

To migrate a standalone Leaf node to an MC-LAG dual-homing mode (i.e., expanding an existing operational Leaf with a peer node to build a high-availability architecture), operators should follow a standardized workflow:

Configuration Deployment -> Logical Isolation -> Physical Integration -> State Synchronization -> Traffic Switchover.

This ensures minimal service impact and mitigates loop risks. Assuming Leaf1 is currently operational and Leaf2 is being added to form the MC-LAG pair:

  1. Configuration Deployment: Power on Leaf2 and gain access via the management network or console port. Complete the baseline configurations for BGP, MC-LAG, VRF, and Layer3 gateways (refer to Section Configure Leaf Nodes) to ensure configuration consistency between Leaf1 and Leaf2.
  1. Logical Isolation: Configure Graceful-Shutdown on Leaf2 based on the active routing protocols. This ensures that Leaf2 advertises routes with lower priority, preventing it from attracting Northbound traffic before MC-LAG forwarding tables are fully synchronized.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Enable BGP graceful-shutdownbgp graceful-shutdown
Exit configuration view.end
  1. Establish Physical Connectivity: After completing the configurations, connect the physical links from Leaf2 to Leaf1 (peer-link) and the Spine layer. This prevents traffic from being prematurely diverted to the new node during the configuration phase.
  2. Verify Control Plane Integrity: Check and confirm that BGP peering is established and that the MC-LAG status is “OK” (synchronized/up). Inspect VXLAN tunnel information.
DescriptionCommand
Check whether the physical interfaces are up.show interface summary
Check the underlay BGP neighbor status.**show ip bgp ** [vrf vrf_name] summary
Check the overlay BGP neighbor status.show bgp l2vpn evpn summary
Check the MC-LAG status.show mclag state
Check the VXLAN tunnel status.show vxlan tunnel
  1. Host Connectivity & Table Validation: Configure the Server Bond (LAG) and connect the cables between Leaf2 and the servers. Verify that the LAG status on Leaf2 is “Up” and ensure that ARP, MAC, and Routing tables are fully synchronized with Leaf1 (the number of table entries on Leaf1 and Leaf2 should be identical under normal conditions).
DescriptionCommand
Check the operational state of link-aggregation group interfaces.show link-aggregation summary
Check the routing entries.show ip route vrf all
Check the ARP entries.show arp
Check the FDB entries.show mac-address
Check the remote MAC entries.show vxlan remotemac {all|A.B.C.D}
  1. Traffic Cutover: Disable Graceful-Shutdown on Leaf2. This restores standard routing priority, allowing Northbound and Southbound traffic to transition smoothly into the dual-homed path.
DescriptionCommand
Enter the configuration view.configure terminal
Enter the BGP configuration view.router bgp asn
Disable BGP graceful-shutdown.no bgp graceful-shutdown
Exit configuration view.end
Save the configuration.write
  1. Traffic and Service Verification: Verify that traffic forwarding on Leaf2 is functioning correctly, and coordinate with the Network Management System (NMS) or operations team to confirm that all services are operational.
DescriptionCommand
Display interfaces counter statistics.show counters interface
  1. Configuration Persistence: Save the configuration to ensure all changes are persistent across reboots.
DescriptionCommand
Save the configuration.write