Best Practices for EVPN MC-LAG (unnumbered BGP)
66 min
preface preface this document introduces the recommended baseline solution, configuration guide, and maintenance guide in detail for data center series switches in evpn mc lag scenarios, to achieve vxlan anycast gateway functionality under leaf spine infrastructure intended audience intended audience this manual is intended for project planning, design and implementation personnel they are expected to be familiar with asterfusion data center series switches be familiar with evpn mc lag fundamentals evpn mc lag overview evpn mc lag overview evpn mc lag is a dual homed and dual active vxlan distributed gateway solution widely adopted in data center environments by combining the advantages of evpn and mc lag technologies, this solution provides support for high reliability in access scenarios, achieving load balancing, fault tolerance, independent nodes upgrades and other functions basic concepts basic concepts mc lag peer mc lag peer the two leaf nodes connected to the same device are called mc lag peer the nodes perform an active and a standby role the node with the smaller local ip address is designated as active endpoint, and the one with larger local ip address serves as standby endpoint the distinction between active and standby applies only to the control plane in the forwarding plane, mc lag peers operate with equal status, independently determining traffic forwarding paths based on their local forwarding decisions iccp iccp iccp (inter chassis communication protocol), defined in rfc 7275, is utilized in this scheme for establishing iccp connections between mc lag peer to synchronize interface statuses, table entries, and perform role negotiation keep alive link keep alive link the keep alive link is a heartbeat detection link, typically implemented as a direct layer 3 connection between mc lag peer, periodically sends heartbeat packet it's used for transmitting iccp control protocol packets, synchronizing table information, establishing mc lag peer relationships, and implementing configuration consistency checks peer link peer link the peer link refers to a direct physical link between mc lag peers, used to forward traffic when there is a failure in downstream links generally, it is recommended to be shared with keep alive link for operational efficiency and resource optimization dad link dad link the dad (dual active detection) link is a layer 3 interoperable link used by mc lag peer to send dual active detection packets, applicable in scenarios when the peer link and keep alive link are shared when the keep alive link is detected as disconnected, the system will automatically shut down all interfaces on the standby node except logical interfaces, management ports, and peer link interfaces mc lag member lag mc lag member lag the port channel on mc lag peer connecting to the same servers or hosts are called member lags mc lag working mode mc lag working mode mc lag operates in two modes dual active mode and active standby mode in dual active mode, both the active and standby node will forward traffic, achieving load sharing across member lags in active standby mode, usually only the active node will forward traffic flows, while the standby one will not the standby node will remain inactive (block traffic) until a failover occurs currently, the mc lag solution is in the dual active mode and does not support the active standby mode vtep vtep a vtep (vxlan tunnel endpoint) is an edge node in a vxlan network, responsible for encapsulating and decapsulating vxlan packets and communicating with the underlying physical network in a distributed gateway architecture, the vtep also serves as a gateway, providing layer 3 routing functionality for traffic within the vxlan overlay vxlan anycast gateway vxlan anycast gateway the vxlan anycast gateway is an architecture used to achieve more efficient and flexible routing and bridging functions in a vxlan network by distributing gateway functionality across multiple vtep nodes, this approach allows each vtep to perform layer 2/layer 3 forwarding independently, thereby enhancing overall performance and resiliency networking solution networking solution a typical evpn mc lag networking is shown in the figure below description service servers are dual connected to server leaf nodes, which set up evpn mc lag as layer 3 anycast gateways spine nodes are connected to leaf nodes, running routing protocols for underlay reachability the following table shows the recommended deployment approach for this solution table 1 recommended deployment table 1 recommended deployment item item recommended approaches to deploy recommended approaches to deploy interconnection of leaf and spine bgp is the most commonly used routing protocol when deploying data center topologies we recommend using bgp protocol for underlay route reachability to achieve the separation of underlay and overlay routes, we recommend using the interconnect port ip to establish underlay bgp neighbors and loopback0 ip to establish overlay bgp neighbors it is recommended to use high speed interfaces for interconnection between leaf and spine you could configure physical layer 3 ports or into a link aggregation group to increase the bandwidth of uplink it is recommended that spine nodes are configured with the same asn to minimize unnecessary route exchanges, thus reducing resource usage it is required that a pair of mc lag peer are configured with the same asn deploy peer link and keep alive link it is recommended to use high speed interfaces for interconnection and configure static aggregation links as peer link on mc lag peer it is recommended to share peer link with keep alive link the peer link shall be configured in tagged mode and added to both the dedicated vlan for keep alive link and the service vlan for hosts the ip address of the dedicated vlan interface (vlanif) is used as the communication endpoint for the iccp connection deploy dad link it is recommended to use an independent low speed interface for interconnection as dad link the solution does not support using the management port to deploy dad link it is recommended to configure the keep alive link first and ensure the mc lag keep alive status is active before configuring dad link, preventing incorrect error down of node ports due to premature activation of dual active detection mechanisms the dad link shall not share physical or logical link with peer link or keep alive link deploy mc lag member lags it is recommended to use low speed interfaces for mc lag member lags multiple members can be configured for reliability reasons, it is recommended to use lacp dynamic aggregation you could enable fast rate (short timeout) on both sides to improve the performance of fault convergence servers/switches connected to leaf nodes it is recommended to use bond4 (load sharing) for access on the server side when configuring the bond if the servers are installed by pxe, the feature lacp fallback is required to be enabled on the port channel of one of the connected leaf layer 3 vxlan anycast gateway when deploying layer 3 vxlan gateways on leaf, all nodes should be configured with the same ip and mac addresses on the vlans shared between them for a pair of mc lag peers, the mac addresses configured on vrfs (virtual routing and forwarding instances) bound to the same layer 3 vni (l3vni) shall be identical it is recommended to configure loopback1 ip as the local evpn vtep ip for a pair of mc lag peers, the vtep ip shall be identical at the same time, this solution provides support for layer 3 connectivity when required the establishment of routing protocols between leaf peers and access nodes is achieved by enabling unique ip feature on leaf nodes and configuring different primary ip and mac addresses for vlans when required load balance in a data center network, usually the bgp asn of each node is different, and the as path of the routes from other nodes are different as well therefore, it is necessary to enable the multi path functionality on nodes that receive routes to achieve load balancing of services fault tolerance it is recommended to configure monitor link groups on the leaf switches to guarantee fast failover in case of uplink or node failures it is recommended to configure startup delay on the uplink interfaces of mc lag nodes this ensures that uplink interfaces remain in a down state until mc lag information synchronization is complete, thereby minimizing packet loss during traffic switchback scenarios caused by node reboots or failure recovery it is recommended to configure the bgp graceful restart (gr) and bgp max med on startup functionalities on leaf and spine switches according to the actual needs for ensuring fast failover of routing protocols in case of node level failures notes there are currently seven bonding modes available on the server side (bond0 to bond6), with bond0, 1, and 4 being the typically used ones bond0 operates in load balancing (round robin) mode, which requires static aggregation configuration on the switch; bond1 operates in active backup mode, which does not require any configuration on the switch except vlans; bond4 operates in lacp mode, requiring dynamic aggregation configuration on the switch in this solution, it is recommended to use bond4 mode on the server side bgp graceful restart functionality as defined in rfc 4724 defines the mechanisms that allows bgp speaker to continue to forward data packets along known routes while the routing protocol information is being restored this feature will help with reducing routing flap and unnecessary churn to the forwarding tables, thus improving network stability bgp max med on startup is a feature that advertises routes with max med on a bgp startup this feature enables other switches to prefer routes from other bgp sessions for forwarding when a bgp is being restored to reduce packet loss during traffic cutback scenarios typical configuration example typical configuration example topology topology the commonly used evpn mc lag network is shown in the figure 2, with vxlan distributed gateway deployment and dual uplinks of service servers into server leaf the ip address planning for each interface is shown in the following table table 2 ip address planning for interfaces table 2 ip address planning for interfaces node node interface interface ip address ip address node node interface interface ip address ip address spine1 loopback0 172 16 1 165/32 spine2 loopback 0 172 16 1 167/32 leaf1 ethernet 0/47 40 95 0 1/30 leaf2 ethernet 0/47 40 95 0 2/30 loopback 0 172 16 1 179/32 loopback 0 172 16 1 166/32 loopback 1 172 16 2 179/32 loopback 1 172 16 2 179/32 vlan10 10 10 0 1/24 vlan10 10 10 0 1/24 vlan20 10 20 0 1/24 vlan20 10 20 0 1/24 vlan30 10 30 0 179/24(primary ip) 10 30 0 1/24(secondary ip) vlan30 10 30 0 166/24(primary ip) 10 30 0 1/24(secondary ip) vlan4094 40 94 0 1/30 vlan4094 40 94 0 2/30 leaf3 ethernet 0/47 40 95 0 1/30 leaf4 ethernet 0/47 40 95 0 2/30 loopback 0 172 16 1 170/32 loopback 0 172 16 1 162/32 loopback 1 172 16 2 170/32 loopback 1 172 16 2 170/32 vlan10 10 10 0 1/24 vlan10 10 10 0 1/24 vlan20 10 20 0 1/24 vlan20 10 20 0 1/24 vlan30 10 30 0 170/24(primary ip) 10 30 0 1/24(secondary ip) vlan30 10 30 0 162/24(primary ip) 10 30 0 1/24(secondary ip) vlan4094 40 94 0 1/30 vlan4094 40 94 0 2/30 configuration overview configuration overview the configuration baseline of this document only involves the networking devices in the spine leaf part of figure 2 other networking equipment configuration is skipped t able 3 configuration overview able 3 configuration overview node node configuration roadmap configuration roadmap configure spine nodes configure interconnect interfaces and loopback docid\ epztbqcbrkg8jdkxafvqz configure underlay bgp neighbors docid\ epztbqcbrkg8jdkxafvqz configure overlay bgp neighbors docid\ epztbqcbrkg8jdkxafvqz configure leaf nodes configure interconnect interfaces and loopback docid\ epztbqcbrkg8jdkxafvqz configure underlay bgp neighbors docid\ epztbqcbrkg8jdkxafvqz configure overlay bgp neighbors docid\ epztbqcbrkg8jdkxafvqz configure mc lag docid\ epztbqcbrkg8jdkxafvqz configure gw and vrf instances docid\ epztbqcbrkg8jdkxafvqz configure downstream cross device aggregation groups docid\ epztbqcbrkg8jdkxafvqz configure evpn and vxlan mapping docid\ epztbqcbrkg8jdkxafvqz configure monitor link group docid\ epztbqcbrkg8jdkxafvqz (optional) configure for layer 3 service connectivity with user side docid\ epztbqcbrkg8jdkxafvqz (optional) optimize arp parameters docid\ epztbqcbrkg8jdkxafvqz (optional) configure cross vrf communication docid\ epztbqcbrkg8jdkxafvqz take leaf1, leaf2, spine1 and spine2 for example ( the configurations of leaf3, leaf4 are similar to leaf1, leaf2 ), the detailed configuration procedures are given below configure spine nodes configure spine nodes configure interconnect interfaces and loopback configure interconnect interfaces and loopback table 4 configure interconnect interfaces and loopback on spine nodes table 4 configure interconnect interfaces and loopback on spine nodes descriptions descriptions spine1 spine1 spine2 spine2 configure the ip addresses of the interfaces connected to leaf nodes according to table 2 interface ethernet 0/0 description to leaf1 ipv6 use link local ! interface ethernet 0/4 description to leaf2 ipv6 use link local ! interface ethernet 0/0 description to leaf1 ipv6 use link local ! interface ethernet 0/4 description to leaf2 ipv6 use link local ! configure the ip address of loopback 0, which will be used as the router id as well as the source interface for establishing bgp neighbors interface loopback 0 ip address 172 16 1 165/32 ! interface loopback 0 ip address 172 16 1 167/32 ! configure underlay bgp neighbors configure underlay bgp neighbors the underlay layer 3 reachability between leaf and spine nodes is achieved by configuring ebgp routing protocols on the interconnecting physical interfaces, while overlay network is built by loopback0 create two bgp peer groups on all spine nodes peer to leaf used to establish underlay bgp neighbors with leaf nodes peer to leaf evpn used to establish overlay bgp neighbors with leaf nodes table 5 configure underlay bgp neighbors on spine nodes table 5 configure underlay bgp neighbors on spine nodes descriptions descriptions spine1 spine1 spine2 spine2 create a route map and an ip prefix list to advertise loopback route map advertise loopback permit 10 match ip address prefix list loopback exit ! ip prefix list loopback seq 10 permit 172 16 1 0/24 ge 24 ip prefix list loopback seq 20 permit 172 16 2 0/24 ge 24 ! route map advertise loopback permit 10 match ip address prefix list loopback exit ! ip prefix list loopback seq 10 permit 172 16 1 0/24 ge 24 ip prefix list loopback seq 20 permit 172 16 2 0/24 ge 24 ! configure bgp asn and the router id enable bgp gr, max med on start up and multi path features router bgp 65165 bgp router id 172 16 1 165 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart router bgp 65165 bgp router id 172 16 1 167 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart create a peer group named peer to leaf and enable bfd neighbor peer to leaf peer group neighbor peer to leaf remote as external neighbor peer to leaf bfd neighbor ethernet 0/0 interface peer group peer to leaf neighbor ethernet 0/4 interface peer group peer to leaf neighbor ethernet 0/8 interface peer group peer to leaf neighbor ethernet 0/12 interface peer group peer to leaf ! neighbor peer to leaf peer group neighbor peer to leaf remote as external neighbor peer to leaf bfd neighbor ethernet 0/0 interface peer group peer to leaf neighbor ethernet 0/4 interface peer group peer to leaf neighbor ethernet 0/8 interface peer group peer to leaf neighbor ethernet 0/12 interface peer group peer to leaf ! advertise loopback ips address family ipv4 unicast redistribute connected route map advertise loopback exit address family ! address family ipv4 unicast redistribute connected route map advertise loopback exit address family ! configure overlay bgp neighbors configure overlay bgp neighbors table 6 configure overlay bgp neighbors on spine nodes table 6 configure overlay bgp neighbors on spine nodes descriptions descriptions spine1 spine1 spine2 spine2 create a peer group named peer to leaf evpn enable bgp evpn address family and specify loopback 0 as the source interface router bgp 65165 neighbor peer to leaf evpn peer group neighbor peer to leaf evpn remote as external neighbor peer to leaf evpn ebgp multihop 5 neighbor peer to leaf evpn update source 172 16 1 165 neighbor 172 16 1 179 peer group peer to leaf evpn neighbor 172 16 1 166 peer group peer to leaf evpn neighbor 172 16 1 170 peer group peer to leaf evpn neighbor 172 16 1 135 peer group peer to leaf evpn ! address family ipv4 unicast no neighbor peer to leaf evpn activate exit address family ! address family l2vpn evpn neighbor peer to leaf evpn activate advertise all vni exit address family exit ! router bgp 65165 neighbor peer to leaf evpn peer group neighbor peer to leaf evpn remote as external neighbor peer to leaf evpn ebgp multihop 5 neighbor peer to leaf evpn update source 172 16 1 167 neighbor 172 16 1 179 peer group peer to leaf evpn neighbor 172 16 1 166 peer group peer to leaf evpn neighbor 172 16 1 170 peer group peer to leaf evpn neighbor 172 16 1 135 peer group peer to leaf evpn ! address family ipv4 unicast no neighbor peer to leaf evpn activate exit address family ! address family l2vpn evpn neighbor peer to leaf evpn activate advertise all vni exit address family exit ! notes for establishing a bgp neighbor with loopback, it is required to configure ebgp hops greater than 1 and designate the source interface configure leaf nodes configure leaf nodes configure interconnect interfaces and loopback configure interconnect interfaces and loopback t able 7 configure interconnect interfaces and loopback on leaf nodes able 7 configure interconnect interfaces and loopback on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 configure the ip address of the interface interconnected with spine interface ethernet 0/48 description to spine1 ipv6 use link local ! interface ethernet 0/52 description to spine2 ipv6 use link local ! interface ethernet 0/48 description to spine1 ipv6 use link local ! interface ethernet 0/52 description to spine2 ipv6 use link local ! configure the ip address of loopback0 as the router id as well as the source interface for establishing bgp neighbors interface loopback 0 ip address 172 16 1 179/32 ! interface loopback 0 ip address 172 16 1 166/32 ! configure the ip address of loopback1 as local vtep ip interface loopback 1 ip address 172 16 2 179/32 ! interface loopback 1 ip address 172 16 2 179/32 ! configure underlay bgp neighbors configure underlay bgp neighbors create two bgp peer groups on all leaf nodes peer to spine used to establish underlay bgp neighbors with spine nodes peer to spine evpn used to establish overlay bgp neighbors with spine nodes table 8 configure underlay bgp neighbors on leaf nodes table 8 configure underlay bgp neighbors on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a route map and an ip prefix list to advertise loopback route map advertise loopback permit 10 match ip address prefix list loopback exit ! ip prefix list loopback seq 10 permit 172 16 1 0/24 ge 24 ip prefix list loopback seq 20 permit 172 16 2 0/24 ge 24 ! route map advertise loopback permit 10 match ip address prefix list loopback exit ! ip prefix list loopback seq 10 permit 172 16 1 0/24 ge 24 ip prefix list loopback seq 20 permit 172 16 2 0/24 ge 24 ! configure bgp asn and router id enable bgp gr, max med on start up and multi path features router bgp 65100 bgp router id 172 16 1 179 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart router bgp 65100 bgp router id 172 16 1 166 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart create a peer group named peer to spine and enable bfd neighbor peer to spine peer group neighbor peer to spine remote as external neighbor peer to spine bfd neighbor ethernet 0/48 interface peer group peer to spine neighbor ethernet 0/52 interface peer group peer to spine ! neighbor peer to spine peer group neighbor peer to spine remote as external neighbor peer to spine bfd neighbor ethernet 0/48 interface peer group peer to spine neighbor ethernet 0/52 interface peer group peer to spine ! advertise loopback ips address family ipv4 unicast redistribute connected route map advertise loopback neighbor peer to spine route map advertise loopback out exit address family exit ! address family ipv4 unicast redistribute connected route map advertise loopback neighbor peer to spine route map advertise loopback out exit address family exit ! after finishing the configurations above, you can check of underlay bgp status through command show ip bgp summary mclag leaf 1# show ip bgp summary ipv4 unicast summary (vrf default) bgp router identifier 172 16 1 179, local as number 65100 vrf id 0 bgp table version 9 rib entries 13, using 2392 bytes of memory peers 2, using 1447 kib of memory peer groups 2, using 128 bytes of memory neighbor v as msgrcvd msgsent tblver inq outq up/down state/pfxrcd pfxsnt desc ethernet 0/48 4 65165 133 133 0 0 0 01 58 22 4 7 n/a ethernet 0/52 4 65165 131 133 0 0 0 01 58 11 4 7 n/a total number of neighbors 2 there are two messages you could take notice of in the output the field “state/pfxrcd” shows the state of the bgp session if the state is idle/connect/active, it means that the bgp session establishment is abnormal if it is displayed as a number, it means that the bgp session has established successfully, and the number is the count of route prefixes received from bgp peer the field “up/down” shows the time duration for which the bgp session has been in the current state configure overlay bgp neighbors configure overlay bgp neighbors table 9 configure overlay bgp neighbors on leaf nodes table 9 configure overlay bgp neighbors on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a peer group named peer to spine evpn enable bgp evpn address family and specify the source interface as loopback 0 router bgp 65100 neighbor peer to spine evpn peer group neighbor peer to spine evpn remote as external neighbor peer to spine evpn ebgp multihop 5 neighbor peer to spine evpn update source 172 16 1 179 neighbor 172 16 1 165 peer group peer to spine evpn neighbor 172 16 1 167 peer group peer to spine evpn ! address family ipv4 unicast no neighbor peer to spine evpn activate exit address family ! address family l2vpn evpn neighbor peer to spine evpn activate advertise all vni exit address family exit ! router bgp 65100 neighbor peer to spine evpn peer group neighbor peer to spine evpn remote as external neighbor peer to spine evpn ebgp multihop 5 neighbor peer to spine evpn update source 172 16 1 166 neighbor 172 16 1 165 peer group peer to spine evpn neighbor 172 16 1 167 peer group peer to spine evpn ! address family ipv4 unicast no neighbor peer to spine evpn activate exit address family ! address family l2vpn evpn neighbor peer to spine evpn activate advertise all vni exit address family exit ! after finishing the configurations above, you can check overlay bgp status through command show bgp l2vpn evpn summary mclag leaf 1# show bgp l2vpn evpn summary bgp router identifier 172 16 1 179, local as number 65100 vrf id 0 bgp table version 0 rib entries 151, using 27 kib of memory peers 2, using 1447 kib of memory peer groups 1, using 64 bytes of memory neighbor v as msgrcvd msgsent tblver inq outq up/down state/pfxrcd pfxsnt desc 172 16 1 165 4 65165 196 328 0 0 0 00 05 52 2153 2181 n/a 172 16 1 167 4 65165 136 332 0 0 0 00 05 53 2087 2181 n/a total number of neighbors 2 configure mc lag configure mc lag table 10 configure mc lag on leaf nodes table 10 configure mc lag on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 configure the vlanif for establishing mc lag vlan 4094 ! interface vlan 4094 ip address 40 94 0 1/30 exit ! vlan 4094 ! interface vlan 4094 ip address 40 94 0 2/30 exit ! configure the peer link and add to the vlan interface link aggregation 9999 switchport trunk vlan 4094 exit ! interface ethernet 0/72 link aggregation group 9999 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 exit ! interface ethernet 0/76 link aggregation group 9999 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 exit ! interface link aggregation 9999 switchport trunk vlan 4094 exit ! interface ethernet 0/72 link aggregation group 9999 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 exit ! interface ethernet 0/76 link aggregation group 9999 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 exit ! configure a mc lag domain mclag domain 1 local address 40 94 0 1 peer address 40 94 0 2 peer link link aggregation 9999 commit ! mclag domain 1 local address 40 94 0 2 peer address 40 94 0 1 peer link link aggregation 9999 commit ! configure the dad link interface ethernet 0/47 ip address 40 95 0 1/30 exit ! mclag domain 1 dad local address 40 95 0 1 dad peer address 40 95 0 2 commit ! interface ethernet 0/47 ip address 40 95 0 2/30 exit ! mclag domain 1 dad local address 40 95 0 2 dad peer address 40 95 0 1 commit ! notes only one mc lag domain is supported after completing the above configuration, you can check the mc lag state by using the show mclag state command mclag leaf 1# show mclag state the mclag's keepalive is ok mclag info sync is completed domain id 1 mclag session channel primary channel vrf name default consistency check action idle local ip 40 94 0 1 peer ip 40 94 0 2 dad local ip 40 95 0 1 dad peer ip 40 95 0 2 peer link interface lag 9999 keepalive time 1 dad detection delay 15 dad recovery delay mlag intf 60 dad recovery delay non mlag intf 0 dad vrf name default dad status dual active session timeout 15 peer link mac 60\ eb 5a 01 10\ b1 system mac 60\ eb 5a 01 10\ b1 peer mac 60\ eb 5a 01 10\ c1 admin role none role active mclag interface loglevel notice mainly focus on the following information the current mc lag status can be obtained from the column the mclag's keepalive is if it shows error , it means that there is an exception in the establishment of mc lag; if it shows ok , it means that the establishment of mc lag is successful; role indicates the role of the current device in mc lag, which is categorized into active and standby dad status indicates the dad status if it is single active , it means that there is an exception in dad establishment; if it is dual active , it means that dad establishment is successful, and thereafter the standby device will be error down when the mc lag status is in exception configure gw and vrf instances configure gw and vrf instances table 11 configure gw and vrf instances on leaf nodes table 11 configure gw and vrf instances on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a vlan for service traffic forwarding vlan 10 ! vlan 20 ! vlan 10 ! vlan 20 ! disable forwarding for arp packets arp broadcast disable arp broadcast disable create vrf instances the mac addresses of the vrfs sharing the l3 vni of the mc lag peer shall be the same the vrfs cannot be the same between different vteps vrf 10123 mac 00 00 00 01 23 00 exit vrf vrf 10123 mac 00 00 00 01 23 00 exit vrf configure the layer 3 gw and set up the arp proxy in evpn mode the ip and mac of the shared vlanif on mc lag peer are required to be the same interface vlan 10 mac address 00 00 00 10 00 00 vrf 10123 ip address 10 10 0 1/24 arp proxy mode evpn ! interface vlan 20 mac address 00 00 00 20 00 00 vrf 10123 ip address 10 20 0 1/24 arp proxy mode evpn ! interface vlan 10 mac address 00 00 00 10 00 00 vrf 10123 ip address 10 10 0 1/24 arp proxy mode evpn ! interface vlan 20 mac address 00 00 00 20 00 00 vrf 10123 ip address 10 20 0 1/24 arp proxy mode evpn ! (optional) for silent hosts, it is required to enable arp proxy extension feature interface vlan 10 arp proxy extend reply arp proxy extend request ! interface vlan 10 arp proxy extend reply arp proxy extend request ! add the peer link to the service vlan interface link aggregation 9999 switchport trunk vlan 10 switchport trunk vlan 20 ! interface link aggregation 9999 switchport trunk vlan 10 switchport trunk vlan 20 ! notes the command arp broadcast disable is supported only in version r0408p00 and later if the switch returns an error indicating that the command is not supported, use the following commands policy map type copp copp system policy class copp system arp trap action trap silent terminals are those who do not actively send arp requests configure downstream cross device aggregation groups configure downstream cross device aggregation groups table 12 configure downstream cross device aggregation groups on leaf nodes table 12 configure downstream cross device aggregation groups on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a dynamic aggregation group and enable fast rate interface link aggregation 100 lacp fast ratecommit ! interface link aggregation 101 lacp fast rate commit ! interface link aggregation 100 lacp fast rate commit ! interface link aggregation 101 lacp fast rate commit ! join the member lag of the mc lag mclag domain 1 member lag 100 member lag 101 ! mclag domain 1 member lag 100 member lag 101 ! (optional) if the servers are installed via pxe, the fallback feature is required to be enabled on one of leaf nodes interface link aggregation 100 lacp fallback commit ! interface link aggregation 101 lacp fallback commit ! add to the service vlan interface link aggregation 100 switchport trunk vlan 10 ! interface link aggregation 101 switchport trunk vlan 20 ! interface link aggregation 100 switchport trunk vlan 10 ! interface link aggregation 101 switchport trunk vlan 20 ! configure the member of the lag and enable storm suppression for bum interface ethernet 0/0 link aggregation group 100 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 ! interface ethernet 0/1 link aggregation group 101 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 ! interface ethernet 0/0 link aggregation group 100 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 ! interface ethernet 0/1 link aggregation group 101 storm suppress broadcast packets 1000 storm suppress multicast packets 1000 storm suppress unknown unicast packets 1000 ! after finishing the configurations above, you can check the port channel status through command show link aggregation summary mclag leaf 1# show link aggregation summary flags a active, i inactive, up up, dw down, n/a not available, s selected, d deselected, not synced no team dev protocol ports description \ 0100 lag 100 lacp(a)(up) 0/0 (s) n/a 0101 lag 101 lacp(a)(up) 0/1 (s) n/a 9999 lag 9999 lacp(a)(up) 0/72 (s) n/a 0/76 (s) in the example above, “up” means the current state of the port channel is normal, and “s” means the member port is selected configure evpn and vxlan mapping configure evpn and vxlan mapping table 13 configure evpn and vxlan mapping on leaf nodes table 13 configure evpn and vxlan mapping on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 configure evpn local vtep ip the vtep ip of mc lag peer shall be the same interface vxlan 0 source 172 16 2 179 exit interface vxlan 0 source 172 16 2 179 exit configure layer 2 vxlan mapping vlan 10 vni 100 ! vlan 20 vni 200 ! vlan 10 vni 100 ! vlan 20 vni 200 ! configure layer 3 vxlan mapping vrf 10123 vni 10000 exit vrf ! vrf 10123 vni 10000 exit vrf ! after configurations you can check vxlan tunnels through command show vxlan tunnel mclag leaf 1# show vxlan tunnel + + + + + \| remotevtep | vni | vlan | vrf | +==============+=======+========+=======+ \| 172 16 2 170 | 100 | 10 | | + + + + + \| 172 16 2 170 | 200 | 20 | | + + + + + \| 172 16 2 170 | 300 | 30 | | + + + + + \| 172 16 2 170 | 10000 | | 10123 | + + + + + configure monitor link group configure monitor link group the purpose of configuring an interface linkage group is to safeguard the switchover in case of link/node failure and reduce packet loss when the status of all uplink ports changes from up to down, the downlink ports will automatically go down; when the status of some uplink ports is restored, the downlink ports will be delayed for a period of time before resuming up the purpose of configuring interface delayed startup is to reduce packet loss in the traffic cutback scenario when the node reboots or fails to recover when the node reboots and recovers, the physical interfaces will be delayed for a certain period of time before resuming up, and the default delayed startup time for all interfaces is 150 s in large scale terminal scenarios (when the number of terminals in the entire network exceeds 1k), the mc lag peer needs to synchronize a large number of table entries after startup, and if the upstream port has been restored to make the traffic cut back but the table entry synchronization is not yet completed, then packet loss will occur therefore, it is recommended to configure a delay time of 300 seconds or longer on the uplink port, and use the default delay time of 150 seconds on the rest of the interfaces, so that the peer link and keep alive links are restored before the uplink port, and sufficient time is reserved for the synchronization of mc lag table entries table 14 configure monitor link on leaf nodes table 14 configure monitor link on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a monitor link group and specify the delay time monitor link group aster 60 ! monitor link group aster 60 ! specify the physical port connected to the spine as the uplink port and configure an interface delay of 300 seconds to start interface ethernet 0/48 monitor link aster uplink startup delay 300 ! interface ethernet 0/52 monitor link aster uplink startup delay 300 ! interface ethernet 0/48 monitor link aster uplink startup delay 300 ! interface ethernet 0/52 monitor link aster uplink startup delay 300 ! specify the physical ports connected to servers as downlink interface ethernet 0/0 monitor link aster downlink ! interface ethernet 0/1 monitor link aster downlink ! interface ethernet 0/0 monitor link aster downlink ! interface ethernet 0/1 monitor link aster downlink ! after configurations you can check monitor link through command show monitor link mclag leaf 1# show monitor link + + + + + + + \| group name | delay | uplink ports | downlink ports | lacp lags | networks | +===============+=========+================+==================+=============+============+ \| aster | 60 | 0/48 | 0/0 | | | \| | | 0/52 | 0/1 | | | + + + + + + + after completing the above configuration, you can check the startup delay time of the interface through command show interface startup delay mclag leaf 1# show interface startup delay port start up delay time \ 0/47 150 0/48 300 0/52 300 0/56 150 0/60 150 0/64 150 0/68 150 0/72 150 0/76 150 …… (optional) configure for layer 3 service connectivity with user side (optional) configure for layer 3 service connectivity with user side usually, the user side devices are attached to mc lag nodes with static routing to cope with flexible and rapid deployment scenarios and meet the needs of rapid service growth, this solution provides the ability to establish dynamic routing protocols between mc lag nodes and user side devices specifically, it is necessary to enable the unique ip function of the corresponding vlan on a pair of mc lag peers the current unique ip on the mc lag supports both diff mac and same mac modes in diff mac mode, the shared vlan on mc lag peer shall be configured with different ip and mac in same mac mode, two leaf peers need to be configured with different ips and the same mac address under the vlanifs when establishing a routing protocol between mc lag peers via the peer link link, the diff mac mode is required when establishing a routing protocol between an mc lag device and an access side device, both modes are acceptable, but the same mac mode is recommended below is a configuration example for establishing a bgp neighbor relationship between an mc lag device and the access side device table 15 (optional) configure for layer 3 service connectivity with user side table 15 (optional) configure for layer 3 service connectivity with user side descriptions descriptions leaf1 leaf1 leaf2 leaf2 create a vlan and enable unique ip configure different master ips for establishing routing protocols and same slave ips as vlan gateways vlan 30 ! interface vlan 30 arp proxy mode evpn mac address 00 00 00 30 00 00 vrf 10123 ip address 10 30 0 179/24 ip address 10 30 0 1/24 secondary ! mclag domain 1 unique ip vlan 30 same mac ! interface link aggregation 9999 switchport trunk vlan 30 ! vlan 30 ! interface vlan 30 arp proxy mode evpn mac address 00 00 00 30 00 00 vrf 10123 ip address 10 30 0 166/24 ip address 10 30 0 1/24 secondary ! mclag domain 1 unique ip vlan 30 same mac ! interface link aggregation 9999 switchport trunk vlan 30 ! add the lag to the service vlan interface link aggregation 101 switchport trunk vlan 30 ! interface link aggregation 101 switchport trunk vlan 30 ! create a bfd profile the recommended bfd detection and reception interval is 1000ms, and the local detection multiplier is 3 by default bfd profile bfd to tor transmit interval 1000 receive interval 1000 exit ! exit ! bfd profile bfd to tor transmit interval 1000 receive interval 1000 exit ! exit ! to simplify the configuration, create a peer group for interacting with the user side device, and enable bfd router bgp 65100 vrf 10123 no bgp ebgp requires policy neighbor peer to tor peer group neighbor peer to tor remote as external neighbor peer to tor ebgp multihop 255 neighbor peer to tor bfd profile bfd to tor neighbor 10 30 0 3 peer group peer to tor bgp listen range 10 30 0 0/24 peer group peer to tor ! router bgp 65100 vrf 10123 no bgp ebgp requires policy neighbor peer to tor peer group neighbor peer to tor remote as external neighbor peer to tor ebgp multihop 255 neighbor peer to tor bfd profile bfd to tor neighbor 10 30 0 3 peer group peer to tor bgp listen range 10 30 0 0/24 peer group peer to tor ! advertise ipv4 address family routes to bgp evpn neighbors address family l2vpn evpn advertise ipv4 unicast exit address family exit ! address family l2vpn evpn advertise ipv4 unicast exit address family exit ! after configurations you can check bgp status through command show ip bgp \[ vrf vrf name ] summary leaf1 mclag leaf 1# show ip bgp vrf 10123 summary ipv4 unicast summary (vrf vrf 10123) bgp router identifier 10 200 0 1, local as number 65100 vrf id 211 bgp table version 671 rib entries 51, using 9384 bytes of memory peers 1, using 723 kib of memory peer groups 1, using 64 bytes of memory neighbor v as msgrcvd msgsent tblver inq outq up/down state/pfxrcd pfxsnt desc 10 30 0 3 4 65236 1872 1770 0 0 0 00 10 59 25 27 n/a total number of neighbors 1 leaf2 mclag leaf 2# show ip bgp vrf 10123 summary ipv4 unicast summary (vrf vrf 10123) bgp router identifier 10 200 0 1, local as number 65100 vrf id 94 bgp table version 10342916 rib entries 51, using 9384 bytes of memory peers 1, using 723 kib of memory peer groups 1, using 64 bytes of memory neighbor v as msgrcvd msgsent tblver inq outq up/down state/pfxrcd pfxsnt desc 10 30 0 3 4 65236 27110 44103 0 0 0 00 13 16 27 27 n/a total number of neighbors 1 (optional) optimize arp parameters (optional) optimize arp parameters by default, the times of arp aging probes of this series of switches is 5, and the arp aging time is 300 seconds in large scale scenario (when the number of hosts in whole network is up to 1k), if the switch receives a large number of arps from hosts in a short time, it may not be able to process in time due to the limitation of cpu capability, resulting in the aging of some arp entries, and then packet loss at this time, you could increase arp probe times to 10, and increase the aging time to 600/1200 seconds to ease the pressure on cpu and avoid packet loss table 16 (optional) optimize arp parameters on leaf nodes table 16 (optional) optimize arp parameters on leaf nodes descriptions descriptions leaf1 leaf1 leaf2 leaf2 set arp probe times arp probe times 10 ! arp probe times 10 ! set arp aging time arp timeout 600 ! arp timeout 600 ! (optional) configure cross vrf communication (optional) configure cross vrf communication assume that vlan10 is located in vrf10123 and vlan20 is located in vrf2520 (config steps are skipped) since vlan10 and vlan20 are located in different vrfs, you need to configure route leakage for cross vrf communication to achieve host intercommunication or route exchange between vlan10 and vlan20 table 17 (optional) configure cross vrf communication table 17 (optional) configure cross vrf communication descriptions descriptions leaf1 leaf1 leaf2 leaf2 configure vrf10123 and vrf2520 to leak routes to each other with bgp ipv4 address family router bgp 65100 vrf 2520 ! address family ipv4 unicast import vrf 10123 exit address family exit ! router bgp 65100 vrf 10123 address family ipv4 unicast import vrf 2520 exit address family exit ! router bgp 65100 vrf 2520 ! address family ipv4 unicast import vrf 10123 exit address family exit ! router bgp 65100 vrf 10123 address family ipv4 unicast import vrf 2520 exit address family exit ! (optional) leak static routes, directly connected routes, and kernel routes from vrf10123 to vrf2520 if required router bgp 65100 vrf 10123 ! address family ipv4 unicast redistribute static redistribute connected redistribute kernel exit address family exit router bgp 65100 vrf 10123 ! address family ipv4 unicast redistribute static redistribute connected redistribute kernel exit address family exit (optional) advertise leaked ipv4 routes in vrf2520 to the remote vtep if required router bgp 65100 vrf 2520 address family l2vpn evpn advertise ipv4 unicast exit address family exit ! router bgp 65100 vrf 2520 address family l2vpn evpn advertise ipv4 unicast exit address family exit ! (optional) configure a route map to filter the routes if required for example, keep the segment route 25 10 12 0/24 in vrf2520 from being leaked ip prefix list ip list v4 seq 4294967295 deny 0 0 0 0/0 le 32 ip prefix list ip list v4 seq 10 permit 25 10 12 0/24 le 24 ! bgp extcommunity list expanded extcom rt vrf2520 seq 5 permit 2520 ! route map vrf2520 v4 permit 5 match extcommunity extcom rt vrf2520 exit ! route map vrf2520 v4 permit 10 match ip address prefix list ip list v4 exit ! router bgp 65100 vrf 2520 ! address family ipv4 unicast table map vrf2520 v4 exit address family exit ! ip prefix list ip list v4 seq 4294967295 deny 0 0 0 0/0 le 32 ip prefix list ip list v4 seq 10 permit 25 10 12 0/24 le 24 ! bgp extcommunity list expanded extcom rt vrf2520 seq 5 permit 2520 ! route map vrf2520 v4 permit 5 match extcommunity extcom rt vrf2520 exit ! route map vrf2520 v4 permit 10 match ip address prefix list ip list v4 exit ! router bgp 65100 vrf 2520 ! address family ipv4 unicast table map vrf2520 v4 exit address family exit ! maintenance maintenance common maintenance commands common maintenance commands display mc lag status display mc lag status table 18 display mc lag status table 18 display mc lag status purpose purpose command command display the mc lag status show mclag status display the mc lag parameter consistency check results show mclag consistency check result \[ number ] display interfaces status display interfaces status table 19 display interfaces status table 19 display interfaces status purpose purpose command command display all interfaces status show interface summary display port channel status show link aggregation summary display ip configuration and status of layer 3 ports show ip interfaces display error down interfaces show interface errdown display vlan configurations show vlan summary display delayed startup information for interfaces show interface startup delay display monitor link configuration show monitor link display counter statistics show counters interface display table entries commonly used display table entries commonly used table 20 display table entries commonly used table 20 display table entries commonly used purpose purpose command command display local mac entries show mac address display remote mac entries you could specify the vtep ip show vxlan remotemac { all | a b c d } display local and remote arp entries show arp display interface count statistics show counters interface displaying underlay routing information show ip route display overlay routing information show ip route vrf vrf name display underlay bgp neighbors show ip bgp \[ vrf vrf name ] summary display overlay bgp neighbors show bgp l2vpn evpn summary display all bgp neighbors show bgp \[ vrf vrf name ] summary display vxlan tunnel establishment status show vxlan tunnel display local vxlan mapping information show vxlan map display evpn type 1 routes show bgp l2vpn evpn route type ead display evpn type 2 routes show bgp l2vpn evpn route type macip display routes advertised to bgp ipv4 neighbors show ip bgp neighbors a b c d advertised routes display all routes received from bgp ipv4 neighbors the feature soft reconfiguration inbound is required to be enabled first show ip bgp neighbors a b c d received routes display routes advertised to bgp evpn neighbors show ip bgp l2vpn evpn neighbors a b c d advertised routes display all routes received from bgp evpn neighbors show ip bgp l2vpn evpn neighbors a b c d routes how to upgrade how to upgrade by following the procedure below to upgrade spine and leaf, you can significantly reduce the impact to service during node upgrade in evpn multihoming scenario preparation for upgrade preparation for upgrade backup configuration files back up the configuration file of your system to a server or locally the path on the switch is /etc/sonic/config db json collect the table entries for spine nodes, it is necessary to collect the bgp neighbors and routes info before upgrading for leaf nodes, it is necessary to collect mc lag, arp, mac, lag, vrf, vni, bgp, routes, vxlan tunnels, es, and other information before upgrading so that you can verify that the status of the upgraded switch is normal transfer the image to switch get the latest software image and the md5 notes asternos v3 1 rxxxpxx fl bin is for cx308p 48y n v2 and cx532p n v2 asternos v3 1 rxxxpxx bin is for other models log in to the switch and copy the software image from remote server to the target switch via scp for example mclag leaf 1# scp source sonic\@10 250 0 243 /asternos/release/asternos v3 1 r0407p00 fl bin target the authenticity of host '10 250 0 243 (10 250 0 243)' can't be established ed25519 key fingerprint is sha256\ gpanann/+mh0zxnir/3yxo0v0bdfkgd0lzwrqueukye this key is not known by any other names are you sure you want to continue connecting (yes/no/\[fingerprint])? yes warning permanently added '10 250 0 243' (ed25519) to the list of known hosts sonic\@10 250 0 243's password asternos v3 1 r0407p00 fl bin 100% 1425mb 82 8mb/s 00 17 leaf1# system ls asternos v3 1 r0407p00 fl bin after the transmission is complete, please check the md5 of the image if it is not the same as the given value, it means that the file is incomplete or there was an error during transmission and it requires re transmission mclag leaf 1# system md5sum asternos v3 1 r0407p00 fl bin fed40a54f42fa54ced69c99e4311ba7e asternos v3 1 r0407p00 fl bin upgrade for spine nodes upgrade for spine nodes the following is an example of the specific procedures for upgrading spine1 first manually switch the network side traffic to spine2 depending on the routing protocols in the network, for bgp, you can enable graceful shutdown to lower the priority of routes advertised from spine1 description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn enable bgp graceful shutdown bgp graceful shutdown exit configuration view end verify that there is no traffic passing through spine1 description description command command clear the statistics clear counters interface display interfaces counter statistics show counters interface save the configuration description description command command save the configuration write notes please be sure that you have saved your configuration before installing a new image install the new image and reboot description description command command list the files in your directory system ls install the new image image update bin file confirm that the image is successfully installed show image reboot the switch reboot after reboot, wait for about 6 minutes to do the verification of the operation status of the upgraded switch description description command command check if the current version is the expected one show version check that the containers on the switch are operating normally system docker ps check the configuration show running config check whether the physical interfaces are up show interface summary check the bgp sessions and routing entries description description command command check underlay bgp neighbor status show ip bgp summary check overlay bgp neighbor status show bgp l2vpn evpn summary check routing table show ip route restore network side traffic description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn disable bgp graceful shutdown no bgp graceful shutdown exit configuration view end save the configuration write check whether the service has been restored verify that traffic is restored on spine1 description description command command display interfaces counter statistics show counters interface now the upgrade of spine1 is done, and then you could repeat the above procedure for spine2 upgrade for leaf nodes upgrade for leaf nodes notes there is no requirement for the order in which leaf nodes should be upgraded first, but it is recommended that there is at least a 10 minute gap between upgrades before upgrading, please make sure that the hosts with dual return access under mc lag are in a state that is not in a single hanging state if there is any unusual packet loss during the upgrade, please revert the action in time and contact the personnel involved, we will provide technical support for you the following is an example of the specific procedures for upgrading leaf1 log in to the leaf1 and leaf2 switches and check the status of the aggregation ports on the switches to make sure that the hosts with dual return access under mc lag are not in the single hanging state; description description command command display port channel status show link aggregation summary first manually cut traffic from the network side to leaf2 depending on the routing protocols in the network, for bgp, you can enable graceful shutdown to lower the priority of routes advertised from leaf1 description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn enable bgp graceful shutdown bgp graceful shutdown exit configuration view end manually down all lag interface protocols in use for leaf1 and manually shutdown, thereby switching all network side and user side traffic to leaf2; description description command command enter lag configuration view interface link aggregation lag id set the lagif to be down on protocol level lacp graceful down shutdown the lagif shutdown exit configuration view end verify that there is no traffic passing through leaf1 description description command command clear the statistics clear counters interface display interfaces counter statistics show counters interface save the configuration description description command command save the configuration write please be sure that you have saved your configuration before installing a new image install the new image and reboot description description command command list the files in your directory system ls install the new image image update bin file confirm that the image is successfully installed show image reboot the switch reboot after reboot, wait for about 6 minutes to do the verification of the operation status of the upgraded switch description description command command check if the current version is the expected one show version check that the containers on the switch are operating normally system docker ps check the configuration show running config check whether the physical interfaces are up show interface summary check the bgp sessions, check that the mc lag is in the ok state and check the vxlan tunnel has been established; description description command command check underlay bgp neighbor status show ip bgp \[ vrf vrf name ] summary check overlay bgp neighbor status show bgp l2vpn evpn summary display the mc lag status show mclag state check the vxlan tunnel show vxlan tunnel restore user side traffic bring back the leaf downlink port description description command command enter lag configuration view interface link aggregation lag id start up the lagif no shutdown check that the arp, mac, and routing table entries on leaf1 are fully synchronized with leaf2 description description command command display local mac entries show mac address display remote mac entries show vxlan remotemac { all | a b c d } displays local and remote arp entries show arp display remote hosts routing table entries show ip route \[ vrf vrf name ] notes requires that the entries of leaf1 are fully synchronized before switching network side traffic, otherwise there will be a packet loss restore network side traffic description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn disable bgp graceful shutdown no bgp graceful shutdown exit configuration view end save the configuration write check whether the service has been restored verify that traffic is restored on leaf1 also check with the network administrator to make sure that services are normal description description command command display interfaces counter statistics show counters interface now the upgrade of leaf1 is done, and then you could repeat the above procedure for leaf2 seamless migration from standalone to mc lag to migrate a standalone leaf node to an mc lag dual homing mode (i e , expanding an existing operational leaf with a peer node to build a high availability architecture), operators should follow a standardized workflow configuration deployment > logical isolation > physical integration > state synchronization > traffic switchover this ensures minimal service impact and mitigates loop risks assuming leaf1 is currently operational and leaf2 is being added to form the mc lag pair configuration deployment power on leaf2 and gain access via the management network or console port complete the baseline configurations for bgp, mc lag, vrf, and layer3 gateways (refer to section configure leaf nodes docid\ epztbqcbrkg8jdkxafvqz ) to ensure configuration consistency between leaf1 and leaf2 notes for single homed hosts, the corresponding vlans must be allowed on the peer link of leaf2 logical isolation configure graceful shutdown on leaf2 based on the active routing protocols this ensures that leaf2 advertises routes with lower priority, preventing it from attracting northbound traffic before mc lag forwarding tables are fully synchronized description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn enable bgp graceful shutdown bgp graceful shutdown exit configuration view end establish physical connectivity after completing the configurations, connect the physical links from leaf2 to leaf1 (peer link) and the spine layer this prevents traffic from being prematurely diverted to the new node during the configuration phase verify control plane integrity check and confirm that bgp peering is established and that the mc lag status is "ok" (synchronized/up) inspect vxlan tunnel information description description command command check whether the physical interfaces are up show interface summary check the underlay bgp neighbor status show ip bgp \[ vrf vrf name ] summary check the overlay bgp neighbor status show bgp l2vpn evpn summary check the mc lag status show mclag state check the vxlan tunnel status show vxlan tunnel host connectivity & table validation configure the server bond (lag) and connect the cables between leaf2 and the servers verify that the lag status on leaf2 is "up" and ensure that arp, mac, and routing tables are fully synchronized with leaf1 (the number of table entries on leaf1 and leaf2 should be identical under normal conditions) description description command command check the operational state of link aggregation group interfaces show link aggregation summary check the routing entries show ip route vrf all check the arp entries show arp check the fdb entries show mac address check the remote mac entries show vxlan remotemac { all | a b c d } traffic cutover disable graceful shutdown on leaf2 this restores standard routing priority, allowing northbound and southbound traffic to transition smoothly into the dual homed path description description command command enter the configuration view configure terminal enter the bgp configuration view router bgp asn disable bgp graceful shutdown no bgp graceful shutdown exit configuration view end save the configuration write notes requires that the entries of leaf1 are fully synchronized before switching network side traffic, otherwise there will be a packet loss traffic and service verification verify that traffic forwarding on leaf2 is functioning correctly, and coordinate with the network management system (nms) or operations team to confirm that all services are operational description description command command display interfaces counter statistics show counters interface configuration persistence save the configuration to ensure all changes are persistent across reboots description description command command save the configuration write
