Technical Docs
CX864E-N Best Practices for Medium-to-Large Scale AI Compute Backend Fabric
33 min
preface preface this guide provides a detailed standardized networking solution, configuration guidance, and maintenance manual for building medium to large scale ai compute backend fabric the solution implements a 2 tier clos network using asterfusion cx864e n switches, based on rail optimized architecture target audience target audience intended for solution planners, designers, and on site implementation engineers who are familiar with asterfusion data center switches roce, pfc, ecn, and related technologies overview overview the rail optimized architecture is recommended for the deployment of backend fabric in medium to large scale ai clusters as shown above, the key design of the rail optimized architecture is to connect the same indexed nics of every server to the same leaf switch, ensuring that multi node gpu communication completes in the fewest possible hops in this design, communication between gpu nodes can utilize internal nvswitch paths, requiring only one network hop to reach the destination without crossing multiple switches, thus avoiding additional latency the details are as follows intra server 8 gpus connect to the nvswitch via the nvlink bus, achieving low latency intra server communication and reducing scale out network transmission pressure server to leaf all servers follow a uniform cabling rule nics are connected to multiple leaf switches according to the "nic1 leaf1, nic2 leaf2 " network layer leaf and spine switches are fully meshed in a 2 tier clos architecture note note nvswitch is a high speed switching chip by nvidia for scale up networks, enabling gpus to communicate at maximum nvlink speeds typical configuration example typical configuration example this example illustrates an ai cluster consisting of 64 compute nodes (256 gpus total, 4 per server) the deployment includes 6 cx864e n 2 spine nodes and 4 leaf nodes key design principles include each gpu connects to a dedicated nic; nics follow the "nic n to leaf n " rule independent subnets per rail 2 tier clos fabric leaf and spine switches are fully meshed leveraging ipv6 link local, unnumbered bgp neighbors are established to exchange rail subnet routes, eliminating the need for ip planning on interconnect interfaces 1 1 oversubscription to ensure non blocking transport, the oversubscription ratio on leaf switches is strictly maintained at 1 1 unified lossless fabric easy roce and advanced load balancing features are enabled on both leaf and spine nodes network topology network topology note note for deployment convenience, it is recommended to connect the upper half of the leaf interfaces to servers and the lower half to spines the as numbers, loopback, and gateway vlan ip planning for each node are as follows table 1 as number and loopback ip planning device name device name as number as number loopback 0 ip address loopback 0 ip address leaf1 65111 10 1 0 111/32 leaf2 65112 10 1 0 112/32 leaf3 65113 10 1 0 113/32 leaf4 65114 10 1 0 114/32 spine1 65115 10 1 0 115/32 spine2 65116 10 1 0 116/32 table 2 gateway vlan ip planning device name device name vlan id vlan id gateway ip address gateway ip address leaf1 101 10 10 1 1/25 leaf2 102 10 10 1 129/25 leaf3 103 10 10 2 1/25 leaf4 104 10 10 2 129/25 configuration overview configuration overview table 3 configuration overview task task configuration configuration roadmap leaf node 1 (optional) configure nic side interface breakout 2\ configure gateway vlan and ip addresses 3\ configure bgp for l3 connectivity 4\ enable easy roce 5\ configure ars spine node 1 configure bgp for l3 connectivity 2\ enable easy roce 3\ configure ars and hash seed configuring leaf switches configuring leaf switches (optional) configure nic side interface breakout (optional) configure nic side interface breakout when connecting 400g nics to cx864e n switches, split each of the downlink 800g port into two 400g interfaces table 4 interface breakout configuration step step leaf1 leaf1 enter global config configure terminal breakout upper 800g ports interface range ethernet 0/0 0/248 breakout 2x400g\[200g] ! single port alternative interface ethernet 0/0 breakout 2x400g\[200g] ! after completing the configuration, verify the interface status using the show interface summary command gateway vlan and ip configuration gateway vlan and ip configuration table 5 vlan and interface ip configuration step step leaf1 leaf1 set hostname hostname leaf1 configure gateway vlan vlan 101 ! interface vlan 101 ip address 10 10 1 1/25 ! assign downlink ports interface range ethernet 0/0 0/252 switchport access vlan 101 ! if the current version does not support batch configuration interface ethernet 0/0 switchport access vlan 101 ! verify vlan configuration using the show vlan summary command bgp configuration for l3 connectivity bgp configuration for l3 connectivity enable the ipv6 link local feature on leaf spine interfaces to establish unnumbered bgp neighbors table 6 bgp neighbor configuration on leaf step step leaf1 leaf1 enable ipv6 link local interface range ethernet 0/256 0/504 ipv6 use link local ! if the current version does not support batch configuration interface ethernet 0/256 ipv6 use link local ! configure loopback 0 interface loopback 0 ip address 10 1 0 111/32 ! global bgp settings router bgp 65111 bgp router id 10 1 0 111 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart unnumbered peer group neighbor peer unnumber bgp peer group neighbor peer unnumber bgp remote as external neighbor range ethernet 0/256 0/504 interface peer group peer unnumber bgp if the current version does not support batch configuration neighbor peer unnumber bgp peer group neighbor peer unnumber bgp remote as external neighbor ethernet 0/256 interface peer group peer unnumber bgp neighbor ethernet 0/264 interface peer group peer unnumber bgp route advertisement address family ipv4 unicast redistribute connected exit address family ! verify bgp configuration and status using the show bgp summary command easy roce configuration easy roce configuration the cx n series switches support queues 0 7 (8 queues in total) queue 3 and queue 4 are lossless (supporting up to two lossless queues), while others are lossy the default template uses system default dscp mapping pfc and ecn are enabled for queue 3 and queue 4, and strict priority (sp) scheduling is set for queues 6 and 7 when creating a template, you can specify three parameters cable length specifies the cable length, affecting pfc and ecn parameter calculations options 5m / 40m / 100m / 300m if the exact length is unavailable, choose the closest value (e g , choose 5m for a 10m cable) incast level specifies the traffic incast model, affecting pfc parameters calculation options low (e g 1 1) / medium (e g 3 1) / high (e g 10 1) low is typically used for gpu backend fabric traffic model specifies the business type throughput sensitive, latency sensitive, or balanced this affects ecn parameters calculations options throughput / latency / balance balance and throughput are typically used for gpu backend fabric if the provided lossless roce configuration does not fully suit your scenario, refer to docid\ vjrveqpakcp gczqmzf3v for fine tuning table 7 enabling easy roce step step leaf1 leaf1 (optional) modify lossless queues; requires save and reload to take effect no priority flow control enable 3 no priority flow control enable 4 priority flow control enable queue id write reload select easy roce template and apply to all interfaces qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput verify roce configuration using the show qos roce command leaf1# show qos roce notice displaying configurations of in use roce profiles \==> roce profile roce lossless 5m low throughput | roce policy map roce lossless 5m low throughput 400g <== + + + + \| | operational | description | +====================+=================+=====================================================+ \| mode | lossless | qos roce mode | + + + + \| status | bind 0/0 0/252 | qos roce binding status | + + + + \| cable length | 5m | cable length in meters for qos roce lossless config | + + + + \| congestion control | | | \| congestion mode | ecn | congestion control mode | \| enabled tc | 3,4 | congestion control config enabled traffic class | \| max threshold | 10094080 | congestion control config max threshold | \| min threshold | 2000000 | congestion control config max threshold | + + + + \| pfc | | | \| pfc priority | 3,4 | pfc enabled switch priority | \| tx status | enabled | pfc rx status | \| rx status | enabled | pfc tx status | + + + + \| trust | | | \| trust mode | dscp | trust setting for packet classification | + + + + \====> roce dscp >sp mapping configurations <==== + + + \| dscp | switch priority | +=========================+===================+ \| 0,1,2,3,4,5,6,7 | 0 | \| 8,9,10,11,12,13,14,15 | 1 | \| 16,17,18,19,20,21,22,23 | 2 | \| 24,25,26,27,28,29,30,31 | 3 | \| 32,33,34,35,36,37,38,39 | 4 | \| 40,41,42,43,44,45,46,47 | 5 | \| 48,49,50,51,52,53,54,55 | 6 | \| 56,57,58,59,60,61,62,63 | 7 | + + + \====> roce sp >tc mapping & ets configurations <==== + + + + \| switch priority | mode | weight | +===================+========+==========+ \| 6 | sp | | \| 7 | sp | | + + + + \====> pfc profile configurations <==== + + + \| profile name | switch priority | +==============================================+===================+ \| egress lossless profile | 3,4 | \| egress lossy profile | 0,1,2,5,6,7 | \| ingress lossy profile | 0,1,2,5,6,7 | \| pg lossless 10000 40m profile | 3,4 | \| roce lossless 5m low throughput 400g profile | 3,4 | \| roce lossless 5m low throughput 800g profile | 3,4 | + + + \==> roce profile roce lossless 5m low throughput | roce policy map roce lossless 5m low throughput 800g <== + + + + \| | operational | description | +====================+===================+=====================================================+ \| mode | lossless | qos roce mode | + + + + \| status | bind 0/256 0/504 | qos roce binding status | + + + + \| cable length | 5m | cable length in meters for qos roce lossless config | + + + + \| congestion control | | | \| congestion mode | ecn | congestion control mode | \| enabled tc | 3,4 | congestion control config enabled traffic class | \| max threshold | 11261952 | congestion control config max threshold | \| min threshold | 2231378 | congestion control config max threshold | + + + + \| pfc | | | \| pfc priority | 3,4 | pfc enabled switch priority | \| tx status | enabled | pfc rx status | \| rx status | enabled | pfc tx status | + + + + \| trust | | | \| trust mode | dscp | trust setting for packet classification | + + + + \====> roce dscp >sp mapping configurations <==== + + + \| dscp | switch priority | +=========================+===================+ \| 0,1,2,3,4,5,6,7 | 0 | \| 8,9,10,11,12,13,14,15 | 1 | \| 16,17,18,19,20,21,22,23 | 2 | \| 24,25,26,27,28,29,30,31 | 3 | \| 32,33,34,35,36,37,38,39 | 4 | \| 40,41,42,43,44,45,46,47 | 5 | \| 48,49,50,51,52,53,54,55 | 6 | \| 56,57,58,59,60,61,62,63 | 7 | + + + \====> roce sp >tc mapping & ets configurations <==== + + + + \| switch priority | mode | weight | +===================+========+==========+ \| 6 | sp | | \| 7 | sp | | + + + + \====> pfc profile configurations <==== + + + \| profile name | switch priority | +==============================================+===================+ \| egress lossless profile | 3,4 | \| egress lossy profile | 0,1,2,5,6,7 | \| ingress lossy profile | 0,1,2,5,6,7 | \| pg lossless 10000 40m profile | 3,4 | \| roce lossless 5m low throughput 400g profile | 3,4 | \| roce lossless 5m low throughput 800g profile | 3,4 | + + + ars (adaptive routing switch) configuration ars (adaptive routing switch) configuration the deployment logic for ars follows these three phases create ars instances > bind next hop groups > fine tune idle time architectural relationship it is essential to understand that ars instances and next hop groups (ecmp groups) maintain a one to one mapping at the spine layer each leaf switch advertises unique routes for example, the ecmp group for routes advertised by leaf1 consists of all physical links connecting the spine to leaf1 consequently, the spine requires a dedicated next hop group for each leaf the number of ars instances on a spine switch must match the total number of leaf switches at the leaf layer all routes advertised by other leafs share the same ecmp members (the uplink paths to spine1 and spine2) therefore, a leaf switch only requires a single ars instance to manage all northbound traffic binding destination networks after creating the instances, it is necessary to associate the destination network segments with their corresponding ars instances for spine1 the next hop group targets the links to leaf1; therefore, you only need to specify the loopback 0 ip of leaf1 as the destination for leaf1 the next hop group targets the uplinks to both spines; therefore, specifying the loopback 0 ip of any other leaf in the cluster will bind the traffic to the corresponding ars instance idle time calibration idle time determines the granularity at which a flow is split into a series of flowlets a flow split is triggered whenever the inter frame gap exceeds this defined interval it is recommended to set the idle time to rtt /2 start with the system default and fine tune based on real time traffic load increase idle time if significant packet reordering is detected at the endpoints decrease idle time if load distribution between the leaf and spine layers appears unbalanced note note rtt (round trip time) is the total time required for a data packet to travel from the sender to the receiver and back again table 8 ars configuration step step leaf1 leaf1 enable ars profile ars profile configure instance ars instance to spine idle time 10 ! bind next hop group ars nexthop group 10 1 0 112/32 instance to spine verify ars configuration using the show ars instance command leaf1# show ars instance instance name assign mode idle time max flows binding configs nexthop group members member count \ to spine per flowlet quality 10 512 10 1 0 112/32 in vrf default n/a n/a the nexthop group members and member count will reflect the actual next hop group members and the member quantity after the route is reachable configuring spine nodes configuring spine nodes bgp configuration for l3 connectivity bgp configuration for l3 connectivity table 9 bgp neighbor configuration on spine step step spine1 spine1 configure hostname hostname spine1 enter global configuration mode configure terminal enable ipv6 link local interface range ethernet 0/0 0/504 ipv6 use link local ! if the current version does not support batch configuration interface ethernet 0/0 ipv6 use link local ! configure loopback 0 interface loopback 0 ip address 10 1 0 115/32 ! global bgp settings router bgp 65115 bgp router id 10 1 0 115 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart unnumbered peer group neighbor peer unnumber bgp peer group neighbor peer unnumber bgp remote as external neighbor range ethernet 0/0 0/504 interface peer group peer unnumber bgp if the current version does not support batch configuration neighbor peer unnumber bgp peer group neighbor peer unnumber bgp remote as external neighbor ethernet 0/0 interface peer group peer unnumber bgp neighbor ethernet 0/8 interface peer group peer unnumber bgp verify bgp configuration and status using the show bgp summary command easy roce configuration easy roce configuration table 10 enabling easy roce step step spine1 spine1 (optional) modify lossless queues; requires save and reload to take effect no priority flow control enable 3 no priority flow control enable 4 priority flow control enable queue id write reload select easy roce template and apply to all interfaces qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput verify roce configuration using the show qos roce command ars and hash seed configuration ars and hash seed configuration as previously described, the spine node requires a dedicated ars instance for each leaf node each instance is then bound to its corresponding next hop group by specifying the loopback 0 ip of each leaf the purpose of configuring hash seed is to mitigate hash polarization (also known as hash imbalance) this phenomenon occurs when traffic remains unevenly distributed across available paths after undergoing multiple stages of hashing hash polarization is most prevalent in clos topology it typically arises when multi tier switches utilize identical asic chips for ecmp, as they often employ the same hashing algorithms by default consequently, the second tier switches fail to effectively redistribute traffic that was already hashed by the first tier, leading to sub optimal bandwidth utilization and "hot spots" on certain links this issue can be effectively resolved by adjusting the hash factors or the hash seed on devices at different network layers to ensure distinct hashing results at each stage table 11 ars and hash seed configuration step step spine1 spine1 enable ars profile ars profile configure instances ars instance to leaf1 idle time 10 ! ars instance to leaf2 idle time 10 ! ars instance to leaf3 idle time 10 ! ars instance to leaf4 idle time 10 ! bind next hop groups ars nexthop group 10 1 0 111/32 instance to leaf1 ars nexthop group 10 1 0 112/32 instance to leaf2 ars nexthop group 10 1 0 113/32 instance to leaf3 ars nexthop group 10 1 0 114/32 instance to leaf4 configure hash seed hash seed 1234 verify ars configuration using the show ars instance command maintenance maintenance roce parameter adjustment/optimization roce parameter adjustment/optimization when default configurations are insufficient, use the following commands to optimize performance modify dscp mapping modify dscp mapping table 12 modifying dscp mapping step step command command check running config for dscp map name show running config enter global configuration mode configure terminal enter dscp map configuration view diffserv map type ip dscp roce lossless diffserv map map specific dscp to cos value ip dscp dscp value cos cos value map all dscp to a default cos default cos value use system default dscp mapping default copy note note the cos value represents the queue id the packet is mapped to modify queue scheduling policy modify queue scheduling policy if the interface has been bound to a lossless roce policy, unbind it before modifying table 13 modifying queue scheduling policy step step command command check running config for policy name show running config enter global configuration mode configure terminal enter lossless roce policy view policy map roce lossless name configure sp mode scheduling queue scheduler priority queue queue id configure dwrr mode scheduling queue scheduler queue limit percent queue weight queue queue id adjust pfc and ecn thresholds adjust pfc and ecn thresholds ecn thresholds are adjusted via min th , max th , and probability min th sets the lower absolute value for ecn marking (bytes) max th sets the upper absolute value for ecn marking (bytes) probability sets the maximum marking probability \[1 100] pfc thresholds are adjusted via the dynamic threshold coefficient dynamic th \text{pfc threshold} = 2^{\text{dynamic\\ th}} \times \text{remaining available buffer} other parameters can remain unchanged during modification recommended values for cx864e n pfc dynamic th 1, 2, 3 wred min (bytes) 1,000,000 / 2,000,000 / 3,000,000 wred max (bytes) 8,000,000 / 10,000,000 / 12,000,000 wred probability (%) 10, 30, 50, 70, 90 note note try ecn adjustment first, then pfc you can follow the principle wred min < wred max < pfc xon < pfc xoff this ensures ecn triggers rate adjustment early during congestion to avoid unnecessary pfc, while still allowing pfc to trigger promptly when necessary to prevent packet loss table 14 adjusting pfc and ecn thresholds operation operation command command get wred and buffer template names show running config enter global configuration mode configure terminal enter ecn configuration view wred roce lossless ecn adjust ecn thresholds mode ecn gmin min th gmax max th gprobability probability enter pfc configuration view buffer profile roce lossless profile adjust pfc thresholds mode lossless dynamic dynamic th size size xoff xoff xon offset xon offset common o\&m commands common o\&m commands interface status maintenance interface status maintenance table 15 interface status information operation operation command command view interface status show interface summary view layer 3 interface ip config and status show ip interfaces view vlan configuration show vlan summary view interface counter statistics show counters interface common table entry maintenance common table entry maintenance table 16 common table entries operation operation command command view lldp neighbor information show lldp neighbor { summary | interface interface name } view local mac address table show mac address view local arp table show arp view bgp neighbor status show bgp summary view local routing table show ip route roce statistics maintenance roce statistics maintenance table 17 roce statistics operation operation command command view roce configuration show qos roce \[ all | summary | roce profile name ] view interface and policy binding show interface policy map view roce related queue statistics show counters qos roce interface ethernet interface name queue queue id clear roce statistics on all interfaces clear counters qos roce view pfc counters show counters priority flow control clear pfc counters clear counters priority flow control view ecn counters show counters ecn clear ecn counters clear counters ecn ars configuration maintenance ars configuration maintenance table 18 ars configuration and status operation operation command command view ars profile configuration show ars profile view ars instance configuration and bindings show ars instance appendix configuration files (sample) appendix configuration files (sample) leaf1 ! hostname leaf1 ! interface loopback 0 ip address 10 1 0 111/32 ! \#to server ! interface range ethernet 0/0 0/248 breakout 2x400g\[200g] ! \#to spine ! interface range ethernet 0/256 0/504 ipv6 use link local ! \#vlan ! interface vlan 101 ip address 10 10 1 1/25 exit ! interface range ethernet 0/0 0/252 switchport access vlan 101 ! \#bgp ! router bgp 65111 bgp router id 10 1 0 111 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/256 0/504 interface peer group peer unnumber ! address family ipv4 unicast redistribute connected exit address family exit ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ! ars profile ! ars instance to spine idle time 10 ! ars nexthop group 10 1 0 112/32 instance to spine ! leaf2 ! hostname leaf2 ! interface loopback 0 ip address 10 1 0 112/32 ! \#to server ! interface range ethernet 0/0 0/248 breakout 2x400g\[200g] ! \#to spine ! interface range ethernet 0/256 0/504 ipv6 use link local ! \#vlan ! interface vlan 102 ip address 10 10 1 129/25 exit ! interface range ethernet 0/0 0/252 switchport access vlan 102 ! \#bgp ! router bgp 65112 bgp router id 10 1 0 112 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/256 0/504 interface peer group peer unnumber ! address family ipv4 unicast redistribute connected exit address family exit ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ! ars profile ! ars instance to spine idle time 10 ! ars nexthop group 10 1 0 111/32 instance to spine ! leaf3 ! hostname leaf3 ! interface loopback 0 ip address 10 1 0 113/32 ! \#to server ! interface range ethernet 0/0 0/248 breakout 2x400g\[200g] ! \#to spine ! interface range ethernet 0/256 0/504 ipv6 use link local ! \#vlan ! interface vlan 103 ip address 10 10 2 1/25 exit ! interface range ethernet 0/0 0/252 switchport access vlan 103 ! \#bgp ! router bgp 65113 bgp router id 10 1 0 113 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/256 0/504 interface peer group peer unnumber ! address family ipv4 unicast redistribute connected exit address family exit ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ! ars profile ! ars instance to spine idle time 10 ! ars nexthop group 10 1 0 114/32 instance to spine ! leaf4 ! hostname leaf4 ! interface loopback 0 ip address 10 1 0 114/32 ! \#to server ! interface range ethernet 0/0 0/248 breakout 2x400g\[200g] ! \#to spine ! interface range ethernet 0/256 0/504 ipv6 use link local ! \#vlan ! interface vlan 104 ip address 10 10 2 129/25 exit ! interface range ethernet 0/0 0/252 switchport access vlan 104 ! \#bgp ! router bgp 65114 bgp router id 10 1 0 114 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/256 0/504 interface peer group peer unnumber ! address family ipv4 unicast redistribute connected exit address family exit ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ! ars profile ! ars instance to spine idle time 10 ! ars nexthop group 10 1 0 113/32 instance to spine ! spine1 ! hostname spine1 ! interface loopback 0 ip address 10 1 0 115/32 ! \#to leaf ! interface ethernet 0/0 0/504 ipv6 use link local ! \#bgp ! router bgp 65115 bgp router id 10 1 0 115 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/0 0/504 interface peer group peer unnumber ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ars instance to leaf1 idle time 10 ! ars instance to leaf2 idle time 10 ! ars instance to leaf3 idle time 10 ! ars instance to leaf4 idle time 10 ! ars nexthop group 10 1 0 111/32 instance to leaf1 ! ars nexthop group 10 1 0 112/32 instance to leaf2 ! ars nexthop group 10 1 0 113/32 instance to leaf3 ! ars nexthop group 10 1 0 114/32 instance to leaf4 ! \#hash hash seed 1234 spine2 ! hostname spine2 ! interface loopback 0 ip address 10 1 0 116/32 ! \#to leaf ! interface ethernet 0/0 0/504 ipv6 use link local ! \#bgp ! router bgp 65116 bgp router id 10 1 0 116 no bgp ebgp requires policy bgp bestpath as path multipath relax bgp max med on startup 120 bgp graceful restart neighbor peer unnumber peer group neighbor peer unnumber remote as external neighbor range ethernet 0/0 0/504 interface peer group peer unnumber ! \#easy roce ! qos roce lossless cable length 5m incast level low traffic model throughput qos service policy roce lossless 5m low throughput ! \#ars ars instance to leaf1 idle time 10 ! ars instance to leaf2 idle time 10 ! ars instance to leaf3 idle time 10 ! ars instance to leaf4 idle time 10 ! ars nexthop group 10 1 0 111/32 instance to leaf1 ! ars nexthop group 10 1 0 112/32 instance to leaf2 ! ars nexthop group 10 1 0 113/32 instance to leaf3 ! ars nexthop group 10 1 0 114/32 instance to leaf4 ! \#hash hash seed 1234
