Skip to Content.

rare-users - Re: [gn4-3-wp6-t1-wb-RARE] new features --- experimenting..... :)

Subject: RARE user and assistance email list

List archive


Re: [gn4-3-wp6-t1-wb-RARE] new features --- experimenting..... :)


Chronological Thread 
  • From: mc36 <>
  • To: "" <>, "" <>
  • Subject: Re: [gn4-3-wp6-t1-wb-RARE] new features --- experimenting..... :)
  • Date: Mon, 1 Feb 2021 17:45:29 +0100

hi,
just a quick follow-up as i let the iperf running at 80mbps so i just crossed
the 1tb
receive and 100gb transmit to cpu (*) on all of my boxes who use rare/p4emu
or mostly
rare/dpdk, and the replicator, the core, is the only rare/tofino guy... just
a fun fact,
the lowest budget node is mediapc, a 50cm3 fanless 3yo brix gb-bace-3000 used
134 minutes
of it's celeron n3000 time to do achieve this with the libpcap packet
engine...
another interesting fact that the bfd didn't went down during the
experiment....
regards,
cs

*: i figured out that i'm punting up unknown mcast to be able to register in
pim at the rp...



core#show interfaces hwsummary
interface state tx rx drop
bundle1 up 0 0 0
bundle1.158 up 155835306788 156107118311 0
bundle1.159 up 155576574044 53981103 0
bundle1.160 up 155635661917 201388859 0
bundle1.161 up 155821629908 812193305 0
bundle1.162 up 155582320733 233152722 0
bundle1.163 up 155787462368 4785391 0
bundle1.164 up 155993712296 5144032 0
sdn12 up 0 12359538 0
sdn2 up 0 14305827 0
sdn4 up 0 60410642 0
sdn6 up 0 20155345 0

core#


noti#show interfaces hwsummary
interface state tx rx drop
hairpin11 up 0 0 0
hairpin12 up 0 0 0
hairpin12.14 up 0 0 0
hairpin21 up 0 0 0
hairpin22 up 0 0 0
hairpin22.21 up 0 0 0
sdn1 up 150631008334 1046915410847 1035152917815
sdn1.158 up 149015163723 149786876586 0
sdn1.170 up 5369464 8428954 0
sdn1.171 up 4879289 4222127 0
sdn1.174 up 5020357 3929331 0
sdn1.175 up 4701933 30120704 0
sdn1.176 up 4530530 149756678 0
sdn2 up 1293930 0 0
sdn901 up 40055060 32133180 0
sdn999 up 466878187 150236827066 0

noti#

mediapc#show interfaces hwsummary
interface state tx rx drop
sdn1 up 116277173 1310611660188 1134839909610
sdn1.159 up 60882546 163580699471 0
sdn1.174 up 6768420 7748905 0
sdn1.178 up 8238565 8905750 0
sdn1.180 up 6771725 7684060 0
sdn1.183 up 1306702 0 0
sdn1.184 up 1804653 5939314 0
sdn1.196 up 31444234 41451808 0
sdn901 up 4691295 4732516 0
sdn902 up 6386666 6423518 0
sdn903 up 4876672 4911883 0
sdn999 up 1386986 1641251 0

mediapc#show interfaces summary
interface state tx rx drop
loopback0 up 7003451 0 0
loopback8 up 0 0 0
template1 admin 272 0 1222391
template2 admin 116 0 1222391
ethernet0 up 427646474 163209040000 0
sdn1 up 375985217 162941476737 215
sdn1.159 up 58820249 162232874728 831
sdn1.174 up 20430286 24434592 831
sdn1.178 up 21660044 24974687 831
sdn1.180 up 20492845 24423652 831
sdn1.183 up 4050178 0 695
sdn1.184 up 114585 15846930 0
sdn1.196 up 240151502 175228672 831
sdn901 up 14970101 15040781 665
sdn902 up 14940175 15034027 665
sdn903 up 14939488 15019851 665
sdn999 up 608726 155860 278

mediapc#
mediapc#show ipv4 bfd inet neighbor
interface address state uptime clients
sdn1.159 10.1.1.230 up 04:51:17 lsrp
sdn1.174 10.1.1.177 up 1d5h lsrp
sdn1.178 10.1.1.157 up 07:53:15 lsrp
sdn1.180 10.1.1.150 up 1d5h lsrp
sdn1.196 10.1.1.73 up 05:00:28 lsrp
sdn901 10.8.2.18 up 1d5h pvrp
sdn902 10.8.2.22 up 1d5h pvrp
sdn903 10.8.2.14 up 1d5h pvrp

mediapc#



mc36@mediapc:~$ neofetch | grep Host
Host: GB-BACE-3000 1.x
mc36@mediapc:~$ ps aux | grep p4emu
root 804651 29.2 0.1 624316 4744 ? Sl 09:51 134:15
/rtr/p4emu.bin 127.0.0.1 9080 0 veth0b veth1b veth2b veth3b veth4b enp2s0
mc36 982318 0.0 0.0 9704 720 pts/0 S+ 17:31 0:00 grep p4emu
mc36@mediapc:~$


On 2/1/21 12:49 PM, mc36 wrote:
hi,

unbelieveable, but this code works in the real life!!!! :)

i loaded the new code to my home stordis and reverted a small portion
of my network from bier (which now goes by cpulabel feature) to plain
old pim (1).... the core box immediately went to distribution mode (2)
and it programmed to the hw, also my notebook noticed (3) via pim about
the needed stream... then started a stream from my notebook (4) and

observed counters (5) on my notebook, and found that the 130mbps of
traffic egressing the sdn1.158 subinterface and obviously counted
on the sdn1 main interface also... then the receive part is more
interesting, i have 940mbps rx on sdn1 which is mostly dropped,

and 130mbps rx on the sdn1.158... now compare it to the mroute (2)
from core and see that 130 * 7 (this many replicas are configured) = 940!!!!
that is, the core box replicates my multicast stream in hardware!!!!!

the same can be observed on core in (6) the hwsummary, whereas i had
9gb on bun1.158, and 9gb tx on every subinterface, so it really replicated!
also the hwsumm from my notebook (7) indicate the 63gb drop on sdn1 so it
really arrived here.... finally the switchport (8) of the stordis also have
a huge amount of mcast rxed....

regards,
cs



1:
core(cfg-if)#show config-differences
interface template1
 no ipv4 pim join-source loopback0
 no ipv4 pim bier-tunnel 28
 ipv4 multicast static-group 232.6.6.6 10.8.255.1
 exit

core(cfg-if)#

noti(cfg)#show config-differences
interface template1
 no ipv4 pim join-source loopback0
 no ipv4 pim bier-tunnel 110
 exit

noti(cfg)#



2:
core(cfg-if)#show ipv4 mroute inet
source      group      interface    upstream    targets
10.8.255.1  232.6.6.6  bundle1.158  10.1.1.233  loopback0 template1
bundle1.158 bundle1.159 bundle1.160 bundle1.161 bundle1.162 bundle1.163
bundle1.164

core(cfg-if)#


3:
noti(cfg)#show ipv4 mroute inet
source      group      interface  upstream    targets
10.8.255.1  232.6.6.6  sdn999     10.8.255.1  sdn1.158

noti(cfg)#



4:
mc36@noti:~$ iperf -u -c 232.6.6.6 -T 128 -b 100M -i1 -t9999



5:
noti(cfg)#show interfaces hwtraffic
interface     state  tx        rx        drop
hairpin11     up     0         0         0
hairpin12     up     0         0         0
hairpin12.14  up     0         0         0
hairpin21     up     0         0         0
hairpin22     up     0         0         0
hairpin22.21  up     0         0         0
sdn1          up     13532235  94683819  93660400
sdn1.158      up     13386149  13529135  0
sdn1.170      up     709       883       0
sdn1.171      up     577       283       0
sdn1.174      up     826       456       0
sdn1.175      up     617       251       0
sdn1.176      up     689       719       0
sdn2          up     0         0         0
sdn901        up     2750      1429      0
sdn999        up     2176      13489976  0

noti(cfg)#


6:
core(cfg-if)#show interfaces hwsummary
interface    state  tx          rx          drop
bundle1      up     0           0           0
bundle1.158  up     8974768051  8934623100  0
bundle1.159  up     8956997974  3413812     0
bundle1.160  up     8960428143  9674501     0
bundle1.161  up     8973865146  54862879    0
bundle1.162  up     8957181567  15587653    0
bundle1.163  up     8965954026  464373      0
bundle1.164  up     8983713126  569344      0
sdn12        up     0           1114288     0
sdn2         up     0           1299535     0
sdn4         up     0           900564      0
sdn6         up     0           1812658     0

core(cfg-if)#


7:
noti(cfg)#show interfaces hwsummary
interface     state  tx           rx           drop
hairpin11     up     0            0            0
hairpin12     up     0            0            0
hairpin12.14  up     0            0            0
hairpin21     up     0            0            0
hairpin22     up     0            0            0
hairpin22.21  up     0            0            0
sdn1          up     13057392511  64385390230  63198446748
sdn1.158      up     9044587565   9162675875   0
sdn1.170      up     52653820     18237843     0
sdn1.171      up     3032062      2666448      0
sdn1.174      up     3086540      2408920      0
sdn1.175      up     3821865069   188910177    0
sdn1.176      up     10578348     243701499    0
sdn2          up     790114       0            0
sdn901        up     25067677     19427454     2502
sdn999        up     412014029    13041826360  92312752

noti(cfg)#




8:
labor#show interface counters port-channel 7
Port:  Po7
Tx Collisions:          0
Tx Ucast:               88,449,261
Tx Mcast:               15,126,523
Tx Bcast:               77,154
Tx Jumbo:               8,353,874
Tx Pkts:                103,652,938
Tx Bytes:               45,290,174,152
Tx Errors:              0
Tx Discards:            0
Rx Ucast:               88,649,255
Rx Mcast:               58,437,586
Rx Bcast:               6,994
Rx Jumbo:               57,402,232
Rx Alignment:           0
Rx UnderSize:           0
Rx 64Pkts:              548,118
Rx 65-127Pkts:          54,235,928
Rx 128-255Pkts:         11,825,131
Rx 256-511Pkts:         4,180,215
Rx 512-1023Pkts:        1,394,053
Rx 1024to1518Pkts:      17,508,165
Rx Pkts:                147,093,835
Rx Bytes:               119,421,896,809
Rx Errors:              0
Rx Discards:            0

labor#





On 2/1/21 10:57 AM, mc36 wrote:
hi,
i just finished (1) the huge refactoring described in my previous mail!
please find attached the fresh test runs with all the dataplanes, the
p4 based ones now use both ingress and egress pipes which was a prerequisite
to continue adding nexthop based multicast features to these...
so for now, i'll do so by removing the 'not applicable' from p4
results the last some test runs....
regards,
cs


https://bitbucket.software.geant.org/projects/RARE/repos/rare/commits/05e54cb91a2f5b3ede7f0e0459109b81c689fe47

On 1/30/21 8:07 AM, mc36 wrote:
hi,

please find attached the latest test runs with bmv2...
no new features but the bmv2 p4 code got that huge [1]
refactoring that i described in the previous mail.
that is, the decapsulation & routing decision happens
in ingress and the encapsulation happens in egress.

now i'll proceed with the tofino code too, it'll be a
bit more tricky as that guy have completely separate
two stages with own parser, match-actions and deparser
too, whereas bmv2 share the all the metadata and headers
between the stages...

and after that, when everything passes again on tofino too,
i'll start adding the newest multicast feature to both p4
codes, that triggered this whole rework: the lsm edge&core...
(dpdk code already have it all, so at least i see it working:)

and when i'm done with the lsm, i'll proceed to bier (not i
the but but with [2] :), which could be a game-changer for
those who use multicast because it fully eliminates tree
building in the core elements for multicast, and as far as
i know, we'll be the first one who'll have it in hw...

regards,
cs

1:
https://github.com/frederic-loui/RARE/commit/dfb3ff2f3d52dc58f6c38d2b2ae12ed74cc10302

2: https://tools.ietf.org/html/rfc8279

On 1/28/21 9:02 PM, mc36 wrote:
hi,
please find attached the fresh runs with dpdk.
news is the label switched multicast features arrived.
it means basically mldp p2mp, mldp mp2mp, rsvp-te p2mp,
pim/igmp-mldp interworking, mldp based mvpn and friends...

regarding the bmv2 and tofino dataplane, there was a long
conversation about it at the intel community (1), and finally
we concluded that we should move away from the current
ingress-pipe-only model and do the encapsulation
exclusively in the egress pipeline...
it'll free up some space in the ingress for more
simultaneous features (or bigger lookup tables)
and will provide us the flexibility needed for
the lsm. the tricky part here is that for pure
multicast, there is no nexthop involved in the
flooding, and as a quick hack, i replicated the
vlan-out table to the egress... but in case of lsm,
we'll need the nexthop rewrite info also because lsm
is basically unicasted on the link, but mostly i
want to get it also done cleanly so nothing left
but to move everything from nexthop to the egress...
regards,
cs

1:
https://community.intel.com/t5/Intel-Connectivity-Research/ingress-vs-egress-processing/m-p/1249943#M2025



Archive powered by MHonArc 2.6.19.

Top of Page