Skip to Content.
Sympa Menu

rare-users - [RARE-users] about the dpdk forwarder....

Subject: RARE user and assistance email list

List archive

[RARE-users] about the dpdk forwarder....


Chronological Thread 
  • From: mc36 <>
  • To: Carmen Misa Moreira <>, "" <>
  • Subject: [RARE-users] about the dpdk forwarder....
  • Date: Wed, 24 Aug 2022 19:01:55 +0200

hi,

well, so i should had written this before, but you'll see, i'm also unsure
the outcome about most of these...
so to tune the dpdk for a given hardware, first one have to read the box
manual to see the wiring topology...
the cpus nowadays "contains" the memory controllers and the pcie lanes, that
is, if you have a multi-socket
box, then first you have to plan to it: which cpu socket will handle which
nic, and then you'll have to
populate much enough memory slots of that cpu to have the packet buffers plus
the nic shared rx-tx rings
local to that cpu... on a single core system, you only have to deal with the
thread mapping... (also on
the multicore system btw)

after the re-assembly of the box, the things are much more simpler:

https://github.com/rare-freertr/freeRtr/blob/master/misc/native/p4emu_dpdk.c#L323

i know that these paramerters seems cryptic but for a basic cpe, you'll only
have
to deal the "[port rxcore txcore]" part.. here, if you want to set the dpdk
nic #1
to receive on thread #2 and transmit on thread #3 then the parameter will be "1 2
3"
and in a single setup, you can continue it by 2 4 5 3 6 7 etc... a thread
can handle
any number of rx or tx or both ports, but, if it's a 2x10g+8*1g setup, i
found it more
performant to dedicate a thread to a single 10g and leave the pair thread
alone...
there are other cpus and other workloads so it's a thing that one have to
consider...
one thing to consider, the pair threads seems to be useful to run the control
plane,
but on a heavily loaded system, pinning it to a dedicated set of threads
obviously
will help as it won't pollute nor the i nor the d, nor the l1,l2,l3 caches and
you also wont have to task switch between the processes...

if this mixed mode is not enough, then the dpdk forwarder have an other mode,
when all the above does not change, but, here you dedicate some threads to
forward exclusively... "[port rxcore txcore]" please please do it in a pair,
you're here because you want the speed so dont let anything else steal the
two threads from the core! also you should consider that in this mode, the
ports are not dealing with the packet forwarding, instead, they pass the
packets after a real quick hashing to the dedicated forwarder cores:
https://github.com/rare-freertr/freeRtr/blob/master/misc/native/p4emu_fwd.h#L40
this is just a pre-hash and the ecmp is considers everything, this is
used only to distribute the flows between the forwarder threads...

the rest of the parameters are mostly useless imho but if you scroll a bit
down
in the first link, you'll find the variable names.... these are mostly the
dpdk sample names and their defaults except the very own one: burst_sleep...

so both the above 2 modes considers this parameter which is passed to
https://man7.org/linux/man-pages/man2/nanosleep.2.html
a negative value will skip this part completely, and if you want maximum
performance and minimum jitter, you need to do this obviously...
but when enabled, i also try some tricks that is, so i won't nanosleep if
there were packets for a short while, but still...

the reason why i did not set it default -1 is that when the lcores does not
nanosleep then they run at 100%
if you allow them sleep somethin about 1000 then you have terrible horrible
jitter but 5% cpu at 1gbps...
so instead i decided to use this value, now it does somewhat 25% when
idling...
and there is a thing, if i run the dpdk in a vm, the hypervisor always sees
100% thread usage if it's below 1000

long story short, autotuning for dpdk seems np hard for me but as usual, i
welcome patches... :)

thanks,
cs



Archive powered by MHonArc 2.6.19.

Top of Page