Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase on master 23-Jul-2024 #129

Draft
wants to merge 834 commits into
base: ipdk-latest
Choose a base branch
from
Draft

Conversation

ffoulkes
Copy link

@ffoulkes ffoulkes commented Jul 23, 2024

  1. Synced master branch from the mothership.

  2. Rebased ipdk-latest onto the master branch.

  3. OvS builds successfully in non-P4 mode and passes the OVS test suite.

  4. OvS builds successfully --with-ovsp4rt=stubs and passes the OVS test suite.

JM1 and others added 30 commits February 15, 2024 22:24
In a scenario where OVN does load balancing and then SNAT with a OVS
userspace datapath [0], the recirc_depth may be greater than 6. In
that case, ovs-vswitchd might drop packets and raise warnings:

  dpif_netdev|WARN|Packet dropped. Max recirculation depth exceeded.

Increasing MAX_RECIRC_DEPTH to 8 solves this issue.

[0] https://github.com/ovn-org/ovn/blob/dd5cd73e3df1bfb1a215cb45d1e2e03eff1d049a/tests/system-ovn-kmod.at#L740

Reported-at: https://issues.redhat.com/browse/FDP-251
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Jakob Meng <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
The cited commit removed direct call to RSTP module from a callback,
but we can still enter the module after going through a patch port
to a different bridge via ofproto_dpif_send_packet().

Partially revert the change going back to a recursive mutex.

Adding the same test for both RSTP and STP.  While STP unit tests
do catch the same problem for STP (if STP mutex changed to be
non-recursive), they are not actually using the same callback function
as ovs-vswitchd, so it makes sense to test the implementation in
ovs-vswitchd itself as well.

Fixes: 6b90bc5 ("lib/rstp: Remove lock recursion.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-February/052925.html
Reported-by: Huangzhidong <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Previously if an OVS configuration nested multiple layers of UDP tunnels
like VXLAN or GENEVE on top of each other through netdev-linux
interfaces, the vnet header would be incorrectly set to the outermost
UDP tunnel layer instead of the intermediary tunnel layer.

This resulted in the middle UDP tunnel not checksum offloading properly.

Fixes: 85bcbbe ("userspace: Enable tunnel tests with TSO.")
Reported-by: David Marchand <[email protected]>
Signed-off-by: Mike Pattrick <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Previously a change was added to the vnet prepend code to solve for the
case where no L4 checksum offloading was needed but the L3 checksum
hadn't been calculated. But the added check didn't properly account
for IPv6 traffic.

Fixes: 85bcbbe ("userspace: Enable tunnel tests with TSO.")
Reported-by: David Marchand <[email protected]>
Signed-off-by: Mike Pattrick <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Previously some packets were excluded from the tunnel mark if they
weren't L4. However, this causes problems with multi encapsulated
packets like arp.

Due to these flags being set, additional checks are required in checksum
modification code.

Fixes: 084c808 ("userspace: Support VXLAN and GENEVE TSO.")
Reported-by: David Marchand <[email protected]>
Signed-off-by: Mike Pattrick <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Previously a gap existed in the tunnel system tests where only ICMP and
TCP traffic was tested. However, the code paths used for UDP traffic are
different then either of those and should also be tested.

Some of the modified tests had previously checked for TCP with ncat but
didn't include an appropriate check for ncat support. That check was
added to these tests.

Signed-off-by: Mike Pattrick <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Beside the date update, moving the mdb NEWS entry to a more
appropriate place - ovs-appctl section.

Acked-by: Simon Horman <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
This changes add statistics for when a thread is put into stop state.
For example with the following:

kill -STOP $(pidof ovs-vswitchd); sleep 1; kill -CONT $(pidof ovs-vswitchd);

Acked-by: Simon Horman <[email protected]>
Signed-off-by: Eelco Chaudron <[email protected]>
The userspace conntrack only supported hash for port selection.
With the patch, both userspace and kernel datapath support the random
flag.

The default behavior remains the same, that is, if no flags are
specified, hash is selected.

Signed-off-by: Paolo Valerio <[email protected]>
Acked-by: Aaron Conole <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
The patch, when 'persistent' flag is specified, makes the IP selection
in a range persistent across reboots.

Signed-off-by: Paolo Valerio <[email protected]>
Acked-by: Aaron Conole <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Currently, failures of pthread_* functions are printed to stderr
only and then OVS aborts.  These error messages are hard to find
and may be even just lost.

Use VLOG_ABORT() instead.  It will do the same thing, but will try
to log the error to the log file and syslog first, if configured.

Using VLOG_ABORT() instead of VLOG_FATAL() to preserve the abort()
logic and not just exit with a failure code, because it's likely
we want a core dump if one of these function failed.  For example,
we would like to have a stack trace in a core dump in case a mutex
lock failed with 'deadlock avoided'.

Acked-by: Simon Horman <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
It's not a system test as it runs with dummy datapath and ports
and it has nothing to do with layer 3 tunnels.

It should be with other userspace tunnel tests.

While moving also making it a little nicer visually and less error
prone by requesting port numbers for all the ports.

Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
To mimic what kernel routing subsystem does [1], add a local route
entry for every dummy IP address. This helps with OVN testing multiple
chassis on a single host and allows to run better unit tests for
userspace tunnels without adding route entries manually.  This is also
the only way to add 'local' route entries that are required for testing
'local_ip' functionality with native tunnels in userspace datapath
because route lookup will reject non-local source IPs.

There seems to be no way to explicitly remove an IP address from
netdev-dummy, hence no code path to handle route entry cleanup.
The port itself can be removed, but our tests do not normally do that.
Removal can be implemented later if necessary.

[1]: http://linux-ip.net/html/routing-tables.html#routing-table-local

"If the machine has several IP addresses on one Ethernet interface,
there will be a route to each locally hosted IP in the local routing
table. This is a normal side effect of bringing up an IP address on
an interface under linux."

Acked-by: Eelco Chaudron <[email protected]>
Signed-off-by: Ihar Hrachyshka <[email protected]>
Co-authored-by: Ilya Maximets <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Local IP is taken into account only in case of IPv4 address, IPv6
source is not checked.  That leads to source being ignored during the
route lookup and ultimately packets encapsulated with a source IP
found during a route lookup, which is likely the wrong one.

Even worse, after encapsulation we have a difference between the
tunnel metadata that contains a correct source IP and the generated
actions that used a wrong source IP.  This means that if there are
OpenFlow rules in a bridge where packet goes after encapsulation,
we may match on rules that do not correspond to the actual packet
we have.

Add the check for IPv6 source address before the route lookup.

Tests added to check that we're actually using the configured local_ip
as a source address in the packet.  Also adding the same test for IPv4,
since apparently we don't have any tests covering this functionality
for userspace tunnels.

This issue also affects the case where source address is set via
OpenFlow, e.g. 'set_filed:2001:beef::88->tun_ipv6_src', but it's just
a different way of populating the tunnel metadata that doesn't depend
on a tunnel to be native or kernel one.  So, not adding extra tests
for this case for now.

Fixes: 8e4e458 ("ofproto-dpif-xlate: makes OVS native tunneling honor tunnel-specified source addresses")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-February/052938.html
Reported-by: Derrick Lim <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Avoid unnecessary thread creation as no upcalls are generated,
resulting in idle threads waiting for process termination.

This optimization significantly reduces memory usage, cutting it
by half on a 128 CPU/thread system during testing, with the number
of threads reduced from 95 to 0.

Acked-by: Mike Pattrick <[email protected]>
Signed-off-by: Eelco Chaudron <[email protected]>
Some network cards support inner checksum offloading but not outer
checksum offloading. Currently OVS will resolve that outer checksum but
allows the network card to resolve the inner checksum, invalidating the
outer checksum in the process.

Now if we can't offload outer checksums, we don't offload inner either.

Reported-at: https://issues.redhat.com/browse/FDP-363
Fixes: 084c808 ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: Mike Pattrick <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
IANAL, but I think we can extend the copyright attached
to documentation to cover the current year: we are still
actively working on the documentation.

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Correct spelling errors in .rst files flagged by codespell.

Also correct some minor grammar errors in nearby documentation.

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
The Anuket was formed by a merger of OPNFV and CNTT [1].

Also, VswitchPerf, aka vsperf, formerly an OPNFV project,
has been renamed ViNePerf [2].

Update links and documentation accordingly.

The old links were broken, this was flagged by make check-docs

[1] https://anuket.io/news/2021/01/27/lf-networking-launches-anuket-an-open-source-project-to-accelerate-infrastructure-compliance-interoperability-and-5g-deployments/
[2] https://docs.opnfv.org/projects/vineperf/en/latest/release/release-notes/release-notes.html

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Update link to OCF Resource Agents documentation as the existing link
is broken. Also, use HTTPS.

Broken link flagged by make check-docs

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
This updates links to several upstream Kernel documents.

1. Lore is now the canonical archive for the netdev mailing list

2. net-next is now maintained by the netdev team,
   of which David Miller is currently a member,
   rather than only by David.

   Also, use HTTPS rather than HTTP.

3. The Netdev FAQ has evolved into the Netdev Maintainer Handbook.

4. The Kernel security document link was dead,
   provide the current canonical location for this document instead.

1., 2. & 3. Found by inspection
4. Flagged by check-docs

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Mike Pattrick <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
With a new runner update, GitHub Actions had a kernel update.
And it seems like something changed between kernels 6.2 and 6.5
so this test now fails very frequently.

I can reproduce the same issue on RHEL 9, and I can't reproduce
it on Ubuntu 23.04 (kernel 6.2).

The test is creating a NAT with a single address+port pair in
an attempt to simulate an address space exhaustion.  It is
expected that a first connection with wget leaves a conntrack
entry in a TIME_WAIT state and the second wget should fail
as long as this entry remains, because the only available
address+port pair is already taken.

However, very frequently (not always!) the second connection
replaces the first conntrack entry with a new one and connection
succeeds.  There is still only one connection in the conntrack
at any single moment in time, so there is seemingly no issue with
the NAT, but the behavior is unexpected and the test fails.

The issue is likely introduced by a new kernel feature that
allows to evict connections that are in the process of closing:
  https://lore.kernel.org/netdev/[email protected]/

Disable the test in CI until we figure out how to fix it.

Acked-by: Simon Horman <[email protected]>
Acked-by: Paolo Valerio <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Trace attempts to process all the recirculations.  However, if there
is a recirculation loop, i.e. if every recirculation generates another
recirculation, this process will never stop.  It will grind until the
trace fills the system memory.

A simple reproducer:

  make sandbox
  ovs-vsctl add-br br0
  ovs-vsctl add-port br0 p1
  ovs-ofctl add-flow br0 "table=0,in_port=p1,ip,actions=ct(table=0)"
  ovs-appctl ofproto/trace br0 in_port=p1,ip

Limit the number of recirculations trace is processing with a fairly
arbitrary number - 4096 (loosely based on the resubmit limit, but
they are not actually related).

Not adding a test for this since it's only for a trace, but also
because the test may lead to OOM event in a system if the test fails,
which is not nice.

Fixes: e6bc8e7 ("ofproto/trace: Add support for tracing conntrack recirculation")
Reported-by: Jaime Caamaño Ruiz <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
In order to properly balance bond traffic, ofproto/bond periodically
reads usage statistics of the post-recirculation rules (which are added
to a hidden internal table).

To do that, each "struct bond_entry" (which represents a hash within a
bond) stores the last seen statistics for its rule. When a hash is moved
to another member (due to a bond rebalance or the previous member going
down), the rule is typically just modified, i.e: same match different
actions. In this case, statistics are preserved and accounting continues
to work.

However, if the rule gets completely deleted (e.g: when all bond members
go down) and then re-created, the new rule will have 0 tx_bytes but its
associated entry will still store a non-zero last-seen value.
This situation leads to an overflow of the delta calculation (computed
as [current_stats_value - last_seen_value]), which can affect traffic
as the hash will be considered to carry a lot of traffic and rebalancing
will kick in.

In order to fix this situation, reset the value of last seen statistics
on rule deletion.

Implementation notes:
Modifying pr_tx_bytes requires write-locking the global rwlock but a
lockless version of update_recirc_rules was being maintained to avoid locking
on bon_unref().
Considering the small impact of locking during bond removal, removing the
lockless version and relying on clang's thread safety analysis is preferred.

Also, folding Ilya's [1], i.e: fixing thread safety annotation in
update_recirc_rules() to require holding write-lock.

[1]
https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/

Reported-at: openvswitch/ovs-issues#319
Co-authored-by: Ilya Maximets <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
We need to know exact linking / compilation errors in order
to fix issues.  We could have uploaded it as an artifact,
but it seems easier to just print it out for now.

Acked-by: Simon Horman <[email protected]>
Acked-by: Alin-Gabriel Serdean <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
There is no chmod or 'mkdir -m' support on Windows, so setting file
permissions for keys and certificates doesn't actually work.

Implementing them using icacls utility instead.

ovs-pki script currently only uses 0700 and 0750 modes, so only those
(and 0600) are implemented.  NTFS ACLs on Windows are fairly different
and more complex in comparison with Unix file permissions, so it's hard
to implement these functions in a generic way.  The script will fail if
it will encounter an unknown mode.

0700 is implemented as a F (full access) for 'Creator Owner' with no
other permissions.  0750 has an additional RX (read+execute) for the
'Creator Group'.  0600 is implemented the same as 0700, since it
doesn't matter for this use case to have or not to have an executable
or traversal permissions managed separately from everything else and
it would be a little overly verbose to give all the permissions except
for X.

Inheritance rules are set to (OI)(CI), so the folder itself, subfolders
and files in a folder inherit those ACEs.

'umask' also doesn't work on Windows.  Instead, moving the private key
output files to a temporary folder that has restricted access already
configured.  The file will inherit these restricted ACEs.
It should not be necessary to set explicit permissions for these
files since moving them within the same volume should preserve ACEs.
However, it might be safer to chmod them directly as well, just in
case.  Windows administrators will still have to be careful with
private keys, because file copies do not preserve permissions and
moves to different volumes do not preserve them as well.  'robocopy'
with flags to copy security should be used in these cases.  We may
want to re-implement 'mv' with 'robocopy' if that becomes a problem
in the future.

There is one more place where umask is used in the script for
creation of a self-signed certificate, but it is not actually needed
there since the resulted certificate doen't need to be private, so
not changing this part for now.

Tested with running an empty 'make check' in AppVeyor and examining
permissions for files in tests/pki:

       Files          |   Linux    |             Windows
 ---------------------+------------+--------------------------------------
 controllerca         | drwxr-xr-x | NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
 switchca             |            | BUILTIN\Administrators:(I)(OI)(CI)(F)
 *ca\certs            |            | BUILTIN\Users:(I)(OI)(CI)(RX)
 *ca\crl              |            | BUILTIN\Users:(I)(CI)(AD)
 *ca\newcerts         |            | BUILTIN\Users:(I)(CI)(WD)
                      |            | APPVEYOR-VM\appveyor:(I)(F)
                      |            | CREATOR OWNER:(I)(OI)(CI)(IO)(F)
 ---------------------+------------+--------------------------------------
 stamp                | -rw-r--r-- | NT AUTHORITY\SYSTEM:(I)(F)
 test-cert.pem        |            | BUILTIN\Administrators:(I)(F)
 test-req.pem         |            | BUILTIN\Users:(I)(RX)
 test2-cert.pem       |            | APPVEYOR-VM\appveyor:(I)(F)
 test2-req.pem        |            |
 *ca\ca.cnf           |            |
 *ca\cacert.pem       |            |
 *ca\careq.pem        |            |
 *ca\crlnumber        |            |
 *ca\index.txt*       |            |
 *ca\serial*          |            |
 *ca\newcerts\*.pem   |            |
 ---------------------+------------+--------------------------------------
 controllerca\private | drwx------ | APPVEYOR-VM\appveyor:(F)
 switchca\private     |            | CREATOR OWNER:(OI)(CI)(IO)(F)
 ---------------------+------------+--------------------------------------
 test-privkey.pem     | -rw------- | APPVEYOR-VM\appveyor:(F)
 test2-privkey.pem    |            |
 *ca\private\cakey.pem|            |

We can see that private folders and keys have only a full access from
their owners.  Other files and folders have some extra inherited ACEs
from a containing folder.

Acked-by: Simon Horman <[email protected]>
Acked-by: Alin-Gabriel Serdean <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
OpenSSL 1.1.0 changed the library names from libeay32 and ssleay32 to
standard libssl and libcrypto.  All the versions of OpenSSL that used
old names reached their official EoL, so it should be safe to just
migrate to new names.  They can still be supported via premium support
option, but I don't think that is important for us.

Also, OpenSSL installers for older versions had the following folder
structure:

  C:\OPENSSL-WIN64\
  +---bin
  +---include
  |   +---openssl
  +---lib
      |   libeay32.lib
      |   ssleay32.lib
      +---VC
              libeay32MD.lib
              libeay32MDd.lib
              libeay32MT.lib
              libeay32MTd.lib
              ssleay32MD.lib
              ssleay32MDd.lib
              ssleay32MT.lib
              ssleay32MTd.lib

With newer OpenSSL 3+ the structure is different:

  C:\OPENSSL-WIN64
  +---bin
  +---include
  |   +---openssl
  +---lib
      +---VC
          +---x64
              +---MD
              |       libcrypto.lib
              |       libssl.lib
              +---MDd
              |       libcrypto.lib
              |       libssl.lib
              +---MT
              |       libcrypto.lib
              |       libssl.lib
              +---MTd
                      libcrypto.lib
                      libssl.lib

Basically, instead of one generic library in the lib folder and a bunch
of differently named versions of it for different type of linkage, we
now have multiple instances of the library located in different folders
based on the linkage type.  So, we have to provide an exact path in
order to find the library.

'lib/VC/x64/MT' was chosen in this patch since it is a way used for
building in build-aux/ccl.
MD stands for dynamic linking, MT is static, 'd' stands for debug
versions of the libraries.

While at it, fixing documentation examples to point to Win64 default
installation folder.

Acked-by: Simon Horman <[email protected]>
Acked-by: Alin-Gabriel Serdean <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
OpenSSL 1.0.2u is long deprecated and not available for download.
So, our CI never actually downloads it and uses whatever is in the
OpenSSL-Win64 folder provided by AppVeyor.  Luckily, it happens to
be OpenSSL 1.0.2u today.

The oldest supported version of OpenSSL upstream today is 3.0.
And it is an LTS version.  3.1 and 3.2 are not LTS.

Use OpenSSL 3.0 for testing instead.

This commit does a few things to achieve that:

1. Removes the folder provided by AppVeyor.  This way we will fail
   the build if something goes wrong instead of silently using
   OpenSSL version provided by AppVeyor.

2. Obtains the JSON description of available releases and downloads
   the latest minor version of OpenSSL 3.0 64-bit.  With this approach
   we should not need to update the download link that frequently.
   New minor releases will be picked up automatically.  They should
   not have any breaking changes, so should be fine to use in CI.
   OpenSSL 3.0 is supported until at least Sep 2026.

   The JSON file is an official file referenced on the:
        https://slproweb.com/products/Win32OpenSSL.html
   So, it should be safe to use.

3. Executes the downloaded installer with 'Start-Process -Wait' to
   properly wait for installation to finish instead of just sleeping
   for 30 seconds.

4. Caches the downloaded installer, so we're not downloading 300 MB
   on each CI run as that is not nice to do.  We know the hash of the
   latest version, so we will re-download only when the binary changes,
   i.e. on a new minor release.

   For the cache to work we need to introduce the 'install' phase,
   because caches are populated after 'init', but before 'install'.
   Alternatively, we could have just renamed 'init' to 'install',
   but I think it's a little nicer to have separate phases, and we
   can also move 'windows-prepare.sh' to the install phase.

   Cache is also invalidated whenever appveyor.yml changes.

Acked-by: Simon Horman <[email protected]>
Acked-by: Alin-Gabriel Serdean <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
Since the patch-set that included [1] there has been a policy of using
the term member for bonds, LACP, and bundle contexts.  This is
consistent with the more recently adopted policy of using the inclusive
naming word list v1 [2, 3].

This patch addresses two instances where the term member should be used
in vswitch.xml. It does not address instances of alternative wording
that require code updates, which can addressed as follow-up activity.

[1] 91fc374 ("Eliminate use of term "slave" in bond, LACP, and bundle contexts.")
[2] df5e5cf ("Documentation: Add section on inclusive language.")
[3] https://inclusivenaming.org/word-lists/

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Kevin Traynor <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Recently OVS adopted a policy of using the inclusive naming word list v1
[1, 2].

This patch addresses the use of the term master repository by
using the term main repository instead.

This is as distinct from addressing the use of a master branch,
which remains as a follow-up task.

[1] df5e5cf ("Documentation: Add section on inclusive language.")
[2] https://inclusivenaming.org/word-lists/

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Kevin Traynor <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
n-sandeep and others added 30 commits July 23, 2024 08:22
Changes include
- Receiving IPv6 address from OVS tunnel configuration and call ovs
  sidecar API's
- Changes to support Static MAC entries programming to the target.

Signed-off-by: Sandeep N <[email protected]>
Signed-off-by: Derek G Foster <[email protected]>
This PR includes
- Basic logic to include a unique bridge ID for each bridge creation.
- Basic logic to include a unique SRC port ID for each vxlan tunnel creation.
- Additional intelligence to read each port (including tunnel port)
  configuration and extract vlan id, vlan mode.
- Configure other new P4 tables for
  vxlan_encap (V4 and V6)
  vxlan_encap_pop_vlan (V4 and V6)
  vxlan_decap (V4 and V6)
  vxlan_decap_push_vlan (V4 and V6)
  tunnel_term (V4 and V6)
  rx_tunnel (V4 and V6)
  vlan_push
  vlan_pop
  tunnel_src_port
  vsi_src_port
- Delete P4 tables when port is deleted (including vxlan) or bridge
  table.

Signed-off-by: Sandeep N <[email protected]>
Signed-off-by: Derek G Foster <[email protected]>
During port/bridge delete CFG value is garbage intermittently, this leads to
undefined beavior while deleting p4 target rules.
This review fixes garbage value issue during delete case.

Signed-off-by: Sandeep N <[email protected]>
* Runtime option to disable OvS offload to P4 target

Control OVS offload with an environment variable during runtime.
If env variable OVS_P4_OFFLOAD=false, then disable OVS offload, else
if OVS_P4_OFFLOAD is not set or OVS_P4_OFFLOAD is any value other
than false, then by default enable OVS offload.

Signed-off-by: Sandeep N <[email protected]>

* Addressing review comments and fixing UT defects

This change introduces a mutex and this mutex is taken before
making a p4runtime call to program FDB entry. This is to avoid multiple
parallel calls when OvS creates multiple revalidator and handler threads
which tries to make a p4runtime call when a MAC is learnt.

Signed-off-by: Sandeep N <[email protected]>

---------

Signed-off-by: Sandeep N <[email protected]>
We are not handling all delete port notifications as the
OVSD configuration will be invalid/NULL when we receive delete of
multiple bridges, ports in a single transaction.
During delete path we are not looking for interface DB cfg,
process delete notification with internal structures irrespective of
iface->cfg is valid or not

Signed-off-by: Sandeep N <[email protected]>
Signed-off-by: Satish Pitchikala <[email protected]>
Signed-off-by: Satish Pitchikala <[email protected]>
Add support for updating P4 tables for maintaining IP and mac addresses
learnt from a flow with ARP response. This will be used when reconstructing L2
after IPSEC packet is decrypted

Signed-off-by: nupuruttarwar <[email protected]>
* Introduce option to specify grpc server address for ovs-p4rt

This patch does the following:
- Introduce the option to specify grpc server address for ovs-p4rt
  client. It can be specified using --grpc-addr or -g option when
  starting ovs-vswitchd. Default grpc address is localhost
- Introduce checks to program FDB learnt entries to hardware only
  if it's a  new entry. Stress tests shows that unnecessary calls to grpc
  server to program entries that are already present throws duplicate
  entry errors and reduces the performance and also results into
  packet loss intermittently

Signed-off-by: nupuruttarwar <[email protected]>
Co-authored-by: Derek G Foster <[email protected]>
Signed-off-by: Derek G Foster <[email protected]>
- Added #includes to define all the symbols used by ovs-p4rt.h.

Signed-off-by: Derek G Foster <[email protected]>
- Used clang-format to format file.

- Alphabetized function declarations.

Signed-off-by: Derek G Foster <[email protected]>
- Moved ovs-p4rt.h from the include/openvswitch folder to
  a new include/ovsp4rt folder. Updated the buildsystem
  to reflect the change.

- Updated #include paths from "openswitch/ovs-p4rt.h"
  to "ovsp4rt/ovs-p4rt.h".

- Implemented an install-data-hook to define a soft link
  from openvswitch/ovs-p4rt.h to ovsp4rt/ovs-p4rt.h, to
  maintain backward compatibility. Modified the dist-hook-git
  target to ignore the symlink.

Signed-off-by: Derek Foster <[email protected]>
- Renamed the functions that ovs-p4rt publishes for use by OVS to
  conform better to C naming conventions.

- Added .vscode/ and .vslick/ to .gitignore.

Signed-off-by: Derek Foster <[email protected]>
1. Added or updated Intel copyright notices.
2. Corrected #elif directives to #else in ofproto-dpif-xlate.
3. Added missing P4OVS conditionals in lib/netdev.h and
   lib/netdev-vport.c.
4. Removed trailing whitespace.

Changes #2 and #3 fix compilation errors.
Change #3 fixes the eight unit tests that used to fail.

Signed-off-by: Derek Foster <[email protected]>
)

- Manually terminate grpc addr string to address Coverity hit
- Revise coverity fix (#119)

Signed-off-by: Sabeel Ansari <[email protected]>
Signed-off-by: Derek Foster <[email protected]>
Co-authored-by: Derek G Foster <[email protected]>
- Removed symbolic link from include/openvswitch/ovs-p4rt.h to
  include/ovsp4rt/ovs-p4rt.h.

Signed-off-by: Derek Foster <[email protected]>
- Renamed the global `grpc_addr` variable to `p4ovs_grpc_addr`.
  Symbols we introduce into the global namespace should be
  prefixed to make their affinity clear and to reduce the
  likelihood of collision with other global names.

Signed-off-by: Derek Foster <[email protected]>
Signed-off-by: Derek Foster <[email protected]>
- Replaced OVS_CHECK_P4OVS with OVS_CHECK_OVSP4RT.

- Implemented --with-ovsp4rt[=stugs] command-line flag to enable
  new P4 build modes.

- Implemented OVSP4RT and LEGACY_P4OVS automake conditionals to
  specify whether to build in OVSP4RT mode or legacy P4OVS mode.

- Moved p4ovs.c and p4ovs.h to the lib directory.

Signed-off-by: Derek Foster <[email protected]>
- Downgraded three VLOG_ERR messages to VLOG_DBG. This fixes all
  but one of the test failures in the P4OVS build.

- Shortened one of the downgraded error messages.

Signed-off-by: Derek Foster <[email protected]>
- The mac_learning_static_none_move counter was being incremented
  twice for the same packet, causing one of the tests to fail.

  Discovered that the P4 code makes its own call to the function,
  believing it to be a predicate with no side effects.

  Solved the problem by having the function return a Boolean
  value indicating whether a static mac port move was being
  attempted. The non-P4 call site in mac_learning_update()
  increments the counter if the returned value is True. The P4
  call site in xlate_normal() ignores the returned value.

Signed-off-by: Derek Foster <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.