Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not kernelize rtp streams without confirmed SSRC #1855

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Den4t
Copy link
Contributor

@Den4t Den4t commented Sep 2, 2024

Hi !

The story is - caller is a Cisco Webex video сonference device (videocodec).
Cisco device initiates a call, opens audio/video streams and also offers screen sharing as H.264 video, screen sharing can start optionally at any time during the call. The rtpengine is media proxy at the perimeter, callee is somewhere in the public and rtpengine converts streams from Cisco RTP to SRTP for remote party.

Now the issue - screen sharing stream randomly stops working after some time in a call, audio and video streams work properly.
During debugging we noticed that if screen sharing starts almost at the beginning of the call it works fine.
Screen sharing start time in working scenario correlates with learning phase interval, so we tried to turn off kernelization, after that the screen sharing issues were completely gone.

After more debugging, we found out that rtpengine kernelizes screen sharing stream with empty SSRC list.
It happens because Cisco in the beginning of media stream sends several packets which are detected by rtpengine as RTP V1 (probably for the NAT traversal).These packets are ignored by rtpengine but affect the decision to kernelize stream. What happens next - because stream to caller is SRTP, kernel module tries to track ROC. While ROC is zero stream functions as expected, but once ROC gets incremented - the stream becomes broken. We see two cases of this - first is FEC, it works as separate SSRC in the same stream so ROC must be tracked separately for each SSRC, in our case rtpengine does not distingush them and incrementation of ROC for one SSRC immediately breaks the second one as SRTP authentication header for it becomes invalid. Second is the session timer in signaling, which fires re-INVITE and unkernelizes then kernelizes stream again, once screen sharing stream is unkenelized rtpengine reports a “new ingress SSRC” and, as we assume, clears ROC for the stream to zero, it also breaks SRTP authentication headers for SSRC where ROC was incremented at least once.

So i think the solution in this case - do not kernelize RTP stream without valid SSRC, this PR adds a strict check for that in learning phase. We tested changes with version 11.5, but we think the changes will work as expected in version 12 as well.

@rfuchs
Copy link
Member

rfuchs commented Sep 2, 2024

Thanks for the analysis. This situation should actually be handled by the kernel module passing through unknown SSRCs to user space, and the daemon then setting up the appropriate tracking for the kernel module. IOW there should be an "unkernelize" event, immediately followed by another "kernelize" event, now with the new SSRCs being tracked. I suppose this doesn't happen?

@Den4t
Copy link
Contributor Author

Den4t commented Sep 2, 2024

IOW there should be an "unkernelize" event, immediately followed by another "kernelize" event, now with the new SSRCs being tracked. I suppose this doesn't happen?

Does not happend when ROC incremented during mixing FEC (if present) with primary media payload,
but happend when when new offer (after re-INVITE) fired.

Yes, i was thinking about kernel module, target_find_ssrc function look in ssrc array index 0 at first lines:

// returns: -1 = no SSRCs were given, -2 = SSRCs were given but SSRC not found
static int target_find_ssrc(struct rtpengine_target *g, uint32_t ssrc) {
        int ssrc_idx;
                
        if (unlikely(!g->target.ssrc[0]))
                return -1;

and for streams without SSRC filled allways return -1, i think it is leaved here for non RTP streams (i can mistake), for example:
m=application XXXXX UDP/BFCP *
This streams also kenelized, i can see this in logs, this seems to be normal behavior, so i decided to check exactly
RTP media transport in user space for valid SSRC before kernelize. In our case (with patch applied) when first non RTP packets
arrive the kernelizing is not happened, but when screen sharing stream is actually opened this stream are kernelized.

@rfuchs
Copy link
Member

rfuchs commented Sep 3, 2024

IOW there should be an "unkernelize" event, immediately followed by another "kernelize" event, now with the new SSRCs being tracked. I suppose this doesn't happen?

Does not happend when ROC incremented during mixing FEC (if present) with primary media payload, but happend when when new offer (after re-INVITE) fired.

ROC updates should be transparent either way, but a new SSRC appearing should trigger it, without or without re-invite.

Yes, i was thinking about kernel module, target_find_ssrc function look in ssrc array index 0 at first lines:

...

and for streams without SSRC filled allways return -1, i think it is leaved here for non RTP streams (i can mistake), for example: m=application XXXXX UDP/BFCP * This streams also kenelized, i can see this in logs, this seems to be normal behavior,

Yes, I think there's a bit of a mix-up in the kernel code, that it doesn't clearly distinguish between streams which are expected to have no SSRCs or unknown SSRCs, and streams which are required to have known SSRCs (definitely required for SRTP).

So in the case of SRTP (or generally RTP), a newly appearing SSRC should

  1. get the packet passed through to user space
  2. user space should unkernelize
  3. user space should set up SSRC contexts and handle the packet
  4. user space should re-kernelize

I'll leave this PR open for now and see if I can pin down the underlying issue.

sipwise-jenkins pushed a commit that referenced this pull request Sep 4, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
@rfuchs
Copy link
Member

rfuchs commented Sep 4, 2024

I could reproduce this (or at least I think so) and 1b187b5 seems to fix the issue. Are you able to confirm?

@Den4t
Copy link
Contributor Author

Den4t commented Sep 4, 2024

Hi, Richard !

I think i can fix the problem in kernel module.

I conducted a test lasting 6 hours, there were no problems, the flow was kernelized as expected,
including during new offers. I attached a log filtered by the port of the stream from the rpengine side (port 40828),
noted specific places in it, as well as module statistics before the end of the call, it shows that ROC
was incremented correctly, including for FEC.

The fix is simple:
master...Den4t:rtpengine:roc-kernel

Its meaning is that the target_find_src function is called only when a valid RTP stream is fixed at the input,
accordingly, a valid SSRC must be present, therefore, if the SSRC list for the stream is empty, then you should not return an error,
instead you need to transfer the packet to the user space.

list_anon.txt
log_anon.txt

@rfuchs
Copy link
Member

rfuchs commented Sep 4, 2024

The fix is simple: master...Den4t:rtpengine:roc-kernel

This is actually similar to my own approach above, at least in effect. I'll have to think about whether getting rid of the distinction between "SSRC not found" and "no SSRCs given" is valid though. There is a possible use case of allowing RTP forwarding without specifying any SSRCs. Perhaps an additional bit flag is needed (in addition to the existing rtp one) so that the user space can signal that SSRCs are expected to exist. I'll let you know.

sipwise-jenkins pushed a commit that referenced this pull request Sep 4, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
@Den4t
Copy link
Contributor Author

Den4t commented Sep 4, 2024

I could reproduce this (or at least I think so) and 1b187b5 seems to fix the issue. Are you able to confirm?

Missed this, sorry, will try tomorrow.

@rfuchs
Copy link
Member

rfuchs commented Sep 4, 2024

Updated to a0b705e to add an explicit flag

@Den4t
Copy link
Contributor Author

Den4t commented Sep 5, 2024

Updated to a0b705e to add an explicit flag

Hi, Richard !

I tested these changes, the results are the same as in the previous test, everything works fine.

Note: i tested at 11.5

sipwise-jenkins pushed a commit that referenced this pull request Sep 5, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
sipwise-jenkins pushed a commit that referenced this pull request Sep 5, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit c3dd3dc)
sipwise-jenkins pushed a commit that referenced this pull request Sep 5, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
sipwise-jenkins pushed a commit that referenced this pull request Sep 5, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit 48bae07)
sipwise-jenkins pushed a commit that referenced this pull request Sep 10, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit c3dd3dc)
sipwise-jenkins pushed a commit that referenced this pull request Sep 10, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit c3dd3dc)
(cherry picked from commit b3f2685)
sipwise-jenkins pushed a commit that referenced this pull request Sep 10, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit c3dd3dc)
(cherry picked from commit b3f2685)
sipwise-jenkins pushed a commit that referenced this pull request Sep 10, 2024
If a stream has been pushed to the kernel from anything other than RTP,
even though RTP is expected, we get a forwarding entries without any
SSRCs. This is valid, but once actual RTP is received, it needs to be
passed on to user space, so that SSRC contexts can be set up.

Possible fix for #1855

Change-Id: I51b82d3cf79cf66780fdde154bebe56e0f43174b
(cherry picked from commit a0b705e)
(cherry picked from commit c3dd3dc)
(cherry picked from commit b3f2685)
(cherry picked from commit 6b76b49)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants