Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(trace): add drop-only option #632

Merged
merged 10 commits into from
Sep 27, 2024
Merged

Conversation

linglilongyi
Copy link
Contributor

@linglilongyi linglilongyi commented Sep 13, 2024

Background

#524
Currently dae trace dumps huge amount of packets, most of them are not necessary. In most cases, we care about packets that are dropped by kernel, aka processed by kernel function kfree_skb_reason, or called by kfree_skbmem but without consume_skb.

Checklist

Full Changelogs

  • Add --drop-only option to where dae only prints the packets dropped by kernel.
  • dae trace now print all the skbs of a packet until the packet is freed by kernel, whether it runs with --drop-only or not.

Issue Reference

Closes #524

Test Result

start tracing
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 tcp_v4_early_demux
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_route_input_noref
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_route_input_slow
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 fib_validate_source
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 __fib_validate_source
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_local_deliver
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_local_deliver_finish
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_protocol_deliver_rcu
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 raw_local_deliver
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 tcp_v4_rcv
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 tcp_v4_fill_cb
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 tcp_timewait_state_process
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 tcp_v4_send_ack
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 ip_send_unicast_reply
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 __ip_options_echo
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 security_skb_classify_flow
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 bpf_lsm_xfrm_decode_session
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 kfree_skb_reason(SKB_DROP_REASON_NOT_SPECIFIED)
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 skb_release_head_state
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 skb_release_data
ffff9e800e629d00 mark=0 netns=4026531840 if=2(ens18) proc=735(/usr/bin/dae) 142.250.197.174:80 > 192.168.1.101:33334 tcp_flags=.F payload_len=0 kfree_skbmem

Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

除了两个小问题其他都还可以。

分割线之后的内容并不需要在这个 PR 里修改。


我自己这半年来的 trace 经历会觉得再细粒度一点的过滤会更有帮助。对一个运行透明代理的linux 来说 skb 可以按照来源划分为 lan/wan,也可以按照生命结束事件分为 local consumed/netdev emitted/dropped。

你这 PR 之后我来做这个更细粒度的划分。因为你已经把 skb 按照地址聚合好了,所以之后都很好改,感谢。

trace/trace.go Outdated Show resolved Hide resolved
switch sym.Name {
case "__kfree_skb","kfree_skbmem":
// most skb end in the call of kfree_skbmem
if !dropOnly || slices.Contains(skb2symNames[event.Skb],"kfree_skb_reason") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所以就算不 drop only 现在也会按照 skb 聚合、等 kfree 再输出。应该也没问题。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确实改变了原来的行为模式,我在changelog里补充一下

cmd/trace.go Outdated Show resolved Hide resolved
- refactor: also merge some line to simplify the code
Copy link
Contributor

@dae-prow dae-prow bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧪 Since the PR has been fully tested, please consider merging it.

@mzz2017 mzz2017 added this to the dae 0.8 milestone Sep 27, 2024
@mzz2017
Copy link
Contributor

mzz2017 commented Sep 27, 2024

@linglilongyi @jschwinger233 是否已经 ready to merge?

@jschwinger233
Copy link
Member

yes in my opinion

@mzz2017 mzz2017 merged commit b814105 into daeuniverse:main Sep 27, 2024
22 checks passed
@linglilongyi linglilongyi deleted the drop-only branch September 29, 2024 02:44
@dae-prow dae-prow bot mentioned this pull request Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement] Troubeshooting tools
4 participants