Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV frontends: vdispl: prepare DRM driver for upstreaming #6

Open
aanisov opened this issue May 22, 2017 · 0 comments
Open

PV frontends: vdispl: prepare DRM driver for upstreaming #6

aanisov opened this issue May 22, 2017 · 0 comments
Assignees

Comments

@aanisov
Copy link

aanisov commented May 22, 2017

Be so kind to end up with PRs to the correspondent repositories to get PV DRM funtional

@andr2000 andr2000 changed the title PV DRM Frontend should be implemented. PV frontends: implement DRM frontend Jul 7, 2017
@andr2000 andr2000 changed the title PV frontends: implement DRM frontend PV frontends: DRM: prepare for upstreaming Jul 7, 2017
@andr2000 andr2000 changed the title PV frontends: DRM: prepare for upstreaming PV frontends: DRM: prepare DRM driver for upstreaming Jul 7, 2017
@andr2000 andr2000 changed the title PV frontends: DRM: prepare DRM driver for upstreaming PV frontends: vdispl: prepare DRM driver for upstreaming Jul 7, 2017
andr2000 pushed a commit to andr2000/linux that referenced this issue Dec 22, 2017
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events.  It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000000000 in ?? ()
  Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
  +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
  +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
  (gdb) where
  #0  0x0000000000000000 in ?? ()
  #1  0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
      evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
      tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
  #2  0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
      sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
  #3  perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
      file_offset=file_offset@entry=1136) at util/session.c:1522
  xen-troops#4  0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
      data_offset=<optimized out>, session=0x0) at util/session.c:1899
  xen-troops#5  perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
  xen-troops#6  0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
      at builtin-buildid-list.c:83
  xen-troops#7  cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
  xen-troops#8  0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
      at perf.c:296
  xen-troops#9  0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
  xen-troops#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
  xen-troops#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
  (gdb)

Fix it by adding a stub event handler for namespace event.

Committer testing:

Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:

  # perf record -a --namespaces sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
  # perf buildid-list | wc -l
  38
  # perf buildid-list | head -5
  e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
  874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
  ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
  5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
  f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
  #

It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:

  # perf buildid-list -H
  Segmentation fault (core dumped)
  ^C
  #

Reported-and-Tested-by: Thomas-Mich Richter <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Hari Bathini <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas-Mich Richter <[email protected]>
Fixes: f3b3614 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Dec 22, 2017
rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
or the .doit() callback.  The check is done with this check:

if (flags & NLM_F_DUMP) ...

The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).

When an RDMA_NL_LS message (response) is received, the bit used for
indicating an error is the same bit as NLM_F_ROOT.

NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.

ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
occurs in the service.  The current code then misinterprets the
NLM_F_DUMP bit and trys to call the .dump() callback.

If the .dump() callback for the specified request is not available
(which is true for the RDMA_NL_LS messages) the following Oops occurs:

[ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
   (null)
[ 4555.969046] IP:           (null)
[ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
[ 4555.979287] Oops: 0010 [#1] SMP
[ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod
dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd
glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me
lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devintf
ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace
sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
pps_core drm dca libata i2c_algo_bit i2c_core
[ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
4.14.0-rc2+ xen-troops#6
[ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS
SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
[ 4556.087967] RIP: 0010:          (null)
[ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
[ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
0000000000000000
[ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
ffff881056362000
[ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
0000000000001100
[ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
ffff881056362000
[ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
0000000000000ec0
[ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
knlGS:0000000000000000
[ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
00000000001606e0
[ 4556.161419] Call Trace:
[ 4556.164167]  ? netlink_dump+0x12c/0x290
[ 4556.168468]  __netlink_dump_start+0x186/0x1f0
[ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
[ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
[ 4556.183604]  netlink_unicast+0x181/0x240
[ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
[ 4556.192392]  sock_sendmsg+0x38/0x50
[ 4556.196299]  SYSC_sendto+0x102/0x190
[ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
[ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
[ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
[ 4556.215442]  SyS_sendto+0xe/0x10
[ 4556.219060]  do_syscall_64+0x67/0x1b0
[ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
[ 4556.228328] RIP: 0033:0x7fe0b9db2a63
[ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
[ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
00007fe0b9db2a63
[ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
000000000000000d
[ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
000000000000000c
[ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
00007ffc55edc280
[ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
0000000000000001
[ 4556.282368] Code:  Bad RIP value.
[ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
[ 4556.293013] CR2: 0000000000000000
[ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
[ 4556.305465] Kernel panic - not syncing: Fatal exception
[ 4556.313786] Kernel Offset: disabled
[ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
[ 4556.328960] ------------[ cut here ]------------

Special case RDMA_NL_LS response messages to call the appropriate
callback.

Additionally, make sure that the .dump() callback is not NULL
before calling it.

Fixes: 647c75a ("RDMA/netlink: Convert LS to doit callback")
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Kaike Wan <[email protected]>
Reviewed-by: Alex Estrin <[email protected]>
Signed-off-by: Michael J. Ruhl <[email protected]>
Reviewed-by: Shiraz Saleem <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Mar 1, 2018
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur in
random user space applications as follow,

kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
 #0  0x00007fc08889ae0d _int_malloc (libc.so.6)
 #1  0x00007fc08889c2f3 malloc (libc.so.6)
 #2  0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
 #3  0x0000560e6005e75c n/a (urxvt)
 xen-troops#4  0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
 xen-troops#5  0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
 xen-troops#6  0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
 xen-troops#7  0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
 xen-troops#8  0x0000560e6005cb55 ev_run (urxvt)
 xen-troops#9  0x0000560e6003b9b9 main (urxvt)
 xen-troops#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
 xen-troops#11 0x0000560e6003f9da _start (urxvt)

After bisection, it was found the first bad commit is bd4c82c ("mm,
THP, swap: delay splitting THP after swapped out").

The root cause is as follows:

When the pages are written to swap device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages to
improve performance.  But zswap (frontswap) will treat THP as a normal
page, so only the head page is saved.  After swapping in, tail pages
will not be restored to their original contents, causing memory
corruption in the applications.

This is fixed by refusing to save page in the frontswap store functions
if the page is a THP.  So that the THP will be swapped out to swap
device.

Another choice is to split THP if frontswap is enabled.  But it is found
that the frontswap enabling isn't flexible.  For example, if
CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
zswap itself isn't enabled.

Frontswap has multiple backends, to make it easy for one backend to
enable THP support, the THP checking is put in backend frontswap store
functions instead of the general interfaces.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: bd4c82c ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: "Huang, Ying" <[email protected]>
Reported-by: Sergey Senozhatsky <[email protected]>
Tested-by: Sergey Senozhatsky <[email protected]>
Suggested-by: Minchan Kim <[email protected]>	[put THP checking in backend]
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Shaohua Li <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: <[email protected]>	[4.14]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Mar 13, 2018
Currently we can crash perf record when running in pipe mode, like:

  $ perf record ls | perf report
  # To display the perf.data header info, please use --header/--header-only options.
  #
  perf: Segmentation fault
  Error:
  The - file has no samples!

The callstack of the crash is:

    0x0000000000515242 in perf_event__synthesize_event_update_name
  3513            ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
  (gdb) bt
  #0  0x0000000000515242 in perf_event__synthesize_event_update_name
  #1  0x00000000005158a4 in perf_event__synthesize_extra_attr
  #2  0x0000000000443347 in record__synthesize
  #3  0x00000000004438e3 in __cmd_record
  xen-troops#4  0x000000000044514e in cmd_record
  xen-troops#5  0x00000000004cbc95 in run_builtin
  xen-troops#6  0x00000000004cbf02 in handle_internal_command
  xen-troops#7  0x00000000004cc054 in run_argv
  xen-troops#8  0x00000000004cc422 in main

The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.

We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.

Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue May 23, 2018
syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 #1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 xen-troops#32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 xen-troops#47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 xen-troops#47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ xen-troops#26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jiri Benc <[email protected]>
Reported-by: syzbot <[email protected]>
Acked-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue May 23, 2018
When the module is removed the led workqueue is destroyed in the remove
callback, before the led device is unregistered from the led subsystem.

This leads to a NULL pointer derefence when the led device is
unregistered automatically later as part of the module removal cleanup.
Bellow is the backtrace showing the problem.

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: __queue_work+0x8c/0x410
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  Modules linked in: ccm edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 joydev crypto_simd asus_nb_wmi glue_helper uvcvideo snd_hda_codec_conexant snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel asus_wmi snd_hda_codec cryptd snd_hda_core sparse_keymap videobuf2_vmalloc arc4 videobuf2_memops snd_hwdep input_leds videobuf2_v4l2 ath9k psmouse videobuf2_core videodev ath9k_common snd_pcm ath9k_hw media fam15h_power ath k10temp snd_timer mac80211 i2c_piix4 r8169 mii mac_hid cfg80211 asus_wireless(-) snd soundcore wmi shpchp 8250_dw ip_tables x_tables amdkfd amd_iommu_v2 amdgpu radeon chash i2c_algo_bit drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm video
  CPU: 3 PID: 2177 Comm: rmmod Not tainted 4.15.0-5-generic xen-troops#6+dev94.b4287e5bem1-Endless
  Hardware name: ASUSTeK COMPUTER INC. X555DG/X555DG, BIOS 5.011 05/05/2015
  RIP: 0010:__queue_work+0x8c/0x410
  RSP: 0018:ffffbe8cc249fcd8 EFLAGS: 00010086
  RAX: ffff992ac6810800 RBX: 0000000000000000 RCX: 0000000000000008
  RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff992ac6400e18
  RBP: ffffbe8cc249fd18 R08: ffff992ac6400db0 R09: 0000000000000000
  R10: 0000000000000040 R11: ffff992ac6400dd8 R12: 0000000000002000
  R13: ffff992abd762e00 R14: ffff992abd763e38 R15: 000000000001ebe0
  FS:  00007f318203e700(0000) GS:ffff992aced80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 00000001c720e000 CR4: 00000000001406e0
  Call Trace:
   queue_work_on+0x38/0x40
   led_state_set+0x2c/0x40 [asus_wireless]
   led_set_brightness_nopm+0x14/0x40
   led_set_brightness+0x37/0x60
   led_trigger_set+0xfc/0x1d0
   led_classdev_unregister+0x32/0xd0
   devm_led_classdev_release+0x11/0x20
   release_nodes+0x109/0x1f0
   devres_release_all+0x3c/0x50
   device_release_driver_internal+0x16d/0x220
   driver_detach+0x3f/0x80
   bus_remove_driver+0x55/0xd0
   driver_unregister+0x2c/0x40
   acpi_bus_unregister_driver+0x15/0x20
   asus_wireless_driver_exit+0x10/0xb7c [asus_wireless]
   SyS_delete_module+0x1da/0x2b0
   entry_SYSCALL_64_fastpath+0x24/0x87
  RIP: 0033:0x7f3181b65fd7
  RSP: 002b:00007ffe74bcbe18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3181b65fd7
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555ea2559258
  RBP: 0000555ea25591f0 R08: 00007ffe74bcad91 R09: 000000000000000a
  R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000003
  R13: 00007ffe74bcae00 R14: 0000000000000000 R15: 0000555ea25591f0
  Code: 01 00 00 02 0f 85 7d 01 00 00 48 63 45 d4 48 c7 c6 00 f4 fa 87 49 8b 9d 08 01 00 00 48 03 1c c6 4c 89 f7 e8 87 fb ff ff 48 85 c0 <48> 8b 3b 0f 84 c5 01 00 00 48 39 f8 0f 84 bc 01 00 00 48 89 c7
  RIP: __queue_work+0x8c/0x410 RSP: ffffbe8cc249fcd8
  CR2: 0000000000000000
  ---[ end trace 7aa4f4a232e9c39c ]---

Unregistering the led device on the remove callback before destroying the
workqueue avoids this problem.

https://bugzilla.kernel.org/show_bug.cgi?id=196097

Reported-by: Dun Hum <[email protected]>
Cc: [email protected]
Signed-off-by: João Paulo Rechi Vita <[email protected]>
Signed-off-by: Darren Hart (VMware) <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Nov 27, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 xen-troops#4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 xen-troops#5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 xen-troops#6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 xen-troops#7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 xen-troops#8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 xen-troops#9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 xen-troops#10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 xen-troops#11 [9a069e78]  200 bytes  insert_work at 23fc2e
 xen-troops#12 [9a069f40]  648 bytes  __queue_work at 2487c0
 xen-troops#13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 xen-troops#14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 xen-troops#15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 xen-troops#16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 xen-troops#17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 xen-troops#18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 xen-troops#19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 xen-troops#20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 xen-troops#21 [9a06acf8]  320 bytes  schedule at 219e476
 xen-troops#22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 xen-troops#23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 xen-troops#24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 xen-troops#25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 xen-troops#26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 xen-troops#27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 xen-troops#28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 xen-troops#29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 xen-troops#30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 xen-troops#31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 xen-troops#32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 xen-troops#33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 xen-troops#34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 xen-troops#35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 xen-troops#36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 xen-troops#37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 xen-troops#38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 xen-troops#39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 xen-troops#40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 xen-troops#41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 xen-troops#42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 xen-troops#43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 xen-troops#44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 xen-troops#45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 xen-troops#46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 xen-troops#47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 xen-troops#48 [9a06f388]  536 bytes  wb_workfn at a28228
 xen-troops#49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 xen-troops#50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 xen-troops#51 [9a06fe40]  104 bytes  kthread at 26545a
 xen-troops#52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Jan 29, 2019
Ido Schimmel says:

====================
mlxsw: Various fixes

This patchset contains small fixes in mlxsw and one fix in the bridge
driver.

Patches #1-xen-troops#4 perform small adjustments in PCI and FID code following
recent tests that were performed on the Spectrum-2 ASIC.

Patch xen-troops#5 fixes the bridge driver to mark FDB entries that were added by
user as such. Otherwise, these entries will be ignored by underlying
switch drivers.

Patch xen-troops#6 fixes a long standing issue in mlxsw where the driver
incorrectly programmed static FDB entries as both static and sticky.

Patches xen-troops#7-xen-troops#8 add test cases for above mentioned bugs.

Please consider patches #1, #2 and xen-troops#4 for stable.
====================

Signed-off-by: David S. Miller <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Mar 14, 2019
When option CONFIG_KASAN is enabled toghether with ftrace, function
ftrace_graph_caller() gets in to a recursion, via functions
kasan_check_read() and kasan_check_write().

 Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 179             mcount_get_pc             x0    //     function's pc
 (gdb) bt
 #0  ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 #1  0xffffff90101406c8 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
 #2  0xffffff90106fd084 in kasan_check_write (p=0xffffffc06c170878, size=4) at ../mm/kasan/common.c:105
 #3  0xffffff90104a2464 in atomic_add_return (v=<optimized out>, i=<optimized out>) at ./include/generated/atomic-instrumented.h:71
 xen-troops#4  atomic_inc_return (v=<optimized out>) at ./include/generated/atomic-fallback.h:284
 xen-troops#5  trace_graph_entry (trace=0xffffffc03f5ff380) at ../kernel/trace/trace_functions_graph.c:441
 xen-troops#6  0xffffff9010481774 in trace_graph_entry_watchdog (trace=<optimized out>) at ../kernel/trace/trace_selftest.c:741
 xen-troops#7  0xffffff90104a185c in function_graph_enter (ret=<optimized out>, func=<optimized out>, frame_pointer=18446743799894897728, retp=<optimized out>) at ../kernel/trace/trace_functions_graph.c:196
 xen-troops#8  0xffffff9010140628 in prepare_ftrace_return (self_addr=18446743592948977792, parent=0xffffffc03f5ff418, frame_pointer=18446743799894897728) at ../arch/arm64/kernel/ftrace.c:231
 xen-troops#9  0xffffff90101406f4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
 (gdb)

Rework so that the kasan implementation isn't traced.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Anders Roxell <[email protected]>
Acked-by: Dmitry Vyukov <[email protected]>
Tested-by: Dmitry Vyukov <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
…_map

Detected via gcc's ASan:

  Direct leak of 2048 byte(s) in 64 object(s) allocated from:
    6     #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
    7     #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
    8     #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
    9     #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
   10     xen-troops#4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
   11     xen-troops#5 0x556b0f0e3231 in print_events util/parse-events.c:2514
   12     xen-troops#6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
   13     xen-troops#7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
   14     xen-troops#8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
   15     xen-troops#9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
   16     xen-troops#10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
   17     xen-troops#11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 8989605 ("perf tools: Do not put a variable sized type not at the end of a struct")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
Detected with gcc's ASan:

  Direct leak of 66 byte(s) in 5 object(s) allocated from:
      #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
      #1 0x560c8761034d in collect_config util/config.c:597
      #2 0x560c8760d9cb in get_value util/config.c:169
      #3 0x560c8760dfd7 in perf_parse_file util/config.c:285
      xen-troops#4 0x560c8760e0d2 in perf_config_from_file util/config.c:476
      xen-troops#5 0x560c876108fd in perf_config_set__init util/config.c:661
      xen-troops#6 0x560c87610c72 in perf_config_set__new util/config.c:709
      xen-troops#7 0x560c87610d2f in perf_config__init util/config.c:718
      xen-troops#8 0x560c87610e5d in perf_config util/config.c:730
      xen-troops#9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442
      xen-troops#10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Taeung Song <[email protected]>
Fixes: 20105ca ("perf config: Introduce perf_config_set class")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
Detected with gcc's ASan:

  Direct leak of 4356 byte(s) in 120 object(s) allocated from:
      #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
      #1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215
      #2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339
      #3 0x55719af66272 in print_events util/parse-events.c:2542
      xen-troops#4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
      xen-troops#5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 40218da ("perf list: Show SDT and pre-cached events")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
…r-free issue

The evlist should be destroyed before the perf session.

Detected with gcc's ASan:

  =================================================================
  ==27350==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b000002e38 at pc 0x5611da276999 bp 0x7ffce8f1d1a0 sp 0x7ffce8f1d190
  WRITE of size 8 at 0x62b000002e38 thread T0
      #0 0x5611da276998 in __list_del /home/work/linux/tools/include/linux/list.h:89
      #1 0x5611da276d4a in __list_del_entry /home/work/linux/tools/include/linux/list.h:102
      #2 0x5611da276e77 in list_del_init /home/work/linux/tools/include/linux/list.h:145
      #3 0x5611da2781cd in thread__put util/thread.c:130
      xen-troops#4 0x5611da2cc0a8 in __thread__zput util/thread.h:68
      xen-troops#5 0x5611da2d2dcb in hist_entry__delete util/hist.c:1148
      xen-troops#6 0x5611da2cdf91 in hists__delete_entry util/hist.c:337
      xen-troops#7 0x5611da2ce19e in hists__delete_entries util/hist.c:365
      xen-troops#8 0x5611da2db2ab in hists__delete_all_entries util/hist.c:2639
      xen-troops#9 0x5611da2db325 in hists_evsel__exit util/hist.c:2651
      xen-troops#10 0x5611da1c5352 in perf_evsel__exit util/evsel.c:1304
      xen-troops#11 0x5611da1c5390 in perf_evsel__delete util/evsel.c:1309
      xen-troops#12 0x5611da1b35f0 in perf_evlist__purge util/evlist.c:124
      xen-troops#13 0x5611da1b38e2 in perf_evlist__delete util/evlist.c:148
      xen-troops#14 0x5611da069781 in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1645
      xen-troops#15 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#16 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#17 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#18 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#19 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      xen-troops#20 0x5611d9ff35c9 in _start (/home/work/linux/tools/perf/perf+0x3e95c9)

  0x62b000002e38 is located 11320 bytes inside of 27448-byte region [0x62b000000200,0x62b000006d38)
  freed by thread T0 here:
      #0 0x7fdccb04ab70 in free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedb70)
      #1 0x5611da260df4 in perf_session__delete util/session.c:201
      #2 0x5611da063de5 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1300
      #3 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
      xen-troops#4 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#5 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#6 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#7 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#8 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  previously allocated by thread T0 here:
      #0 0x7fdccb04b138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      #1 0x5611da26010c in zalloc util/util.h:23
      #2 0x5611da260824 in perf_session__new util/session.c:118
      #3 0x5611da0633a6 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1192
      xen-troops#4 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
      xen-troops#5 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#6 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#7 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#8 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#9 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  SUMMARY: AddressSanitizer: heap-use-after-free /home/work/linux/tools/include/linux/list.h:89 in __list_del
  Shadow bytes around the buggy address:
    0x0c567fff8570: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff8580: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff8590: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff85a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff85b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  =>0x0c567fff85c0: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd
    0x0c567fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff85f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x0c567fff8610: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:       fa
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
  ==27350==ABORTING

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
The array str[] should have six elements.

  =================================================================
  ==4322==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56463844e300 at pc 0x564637e7ad0d bp 0x7f30c8c89d10 sp 0x7f30c8c89d00
  READ of size 8 at 0x56463844e300 thread T9
      #0 0x564637e7ad0c in __ordered_events__flush util/ordered-events.c:316
      #1 0x564637e7b0e4 in ordered_events__flush util/ordered-events.c:338
      #2 0x564637c6a57d in process_thread /home/changbin/work/linux/tools/perf/builtin-top.c:1073
      #3 0x7f30d173a163 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8163)
      xen-troops#4 0x7f30cfffbdee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11adee)

  0x56463844e300 is located 32 bytes to the left of global variable 'flags' defined in 'util/trace-event-parse.c:229:26' (0x56463844e320) of size 192
  0x56463844e300 is located 0 bytes to the right of global variable 'str' defined in 'util/ordered-events.c:268:28' (0x56463844e2e0) of size 32
  SUMMARY: AddressSanitizer: global-buffer-overflow util/ordered-events.c:316 in __ordered_events__flush
  Shadow bytes around the buggy address:
    0x0ac947081c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c50: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
  =>0x0ac947081c60:[f9]f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c70: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
    0x0ac947081c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ac947081cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:       fa
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
  Thread T9 created by T0 here:
      #0 0x7f30d179de5f in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x4ae5f)
      #1 0x564637c6b954 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1253
      #2 0x564637c7173c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
      #3 0x564637d85038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#4 0x564637d85577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#5 0x564637d8597b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#6 0x564637d860e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#7 0x7f30cff0509a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Jiri Olsa <[email protected]>
Fixes: 16c66bc ("perf top: Add processing thread")
Fixes: 68ca5d0 ("perf ordered_events: Add ordered_events__flush_time interface")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
Using gcc's ASan, Changbin reports:

  =================================================================
  ==7494==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 48 byte(s) in 1 object(s) allocated from:
      #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      #1 0x5625e5330a5e in zalloc util/util.h:23
      #2 0x5625e5330a9b in perf_counts__new util/counts.c:10
      #3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
      xen-troops#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
      xen-troops#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
      xen-troops#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
      xen-troops#7 0x5625e51528e6 in run_test tests/builtin-test.c:358
      xen-troops#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388
      xen-troops#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
      xen-troops#10 0x5625e515572f in cmd_test tests/builtin-test.c:722
      xen-troops#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  Indirect leak of 72 byte(s) in 1 object(s) allocated from:
      #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      #1 0x5625e532560d in zalloc util/util.h:23
      #2 0x5625e532566b in xyarray__new util/xyarray.c:10
      #3 0x5625e5330aba in perf_counts__new util/counts.c:15
      xen-troops#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
      xen-troops#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
      xen-troops#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
      xen-troops#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
      xen-troops#8 0x5625e51528e6 in run_test tests/builtin-test.c:358
      xen-troops#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388
      xen-troops#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
      xen-troops#11 0x5625e515572f in cmd_test tests/builtin-test.c:722
      xen-troops#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

His patch took care of evsel->prev_raw_counts, but the above backtraces
are about evsel->counts, so fix that instead.

Reported-by: Changbin Du <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
…_event_on_all_cpus test

  =================================================================
  ==7497==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f0333a88f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
      #1 0x5625e5326213 in cpu_map__trim_new util/cpumap.c:45
      #2 0x5625e5326703 in cpu_map__read util/cpumap.c:103
      #3 0x5625e53267ef in cpu_map__read_all_cpu_map util/cpumap.c:120
      xen-troops#4 0x5625e5326915 in cpu_map__new util/cpumap.c:135
      xen-troops#5 0x5625e517b355 in test__openat_syscall_event_on_all_cpus tests/openat-syscall-all-cpus.c:36
      xen-troops#6 0x5625e51528e6 in run_test tests/builtin-test.c:358
      xen-troops#7 0x5625e5152baf in test_and_print tests/builtin-test.c:388
      xen-troops#8 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
      xen-troops#9 0x5625e515572f in cmd_test tests/builtin-test.c:722
      xen-troops#10 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#11 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#12 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#13 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#14 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: f30a79b ("perf tools: Add reference counting for cpu_map object")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
  =================================================================
  ==7506==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 13 byte(s) in 3 object(s) allocated from:
      #0 0x7f03339d6070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
      #1 0x5625e53aaef0 in expr__find_other util/expr.y:221
      #2 0x5625e51bcd3f in test__expr tests/expr.c:52
      #3 0x5625e51528e6 in run_test tests/builtin-test.c:358
      xen-troops#4 0x5625e5152baf in test_and_print tests/builtin-test.c:388
      xen-troops#5 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
      xen-troops#6 0x5625e515572f in cmd_test tests/builtin-test.c:722
      xen-troops#7 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#8 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#9 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#10 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#11 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Changbin Du <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 0751673 ("perf tools: Add a simple expression parser for JSON")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
  =================================================================
  ==20875==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 1160 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
      #1 0x55bd50005599 in zalloc util/util.h:23
      #2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
      #3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
      xen-troops#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
      xen-troops#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
      xen-troops#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
      xen-troops#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
      xen-troops#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
      xen-troops#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
      xen-troops#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
      xen-troops#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
      xen-troops#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
      xen-troops#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

  Indirect leak of 19 byte(s) in 1 object(s) allocated from:
      #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
      #1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)

Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Apr 4, 2019
gpio-aspeed implements support for PIN_CONFIG_INPUT_DEBOUNCE. As of
v5.1-rc1 we're seeing the following when booting a Romulus BMC kernel:

> [   21.373137] ------------[ cut here ]------------
> [   21.374545] WARNING: CPU: 0 PID: 1 at drivers/gpio/gpio-aspeed.c:834 unregister_allocated_timer+0x38/0x94
> [   21.376181] No timer allocated to offset 74
> [   21.377672] CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-rc1-dirty xen-troops#6
> [   21.378800] Hardware name: Generic DT based system
> [   21.379965] Backtrace:
> [   21.381024] [<80107d44>] (dump_backtrace) from [<80107f78>] (show_stack+0x20/0x24)
> [   21.382713]  r7:8038b720 r6:00000009 r5:00000000 r4:87897c64
> [   21.383815] [<80107f58>] (show_stack) from [<80656398>] (dump_stack+0x20/0x28)
> [   21.385042] [<80656378>] (dump_stack) from [<80115f1c>] (__warn.part.3+0xb4/0xdc)
> [   21.386253] [<80115e68>] (__warn.part.3) from [<80115fb0>] (warn_slowpath_fmt+0x6c/0x90)
> [   21.387471]  r6:00000342 r5:807f8758 r4:80a07008
> [   21.388278] [<80115f48>] (warn_slowpath_fmt) from [<8038b720>] (unregister_allocated_timer+0x38/0x94)
> [   21.389809]  r3:0000004a r2:807f8774
> [   21.390526]  r7:00000000 r6:0000000a r5:60000153 r4:0000004a
> [   21.391601] [<8038b6e8>] (unregister_allocated_timer) from [<8038baac>] (aspeed_gpio_set_config+0x330/0x48c)
> [   21.393248] [<8038b77c>] (aspeed_gpio_set_config) from [<803840b0>] (gpiod_set_debounce+0xe8/0x114)
> [   21.394745]  r10:82ee2248 r9:00000000 r8:87b63a00 r7:00001388 r6:87947320 r5:80729310
> [   21.396030]  r4:879f64a0
> [   21.396499] [<80383fc8>] (gpiod_set_debounce) from [<804b4350>] (gpio_keys_probe+0x69c/0x8e0)
> [   21.397715]  r7:845d94b8 r6:00000001 r5:00000000 r4:87b63a1c
> [   21.398618] [<804b3cb4>] (gpio_keys_probe) from [<8040eeec>] (platform_dev_probe+0x44/0x80)
> [   21.399834]  r10:00000003 r9:80a3a8b0 r8:00000000 r7:00000000 r6:80a7f9dc r5:80a3a8b0
> [   21.401163]  r4:8796bc10
> [   21.401634] [<8040eea8>] (platform_drv_probe) from [<8040d0d4>] (really_probe+0x208/0x3dc)
> [   21.402786]  r5:80a7f8d0 r4:8796bc10
> [   21.403547] [<8040cecc>] (really_probe) from [<8040d7a4>] (driver_probe_device+0x130/0x170)
> [   21.404744]  r10:0000007b r9:8093683c r8:00000000 r7:80a07008 r6:80a3a8b0 r5:8796bc10
> [   21.405854]  r4:80a3a8b0
> [   21.406324] [<8040d674>] (driver_probe_device) from [<8040da8c>] (device_driver_attach+0x68/0x70)
> [   21.407568]  r9:8093683c r8:00000000 r7:80a07008 r6:80a3a8b0 r5:00000000 r4:8796bc10
> [   21.408877] [<8040da24>] (device_driver_attach) from [<8040db14>] (__driver_attach+0x80/0x150)
> [   21.410327]  r7:80a07008 r6:8796bc10 r5:00000001 r4:80a3a8b0
> [   21.411294] [<8040da94>] (__driver_attach) from [<8040b20c>] (bus_for_each_dev+0x80/0xc4)
> [   21.412641]  r7:80a07008 r6:8040da94 r5:80a3a8b0 r4:87966f30
> [   21.413580] [<8040b18c>] (bus_for_each_dev) from [<8040dc0c>] (driver_attach+0x28/0x30)
> [   21.414943]  r7:00000000 r6:87b411e0 r5:80a33fc8 r4:80a3a8b0
> [   21.415927] [<8040dbe4>] (driver_attach) from [<8040bbf0>] (bus_add_driver+0x14c/0x200)
> [   21.417289] [<8040baa4>] (bus_add_driver) from [<8040e2b4>] (driver_register+0x84/0x118)
> [   21.418652]  r7:80a60ae0 r6:809226b8 r5:80a07008 r4:80a3a8b0
> [   21.419652] [<8040e230>] (driver_register) from [<8040fc28>] (__platform_driver_register+0x3c/0x50)
> [   21.421193]  r5:80a07008 r4:809525f8
> [   21.421990] [<8040fbec>] (__platform_driver_register) from [<809226d8>] (gpio_keys_init+0x20/0x28)
> [   21.423447] [<809226b8>] (gpio_keys_init) from [<8090128c>] (do_one_initcall+0x80/0x180)
> [   21.424886] [<8090120c>] (do_one_initcall) from [<80901538>] (kernel_init_freeable+0x1ac/0x26c)
> [   21.426354]  r8:80a60ae0 r7:80a60ae0 r6:8093685c r5:00000008 r4:809525f8
> [   21.427579] [<8090138c>] (kernel_init_freeable) from [<8066d9a0>] (kernel_init+0x18/0x11c)
> [   21.428819]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:8066d988
> [   21.429947]  r4:00000000
> [   21.430415] [<8066d988>] (kernel_init) from [<801010e8>] (ret_from_fork+0x14/0x2c)
> [   21.431666] Exception stack(0x87897fb0 to 0x87897ff8)
> [   21.432877] 7fa0:                                     00000000 00000000 00000000 00000000
> [   21.434446] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   21.436052] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   21.437308]  r5:8066d988 r4:00000000
> [   21.438102] ---[ end trace d7d7ac3a80567d0e ]---

We only hit unregister_allocated_timer() if the argument to
aspeed_gpio_set_config() is 0, but we can't be calling through
gpiod_set_debounce() from gpio_keys_probe() unless the gpio-keys button has a
non-zero debounce interval.

Commit 6581eaf ("gpio: use new gpio_set_config() helper in more places")
spreads the use of gpio_set_config() to the debounce and transitory
state configuration paths. The implementation of gpio_set_config() is:

> static int gpio_set_config(struct gpio_chip *gc, unsigned offset,
> 			   enum pin_config_param mode)
> {
> 	unsigned long config = { PIN_CONF_PACKED(mode, 0) };
>
> 	return gc->set_config ? gc->set_config(gc, offset, config) : -ENOTSUPP;
> }

Here it packs its own config value with a fixed argument of 0; this is
incorrect behaviour for implementing the debounce and transitory functions, and
the debounce and transitory gpio_set_config() call-sites now have an undetected
type mismatch as they both already pack their own config parameter (i.e. what
gets passed is not an `enum pin_config_param`). Indeed this can be seen in the
small diff for 6581eaf:

> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index de595fa31a1a..1f239aac43df 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -2725,7 +2725,7 @@ int gpiod_set_debounce(struct gpio_desc *desc, unsigned debounce)
>         }
>
>         config = pinconf_to_config_packed(PIN_CONFIG_INPUT_DEBOUNCE, debounce);
> -       return chip->set_config(chip, gpio_chip_hwgpio(desc), config);
> +       return gpio_set_config(chip, gpio_chip_hwgpio(desc), config);
>  }
>  EXPORT_SYMBOL_GPL(gpiod_set_debounce);
>
> @@ -2762,7 +2762,7 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory)
>         packed = pinconf_to_config_packed(PIN_CONFIG_PERSIST_STATE,
>                                           !transitory);
>         gpio = gpio_chip_hwgpio(desc);
> -       rc = chip->set_config(chip, gpio, packed);
> +       rc = gpio_set_config(chip, gpio, packed);
>         if (rc == -ENOTSUPP) {
>                 dev_dbg(&desc->gdev->dev, "Persistence not supported for GPIO %d\n",
>                                 gpio);

Revert commit 6581eaf ("gpio: use new gpio_set_config() helper in
more places") to restore correct behaviour for gpiod_set_debounce() and
gpiod_set_transitory().

Cc: Thomas Petazzoni <[email protected]>
Signed-off-by: Andrew Jeffery <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Aug 13, 2020
Ido Schimmel says:

====================
mlxsw fixes

This patch set contains various fixes for mlxsw.

Patches #1-#2 fix two trap related issues introduced in previous cycle.

Patches #3-xen-troops#5 fix rare use-after-frees discovered by syzkaller. After
over a week of fuzzing with the fixes, the bugs did not reproduce.

Patch xen-troops#6 from Amit fixes an issue in the ethtool selftest that was
recently discovered after running the test on a new platform that
supports only 1Gbps and 10Gbps speeds.
====================

Signed-off-by: David S. Miller <[email protected]>
andr2000 pushed a commit to andr2000/linux that referenced this issue Aug 13, 2020
I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        xen-troops#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        xen-troops#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        xen-troops#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        xen-troops#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        xen-troops#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        xen-troops#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        xen-troops#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        xen-troops#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        xen-troops#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        xen-troops#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        xen-troops#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        xen-troops#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        xen-troops#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        xen-troops#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        xen-troops#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        xen-troops#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        xen-troops#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        xen-troops#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        xen-troops#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        xen-troops#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        xen-troops#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        xen-troops#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        xen-troops#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Link: https://lore.kernel.org/linux-trace-devel/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Nov 22, 2021
The exit function fixes a memory leak with the src field as detected by
leak sanitizer. An example of which is:

Indirect leak of 25133184 byte(s) in 207 object(s) allocated from:
    #0 0x7f199ecfe987 in __interceptor_calloc libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55defe638224 in annotated_source__alloc_histograms util/annotate.c:803
    #2 0x55defe6397e4 in symbol__hists util/annotate.c:952
    xen-troops#3 0x55defe639908 in symbol__inc_addr_samples util/annotate.c:968
    xen-troops#4 0x55defe63aa29 in hist_entry__inc_addr_samples util/annotate.c:1119
    xen-troops#5 0x55defe499a79 in hist_iter__report_callback tools/perf/builtin-report.c:182
    xen-troops#6 0x55defe7a859d in hist_entry_iter__add util/hist.c:1236
    xen-troops#7 0x55defe49aa63 in process_sample_event tools/perf/builtin-report.c:315
    xen-troops#8 0x55defe731bc8 in evlist__deliver_sample util/session.c:1473
    xen-troops#9 0x55defe731e38 in machines__deliver_event util/session.c:1510
    xen-troops#10 0x55defe732a23 in perf_session__deliver_event util/session.c:1590
    xen-troops#11 0x55defe72951e in ordered_events__deliver_event util/session.c:183
    xen-troops#12 0x55defe740082 in do_flush util/ordered-events.c:244
    xen-troops#13 0x55defe7407cb in __ordered_events__flush util/ordered-events.c:323
    xen-troops#14 0x55defe740a61 in ordered_events__flush util/ordered-events.c:341
    xen-troops#15 0x55defe73837f in __perf_session__process_events util/session.c:2390
    xen-troops#16 0x55defe7385ff in perf_session__process_events util/session.c:2420
    ...

Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Nov 22, 2021
To clear a user buffer we cannot simply use memset, we have to use
clear_user().  With a virtio-mem device that registers a vmcore_cb and
has some logically unplugged memory inside an added Linux memory block,
I can easily trigger a BUG by copying the vmcore via "cp":

  systemd[1]: Starting Kdump Vmcore Save Service...
  kdump[420]: Kdump is using the default log level(3).
  kdump[453]: saving to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
  kdump[458]: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
  kdump[465]: saving vmcore-dmesg.txt complete
  kdump[467]: saving vmcore
  BUG: unable to handle page fault for address: 00007f2374e01000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0003) - permissions violation
  PGD 7a523067 P4D 7a523067 PUD 7a528067 PMD 7a525067 PTE 800000007048f867
  Oops: 0003 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 468 Comm: cp Not tainted 5.15.0+ xen-troops#6
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014
  RIP: 0010:read_from_oldmem.part.0.cold+0x1d/0x86
  Code: ff ff ff e8 05 ff fe ff e9 b9 e9 7f ff 48 89 de 48 c7 c7 38 3b 60 82 e8 f1 fe fe ff 83 fd 08 72 3c 49 8d 7d 08 4c 89 e9 89 e8 <49> c7 45 00 00 00 00 00 49 c7 44 05 f8 00 00 00 00 48 83 e7 f81
  RSP: 0018:ffffc9000073be08 EFLAGS: 00010212
  RAX: 0000000000001000 RBX: 00000000002fd000 RCX: 00007f2374e01000
  RDX: 0000000000000001 RSI: 00000000ffffdfff RDI: 00007f2374e01008
  RBP: 0000000000001000 R08: 0000000000000000 R09: ffffc9000073bc50
  R10: ffffc9000073bc48 R11: ffffffff829461a8 R12: 000000000000f000
  R13: 00007f2374e01000 R14: 0000000000000000 R15: ffff88807bd421e8
  FS:  00007f2374e12140(0000) GS:ffff88807f000000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f2374e01000 CR3: 000000007a4aa000 CR4: 0000000000350eb0
  Call Trace:
   read_vmcore+0x236/0x2c0
   proc_reg_read+0x55/0xa0
   vfs_read+0x95/0x190
   ksys_read+0x4f/0xc0
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access
Prevention (SMAP)", which is used to detect wrong access from the kernel
to user buffers like this: SMAP triggers a permissions violation on
wrong access.  In the x86-64 variant of clear_user(), SMAP is properly
handled via clac()+stac().

To fix, properly use clear_user() when we're dealing with a user buffer.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 997c136 ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages")
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Philipp Rudo <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue May 10, 2022
While handling PCI errors (AER flow) driver tries to
disable NAPI [napi_disable()] after NAPI is deleted
[__netif_napi_del()] which causes unexpected system
hang/crash.

System message log shows the following:
=======================================
[ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures.
[ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)'
[ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports:
'need reset'
[ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports:
'need reset'
[ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset'
[ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow:
[ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows:
[ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow:
[ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows:
[ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset'
[ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->slot_reset()
[ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing...
[ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset
--> driver unload

Uninterruptible tasks
=====================
crash> ps | grep UN
     213      2  11  c000000004c89e00  UN   0.0       0      0  [eehd]
     215      2   0  c000000004c80000  UN   0.0       0      0
[kworker/0:2]
    2196      1  28  c000000004504f00  UN   0.1   15936  11136  wickedd
    4287      1   9  c00000020d076800  UN   0.0    4032   3008  agetty
    4289      1  20  c00000020d056680  UN   0.0    7232   3840  agetty
   32423      2  26  c00000020038c580  UN   0.0       0      0
[kworker/26:3]
   32871   4241  27  c0000002609ddd00  UN   0.1   18624  11648  sshd
   32920  10130  16  c00000027284a100  UN   0.1   48512  12608  sendmail
   33092  32987   0  c000000205218b00  UN   0.1   48512  12608  sendmail
   33154   4567  16  c000000260e51780  UN   0.1   48832  12864  pickup
   33209   4241  36  c000000270cb6500  UN   0.1   18624  11712  sshd
   33473  33283   0  c000000205211480  UN   0.1   48512  12672  sendmail
   33531   4241  37  c00000023c902780  UN   0.1   18624  11648  sshd

EEH handler hung while bnx2x sleeping and holding RTNL lock
===========================================================
crash> bt 213
PID: 213    TASK: c000000004c89e00  CPU: 11  COMMAND: "eehd"
  #0 [c000000004d477e0] __schedule at c000000000c70808
  #1 [c000000004d478b0] schedule at c000000000c70ee0
  #2 [c000000004d478e0] schedule_timeout at c000000000c76dec
  xen-troops#3 [c000000004d479c0] msleep at c0000000002120cc
  xen-troops#4 [c000000004d479f0] napi_disable at c000000000a06448
                                        ^^^^^^^^^^^^^^^^
  xen-troops#5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x]
  xen-troops#6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x]
  xen-troops#7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc
  xen-troops#8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8
  xen-troops#9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64

And the sleeping source code
============================
crash> dis -ls c000000000a06448
FILE: ../net/core/dev.c
LINE: 6702

   6697  {
   6698          might_sleep();
   6699          set_bit(NAPI_STATE_DISABLE, &n->state);
   6700
   6701          while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
* 6702                  msleep(1);
   6703          while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
   6704                  msleep(1);
   6705
   6706          hrtimer_cancel(&n->timer);
   6707
   6708          clear_bit(NAPI_STATE_DISABLE, &n->state);
   6709  }

EEH calls into bnx2x twice based on the system log above, first through
bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes
the following call chains:

bnx2x_io_error_detected()
  +-> bnx2x_eeh_nic_unload()
       +-> bnx2x_del_all_napi()
            +-> __netif_napi_del()

bnx2x_io_slot_reset()
  +-> bnx2x_netif_stop()
       +-> bnx2x_napi_disable()
            +->napi_disable()

Fix this by correcting the sequence of NAPI APIs usage,
that is delete the NAPI after disabling it.

Fixes: 7fa6f34 ("bnx2x: AER revised")
Reported-by: David Christensen <[email protected]>
Tested-by: David Christensen <[email protected]>
Signed-off-by: Manish Chopra <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue May 17, 2022
Current DP driver implementation has adding safe mode done at
dp_hpd_plug_handle() which is expected to be executed under event
thread context.

However there is possible circular locking happen (see blow stack trace)
after edp driver call dp_hpd_plug_handle() from dp_bridge_enable() which
is executed under drm_thread context.

After review all possibilities methods and as discussed on
https://patchwork.freedesktop.org/patch/483155/, supporting EDID
compliance tests in the driver is quite hacky. As seen with other
vendor drivers, supporting these will be much easier with IGT. Hence
removing all the related fail safe code for it so that no possibility
of circular lock will happen.
Reviewed-by: Stephen Boyd <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>

======================================================
 WARNING: possible circular locking dependency detected
 5.15.35-lockdep xen-troops#6 Tainted: G        W
 ------------------------------------------------------
 frecon/429 is trying to acquire lock:
 ffffff808dc3c4e8 (&dev->mode_config.mutex){+.+.}-{3:3}, at:
dp_panel_add_fail_safe_mode+0x4c/0xa0

 but task is already holding lock:
 ffffff808dc441e0 (&kms->commit_lock[i]){+.+.}-{3:3}, at: lock_crtcs+0xb4/0x124

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> xen-troops#3 (&kms->commit_lock[i]){+.+.}-{3:3}:
        __mutex_lock_common+0x174/0x1a64
        mutex_lock_nested+0x98/0xac
        lock_crtcs+0xb4/0x124
        msm_atomic_commit_tail+0x330/0x748
        commit_tail+0x19c/0x278
        drm_atomic_helper_commit+0x1dc/0x1f0
        drm_atomic_commit+0xc0/0xd8
        drm_atomic_helper_set_config+0xb4/0x134
        drm_mode_setcrtc+0x688/0x1248
        drm_ioctl_kernel+0x1e4/0x338
        drm_ioctl+0x3a4/0x684
        __arm64_sys_ioctl+0x118/0x154
        invoke_syscall+0x78/0x224
        el0_svc_common+0x178/0x200
        do_el0_svc+0x94/0x13c
        el0_svc+0x5c/0xec
        el0t_64_sync_handler+0x78/0x108
        el0t_64_sync+0x1a4/0x1a8

 -> #2 (crtc_ww_class_mutex){+.+.}-{3:3}:
        __mutex_lock_common+0x174/0x1a64
        ww_mutex_lock+0xb8/0x278
        modeset_lock+0x304/0x4ac
        drm_modeset_lock+0x4c/0x7c
        drmm_mode_config_init+0x4a8/0xc50
        msm_drm_init+0x274/0xac0
        msm_drm_bind+0x20/0x2c
        try_to_bring_up_master+0x3dc/0x470
        __component_add+0x18c/0x3c0
        component_add+0x1c/0x28
        dp_display_probe+0x954/0xa98
        platform_probe+0x124/0x15c
        really_probe+0x1b0/0x5f8
        __driver_probe_device+0x174/0x20c
        driver_probe_device+0x70/0x134
        __device_attach_driver+0x130/0x1d0
        bus_for_each_drv+0xfc/0x14c
        __device_attach+0x1bc/0x2bc
        device_initial_probe+0x1c/0x28
        bus_probe_device+0x94/0x178
        deferred_probe_work_func+0x1a4/0x1f0
        process_one_work+0x5d4/0x9dc
        worker_thread+0x898/0xccc
        kthread+0x2d4/0x3d4
        ret_from_fork+0x10/0x20

 -> #1 (crtc_ww_class_acquire){+.+.}-{0:0}:
        ww_acquire_init+0x1c4/0x2c8
        drm_modeset_acquire_init+0x44/0xc8
        drm_helper_probe_single_connector_modes+0xb0/0x12dc
        drm_mode_getconnector+0x5dc/0xfe8
        drm_ioctl_kernel+0x1e4/0x338
        drm_ioctl+0x3a4/0x684
        __arm64_sys_ioctl+0x118/0x154
        invoke_syscall+0x78/0x224
        el0_svc_common+0x178/0x200
        do_el0_svc+0x94/0x13c
        el0_svc+0x5c/0xec
        el0t_64_sync_handler+0x78/0x108
        el0t_64_sync+0x1a4/0x1a8

 -> #0 (&dev->mode_config.mutex){+.+.}-{3:3}:
        __lock_acquire+0x2650/0x672c
        lock_acquire+0x1b4/0x4ac
        __mutex_lock_common+0x174/0x1a64
        mutex_lock_nested+0x98/0xac
        dp_panel_add_fail_safe_mode+0x4c/0xa0
        dp_hpd_plug_handle+0x1f0/0x280
        dp_bridge_enable+0x94/0x2b8
        drm_atomic_bridge_chain_enable+0x11c/0x168
        drm_atomic_helper_commit_modeset_enables+0x500/0x740
        msm_atomic_commit_tail+0x3e4/0x748
        commit_tail+0x19c/0x278
        drm_atomic_helper_commit+0x1dc/0x1f0
        drm_atomic_commit+0xc0/0xd8
        drm_atomic_helper_set_config+0xb4/0x134
        drm_mode_setcrtc+0x688/0x1248
        drm_ioctl_kernel+0x1e4/0x338
        drm_ioctl+0x3a4/0x684
        __arm64_sys_ioctl+0x118/0x154
        invoke_syscall+0x78/0x224
        el0_svc_common+0x178/0x200
        do_el0_svc+0x94/0x13c
        el0_svc+0x5c/0xec
        el0t_64_sync_handler+0x78/0x108
        el0t_64_sync+0x1a4/0x1a8

Changes in v2:
-- re text commit title
-- remove all fail safe mode

Changes in v3:
-- remove dp_panel_add_fail_safe_mode() from dp_panel.h
-- add Fixes

Changes in v5:
--  [email protected]

Changes in v6:
--  fix Fixes commit ID

Fixes: 8b2c181 ("drm/msm/dp: add fail safe mode outside of event_mutex context")
Reported-by: Douglas Anderson <[email protected]>
Signed-off-by: Kuogee Hsieh <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Clark <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue May 17, 2022
Recent commit that modified fib route event handler to handle events
according to their priority introduced use-after-free[0] in mp->mfi pointer
usage. The pointer now is not just cached in order to be compared to
following fib_info instances, but is also dereferenced to obtain
fib_priority. However, since mlx5 lag code doesn't hold the reference to
fin_info during whole mp->mfi lifetime, it could be used after fib_info
instance has already been freed be kernel infrastructure code.

Don't ever dereference mp->mfi pointer. Refactor it to be 'const void*'
type and cache fib_info priority in dedicated integer. Group
fib_info-related data into dedicated 'fib' structure that will be further
extended by following patches in the series.

[0]:

[  203.588029] ==================================================================
[  203.590161] BUG: KASAN: use-after-free in mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core]
[  203.592386] Read of size 4 at addr ffff888144df2050 by task kworker/u20:4/138

[  203.594766] CPU: 3 PID: 138 Comm: kworker/u20:4 Tainted: G    B             5.17.0-rc7+ xen-troops#6
[  203.596751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  203.598813] Workqueue: mlx5_lag_mp mlx5_lag_fib_update [mlx5_core]
[  203.600053] Call Trace:
[  203.600608]  <TASK>
[  203.601110]  dump_stack_lvl+0x48/0x5e
[  203.601860]  print_address_description.constprop.0+0x1f/0x160
[  203.602950]  ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core]
[  203.604073]  ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core]
[  203.605177]  kasan_report.cold+0x83/0xdf
[  203.605969]  ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core]
[  203.607102]  mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core]
[  203.608199]  ? mlx5_lag_init_fib_work+0x1c0/0x1c0 [mlx5_core]
[  203.609382]  ? read_word_at_a_time+0xe/0x20
[  203.610463]  ? strscpy+0xa0/0x2a0
[  203.611463]  process_one_work+0x722/0x1270
[  203.612344]  worker_thread+0x540/0x11e0
[  203.613136]  ? rescuer_thread+0xd50/0xd50
[  203.613949]  kthread+0x26e/0x300
[  203.614627]  ? kthread_complete_and_exit+0x20/0x20
[  203.615542]  ret_from_fork+0x1f/0x30
[  203.616273]  </TASK>

[  203.617174] Allocated by task 3746:
[  203.617874]  kasan_save_stack+0x1e/0x40
[  203.618644]  __kasan_kmalloc+0x81/0xa0
[  203.619394]  fib_create_info+0xb41/0x3c50
[  203.620213]  fib_table_insert+0x190/0x1ff0
[  203.621020]  fib_magic.isra.0+0x246/0x2e0
[  203.621803]  fib_add_ifaddr+0x19f/0x670
[  203.622563]  fib_inetaddr_event+0x13f/0x270
[  203.623377]  blocking_notifier_call_chain+0xd4/0x130
[  203.624355]  __inet_insert_ifa+0x641/0xb20
[  203.625185]  inet_rtm_newaddr+0xc3d/0x16a0
[  203.626009]  rtnetlink_rcv_msg+0x309/0x880
[  203.626826]  netlink_rcv_skb+0x11d/0x340
[  203.627626]  netlink_unicast+0x4cc/0x790
[  203.628430]  netlink_sendmsg+0x762/0xc00
[  203.629230]  sock_sendmsg+0xb2/0xe0
[  203.629955]  ____sys_sendmsg+0x58a/0x770
[  203.630756]  ___sys_sendmsg+0xd8/0x160
[  203.631523]  __sys_sendmsg+0xb7/0x140
[  203.632294]  do_syscall_64+0x35/0x80
[  203.633045]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[  203.634427] Freed by task 0:
[  203.635063]  kasan_save_stack+0x1e/0x40
[  203.635844]  kasan_set_track+0x21/0x30
[  203.636618]  kasan_set_free_info+0x20/0x30
[  203.637450]  __kasan_slab_free+0xfc/0x140
[  203.638271]  kfree+0x94/0x3b0
[  203.638903]  rcu_core+0x5e4/0x1990
[  203.639640]  __do_softirq+0x1ba/0x5d3

[  203.640828] Last potentially related work creation:
[  203.641785]  kasan_save_stack+0x1e/0x40
[  203.642571]  __kasan_record_aux_stack+0x9f/0xb0
[  203.643478]  call_rcu+0x88/0x9c0
[  203.644178]  fib_release_info+0x539/0x750
[  203.644997]  fib_table_delete+0x659/0xb80
[  203.645809]  fib_magic.isra.0+0x1a3/0x2e0
[  203.646617]  fib_del_ifaddr+0x93f/0x1300
[  203.647415]  fib_inetaddr_event+0x9f/0x270
[  203.648251]  blocking_notifier_call_chain+0xd4/0x130
[  203.649225]  __inet_del_ifa+0x474/0xc10
[  203.650016]  devinet_ioctl+0x781/0x17f0
[  203.650788]  inet_ioctl+0x1ad/0x290
[  203.651533]  sock_do_ioctl+0xce/0x1c0
[  203.652315]  sock_ioctl+0x27b/0x4f0
[  203.653058]  __x64_sys_ioctl+0x124/0x190
[  203.653850]  do_syscall_64+0x35/0x80
[  203.654608]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[  203.666952] The buggy address belongs to the object at ffff888144df2000
                which belongs to the cache kmalloc-256 of size 256
[  203.669250] The buggy address is located 80 bytes inside of
                256-byte region [ffff888144df2000, ffff888144df2100)
[  203.671332] The buggy address belongs to the page:
[  203.672273] page:00000000bf6c9314 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x144df0
[  203.674009] head:00000000bf6c9314 order:2 compound_mapcount:0 compound_pincount:0
[  203.675422] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
[  203.676819] raw: 002ffff800010200 0000000000000000 dead000000000122 ffff888100042b40
[  203.678384] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[  203.679928] page dumped because: kasan: bad access detected

[  203.681455] Memory state around the buggy address:
[  203.682421]  ffff888144df1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  203.683863]  ffff888144df1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  203.685310] >ffff888144df2000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  203.686701]                                                  ^
[  203.687820]  ffff888144df2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  203.689226]  ffff888144df2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  203.690620] ==================================================================

Fixes: ad11c4f ("net/mlx5e: Lag, Only handle events from highest priority multipath entry")
Signed-off-by: Vlad Buslov <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Jun 20, 2022
see warning:
| drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
| type 'unsigned short' but the argument has type 'int' [-Wformat]
|        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
|                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~

Variadic functions (printf-like) undergo default argument promotion.
Documentation/core-api/printk-formats.rst specifically recommends
using the promoted-to-type's format flag.

Also, as per C11 6.3.1.1:
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
`If an int can represent all values of the original type ..., the
value is converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions.`

Since the argument is a u16 it will get promoted to an int and thus it is
most accurate to use the %x format specifier here. It should be noted that the
`xen-troops#6` formatting sugar does not alter the promotion rules.

Link: ClangBuiltLinux#378
Signed-off-by: Justin Stitt <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Oct 5, 2022
We have been hitting the following lockdep splat with btrfs/187 recently

  WARNING: possible circular locking dependency detected
  5.19.0-rc8+ torvalds#775 Not tainted
  ------------------------------------------------------
  btrfs/752500 is trying to acquire lock:
  ffff97e1875a97b8 (btrfs-treloc-02#2){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  but task is already holding lock:
  ffff97e1875a9278 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-tree-01/1){+.+.}-{3:3}:
	 down_write_nested+0x41/0x80
	 __btrfs_tree_lock+0x24/0x110
	 btrfs_init_new_buffer+0x7d/0x2c0
	 btrfs_alloc_tree_block+0x120/0x3b0
	 __btrfs_cow_block+0x136/0x600
	 btrfs_cow_block+0x10b/0x230
	 btrfs_search_slot+0x53b/0xb70
	 btrfs_lookup_inode+0x2a/0xa0
	 __btrfs_update_delayed_inode+0x5f/0x280
	 btrfs_async_run_delayed_root+0x24c/0x290
	 btrfs_work_helper+0xf2/0x3e0
	 process_one_work+0x271/0x590
	 worker_thread+0x52/0x3b0
	 kthread+0xf0/0x120
	 ret_from_fork+0x1f/0x30

  -> #1 (btrfs-tree-01){++++}-{3:3}:
	 down_write_nested+0x41/0x80
	 __btrfs_tree_lock+0x24/0x110
	 btrfs_search_slot+0x3c3/0xb70
	 do_relocation+0x10c/0x6b0
	 relocate_tree_blocks+0x317/0x6d0
	 relocate_block_group+0x1f1/0x560
	 btrfs_relocate_block_group+0x23e/0x400
	 btrfs_relocate_chunk+0x4c/0x140
	 btrfs_balance+0x755/0xe40
	 btrfs_ioctl+0x1ea2/0x2c90
	 __x64_sys_ioctl+0x88/0xc0
	 do_syscall_64+0x38/0x90
	 entry_SYSCALL_64_after_hwframe+0x63/0xcd

  -> #0 (btrfs-treloc-02#2){+.+.}-{3:3}:
	 __lock_acquire+0x1122/0x1e10
	 lock_acquire+0xc2/0x2d0
	 down_write_nested+0x41/0x80
	 __btrfs_tree_lock+0x24/0x110
	 btrfs_lock_root_node+0x31/0x50
	 btrfs_search_slot+0x1cb/0xb70
	 replace_path+0x541/0x9f0
	 merge_reloc_root+0x1d6/0x610
	 merge_reloc_roots+0xe2/0x260
	 relocate_block_group+0x2c8/0x560
	 btrfs_relocate_block_group+0x23e/0x400
	 btrfs_relocate_chunk+0x4c/0x140
	 btrfs_balance+0x755/0xe40
	 btrfs_ioctl+0x1ea2/0x2c90
	 __x64_sys_ioctl+0x88/0xc0
	 do_syscall_64+0x38/0x90
	 entry_SYSCALL_64_after_hwframe+0x63/0xcd

  other info that might help us debug this:

  Chain exists of:
    btrfs-treloc-02#2 --> btrfs-tree-01 --> btrfs-tree-01/1

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(btrfs-tree-01/1);
				 lock(btrfs-tree-01);
				 lock(btrfs-tree-01/1);
    lock(btrfs-treloc-02#2);

   *** DEADLOCK ***

  7 locks held by btrfs/752500:
   #0: ffff97e292fdf460 (sb_writers#12){.+.+}-{0:0}, at: btrfs_ioctl+0x208/0x2c90
   #1: ffff97e284c02050 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: btrfs_balance+0x55f/0xe40
   #2: ffff97e284c00878 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x236/0x400
   xen-troops#3: ffff97e292fdf650 (sb_internal#2){.+.+}-{0:0}, at: merge_reloc_root+0xef/0x610
   xen-troops#4: ffff97e284c02378 (btrfs_trans_num_writers){++++}-{0:0}, at: join_transaction+0x1a8/0x5a0
   xen-troops#5: ffff97e284c023a0 (btrfs_trans_num_extwriters){++++}-{0:0}, at: join_transaction+0x1a8/0x5a0
   xen-troops#6: ffff97e1875a9278 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  stack backtrace:
  CPU: 1 PID: 752500 Comm: btrfs Not tainted 5.19.0-rc8+ torvalds#775
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:

   dump_stack_lvl+0x56/0x73
   check_noncircular+0xd6/0x100
   ? lock_is_held_type+0xe2/0x140
   __lock_acquire+0x1122/0x1e10
   lock_acquire+0xc2/0x2d0
   ? __btrfs_tree_lock+0x24/0x110
   down_write_nested+0x41/0x80
   ? __btrfs_tree_lock+0x24/0x110
   __btrfs_tree_lock+0x24/0x110
   btrfs_lock_root_node+0x31/0x50
   btrfs_search_slot+0x1cb/0xb70
   ? lock_release+0x137/0x2d0
   ? _raw_spin_unlock+0x29/0x50
   ? release_extent_buffer+0x128/0x180
   replace_path+0x541/0x9f0
   merge_reloc_root+0x1d6/0x610
   merge_reloc_roots+0xe2/0x260
   relocate_block_group+0x2c8/0x560
   btrfs_relocate_block_group+0x23e/0x400
   btrfs_relocate_chunk+0x4c/0x140
   btrfs_balance+0x755/0xe40
   btrfs_ioctl+0x1ea2/0x2c90
   ? lock_is_held_type+0xe2/0x140
   ? lock_is_held_type+0xe2/0x140
   ? __x64_sys_ioctl+0x88/0xc0
   __x64_sys_ioctl+0x88/0xc0
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

This isn't necessarily new, it's just tricky to hit in practice.  There
are two competing things going on here.  With relocation we create a
snapshot of every fs tree with a reloc tree.  Any extent buffers that
get initialized here are initialized with the reloc root lockdep key.
However since it is a snapshot, any blocks that are currently in cache
that originally belonged to the fs tree will have the normal tree
lockdep key set.  This creates the lock dependency of

  reloc tree -> normal tree

for the extent buffer locking during the first phase of the relocation
as we walk down the reloc root to relocate blocks.

However this is problematic because the final phase of the relocation is
merging the reloc root into the original fs root.  This involves
searching down to any keys that exist in the original fs root and then
swapping the relocated block and the original fs root block.  We have to
search down to the fs root first, and then go search the reloc root for
the block we need to replace.  This creates the dependency of

  normal tree -> reloc tree

which is why lockdep complains.

Additionally even if we were to fix this particular mismatch with a
different nesting for the merge case, we're still slotting in a block
that has a owner of the reloc root objectid into a normal tree, so that
block will have its lockdep key set to the tree reloc root, and create a
lockdep splat later on when we wander into that block from the fs root.

Unfortunately the only solution here is to make sure we do not set the
lockdep key to the reloc tree lockdep key normally, and then reset any
blocks we wander into from the reloc root when we're doing the merged.

This solves the problem of having mixed tree reloc keys intermixed with
normal tree keys, and then allows us to make sure in the merge case we
maintain the lock order of

  normal tree -> reloc tree

We handle this by setting a bit on the reloc root when we do the search
for the block we want to relocate, and any block we search into or COW
at that point gets set to the reloc tree key.  This works correctly
because we only ever COW down to the parent node, so we aren't resetting
the key for the block we're linking into the fs root.

With this patch we no longer have the lockdep splat in btrfs/187.

Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Oct 5, 2022
The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:

Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208		memcpy(hinfo->secret, secret, slen);
(gdb) bt
 #0  seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
 #1  0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
    extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
    family=<optimized out>) at net/netlink/genetlink.c:731
 #2  0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
    family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
 xen-troops#3  genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
 xen-troops#4  0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
    at net/netlink/af_netlink.c:2501
 xen-troops#5  0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
 xen-troops#6  0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
    at net/netlink/af_netlink.c:1319
 xen-troops#7  netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
    at net/netlink/af_netlink.c:1345
 xen-troops#8  0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'

The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.

Reported-by: Lucas Leong <[email protected]>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret)
Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: David Lebrun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Oct 5, 2022
The system call gate area counts as kernel text but trying
to install a kprobe in this area fails with an Oops later on.
To fix this explicitly disallow the gate area for kprobes.

Found by syzkaller with the following reproducer:
perf_event_open$cgroup(&(0x7f00000001c0)={0x6, 0x80, 0x0, 0x0, 0x0, 0x0, 0x80ffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext={0x0, 0xffffffffff600000}}, 0xffffffffffffffff, 0x0, 0xffffffffffffffff, 0x0)

Sample report:
BUG: unable to handle page fault for address: fffffbfff3ac6000
PGD 6dfcb067 P4D 6dfcb067 PUD 6df8f067 PMD 6de4d067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 PID: 21978 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b-dirty xen-troops#6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:__insn_get_emulate_prefix arch/x86/lib/insn.c:91 [inline]
RIP: 0010:insn_get_emulate_prefix arch/x86/lib/insn.c:106 [inline]
RIP: 0010:insn_get_prefixes.part.0+0xa8/0x1110 arch/x86/lib/insn.c:134
Code: 49 be 00 00 00 00 00 fc ff df 48 8b 40 60 48 89 44 24 08 e9 81 00 00 00 e8 e5 4b 39 ff 4c 89 fa 4c 89 f9 48 c1 ea 03 83 e1 07 <42> 0f b6 14 32 38 ca 7f 08 84 d2 0f 85 06 10 00 00 48 89 d8 48 89
RSP: 0018:ffffc900088bf860 EFLAGS: 00010246
RAX: 0000000000040000 RBX: ffffffff9b9bebc0 RCX: 0000000000000000
RDX: 1ffffffff3ac6000 RSI: ffffc90002d82000 RDI: ffffc900088bf9e8
RBP: ffffffff9d630001 R08: 0000000000000000 R09: ffffc900088bf9e8
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffffffff9d630000 R14: dffffc0000000000 R15: ffffffff9d630000
FS:  00007f63eef63640(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff3ac6000 CR3: 0000000029d90005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <TASK>
 insn_get_prefixes arch/x86/lib/insn.c:131 [inline]
 insn_get_opcode arch/x86/lib/insn.c:272 [inline]
 insn_get_modrm+0x64a/0x7b0 arch/x86/lib/insn.c:343
 insn_get_sib+0x29a/0x330 arch/x86/lib/insn.c:421
 insn_get_displacement+0x350/0x6b0 arch/x86/lib/insn.c:464
 insn_get_immediate arch/x86/lib/insn.c:632 [inline]
 insn_get_length arch/x86/lib/insn.c:707 [inline]
 insn_decode+0x43a/0x490 arch/x86/lib/insn.c:747
 can_probe+0xfc/0x1d0 arch/x86/kernel/kprobes/core.c:282
 arch_prepare_kprobe+0x79/0x1c0 arch/x86/kernel/kprobes/core.c:739
 prepare_kprobe kernel/kprobes.c:1160 [inline]
 register_kprobe kernel/kprobes.c:1641 [inline]
 register_kprobe+0xb6e/0x1690 kernel/kprobes.c:1603
 __register_trace_kprobe kernel/trace/trace_kprobe.c:509 [inline]
 __register_trace_kprobe+0x26a/0x2d0 kernel/trace/trace_kprobe.c:477
 create_local_trace_kprobe+0x1f7/0x350 kernel/trace/trace_kprobe.c:1833
 perf_kprobe_init+0x18c/0x280 kernel/trace/trace_event_perf.c:271
 perf_kprobe_event_init+0xf8/0x1c0 kernel/events/core.c:9888
 perf_try_init_event+0x12d/0x570 kernel/events/core.c:11261
 perf_init_event kernel/events/core.c:11325 [inline]
 perf_event_alloc.part.0+0xf7f/0x36a0 kernel/events/core.c:11619
 perf_event_alloc kernel/events/core.c:12059 [inline]
 __do_sys_perf_event_open+0x4a8/0x2a00 kernel/events/core.c:12157
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f63ef7efaed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f63eef63028 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 00007f63ef90ff80 RCX: 00007f63ef7efaed
RDX: 0000000000000000 RSI: ffffffffffffffff RDI: 00000000200001c0
RBP: 00007f63ef86019c R08: 0000000000000000 R09: 0000000000000000
R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000002 R14: 00007f63ef90ff80 R15: 00007f63eef43000
 </TASK>
Modules linked in:
CR2: fffffbfff3ac6000
---[ end trace 0000000000000000 ]---
RIP: 0010:__insn_get_emulate_prefix arch/x86/lib/insn.c:91 [inline]
RIP: 0010:insn_get_emulate_prefix arch/x86/lib/insn.c:106 [inline]
RIP: 0010:insn_get_prefixes.part.0+0xa8/0x1110 arch/x86/lib/insn.c:134
Code: 49 be 00 00 00 00 00 fc ff df 48 8b 40 60 48 89 44 24 08 e9 81 00 00 00 e8 e5 4b 39 ff 4c 89 fa 4c 89 f9 48 c1 ea 03 83 e1 07 <42> 0f b6 14 32 38 ca 7f 08 84 d2 0f 85 06 10 00 00 48 89 d8 48 89
RSP: 0018:ffffc900088bf860 EFLAGS: 00010246
RAX: 0000000000040000 RBX: ffffffff9b9bebc0 RCX: 0000000000000000
RDX: 1ffffffff3ac6000 RSI: ffffc90002d82000 RDI: ffffc900088bf9e8
RBP: ffffffff9d630001 R08: 0000000000000000 R09: ffffc900088bf9e8
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffffffff9d630000 R14: dffffc0000000000 R15: ffffffff9d630000
FS:  00007f63eef63640(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff3ac6000 CR3: 0000000029d90005 CR4: 0000000000770ef0
PKRU: 55555554
==================================================================

Link: https://lkml.kernel.org/r/[email protected]

cc: "Naveen N. Rao" <[email protected]>
cc: Anil S Keshavamurthy <[email protected]>
cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Fixes: 1da177e ("Linux-2.6.12-rc2")
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Christian A. Ehrhardt <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
otyshchenko1 pushed a commit to otyshchenko1/linux that referenced this issue Oct 5, 2022
If an AF_PACKET socket is used to send packets through ipvlan and the
default xmit function of the AF_PACKET socket is changed from
dev_queue_xmit() to packet_direct_xmit() via setsockopt() with the option
name of PACKET_QDISC_BYPASS, the skb->mac_header may not be reset and
remains as the initial value of 65535, this may trigger slab-out-of-bounds
bugs as following:

=================================================================
UG: KASAN: slab-out-of-bounds in ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
PU: 2 PID: 1768 Comm: raw_send Kdump: loaded Not tainted 6.0.0-rc4+ xen-troops#6
ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33
all Trace:
print_address_description.constprop.0+0x1d/0x160
print_report.cold+0x4f/0x112
kasan_report+0xa3/0x130
ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
ipvlan_start_xmit+0x29/0xa0 [ipvlan]
__dev_direct_xmit+0x2e2/0x380
packet_direct_xmit+0x22/0x60
packet_snd+0x7c9/0xc40
sock_sendmsg+0x9a/0xa0
__sys_sendto+0x18a/0x230
__x64_sys_sendto+0x74/0x90
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The root cause is:
  1. packet_snd() only reset skb->mac_header when sock->type is SOCK_RAW
     and skb->protocol is not specified as in packet_parse_headers()

  2. packet_direct_xmit() doesn't reset skb->mac_header as dev_queue_xmit()

In this case, skb->mac_header is 65535 when ipvlan_xmit_mode_l2() is
called. So when ipvlan_xmit_mode_l2() gets mac header with eth_hdr() which
use "skb->head + skb->mac_header", out-of-bound access occurs.

This patch replaces eth_hdr() with skb_eth_hdr() in ipvlan_xmit_mode_l2()
and reset mac header in multicast to solve this out-of-bound bug.

Fixes: 2ad7bf3 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Lu Wei <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants