Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer Overflow in the searchBaseLayer Method #578

Open
BlaiseMuhirwa opened this issue Aug 2, 2024 · 0 comments
Open

Buffer Overflow in the searchBaseLayer Method #578

BlaiseMuhirwa opened this issue Aug 2, 2024 · 0 comments

Comments

@BlaiseMuhirwa
Copy link

I was testing out HNSW on SIFT and realized that sometimes we get a buffer overflow. See the following backtrace

==89680==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x607000000844 at pc 0x000100031a37 bp 0x7ff7bfefd750 sp 0x7ff7bfefd748
READ of size 4 at 0x607000000844 thread T0
    #0 0x100031a36 in hnswlib::HierarchicalNSW<float>::searchBaseLayer(unsigned int, void const*, int) hnswalg.h:282
    #1 0x100029073 in hnswlib::HierarchicalNSW<float>::addPoint(void const*, unsigned long, int) hnswalg.h:1384
    #2 0x100005c7d in run_test(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, int, bool) dump_graph.cpp:137
    #3 0x10000a33b in main dump_graph.cpp:197
    #4 0x7ff81942c365 in start+0x795 (dyld:x86_64+0xfffffffffff5c365)

0x607000000845 is located 0 bytes after 69-byte region [0x607000000800,0x607000000845)
allocated by thread T0 here:
    #0 0x100b18a20 in wrap_malloc+0xa0 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0xdca20)
    #1 0x100028877 in hnswlib::HierarchicalNSW<float>::addPoint(void const*, unsigned long, int) hnswalg.h:1334
    #2 0x100005c7d in run_test(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, int, bool) dump_graph.cpp:137
    #3 0x10000a33b in main dump_graph.cpp:197
    #4 0x7ff81942c365 in start+0x795 (dyld:x86_64+0xfffffffffff5c365)

SUMMARY: AddressSanitizer: heap-buffer-overflow hnswalg.h:282 in hnswlib::HierarchicalNSW<float>::searchBaseLayer(unsigned int, void const*, int)
Shadow bytes around the buggy address:
  0x607000000580: 00 00 00 00 05 fa fa fa fa fa 00 00 00 00 00 00
  0x607000000600: 00 00 05 fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x607000000680: 05 fa fa fa fa fa 00 00 00 00 00 00 00 00 05 fa
  0x607000000700: fa fa fa fa 00 00 00 00 00 00 00 00 05 fa fa fa
  0x607000000780: fa fa 00 00 00 00 00 00 00 00 05 fa fa fa fa fa
=>0x607000000800: 00 00 00 00 00 00 00 00[05]fa fa fa fa fa 00 00
  0x607000000880: 00 00 00 00 00 00 05 fa fa fa fa fa 00 00 00 00
  0x607000000900: 00 00 00 00 05 fa fa fa fa fa 00 00 00 00 00 00
  0x607000000980: 00 00 05 fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x607000000a00: 05 fa fa fa fa fa 00 00 00 00 00 00 00 00 05 fa
  0x607000000a80: fa fa fa fa 00 00 00 00 00 00 00 00 05 fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==89680==ABORTING
(lldb) AddressSanitizer report breakpoint hit. Use 'thread info -s' to get extended information about the report.
Process 89680 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = Heap buffer overflow
    frame #0: 0x0000000100b233b0 libclang_rt.asan_osx_dynamic.dylib__asan::AsanDie()
libclang_rt.asan_osx_dynamic.dylib__asan::AsanDie:
->  0x100b233b0 <+0>: pushq  %rbp
    0x100b233b1 <+1>: movq   %rsp, %rbp
    0x100b233b4 <+4>: pushq  %rbx
    0x100b233b5 <+5>: pushq  %rax
Target 0: (dump_graph) stopped.

After taking a careful look, I noticed that this is coming from the following line where we try to prefetch the visited list marker for the next item:

_mm_prefetch((char *) (visited_array + *(datal + j + 1)), _MM_HINT_T0);

A fix for this is to check that we're not going to go out of bounds. Something like this

#ifdef USE_SSE
        // prevent going out of bounds
        if (j + 1 < size) {
          tableint next_candidate_id = *(datal + j + 1);
          _mm_prefetch((char *)(visited_array + next_candidate_id), _MM_HINT_T0);
          _mm_prefetch(getDataByInternalId(next_candidate_id), _MM_HINT_T0);
        }
#endif

I can submit a PR for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant