-
Notifications
You must be signed in to change notification settings - Fork 412
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore(iast): add local debug scripts to find leaks (#9318)
Add to the repository the scripts that @juanjux and I use to debug C++ leaks. This folder (scripts/iast/) contains some scripts to check the memory usage of native code. ### 1. Build the docker image ```sh docker build . -f docker/Dockerfile_py311_debug_mode -t python_311_debug ``` ### 2. Run the docker container #### 2.1. Run the container with the script to find references (this script will run the memory usage check) ```sh docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && scripts/iast/run_references.sh" >> References: 1003 >> References: 2 >> References: 2 >> References: 2 >> References: 2 >> References: 2 ``` #### 2.2. Run the container with the script with memray usage check ```sh docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && scripts/iast/run_memray.sh" google-chrome file://$PWD/memray-flamegraph-lel.html ``` #### 2.3. Run the container with the script with Max RSS ```sh docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && scripts/iast/run_memory.sh" >> Round 0 Max RSS: 41.9453125 >> 42.2109375 ``` ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: erikayasuda <[email protected]>
- Loading branch information
1 parent
f471e6b
commit 9e3bd1f
Showing
12 changed files
with
24,653 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
# DEV: Use `debian:slim` instead of an `alpine` image to support installing wheels from PyPI | ||
# this drastically improves test execution time since python dependencies don't all | ||
# have to be built from source all the time (grpcio takes forever to install) | ||
FROM debian:buster-20221219-slim | ||
|
||
# http://bugs.python.org/issue19846 | ||
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK. | ||
ENV LANG C.UTF-8 | ||
|
||
# https://support.circleci.com/hc/en-us/articles/360045268074-Build-Fails-with-Too-long-with-no-output-exceeded-10m0s-context-deadline-exceeded- | ||
ENV PYTHONUNBUFFERED=1 | ||
# Configure PATH environment for pyenv | ||
ENV PYTHON_SOURCE=/root/python_source | ||
ENV PYTHON_DEBUG=/root/env/python_debug | ||
ENV PATH=$PATH:${PYTHON_DEBUG}/bin | ||
ENV PYTHON_CONFIGURE_OPTS=--enable-shared | ||
|
||
RUN \ | ||
# Install system dependencies | ||
apt-get update \ | ||
&& apt-get install -y --no-install-recommends \ | ||
apt-transport-https \ | ||
build-essential \ | ||
ca-certificates \ | ||
clang-format \ | ||
curl \ | ||
git \ | ||
gnupg \ | ||
jq \ | ||
libbz2-dev \ | ||
libenchant-dev \ | ||
libffi-dev \ | ||
liblzma-dev \ | ||
libmemcached-dev \ | ||
libncurses5-dev \ | ||
libncursesw5-dev \ | ||
libpq-dev \ | ||
libreadline-dev \ | ||
libsasl2-dev \ | ||
libsqlite3-dev \ | ||
libsqliteodbc \ | ||
libssh-dev \ | ||
libssl-dev \ | ||
patch \ | ||
python-openssl\ | ||
unixodbc-dev \ | ||
wget \ | ||
zlib1g-dev \ | ||
valgrind \ | ||
# Cleaning up apt cache space | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# Install pyenv and necessary Python versions | ||
# `--with-pydebug`: [Add options](https://docs.python.org/3/using/configure.html#python-debug-build) like count references, sanity checks... | ||
# `--with-valgrind`: Enable Valgrind support (default is no). | ||
# `--without-pymalloc`: Python has a pymalloc allocator optimized for small objects (smaller or equal to 512 bytes) with a short lifetime. We remove this functionality to not hide errors | ||
RUN git clone --depth 1 --branch v3.11.6 https://github.com/python/cpython/ "${PYTHON_SOURCE}" \ | ||
&& cd ${PYTHON_SOURCE} \ | ||
&& ./configure --with-pydebug --without-pymalloc --with-valgrind --prefix ${PYTHON_DEBUG} \ | ||
&& make OPT=-g \ | ||
&& make install \ | ||
&& cd - | ||
|
||
RUN python3.11 -m pip install -U pip \ | ||
&& python3.11 -m pip install six cattrs setuptools cython wheel cmake pytest pytest-cov hypothesis pytest-memray\ | ||
memray==1.12.0 \ | ||
requests==2.31.0 \ | ||
attrs>=20 \ | ||
bytecode>=0.14.0 \ | ||
cattrs \ | ||
ddsketch>=3.0.0 \ | ||
envier~=0.5 \ | ||
opentelemetry-api>=1 \ | ||
protobuf>=3 \ | ||
six>=1.12.0 \ | ||
typing_extensions \ | ||
xmltodict>=0.12 | ||
|
||
|
||
CMD ["/bin/bash"] | ||
#docker build . -f docker/Dockerfile_py311_debug_mode -t python_311_debug | ||
#docker run --rm -it -v ${PWD}:/ddtrace python_311_debug | ||
# | ||
# Now, you can check IAST leaks: | ||
#cd /ddtrace | ||
#export PATH=$PATH:$PWD | ||
#export PYTHONPATH=$PYTHONPATH:$PWD | ||
#export PYTHONMALLOC=malloc | ||
#python3.11 ddtrace/appsec/_iast/leak.py | ||
#python3.11 -m memray run --trace-python-allocators --native -o lel.bin -f prueba.py | ||
#python3.11 -m memray flamegraph lel.bin --leaks -f |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
# DEV: Use `debian:slim` instead of an `alpine` image to support installing wheels from PyPI | ||
# this drastically improves test execution time since python dependencies don't all | ||
# have to be built from source all the time (grpcio takes forever to install) | ||
FROM debian:buster-20221219-slim | ||
|
||
# http://bugs.python.org/issue19846 | ||
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK. | ||
ENV LANG C.UTF-8 | ||
|
||
# https://support.circleci.com/hc/en-us/articles/360045268074-Build-Fails-with-Too-long-with-no-output-exceeded-10m0s-context-deadline-exceeded- | ||
ENV PYTHONUNBUFFERED=1 | ||
# Configure PATH environment for pyenv | ||
ENV PYTHON_SOURCE=/root/python_source | ||
ENV PYTHON_DEBUG=/root/env/python_debug | ||
ENV PATH=$PATH:${PYTHON_DEBUG}/bin | ||
ENV PYTHON_CONFIGURE_OPTS=--enable-shared | ||
|
||
RUN \ | ||
# Install system dependencies | ||
apt-get update \ | ||
&& apt-get install -y --no-install-recommends \ | ||
apt-transport-https \ | ||
build-essential \ | ||
ca-certificates \ | ||
clang-format \ | ||
curl \ | ||
git \ | ||
gnupg \ | ||
jq \ | ||
libbz2-dev \ | ||
libenchant-dev \ | ||
libffi-dev \ | ||
liblzma-dev \ | ||
libmemcached-dev \ | ||
libncurses5-dev \ | ||
libncursesw5-dev \ | ||
libpq-dev \ | ||
libreadline-dev \ | ||
libsasl2-dev \ | ||
libsqlite3-dev \ | ||
libsqliteodbc \ | ||
libssh-dev \ | ||
libssl-dev \ | ||
patch \ | ||
python-openssl\ | ||
unixodbc-dev \ | ||
wget \ | ||
zlib1g-dev \ | ||
valgrind \ | ||
# Cleaning up apt cache space | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# Install pyenv and necessary Python versions | ||
# `--with-pydebug`: [Add options](https://docs.python.org/3/using/configure.html#python-debug-build) like count references, sanity checks... | ||
# `--with-valgrind`: Enable Valgrind support (default is no). | ||
# `--without-pymalloc`: Python has a pymalloc allocator optimized for small objects (smaller or equal to 512 bytes) with a short lifetime. We remove this functionality to not hide errors | ||
RUN git clone --depth 1 --branch v3.12.3 https://github.com/python/cpython/ "${PYTHON_SOURCE}" \ | ||
&& cd ${PYTHON_SOURCE} \ | ||
&& ./configure --with-pydebug --without-pymalloc --with-valgrind --prefix ${PYTHON_DEBUG} \ | ||
&& make OPT=-g \ | ||
&& make install \ | ||
&& cd - | ||
|
||
RUN python3.12 -m pip install -U pip \ | ||
&& python3.12 -m pip install six cattrs setuptools cython wheel cmake pytest pytest-cov hypothesis pytest-memray\ | ||
memray==1.12.0 \ | ||
requests==2.31.0 \ | ||
attrs>=20 \ | ||
bytecode>=0.14.0 \ | ||
cattrs \ | ||
ddsketch>=3.0.0 \ | ||
envier~=0.5 \ | ||
opentelemetry-api>=1 \ | ||
protobuf>=3 \ | ||
six>=1.12.0 \ | ||
typing_extensions \ | ||
xmltodict>=0.12 | ||
|
||
|
||
CMD ["/bin/bash"] | ||
#docker build . -f docker/Dockerfile_py311_debug_mode -t python_311_debug | ||
#docker run --rm -it -v ${PWD}:/ddtrace python_311_debug | ||
# | ||
# Now, you can check IAST leaks: | ||
#cd /ddtrace | ||
#export PATH=$PATH:$PWD | ||
#export PYTHONPATH=$PYTHONPATH:$PWD | ||
#export PYTHONMALLOC=malloc | ||
#python3.12 ddtrace/appsec/_iast/leak.py | ||
#python3.12 -m memray run --trace-python-allocators --native -o lel.bin -f prueba.py | ||
#python3.12 -m memray flamegraph lel.bin --leaks -f |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
export PATH=$PATH:$PWD | ||
export PYTHONPATH=$PYTHONPATH:$PWD | ||
export PYTHON_VERSION=python3.11 | ||
export PYTHONMALLOC=malloc | ||
export DD_COMPILE_DEBUG=true | ||
export DD_TRACE_ENABLED=true | ||
export DD_IAST_ENABLED=true | ||
export _DD_IAST_DEBUG=true | ||
export DD_IAST_REQUEST_SAMPLING=100 | ||
export _DD_APPSEC_DEDUPLICATION_ENABLED=false | ||
export DD_INSTRUMENTATION_TELEMETRY_ENABLED=true | ||
export DD_REMOTE_CONFIGURATION_ENABLED=false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
This folder (scripts/iast/) contains some scripts to check memory usage of native code. | ||
|
||
## How to use | ||
|
||
### 1. Build the docker image | ||
|
||
```sh | ||
docker build . -f docker/Dockerfile_py311_debug_mode -t python_311_debug | ||
``` | ||
|
||
### 2. Run the docker container | ||
|
||
#### 2.1. Run the container with the script to find references (this script will run the memory usage check) | ||
|
||
```sh | ||
docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && source scripts/iast/.env && \ | ||
sh scripts/iast/run_references.sh" | ||
>> References: 1003 | ||
>> References: 2 | ||
>> References: 2 | ||
>> References: 2 | ||
>> References: 2 | ||
>> References: 2 | ||
``` | ||
|
||
#### 2.2. Run the container with the script with memray usage check | ||
|
||
```sh | ||
docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && source scripts/iast/.env && \ | ||
sh scripts/iast/run_memray.sh" | ||
google-chrome file://$PWD/memray-flamegraph-lel.html | ||
``` | ||
|
||
#### 2.3. Run the container with the script with Max RSS | ||
|
||
```sh | ||
docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && source scripts/iast/.env && \ | ||
sh scripts/iast/run_memory.sh" | ||
>> Round 0 Max RSS: 41.9453125 | ||
>> 42.2109375 | ||
``` | ||
|
||
#### 2.4. Run the container with valgrind | ||
|
||
- `--tool`: default: memcheck, other options: cachegrind, callgrind, helgrind, drd, massif, dhat, lackey, none, exp-bbv | ||
- memcheck: | ||
- `--leak-check`: options summary/full/yes | ||
- massif: heap profiler, see below | ||
- `--track-origins`: increases the size of the basic block translations | ||
- `--suppressions`: path to our suppression file: `scripts/iast/valgrind-python.supp` | ||
- `--log-file`: Valgrind report a lot information, we store this info in a file to analyze carefully the reports | ||
|
||
docker run --rm -it -v ${PWD}:/ddtrace python_311_debug /bin/bash -c "cd /ddtrace && source scripts/iast/.env && \ | ||
valgrind --tool=memcheck --leak-check=full --log-file=scripts/iast/valgrind_bench_overload.out --track-origins=yes \ | ||
--suppressions=scripts/iast/valgrind-python.supp --show-leak-kinds=all \ | ||
python3.11 scripts/iast/test_leak_functions.py 100" | ||
|
||
##### Understanding results of memcheck | ||
|
||
Valgrind Memcheck returns all traces of C and C++ files. Most of them are Python core traces. These traces could be | ||
memory leaks in our Python code, but we can't interpret them at the moment. Therefore, all of them are in the | ||
suppression file. | ||
|
||
|
||
The valid traces of our C files, are like that: | ||
``` | ||
==324555== 336 bytes in 1 blocks are possibly lost in loss record 4,806 of 5,852 | ||
==324555== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) | ||
==324555== by 0x40149CA: allocate_dtv (dl-tls.c:286) | ||
==324555== by 0x40149CA: _dl_allocate_tls (dl-tls.c:532) | ||
==324555== by 0x486E322: allocate_stack (allocatestack.c:622) | ||
==324555== by 0x486E322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660) | ||
==324555== by 0xFBF078E: ??? (in /root/ddtrace/native-core.so) | ||
==324555== by 0x19D312C7: ??? | ||
==324555== by 0x1FFEFEFAFF: ??? | ||
==324555== | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
import os | ||
import random | ||
import subprocess | ||
|
||
import requests | ||
|
||
from ddtrace.appsec._iast._utils import _is_iast_enabled | ||
|
||
|
||
if _is_iast_enabled(): | ||
from ddtrace.appsec._iast._taint_tracking import OriginType | ||
from ddtrace.appsec._iast._taint_tracking import taint_pyobject | ||
|
||
|
||
def test_doit(): | ||
origin_string1 = "hiroot" | ||
|
||
if _is_iast_enabled(): | ||
tainted_string_2 = taint_pyobject( | ||
pyobject="1234", source_name="abcdefghijk", source_value="1234", source_origin=OriginType.PARAMETER | ||
) | ||
else: | ||
tainted_string_2 = "1234" | ||
|
||
string1 = str(origin_string1) # String with 1 propagation range | ||
string2 = str(tainted_string_2) # String with 1 propagation range | ||
|
||
string3 = string1 + string2 # 2 propagation ranges: hiroot1234 | ||
string4 = "-".join([string3, string3, string3]) # 6 propagation ranges: hiroot1234-hiroot1234-hiroot1234 | ||
string5 = string4[0:20] # 1 propagation range: hiroot1234-hiroot123 | ||
string6 = string5.title() # 1 propagation range: Hiroot1234-Hiroot123 | ||
string7 = string6.upper() # 1 propagation range: HIROOT1234-HIROOT123 | ||
string8 = "%s_notainted" % string7 # 1 propagation range: HIROOT1234-HIROOT123_notainted | ||
string9 = "notainted_{}".format(string8) # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string10 = "nottainted\n" + string9 # 2 propagation ranges: notainted\nnotainted_HIROOT1234-HIROOT123_notainted | ||
string11 = string10.splitlines()[1] # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string12 = string11 + "_notainted" # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted_notainted | ||
string13 = string12.rsplit("_", 1)[0] # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
|
||
try: | ||
# Path traversal vulnerability | ||
m = open("/" + string13 + ".txt") | ||
_ = m.read() | ||
except Exception: | ||
pass | ||
|
||
try: | ||
# Command Injection vulnerability | ||
_ = subprocess.Popen("ls " + string9) | ||
except Exception: | ||
pass | ||
|
||
try: | ||
# SSRF vulnerability | ||
requests.get("http://" + "foobar") | ||
# urllib3.request("GET", "http://" + "foobar") | ||
except Exception: | ||
pass | ||
|
||
# Weak Randomness vulnerability | ||
_ = random.randint(1, 10) | ||
|
||
# os path propagation | ||
string14 = os.path.join(string13, "a") # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted/a | ||
string15 = os.path.split(string14)[0] # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string16 = os.path.dirname( | ||
string15 + "/" + "foobar" | ||
) # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string17 = os.path.basename("/foobar/" + string16) # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string18 = os.path.splitext(string17 + ".jpg")[0] # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string19 = os.path.normcase(string18) # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
string20 = os.path.splitdrive(string19)[1] # 1 propagation range: notainted_HIROOT1234-HIROOT123_notainted | ||
|
||
expected = "notainted_HIROOT1234-HIROOT123_notainted" # noqa: F841 | ||
# assert string20 == expected | ||
return string20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
memray==1.12.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
PYTHON="${PYTHON_VERSION:-python3.11}" | ||
$PYTHON -m pip install -r scripts/iast/requirements.txt | ||
$PYTHON scripts/iast/test_leak_functions.py 1000000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
PYTHON="${PYTHON_VERSION:-python3.11}" | ||
$PYTHON -m pip install -r scripts/iast/requirements.txt | ||
$PYTHON -m memray run --trace-python-allocators --aggregate --native -o lel.bin -f scripts/iast/test_leak_functions.py 100 | ||
$PYTHON -m memray flamegraph lel.bin --leaks -f |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
PYTHON="${PYTHON_VERSION:-python3.11}" | ||
# $PYTHON setup.py build_ext --inplace | ||
${PYTHON} -m pip install -r scripts/iast/requirements.txt | ||
${PYTHON} -m ddtrace.commands.ddtrace_run ${PYTHON} scripts/iast/test_references.py |
Oops, something went wrong.