Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Detect anomalous Datadog traces; upgrade to Python 3.11 #757

Merged
merged 2 commits into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8', '3.12']
python-version: ['3.11', '3.12']
tox-env: [quality, django42]
steps:
- uses: actions/checkout@v4
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: setup python
uses: actions/setup-python@v4
with:
python-version: 3.8
python-version: 3.11

- name: Install pip
run: pip install -r requirements/pip.txt
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.8'
python-version: '3.11'
architecture: x64
- name: Install pip
run: pip install -r requirements/pip.txt
Expand Down
10 changes: 10 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ Change Log
Unreleased
~~~~~~~~~~

[4.0.0] - 2024-08-05
~~~~~~~~~~~~~~~~~~~~
Changed
-------
* Dropped support for Python 3.8; only testing with 3.11 and above now.

Added
-----
* ``DatadogDiagnosticMiddleware`` can now detect and log anomalous traces, enabled by Waffle flag ``datadog.diagnostics.detect_anomalous_trace``

[3.6.0] - 2024-07-24
~~~~~~~~~~~~~~~~~~~~
Added
Expand Down
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ One Time Setup
cd edx-arch-experiments

# Set up a virtualenv using virtualenvwrapper with the same name as the repo and activate it
mkvirtualenv -p python3.8 edx-arch-experiments
mkvirtualenv -p python3.11 edx-arch-experiments

Local testing
~~~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -469,7 +469,7 @@ def get_version(*file_paths):

# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {
'python': ('https://docs.python.org/3.8', None),
'python': ('https://docs.python.org/3.11', None),
'django': ('https://docs.djangoproject.com/en/2.2/', 'https://docs.djangoproject.com/en/2.2/_objects/'),
'model_utils': ('https://django-model-utils.readthedocs.io/en/latest/', None),
}
Expand Down
2 changes: 1 addition & 1 deletion edx_arch_experiments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
A plugin to include applications under development by the architecture team at 2U.
"""

__version__ = '3.6.0'
__version__ = '4.0.0'
7 changes: 0 additions & 7 deletions edx_arch_experiments/datadog_diagnostics/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,3 @@ Datadog Diagnostics
When installed in the LMS as a plugin app, the ``datadog_diagnostics`` app adds additional logging for debugging our Datadog integration.

This is intended as a temporary situation while we debug the `trace concatenation issue <https://github.com/edx/edx-arch-experiments/issues/692>`_.

Usage
*****

In LMS:

- Install ``edx-arch-experiments`` as a dependency
33 changes: 32 additions & 1 deletion edx_arch_experiments/datadog_diagnostics/middleware.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,31 @@
"""
Diagnostic middleware for Datadog.

To use, install edx-arch-experiments and add
``edx_arch_experiments.datadog_diagnostics.middleware.DatadogDiagnosticMiddleware``
to ``MIDDLEWARE``, then set the below settings as needed.
"""

import logging
import time

from django.core.exceptions import MiddlewareNotUsed
from edx_toggles.toggles import WaffleFlag

log = logging.getLogger(__name__)

# .. toggle_name: datadog.diagnostics.detect_anomalous_trace
# .. toggle_implementation: WaffleFlag
# .. toggle_default: False
# .. toggle_description: Enables logging of anomalous Datadog traces for web requests.
# .. toggle_warning: This is a noisy feature and so it should only be enabled
# for a percentage of requests.
# .. toggle_use_cases: temporary
# .. toggle_creation_date: 2024-08-01
# .. toggle_target_removal_date: 2024-11-01
# .. toggle_tickets: https://github.com/edx/edx-arch-experiments/issues/692
DETECT_ANOMALOUS_TRACE = WaffleFlag('datadog.diagnostics.detect_anomalous_trace', module_name=__name__)

# .. toggle_name: datadog.diagnostics.log_root_span
# .. toggle_implementation: WaffleFlag
# .. toggle_default: False
Expand Down Expand Up @@ -43,6 +60,8 @@ def __init__(self, get_response):
# If import fails, don't even load this middleware.
raise MiddlewareNotUsed # pylint: disable=raise-missing-from

self.worker_start_epoch = time.time()

def __call__(self, request):
return self.get_response(request)

Expand All @@ -63,9 +82,21 @@ def log_diagnostics(self, request):
"""
Contains all the actual logging logic.
"""
local_root_span = self.dd_tracer.current_root_span()

if DETECT_ANOMALOUS_TRACE.is_enabled():
root_duration_s = local_root_span.duration
if root_duration_s is not None:
worker_run_time_s = time.time() - self.worker_start_epoch
log.warning(
f"Anomalous Datadog local root span (duration already set): "
f"id = {local_root_span.trace_id:x}; "
f"duration = {root_duration_s:0.3f} sec; "
f"worker age = {worker_run_time_s:0.3f} sec"
)

if LOG_ROOT_SPAN.is_enabled():
route_pattern = getattr(request.resolver_match, 'route', None)
local_root_span = self.dd_tracer.current_root_span()
current_span = self.dd_tracer.current_span()
# pylint: disable=protected-access
log.info(
Expand Down
110 changes: 110 additions & 0 deletions edx_arch_experiments/datadog_diagnostics/tests/test_middleware.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
"""
Tests for diagnostic middleware.
"""

import re
from unittest.mock import Mock, patch

import ddt
import ddtrace
from django.test import TestCase

from ..middleware import DETECT_ANOMALOUS_TRACE, LOG_ROOT_SPAN, DatadogDiagnosticMiddleware


def fake_view(_request):
"""Fake get_response for middleware."""


@ddt.ddt
class TestDatadogDiagnosticMiddleware(TestCase):
"""Tests for DatadogDiagnosticMiddleware."""

def setUp(self):
self.middleware = DatadogDiagnosticMiddleware(fake_view)

def run_middleware(self):
"""Run the middleware using a fake request."""
resolver = Mock()
resolver.route = "/some/path"
request = Mock()
request.resolver_match = resolver
self.middleware.process_view(request, None, None, None)

@patch('edx_arch_experiments.datadog_diagnostics.middleware.log.error')
def test_log_diagnostics_error_only_once(self, mock_log_error):
"""
If the log_diagnostics function is broken, only log the error once.
The method should still be called every time in case it is still doing
useful work before the error, though.
"""

bad_method = Mock(side_effect=lambda request: 1/0)
self.middleware.log_diagnostics = bad_method

self.run_middleware()
self.run_middleware()

# Called twice
assert len(bad_method.call_args_list) == 2

# But only log once
mock_log_error.assert_called_once_with(
"Encountered error in DatadogDiagnosticMiddleware (suppressing further errors): "
"ZeroDivisionError('division by zero')"
)

@ddt.data(
# Feature disabled
(False, False),
(False, True),
# Enabled, but nothing anomalous
(True, False),
# Enabled and anomaly detected
(True, True),
)
@ddt.unpack
@patch('edx_arch_experiments.datadog_diagnostics.middleware.log.warning')
def test_anomalous_trace(self, enabled, cause_anomaly, mock_log_warning):
with (
patch.object(DETECT_ANOMALOUS_TRACE, 'is_enabled', return_value=enabled),
patch.object(LOG_ROOT_SPAN, 'is_enabled', return_value=False),
# Need at least two levels of spans in order to fake
# an anomaly. (Otherwise current_root_span returns None.)
ddtrace.tracer.trace("test_local_root"),
ddtrace.tracer.trace("inner_span"),
):
if cause_anomaly:
ddtrace.tracer.current_root_span().finish()
self.run_middleware()

if enabled and cause_anomaly:
mock_log_warning.assert_called_once()
log_msg = mock_log_warning.call_args_list[0][0][0] # first arg of first call
assert re.fullmatch(
r"Anomalous Datadog local root span \(duration already set\): "
r"id = [0-9A-Fa-f]+; duration = [0-9]\.[0-9]{3} sec; worker age = [0-9]\.[0-9]{3} sec",
log_msg
)
else:
mock_log_warning.assert_not_called()

@patch('edx_arch_experiments.datadog_diagnostics.middleware.log.info')
def test_log_root_span(self, mock_log_info):
with (
patch.object(DETECT_ANOMALOUS_TRACE, 'is_enabled', return_value=False),
patch.object(LOG_ROOT_SPAN, 'is_enabled', return_value=True),
# Need at least two levels of spans for interesting logging
ddtrace.tracer.trace("test_local_root"),
ddtrace.tracer.trace("inner_span"),
):
self.run_middleware()

mock_log_info.assert_called_once()
log_msg = mock_log_info.call_args_list[0][0][0] # first arg of first call
assert re.fullmatch(
r"Datadog span diagnostics: Route = /some/path; "
r"local root span = name='test_local_root' .*; "
r"current span = name='inner_span' .*",
log_msg
)
60 changes: 29 additions & 31 deletions requirements/base.txt
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
#
# This file is autogenerated by pip-compile with Python 3.12
# This file is autogenerated by pip-compile with Python 3.11
# by the following command:
#
# make upgrade
#
asgiref==3.7.2
asgiref==3.8.1
# via django
attrs==23.2.0
# via
# jsonschema
# referencing
certifi==2024.2.2
certifi==2024.7.4
# via requests
cffi==1.16.0
# via
Expand All @@ -22,11 +22,11 @@ click==8.1.7
# via
# code-annotations
# edx-django-utils
code-annotations==1.6.0
code-annotations==1.8.0
# via edx-toggles
cryptography==42.0.5
cryptography==43.0.0
# via pyjwt
django==3.2.24
django==4.2.14
# via
# -c https://raw.githubusercontent.com/edx/edx-lint/master/edx_lint/files/common_constraints.txt
# -r requirements/base.in
Expand All @@ -47,75 +47,73 @@ django-waffle==4.1.0
# edx-django-utils
# edx-drf-extensions
# edx-toggles
djangorestframework==3.14.0
djangorestframework==3.15.2
# via
# -r requirements/base.in
# drf-jwt
# edx-drf-extensions
dnspython==2.6.1
# via pymongo
drf-jwt==1.19.2
# via edx-drf-extensions
edx-codejail==3.3.3
edx-codejail==3.4.1
# via -r requirements/base.in
edx-django-utils==5.10.1
edx-django-utils==5.15.0
# via
# -r requirements/base.in
# edx-drf-extensions
# edx-toggles
edx-drf-extensions==10.2.0
edx-drf-extensions==10.3.0
# via -r requirements/base.in
edx-opaque-keys==2.5.1
edx-opaque-keys==2.10.0
# via edx-drf-extensions
edx-toggles==5.1.1
edx-toggles==5.2.0
# via -r requirements/base.in
idna==3.6
idna==3.7
# via requests
jinja2==3.1.3
jinja2==3.1.4
# via code-annotations
jsonschema==4.21.1
jsonschema==4.23.0
# via -r requirements/base.in
jsonschema-specifications==2023.12.1
# via jsonschema
markupsafe==2.1.5
# via jinja2
newrelic==9.7.0
newrelic==9.12.0
# via edx-django-utils
pbr==6.0.0
# via stevedore
psutil==5.9.8
psutil==6.0.0
# via edx-django-utils
pycparser==2.21
pycparser==2.22
# via cffi
pyjwt[crypto]==2.8.0
pyjwt[crypto]==2.9.0
# via
# drf-jwt
# edx-drf-extensions
pymongo==3.13.0
pymongo==4.8.0
# via edx-opaque-keys
pynacl==1.5.0
# via edx-django-utils
python-slugify==8.0.4
# via code-annotations
pytz==2024.1
# via
# django
# djangorestframework
pyyaml==6.0.1
# via code-annotations
referencing==0.33.0
referencing==0.35.1
# via
# jsonschema
# jsonschema-specifications
requests==2.31.0
requests==2.32.3
# via edx-drf-extensions
rpds-py==0.18.0
rpds-py==0.19.1
# via
# jsonschema
# referencing
semantic-version==2.10.0
# via edx-drf-extensions
six==1.16.0
# via edx-codejail
sqlparse==0.4.4
sqlparse==0.5.1
# via django
stevedore==5.2.0
# via
Expand All @@ -124,11 +122,11 @@ stevedore==5.2.0
# edx-opaque-keys
text-unidecode==1.3
# via python-slugify
typing-extensions==4.10.0
typing-extensions==4.12.2
# via edx-opaque-keys
urllib3==2.2.1
urllib3==2.2.2
# via requests

# The following packages are considered to be unsafe in a requirements file:
setuptools==69.1.1
setuptools==72.1.0
# via -r requirements/base.in
Loading
Loading