From baf811543d70a198a929a42378cbd6bb47e3be5b Mon Sep 17 00:00:00 2001 From: Oliver Ruebel Date: Tue, 5 Mar 2024 10:08:27 -0800 Subject: [PATCH] Add network docs (#35) Fix #24 - Add documentation about the network tracking and how to use it in the benchmarks - Update the ``NetworkTracker`` class to track the total time directly, rather than having the ``network_activity_tracker`` track the time separately. This change addresses the following issues: a) ``NetworkTracker`` was missing the total time so results are not the same compared to ``network_activity_tracker`` and 2) ``NetworkTracker.asv_network_statistics`` was not being updated in the ``network_activity_tracker`` so the timing result was not being recorded. - Update ``NetworkTracker`` and ``network_activity_tracker`` to allow the user to optionally set the process ID to track. This will be useful if/when we need to run code we want to profile in a separate process (e.g., when running in node.js) --------- Co-authored-by: Cody Baker <51133164+CodyCBakerPhD@users.noreply.github.com> --- README.md | 3 +- docs/development.rst | 5 +-- docs/network_tracking.rst | 24 +++++++++++ docs/running_benchmarks.rst | 4 +- docs/writing_benchmarks.rst | 45 ++++++++++++++++++++- src/nwb_benchmarks/core/_network_tracker.py | 42 +++++++++++++------ 6 files changed, 102 insertions(+), 21 deletions(-) create mode 100644 docs/network_tracking.rst diff --git a/README.md b/README.md index d155a57..c4ed222 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,5 @@ pip install -r docs/requirements-rtd.txt then build the docs by executing the command... ``` -mkdir -p docs/build/html -sphinx-build -M html docs docs/build/html +sphinx-build -M html docs docs/build ``` diff --git a/docs/development.rst b/docs/development.rst index 93f48ba..4565ac5 100644 --- a/docs/development.rst +++ b/docs/development.rst @@ -112,7 +112,4 @@ which is also indented for improved human readability and line-by-line GitHub tr If this ``results`` folder eventually becomes too large for Git to reasonably handle, we will explore options to share via other data storage services. -Network Tracking ----------------- - -Stay tuned https://github.com/NeurodataWithoutBorders/nwb_benchmarks/issues/24 +.. include:: network_tracking.rst diff --git a/docs/network_tracking.rst b/docs/network_tracking.rst new file mode 100644 index 0000000..9029de3 --- /dev/null +++ b/docs/network_tracking.rst @@ -0,0 +1,24 @@ +.. _network-tracking: + +Network Tracking +---------------- + +The network tracking is implemented as part of the `nwb_benchmarks.core` module and consists of the following main components: + +* ``CaptureConnections`` : This class uses the ``psutils`` library to capture network connections and map the connections to process IDs (PIDs). This information is then used downstream to allow filtering of network traffic packets by PID to allow us to distinguish between network traffic generated by us versus other processes running on the same system. See `core/_capture_connections.py `_ +* ``NetworkProfiler`` : This class uses the ``tshark`` command line tool (and ``pyshark`` package) to capture the network traffic (packets) generated by all processes on the system. In combination with ``CaptureConnections`` we can then filter the captured packets to retrieve the packets generated by a particular PID via the ``get_packets_for_connections`` function. See `core/_network_profiler.py `_ +* ``NetworkStatistics`` : This class provides functions for processing the network packets captured by the ``NetworkProfiler`` to compute basic network statistics, such as, the number of packets sent/received or the size of the data up/downloaded. The ``get_statistics`` function provides a convenient method to retrieve all the metrics via a single function call. See `core/_network_statistics.py `_ +* ``NetworkTracker`` and ``network_activity_tracker`` : The ``NetworkTracker`` class, and corresponding ``network_activity_tracker`` context manager, built on the functionality implemented in the above modules to make it easy to track and compute network statistics for a given time during the execution of a code. + +.. note:: + + ``CaptureConnections`` uses `psutil.net_connections() `_, which requires sudo/root access on macOS and AIX. + +.. note:: + + Running the network tracking generates additional threads/processes in order to capture traffic while the main code is running: **1)** ``NetworkProfiler.start_capture`` generates a ``subprocess`` for running the ``tshark`` command line tool, which is then being terminated when ``NetworkProfiler.stop_capture`` is called and **2)** ``CaptureConnections`` implements a ``Thread`` that is being run in the background. The ``NetworkTracker`` automatically starts and terminates these processs/threads, so a user typically does not need to manage these directly. + +Typical usage +^^^^^^^^^^^^^ + +In most cases, users will use the ``NetworkTracker`` or ``network_activity_tracker`` to track network traffic and statistics as illustrated in :ref:`network-tracking-benchmarks`. diff --git a/docs/running_benchmarks.rst b/docs/running_benchmarks.rst index 3c0801b..6914dd1 100644 --- a/docs/running_benchmarks.rst +++ b/docs/running_benchmarks.rst @@ -14,7 +14,9 @@ use `psutil net_connections