Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft updated doc for profiler #1144

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 39 additions & 8 deletions man/profile.doc
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,22 @@ information-gathering component built into the kernel,%
setitimer() using the \const{SIGPROF} signal and one
using Windows Multi Media (MM) timers. On other
systems the profiler is not provided.}
and a presentation component which is defined in the \pllib{statistics}
and a presentation component which is defined in the \pllib{prolog_profile}
library. The latter can be hooked, which is used by the XPCE module
\pllib{swi/pce_profile} to provide an interactive graphical
frontend for the results.

The information gathering component can be run in two modes, controllable using
the \const{ports} option. The first mode (\const{ports(false)}) minimizes the
memory required to store the call tree, but only accumulates the
\emph{call} port count for each predicate. (See \secref{debugoverview} for more information
on the Prolog "Byrd Box Model" or "4 Port Model".) This mode is most beneficial for long
profiler sessions on running applications. The second mode (\const{ports(true)})
accumulates correct counts on all four ports, but can require considerably more
memory (code dependent) and is more suitable for performing more detailed analysis
(performance tuning) on contained pieces of code, e.g., at the individual query level.
(See \secref{profilegather} for more detail.)

\subsection{Profiling predicates}
\label{sec:profiling-predicates}

Expand All @@ -36,10 +47,29 @@ to show_profile/1.
\termitem{time}{+Which}
If \arg{Which} is \const{cpu} (default), collect CPU timing
statistics. If \const{wall}, collect wall time statistics
based on a 5 millisecond sampling rate. Wall time statistics
based on the specified sampling rate (see below). Wall time statistics
can be useful if \arg{Goal} calls blocking system calls.
\end{description}

\begin{description}
\termitem{ports}{+Bool}
If \const{true} counts on all ports will be accumulated at the possible
expense of added memory overhead. If \const{false} only the \emph{call} port
counts are accumulated but memory overhead is reduced in some situations.
If neither value is specified, the default is the "classic" view which
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "classic" view different from specifying either ports(true) or ports(false)? If so, I suggest adding a ports(classic) option.

minimizes the use of memory but displays all port counts, some of which
may be incorrect (\emph{call} port counts are always correct).
\end{description}

\begin{description}
\termitem{sampling_rate}{+Rate}
\arg{Rate} is a numeric value between 1 and 1000 specifying the number of
samples/sec. (default = 200) to be used when gathering timing information.
Note that the accuracy of the timing results is largely independent of the
sampling rate as long as a sufficient number of samples (code dependent)
are collected.
\end{description}

\predicate{show_profile}{1}{+Options}
This predicate first calls prolog:show_profile_hook/1. If XPCE is
loaded, this hook is used to activate a GUI interface to visualise the
Expand Down Expand Up @@ -144,12 +174,12 @@ While the program executes under the profiler, the system builds a
kernel: one that starts a new goal (\emph{profCall}), one that tells the
system which goal is resumed after an \emph{exit} (\emph{profExit}) and
one that tells the system which goal is resumed after a \emph{fail}
(i.e., which goal is used to \emph{retry} (\emph{profRedo})). The
profCall() function finds or creates the subnode for the argument
(i.e., which goal is used to \emph{retry} (\emph{profFail})).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use \exam{...} for examples (not \emph{...})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\const{} is for constants (identifiers, atoms, ...), \exam{} is for (example) inline code fragments (i.e., generic "code"). Functions are written name() (will be styled automatically). Terms may be written as \term{name}{args}, especially if the arguments are placeholders.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proof reading; I'll update the PR to fix the typos.

Regarding "classic" mode; perhaps I'm overly pedantic but I think a mode which displays incorrect information in some situations is a mistake. It's (too?) easy to enable that mode buy just by omitting the ports option so, personally, I'd rather not legitimize it further in the doc.

As for the tex conventions used by SWIP, I just tried to mimic what was there previously as I don't really understand the nuances, and the changes required (if any) to what's been submitted. This is just a draft so I assume the editor-in-chief will have the final say for the official version.

Copy link
Member

@kamahen kamahen Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused: if I specify ports(false), do I get the current (somewhat wrong) behaviour? Do I get a different behaviour if I don't specify either ports(false) or ports(true)?

If ports(false) gives the same behaviour as not specifying the ports option at all, then there's no problem. If this isn't the case, I don't like it because the API would have no way of specifying the default behaviour.

[I'll leave the markup fixing to Jan -- I often get it wrong because there's no documentation about the markup and I got lost while trying to read the TeX macros]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three different behaviours at the moment. The API (predicate '$profile'/4) supports all three. The user doc "should" explain how each one is specified in the profile/2 predicate (no ports option defaults to "classic" mode).

I'd rather there were just two modes: ports(true) and ports(false); that's how I intend to use it.

\emph{profCall()} finds or creates the subnode for the argument
predicate below the current node, increments the call-count of this link
and returns the sub-node which is recorded in the Prolog stack-frame.
Choice-points are marked with the current profiling node. profExit() and
profRedo() pass the profiling node where execution resumes.
Choice-points are marked with the current profiling node. \emph{profExit()} and
\emph{profFail()} pass the profiling node where execution resumes.

Just using the above algorithm would create a much too big tree due to
recursion. For this reason the system performs detection of recursion.
Expand All @@ -159,8 +189,9 @@ detected. For example, call/1 can call a predicate that uses call/1
itself. This can be viewed as a recursive invocation, but this is
generally not desirable. Recursion is currently assumed if the same
predicate \emph{with the same parent} appears higher in the call-graph.
Early experience with some non-trivial programs are
promising.
Acting on mutual recursion causes inaccuracies in the port counts for
\emph{profExit} and \emph{profFail} motivating the '\const{ports}' option
of profile/2 to control which strategy is used for a profiling session.

The last part of the profiler collects statistics on the CPU time
used in each node. On systems providing setitimer() with
Expand Down