-
Notifications
You must be signed in to change notification settings - Fork 20
DTrace based on BPF Implementation Plan
This document provides details on the implementation of DTrace on top of existing Linux kernel tracing facilities. It is meant as a guideline for project planning, and provides (to the extent possible) the order of implementation of various components, features, and capabilities of DTrace on BPF.
The document is organized by high level features, providing more detailed implementation information for each of these features. Generally, the implementation will follow this break-up of functionality, but it is certainly anticipated that some deviations will occur. One concrete example would be the string datatype. While the implementation of values of this datatype can be isolated to specific parts of the DTrace on Linux source code, support for this datatype is present in many different areas. Therefore, the implementation of the string datatype can either be done by providing all functionality as a single deliverable, or it can be done by implementing parts of the string datatype support for features that are planned as a specific deliverable, thereby providing a partial string datatype implementation. This document will be updated as these practical decisions about deliverables are made.
There are two approaches possible for implementing built-in variables. We can either generate code directly from the compiler whenever a built-in variable is encountered (which makes it impossible to cache the values unless we make the generated code even more fancy), or we can implement a get_bvar() function to which the compiler will generate calls whenever a built-in variable value is needed. That function could provide for caching of values. The latter approach is currently preferred.
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
arg0 .. arg9 | 2.0.0-1.2 | Raw probe argument | dctx.argv[0] .. dctx.argv[9] | |
args | 2.0.0-1.11 |
Mapped (possibly translated) arguments | ||
caller | 2.0.0-1.6 | Function in which the probe fired | ||
curcpu | 2.0.0-1.2 | CPU on which the probe fired | Use BPF helper: bpf_get_smp_processor_id() | |
curthread | 2.0.0-1.2 | Task in which the probe fired | Use BPF helper: bpf_get_current_task() | |
epid | 2.0.0-1.2 | Enabled probe ID | dctx->mst→epid (resolved at program load time) | |
errno | 2.0.0-1.6 | Return code of the current system call. It will only have a non-zero value during the processing of the syscall return probe. In all other cases it is 0. | Ensure that the current probe is a syscall return probe, and retrieve the value as dctx->mst→argv[0]. Since it is (currently) difficult to determine which probe we're executing for, it might be easier to populate an errno field in dctx→mst from the prologue of a syscall return probe (in all other cases it needs to be 0), and just return that. |
|
execname | 2.0.0-1.10 | Executable name of the current task | Use curthread->comm | |
gid | 2.0.0-1.2 | Group ID of the current task | Use BPF helper: bpf_get_current_uid_gid() | |
id | 2.0.0-1.3 | Probe ID | dctx->mst→prid (resolved at program load time) | |
ipl | Interrupt level | We need to be able to determine whether we are in an interrupt or not. | ||
pid | 2.0.0-1.2 | Process ID of current task | Use BPF helper: bpf_get_current_pid_tgid() | |
ppid | 2.0.0-1.3 | Parent process ID of current task | BPF code to read task->real_parent→pid | |
probefunc | 2.0.0-1.6 | Probe description: function name | The plan is to provide a BPF map that maps EPID to probe ID and probe description names |
|
probemod | 2.0.0-1.6 | Probe description: module name | The plan is to provide a BPF map that maps EPID to probe ID and probe description names |
|
probename | 2.0.0-1.6 | Probe description: probe name | The plan is to provide a BPF map that maps EPID to probe ID and probe description names |
|
probeprov |
2.0.0-1.6 | Probe description: provider name | The plan is to provide a BPF map that maps EPID to probe ID and probe description names |
|
stackdepth | 2.0.0-1.6 | |||
tid | 2.0.0-1.2 | Use BPF helper: bpf_get_current_pid_tgid() | ||
timestamp | 2.0.0-1.2 | Use BPF helper: bpf_ktime_get_ns() | ||
ucaller | 2.0.0-1.6 | Userspace function calling function where probe fires | ||
uid | 2.0.0-1.2 | Use BPF helper: bpf_get_current_uid_gid() | ||
uregs | 2.0.0-1.12 (from kernel 5.15 and later) 2.0.0-1.13 (earlier kernels) |
Newer kernels have bpf_task_pt_regs(). Older kernels would need to implement this explicitly for the architecture dtrace runs on. | ||
ustackdepth | 2.0.0-1.6 | |||
vtimestamp | ||||
walltimestamp | 2.0.0-1,6 |
Name | In Release | Description | Comments |
---|---|---|---|
Initial value |
2.0.0-1.0 | DTrace documentation specifies that the initial value of clause-local variables is unspecified. BPF enforces a strict store-before-load policy. In DTrace v2 clause-local variables are always initialized as 0. | |
2.0.0-1.5 | DESIGN CHANGE: No longer applicable. | ||
Allocate space | 2.0.0-1.5 | ||
. (1) size <= 8 bytes |
2.0.0-1.0 | Clause-local variables of 8 bytes or less are allocated on the stack as a 64-bit (8 bytes) value. | |
2.0.0-1.5 | DESIGN CHANGE: Clause-local variables of any size are stored as a sequence of bytes in a BPF map value. | ||
. (2) size > 8 bytes |
2.0.0-1.5 | Clause-local variables of more than 8 bytes are allocated as a sequence of bytes in the 'lvars' BPF map value, each aligned at a 64-bit (8 byte) boundary. | |
. Size-align data | 2.0.0-1.6 | DESIGN CHANGE: Ensure that variable values are stored at their proper alignment rather than all aligned at 8-byte boundaries. | |
Get value | 2.0.0-1.5 | ||
. (1) by value | 2.0.0-1.0 | Only for values of 8 bytes or less. | |
2.0.0-1.5 | For values of any size. | ||
. (2) by reference | 2.0.0-1.5 | ||
. . (a) size <= 8 |
2.0.0-1.5 | DESIGN CHANGE: Reference to the storage location of the clause-local variable in the 'lvars' BPF map value. | |
. . (b) size > 8 | 2.0.0-1.5 | Reference to the storage location of the clause-local variable in the' lvars' BPF map value. | |
Set value | 2.0.0-1.5 | ||
. (1) size <= 8 | 2.0.0-1.0 | Direct store to the stack location. | |
2.0.0-1.5 | DESIGN CHANGE: Direct store to the storage location of the clause-local -variable in the 'lvars' BPF map value. | ||
. (2) size > bytes | 2.0.0-1.5 | memcpy() to the storage location of the clause-local -variable in the 'lvars' BPF map value. |
Name | In Release | Description | Comments |
---|---|---|---|
Initial value |
2.0.0-1.0 | DTrace documentation specifies that the initial value of clause-local variables is unspecified. BPF enforces a strict store-before-load policy. In DTrace v2 clause-local variables are always initialized as 0. | |
2.0.0-1.5 | DESIGN CHANGE: No longer applicable. | ||
Allocate space | 2.0.0-1.5 | ||
. (1) size <= 8 bytes |
2.0.0-1.0 | Clause-local variables of 8 bytes or less are allocated on the stack as a 64-bit (8 bytes) value. | |
2.0.0-1.5 | DESIGN CHANGE: Clause-local variables of any size are stored as a sequence of bytes in a BPF map value. | ||
. (2) size > 8 bytes |
2.0.0-1.5 | Clause-local variables of more than 8 bytes are allocated as a sequence of bytes in the 'gvars' BPF map value, each aligned at a 64-bit (8 byte) boundary. | |
. Size-align data | 2.0.0-1.6 | DESIGN CHANGE: Ensure that variable values are stored at their proper alignment rather than all aligned at 8-byte boundaries. | |
Get value | 2.0.0-1.5 | ||
. (1) by value | 2.0.0-1.0 | Only for values of 8 bytes or less. | |
2.0.0-1.5 | For values of any size. | ||
. (2) by reference | 2.0.0-1.5 | ||
. . (a) size <= 8 |
2.0.0-1.5 | DESIGN CHANGE: Reference to the storage location of the clause-local variable in the 'gvars' BPF map value. | |
. . (b) size > 8 | 2.0.0-1.5 | Reference to the storage location of the clause-local variable in the' gvars' BPF map value. | |
Set value | 2.0.0-1.5 | ||
. (1) size <= 8 | 2.0.0-1.0 | Direct store to the stack location. | |
2.0.0-1.5 | DESIGN CHANGE: Direct store to the storage location of the clause-local -variable in the 'gvars' BPF map value. | ||
. (2) size > bytes | 2.0.0-1.5 | memcpy() to the storage location of the clause-local -variable in the 'gvars' BPF map value. | |
Global associative arrays | 2.0.0-1.10 |
Name | In Release | Description | Comments |
---|---|---|---|
Key calculation | 2.0.0-1.9 | This requires a formula that converts the tuple of key values into a unique index into the thread-local storage variable table. | |
Initial value |
2.0.0-1.9 | Per DTrace documentation, the initial value is undefined. | |
Allocate space | |||
. (1) size <= 8 bytes | 2.0.0-1.9 | DESIGN CHANGE: All sizes will be handled the same (see below). | |
. (2) size > 8 bytes | 2.0.0-1.9 | ||
Get value | |||
. (1) by value | 2.0.0-1.9 | ||
. (2) by reference | 2.0.0-1.9 | ||
. . (a) size <= 8 |
2.0.0-1.9 | DESIGN CHANGE: All sizes will be handled the same (see below). | |
. . (b) size > 8 | 2.0.0-1.9 | ||
Set value | |||
. (1) size <= 8 | 2.0.0-1.9 | DESIGN CHANGE: All sizes will be handled the same (see below). | |
. (2) size > bytes | 2.0.0-1.9 | ||
TLS associative arrays | 2.0.0-1.10 |
Name | In Release | Description | Comments |
---|---|---|---|
Consolidated string table | 2.0.0-1.6 | While every DIFO (compiled clause) has its own string table, a global one is necessary to avoid duplicating string constants between BPF programs. |
|
Add probe description components to string table | 2.0.0-1.6 | Support for probeprov, probemod, probefun, and probename requires those strings (for all enabled probes) to be added to the string table. |
|
Create 'probes' BPF map | 2.0.0-1.6 | The mapping from probe ID to probeprov, probemod, probefun, and probename is to be stored in a BPF map so clauses can retrieve the correct values. |
|
String length handling | 2.0.0-1.6 |
|
|
(1) Variable-length prefix | 2.0.0-1.6 | String length are stored as variable-length integers ahead of the actual string data bytes. | |
(2) Fixed length prefix | 2.0.0-1.7 | DESIGN CHANGE: Using variable-length integers to store the string length as prefix to the actual string data bytes has been replaced with a fixed 2-byte length prefix to work around BPF verifier complexity issues. | |
(3) No length prefix | 2.0.0-1.10 | DESIGN CHANGE: Use bpf_probe_read_str() to determine string length (copying into unused memory at the end of the string constant table). This allows us to remove the string length prefix altogether. | |
(3) No length prefix | 2.0.0-1.10 | DESIGN CHANGE: Use bpf_probe_read_str() to determine string length (copying into unused memory at the end of the string constant table). This allows us to remove the string length prefix altogether. | |
NULL strings support | 2.0.0-1.13 | DTrace strings are fixed size char arrays. NULL (which can be assigned to a string variable) needs a representation and encoding/decoding upon store/load. | |
NULL pointer support | 2.0.0-1.13 | NULL pointers for DTrace strings (see above) and kernel strings need to be supported correctly in all operations. Where necessary, appropriate faults are to be reported. | |
String constants |
|
|
|
Inline constants | 2.0.0-1.6 | Support for string constant explicitly included in source code (i.e. "...") |
|
Dynamic ref'd | 2.0.0-1.6 | Support for dynamically referenced string constants (e.g. probename) |
|
Dynamic strings |
|
|
|
Kernel strings |
2.0.0-1.11 |
Support to retrieve (load) strings from kernel addresses (depends on temporary strings) | |
Userspace strings |
2.0.0-1.11 |
Support to retrieve (load) strings from userspace addresses (depends on temporary strings) | |
Functions |
|
|
|
index() | 2.0.0-1.8 | Same as strstr() blocked by nested loops not accepted by BPF verifier (hard limit on # of instructions) |
|
rindex() | 2.0.0-1.8 | Same as strstr() blocked by nested loops not accepted by BPF verifier (hard limit on # of instructions) | |
strchr() | 2.0.0-1.8 | limited functionality , needs tricks in code generation | |
strjoin() | 2.0.0-1.7 | Store sum of lengths, memcpy() 1st string (omit NUL), and memcpy() 2nd string (incl. NUL) | |
strlen() | 2.0.0-1.6 2.0.0-1.10 |
2.0.0-1.6: Use dt_vint2int() 2.0.0-1.10: Use bpf_probe_read_str() |
|
strrchr() | 2.0.0-1.8 | limited functionality , needs tricks in code generation | |
strstr() | 2.0.0-1.8 | Same as strstr() blocked by nested loops not accepted by BPF verifier (hard limit on # of instructions) | |
strtok() | 2.0.0-1.9 | special version of strstr, same issues w/ verifier | |
substr() | 2.0.0-1.7 | Store length of slice, memcpy() slice, and append NUL |
Due to limitations in the BPF implementation and the strict validation rules of the BPF verifier, string handling requires custom BPF functions. DTrace is designed around a concept of fixed length string space which makes the implementation easier but it also wastes a lot of space. In addition, it would be very expensive to have to perform a strlen() operation as part of string manipulation operations.
Prior to version 2.0.0-1.7, strings were stored as a variable-length integer encoding its length, followed by a 0-terminated sequence of data bytes. This provided a compact format that eliminates the need to recalculate the length of the string.
As of version 2.0.0-1.7, string lengths are stored as a fixed length (2-byte) prefix, followed by a 0-terminated sequence of data bytes. This change was found to be necessary because the BPF verifier had trouble coping with the variable length prefix, causing it to re-evaluate portions of code many more times than necessary. This caused the BPF verification process to reach the limit of 1 million evaluated instructions rather quickly.
As of version 2.0.0-1.10, strings will no longer have their length stored as a prefix to the character data. While the prefix was useful for more optimized operation, the need to limit it to 2 bytes imposed a limitation on strings in DTrace that is not part of the design. In addition, it poses a complication once support is added for reading strings from kernel and userspace memory, because those strings will not have a length prefix. The new design uses the bpf_probe_read_str() BPF helper to copy the source string into unused memory at the end of the string constant table. The BPF helper copies bytes up to the first occurrence of a NUL character, or up to a given number of bytes, whichever is less. Since bpf_probe_read_str() returns the number of bytes copied, including the NUL byte, the source string size is determined as the return value of the BPF helper minus 1.
The implementation of associative arrays in DTrace based on BPF can learn a lot from the legacy implementation. The limiting conditions are near identical. We must work with a limited storage pool that is set side ahead of time, and we need to be able to support dynamic allocation and deallocation without sleeping. Static analysis of the D code being compiled allows us to determine the maximum element size. The legacy implementation limits allocations for associative array elements to the calculated maximum element size so that all allocations are of the same size.
The main challenge for DTrace based on BPF is the implementation of tuples as index value for associative arrays. Functionality must be implemented to convert a tuple into a semi-unique numeric value that can be used to index a BPF hashmap. This means we need to implement a hash function that takes a tuple and calculates a numeric id based on the component values of the tuple. An associative array can contain elements with index tuples of varying size.
Name | In Release | Description | Comments |
---|---|---|---|
Generic dynamic variable support | 2.0.0-1.10 | Dynamic variables were implemented for TLS variables. This can be made more generic so it can be used for associative array element storage as well. This is consistent with the legacy implementation. | |
Tuple-to-id mapping | 2.0.0-1.10 | Use a BPF hash map (tuples) that uses the concatenation of the tuples elements as a key, and stores the address of the map value as the associated value. This guarantees that the tuple is associated with a unique numeric value. This will be used as key into the dynamic variable storage map. |
Done 2.0.0-1.10 |
---|
Name | In Release | Description | Comments |
---|---|---|---|
Implement aggregations as variabes |
2.0.0-1.4 | Aggregations were implemented as a special kind of action. In the new design, they are a special kind of variables. This requires compiler and disassembler changes. | |
Allocate space | 2.0.0-1.4 | Aggregations are allocated as one or more 8-byte chunks (each storing a uint64_t) in the aggs BPF per-CPU map value. | |
Data update mechanism | 2.0.0-1.4 | Aggregation data is to be updated on a per-CPU basis (i.e. the CPU a probe executes on) in a lock-free, wait-free manner. We use a latch sequence protected data-pair approach. | |
Consumer processing | 2.0.0-1.4 | Processing aggregation data requires retrieving the agg BPF map and aggregating the data across the active CPUs. | |
Functions | 2.0.0-1.4 | ||
. avg() | 2.0.0-1.4 | ||
. count() | 2.0.0-1.4 | ||
. llquantize() | 2.0.0-1.4 | ||
. lquantize() | 2.0.0-1.4 | ||
. max() | 2.0.0-1.4 | ||
. min() | 2.0.0-1.4 | ||
. quantize() | 2.0.0-1.4 | ||
. stddev() | 2.0.0-1.4 | ||
. sum() | 2.0.0-1.4 | ||
Implement tuple indexed aggregations | 2.0.0-1.11 |
turn tuple into unique ID and then use that to index |
Aggregations used to be implemented with their own specific per-cpu output buffer (alongside the regular per-cpu trace data output buffer). Space was allocated in the per-cpu buffer for the different aggregation data items (one or more uint64_t values), and the execution of aggregation actions caused these values to be updated. When the consumer retrieved the aggregation buffer for a specific cpu, a buffer swap would take place to ensure that further probe firing would record aggregation data in a cleared buffer while the consumer processed the data in the now inactive buffer.
In the BPF-based design, aggregations are stored in the singleton map value of a per-CPU BPF map (index 0). The map value will be allocated with enough space to hold all aggregations that are in use. Aggregation actions will update the values allocated to the aggregation they operate on. When the consumer is ready to process aggregation data, it will perform a bpf_map_lookup_elem() to retrieve the data for all CPUs at once.
There is a potential complexity: the legacy implementation for DTrace could ensure that the buffer swap did not happen in the midst of probe processing on that specific CPU. The new design needs to guard against processing incomplete data (e.g. for aggregation functions that use 2 data items, one has been updated while the other has not).
Name | In Release | Description | Comments |
---|---|---|---|
Speculation buffer management | 2.0.0-1.8 | New functionality for userspace. | |
Speculation ID implementation | 2.0.0-1.8 | ||
Speculation actions | |||
. commit() | 2.0.0-1.8 | Action | |
. discard() | 2.0.0-1.8 | Action | |
. speculate() | 2.0.0-1.8 | Action | |
. speculation() | 2.0.0-1.8 | Subroutine |
Speculative tracing in the legacy implementation used kernel-side buffers tracing data, and it would copy the tracing data to the main output buffer upon commit (or discard the buffer). This model is difficult to implement without kernel modifications because of how much work is require at the kernel level.
A new implementation will be used for speculative tracing where the handling of speculations is managed at the userspace level instead. When a speculation is started, a unique tag ID will be generated and all tracing data that is generated from that point forward will be tagged with this ID. As userspace processes the output buffer, it will append any tagged data to its respective speculation buffer. When a commit() is encountered, the stored data will be processed as part of the main tracing data stream. If a discard() is encountered, the stored data is dropped.
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
breakpoint() | ||||
chill(int) | ||||
clear(@) | 2.0.0-1.12 | |||
commit(int) |
2.0.0-1.8 |
See Speculations. | ||
denormalize(@) | 2.0.0-1.5 | |||
discard(int) | 2.0.0-1.8 | See Speculations. | ||
exit(int) | 2.0.0-1.0 | |||
freopen(@, ...) | 2.0.0-1.3 | Variant of the printf() action. | ||
ftruncate() | 2.0.0-1.9 | |||
func(uintptr_t) | 2.0.0-1.6 | |||
jstack([uint32_t], [uint32_t]) | ||||
mod(uintptr_t) | 2.0.0-1.6 | |||
normalize(@, [uint64]) | 2.0.0-1.5 | |||
panic() | ||||
pcap(void*, int) | 2.0.0-1.14 | It is unclear how we can do this without kernel support, unless BPF already provides access to this data. | ||
print() | 2.0.0-1.14 | |||
printa(@, ...) | 2.0.0-1.4 | |||
printf(string, ...) | 2.0.0-1.2 | |||
raise(int) | 2.0.0-1.2 | Send a signal to the running task. Use BPF helper: bpf_send_signal() | ||
setopt(const char *, [const char *]) | 2.0.0-1.11 |
|||
speculate(int) |
2.0.0-1.8 |
See Speculations. | ||
stack([uint32_t]) | 2.0.0-1.6 | |||
stop() | ||||
sym(uintptr_t) | ||||
system(@, ...) | 2.0.0-1.3 | Variant of the printf() action. | ||
trace(@) |
2.0.0-1.0 2.0.0-1.2 2.0.0-1.6 2.0.0-1.10 |
2.0.0-1.0: Raw numeric values as signed 64-bit integers 2.0.0-1.2: [Orabug: 31407534] Enhancement of DTrace v1 behaviour by providing consistent output based on sign and bit-width of the value. 2.0.0-1.6 Adding support for string. 2.0.0-1.10 Adding support for arrays, struct, union. |
Numeric values only. | |
tracemem(@, size_t, [size_t]) | 2.0.0-1.12 |
|||
trunc(@, [uint64_t]) | 2.0.0-1.14 |
|||
uaddr(uintptr_t) | 2.0.0-1.6 | |||
ufunc(uintptr_t) | 2.0.0-1.6 | |||
umod(uintptr_t) | 2.0.0-1.6 | |||
ustack([uint32_t]. [uint32_t])) | 2.0.0-1.6 | |||
usym(uintptr_t) | 2.0.0-1.6 |
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
alloca | 2.0.0-1.10 | |||
basename | 2.0.0-1.9 | |||
bcopy | 2.0.0-1.10 | See BPF function memcpy(). Depends on alloca(), which depends on dynamic memory management | ||
cleanpath | 2.0.0-1.14 | |||
copyin | 2.0.0-1.11 |
|||
copyinstr | 2.0.0-1.11 |
|||
copyinto | 2.0.0-1.11 |
|||
copyout | 2.0.0-1.12 |
|||
copyoutstr | 2.0.0-1.12 |
|||
d_path | Added for Linux - we need to investigate if it is still needed. | Linux 5.9 adds a BPF d_path helper. See 6e22ab9da793 "bpf: Add d_path helper"
Note the limitations imposed by that d_path helper. (needs to run w/o locking, so the cases have been limited to few) |
||
dirname | 2.0.0-1.9 | |||
getmajor | 2.0.0-1.10 | |||
getminor | 2.0.0-1.10 | |||
htonl | 2.0.0-1.8 | |||
htonll | 2.0.0-1.8 | |||
htons | 2.0.0-1.8 | |||
index | 2.0.0-1.8 | See #Strings | ||
inet_ntoa | 2.0.0-1.10 | |||
inet_ntoa6 | 2.0.0-1.14 | |||
inet_ntop | 2.0.0-1.14 | |||
link_ntop | 2.0.0-1.14 | |||
lltostr | 2.0.0-1.8 | |||
msgdsize | Specific to Solaris - won't get implemented. | |||
msgsize | Specific to Solaris - won't get implemented. | |||
mutex_owned | 2.0.0-1.10 | |||
mutex_owner | 2.0.0-1.10 | |||
mutex_type_adaptive | 2.0.0-1.10 | |||
mutex_type_spin | 2.0.0-1.10 | |||
ntohl | 2.0.0-1.8 | |||
ntohll | 2.0.0-1.8 | |||
ntohs | 2.0.0-1.8 | |||
progenyof | 2.0.0-1.10 | |||
rand | 2.0.0-1.9 | |||
rindex | 2.0.0-1.8 | See #Strings | ||
rw_iswriter | 2.0.0-1.10 | |||
rw_read_held | 2.0.0-1.10 | |||
rw_write_held | 2.0.0-1.10 | |||
speculation | 2.0.0-1.8 | See Speculations. | ||
strchr | 2.0.0-1.8 | See #Strings | ||
strjoin | 2.0.0-1.7 | See #Strings | ||
strlen | 2.0.0-1.6 | See #Strings | ||
strrchr | 2.0.0-1.8 | See #Strings | ||
strstr | 2.0.0-1.8 | See #Strings | ||
strtok | 2.0.0-1.9 | See #Strings | ||
substr | 2.0.0-1.7 | See #Strings |
They are working as well as in V1. We use the kernel backtrace capability (BPF helper function).
DONE (2.0.0-1.6) |
---|
Name | In Release | Description | Comments |
---|---|---|---|
stack space |
2.0.0-1.6 | ||
string space | 2.0.0-1.7 | ||
alloca space | 2.0.0-1.10 |
Various internal functions such as string manipulation functions and stack trace retrieval need temporary storage. The D language supports explicit allocation of some temporary storage (for the duration of a clause) using the alloca() function. This represents two distinct cases of scratch space needs. The legacy implementation used a rather simplistic but guaranteed approach: all storage requests were cumulative. This means that there was no reuse of any scratch space within the execution of an action. This made space management very easy but it also wasted space.
Since DTrace based on BPF uses entire clauses rather than actions as compilation units the waste of scratch space is more significant. On the other hand, the fact that strings have a known maximum size and given an upper limit for stack trace depth, we can determine a fixed storage size that is sufficient to satisfy the needs of string functions (even in the common case of nested function calls) and of stack trace retrieval functions. We also know that an expression cannot combine string functions and stack trace functions, so the same scratch space can be used to support both needs.
DONE (2.0.0-1.10) |
---|
Description needed
DONE (2.0.0-1.2) |
---|
Description needed
DONE (2.0.0-1.2) |
---|
DONE - BEGIN and END probes
Name | In Release | Description | Comments |
---|---|---|---|
Create probes | 2.0.0-1.0 | ||
Determine probe info | 2.0.0-1.0 | ||
Implement trampoline | 2.0.0-1.0 | ||
Probe cleanup | 2.0.0-1.0 | ||
BEGIN probe semantics | 2.0.0-1.3 | BEGIN probe must fire before all other probes | |
END probe semantics | 2.0.0-1.3 | END probe must fire after all other probes |
DONE - ERROR probe
Name | In Release | Description | Comments |
---|---|---|---|
Create probe | 2.0.0-1.5 | ||
Determine probe info | 2.0.0-1.5 | ||
Implement trampoline | 2.0.0-1.5 |
DONE - profile probes
Name | In Release | Description | Comments |
---|---|---|---|
Create probe | 2.0.0-1.2 | ||
Determine probe info | 2.0.0-1.2 | ||
Implement trampoline | 2.0.0-1.2 |
DONE - tick probes
Name | In Release | Description | Comments |
---|---|---|---|
Create probe | 2.0.0-1.2 | ||
Determine probe info | 2.0.0-1.2 | ||
Implement trampoline | 2.0.0-1.2 |
Name | In Release | Description | Comments |
---|---|---|---|
Discover probe points | 2.0.0-0.8 | ||
Create probes | 2.0.0-0.8 | ||
Determine probe info | 2.0.0-0.8 | ||
Implement trampoline | 2.0.0-0.8 |
Name | In Release | Description | Comments |
---|---|---|---|
Discover probe points | 2.0.0-0.8 | ||
Create probes | 2.0.0-0.8 | ||
Determine probe info | 2.0.0-0.8 | ||
Implement trampoline | 2.0.0-0.8 | ||
Probe cleanup | 2.0.0-1.0 |
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Discover probe points | 2.0.0-0.8 | |||
Create probes | 2.0.0-0.8 | |||
Determine probe info | 2.0.0-0.8 | |||
Implement trampoline | 2.0.0-0.8 | |||
Decode arguments from context based on type | 2.0.0-1.11 | |||
Standard DTrace probes | Provide the standard set of DTrace SDT probes. | This requires a kernel patch. Where DTrace v1 provided a DTrace-specific way to define SDT probes and kernel tracepoints were exposed using that same mechanism, DTrace v2 will need to do the opposite: implement the DTrace SDT probes as kernel tracepoints. It may be possible to re-use some of the existing kernel tracepoints (possibly requiring argument mapping and translators). | ||
Standard SDT providers | See detailed list here: [DTrace Statically Defined Tracepoints](https://github.com/oracle/dtrace-utils/wiki/DTrace:-Statically-Defined-Tracepoints-(SDT)) | |||
(1)io | 2.0.0-1.14 | |||
(2)ip | 2.0.0-1.14 | |||
(3)lockstat |
2.0.0-1.13 (partial) |
|||
(4)nfsv3 | ||||
(5)nfsv4 | ||||
(6)proc | 2.0.0-1.12 (partial) 2.0.0-1.13 (complete) |
|||
(7)sched | 2.0.0-1.13 (partial) |
|||
(8)sysevent | ||||
(9)tcp | ||||
(10)udp | ||||
(11)sysevent |
Name | In Release | Description | Comments |
---|---|---|---|
Discover probe points | 2.0.0-1.12 |
||
Create probes | 2.0.0-1.12 |
||
Determine probe info | 2.0.0-1.12 |
||
Implement trampoline | 2.0.0-1.12 |
||
Probe cleanup | 2.0.0-1.12 |
The implementation of the PID provider (function boundary tracing for userspace and arbitrary instruction tracing for userspace) is build upon the uprobes support provided by the Linux kernel through the perf event sub-system. Because PID probes are grouped in process-specific providers whereas the underlying uprobes are inode based, a level of indirection is needed to implement PID probes in DTrace. This is done by associating the process-specific probes with their corresponding uprobes based "real" probe (belonging to a generic, system-wide "pid" provider). In other words, when a pid probe is requested for pid1234:a.out:func:entry, we first create a pid-provider probe pid:0x12ab:func:entry (representing the underlying uprobe) if it does not exist yet. We then associate the pid1234:a.out:func:entry probe with that pid:0x12ab:func:entry probe.
When the final probe program for that pid:0x12ab:func:entry probe is being constructed, clauses for all associated probes are included in the program, with a conditional preceding each to ensure the correct clauses are executed based on the PID of the process that triggered the probe.
Additional changes that are needed for the pid provider include changing the notification mechanism for process death to use an eventfd so that the epoll_wait() that waits for trace buffer data will also be triggered for process death notifications.
Name | In Release | Description | Comments |
---|---|---|---|
Discover probe points | 2.0.0-1.5 | ||
Create probes | 2.0.0-1.5 | ||
Implement dynamic provider creating |
2.0.0-1.5 | ||
Implement dynamic probe creation | 2.0.0-1.5 | ||
Implement probe and proovider cleanup upon error during compilation |
2.0.0-1.5 | ||
Implement meta-probe support | 2.0.0-1.5 | ||
Determine probe info | 2.0.0-1.5 | ||
Implement trampoline Implement provider-controlled clause call code generation |
2.0.0-1.5 | ||
Probe cleanup | 2.0.0-1.5 | ||
Offset Probes | 2.0.0-1.14 |
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Discover probe points | 2.0.0-1.11 |
|||
Create probes | 2.0.0-1.11 |
|||
Determine probe info | 2.0.0-1.11 |
|||
Implement trampoline | 2.0.0-1.11 |
|||
Probe cleanup | 2.0.0-1.11 |
|||
Is Enabled Probes | ||||
Translated Probe args | ||||
Blocking ioctl | ||||
Systemwide probing | ||||
Wildcard probes | 2.0.0-1.13 | |||
Ability to operate on probes inserted by other tools |
The cpc (CPU Performance Counter) provider on Solaris allowed users to specify probes that sample hardware counters such as cache misses. Typical usage is for understanding which executables or call stacks are responsible for important performance bottlenecks. On Linux, it should be relatively easy to take advantage of perf_event_open() to supply such functionality. The availability of probes is platform-specific, although there are also some cross-platform names that can be used for more generic events.
Name | In Release | Description | Comments |
---|---|---|---|
Discover probe points | 2.0.0-1.12 |
||
Create probes | 2.0.0-1.12 |
||
Determine probe info | 2.0.0-1.12 |
||
Implement trampoline | 2.0.0-1.12 |
||
Probe cleanup | 2.0.0-1.12 |
||
Probe args | 2.0.0-1.12 |
Name | Type | Target Release | In Release | Description | Comments |
---|---|---|---|---|---|
aggpercpu | Compile-time | 2.0.0-1.14 | Aggregate per CPU. | ||
aggrate | Dynamic runtime | 2.0.0-1.13 | Rate of aggregation reading. | Value type: time | |
aggsize | Runtime | 2.0.0-1.11 |
Aggregation buffer size/ | Value type: size | |
aggsortkey | Dynamic runtime | 2.0.0-1.11 |
Sort aggregations by key. | false or true | |
aggsortkeypos | Dynamic runtime | 2.0.0-1.11 |
Number of the aggregation key on which to sort. | Value type: scalar | |
aggsortpos | Dynamic runtime | 2.0.0-1.11 |
Number of the aggregation variable on which to sort | Value type: scalar | |
aggsortrev | Dynamic runtime | 2.0.0-1.11 |
Sort aggregations in reverse order. | false or true | |
amin | Compile-time | 2.0.0-1.0 |
Stability attribute minimum. | Value type: string | |
argref | Compile-time | 2.0.0-1.0 |
Do not require all macro arguments to be used. | ||
bpflog | Runtime | 2.0.0-1.7 | Enable printing of the BPF verifier log even when loading programs was successful. | ||
bpflogsize | Runtime | 2.0.0-1.5 | BPF verifier log size. | Value type: size | |
bufpolicy | Runtime | Buffer policy. | fill, ring, or switch | ||
bufresize | Runtime | Buffer resizing policy. | auto or manual | ||
bufsize | Runtime | 2.0.0-1.2 |
Principal buffer size (equivalent to the dtrace -b). | Value type: size | |
cleanrate | Runtime | Cleaning rate. | Value type: time | ||
core | Compile-time | 2.0.0-1.0 |
Enable core dumping by dtrace. | ||
cpp | Compile-time | 2.0.0-1.11 |
Use cpp to pre-process the input file. | ||
cpparg | Compile-time | 2.0.0-1.0 |
Use cpp to pre-process the input file. | ||
cpphdrs | Compile-time | 2.0.0-1.0 |
Specify the -H option to cpp to print the name of each header file that is used. | ||
cpppath | Compile-time | 2.0.0-1.0 |
Specify the path name of cpp. | Value type: string | |
cpu | Runtime | 2.0.0-1.14 | CPU on which to enable tracing. | Value type: scalar | |
ctfpath | Compile time | 1.1.0-1 |
Specify the path name of vmlinux.ctf. | Value type: string | |
ctypes | Compile-time | 0.1 |
Write out Compact Type Format (CTF) definitions of all C types used in a program at the end of a D compilation run. | Value type: string | |
debug | Compile-time | 0.1 |
Enable DTrace debugging mode (equivalent to setting the environment variable DTRACE_DEBUG). | ||
debugassert | Compile-time | 1.0.2 |
Enable DTrace asserts. mutexes: Use error-checking mutexes. | ||
defaultargs | Compile-time | 0.1 |
Allow references to unspecified macro arguments. Use 0 as the value for an unspecified argument. | ||
define | Compile-time | 0.1 |
Define a macro name and optional value in the form name[=value]. (equivalent to dtrace -D). | Value type: string | |
destructive | Runtime | 0.1 |
Allow destructive actions (equivalent to dtrace -w). | ||
disasm | Compile-time | 2.0.0-1.0 | Specify the disassembler listings to print for dtrace-S. | Value type: scalar | |
droptags | Compile-time | 2.0.0-1.13 | Specifies that drop tags are used. | ||
dtypes | Compile-time | 0.1 |
Write out CTF definitions of all D types that are used in a program at the end of a D compilation run. | Value type: string | |
dynvarsize | Runtime | 2.0.0-1.9 |
Dynamic variable space size. | Value type: size | |
empty | Compile-time | 0.1 |
Permit compilation of empty D source files. | ||
errtags | Compile-time | 0.1 |
Prefix default error message with error tags. | ||
evaltime | Compile-time |
Control when DTrace halts a new process after grabbing it. exec: Halt the process immediately after exec(). main: Halt after constructor execution, immediately before main(). For stripped binaries, main and postinit are silently converted to preinit because a symbol table is required to locate main(). postinit: Equivalent to main. preinit: Halt after initialization of the dynamic linker loader (ld.so) and before constructor invocation (default behavior). For statically linked binaries, preinit is equivalent to exec, and it might not skip ld.so initialization, which can happen after main(). For stripped, statically linked binaries, both postinit and main are equivalent to preinit, because the main symbol cannot be looked up if there is no symbol table. |
exec, main, postinit, or preinit | ||
flowindent | Dynamic runtime | 2.0.0-1.2 |
Indent function entry and prefix with ->. Unindent function return and prefix with <-. Indent system call entry and prefix with =>. Unindent system call return and prefix with <=. Equivalent to dtrace -F. | ||
grabanon | Runtime | ||||
incdir | Compile-time | 2.0.0-1.12 |
Add a #include directory to the preprocessor search path (equivalent to dtrace -I). | Value type: string | |
iregs | Compile-time | 2.0.0-1.13 | Size of the DTrace Intermediate Format (DIF) integer register set. The default value is 8. | Value type: scalar | |
jstackframes | Runtime | r | |||
jstackstrsize | Runtime | ||||
kdefs | Compile-time | 0.1 |
Do not permit unresolved kernel symbols. | ||
knodefs | Compile-time | 0.1 |
Permit unresolved kernel symbols. | ||
late | Compile-time |
Specify whether references to dynamic translators are permitted: dynamic: Allow references to dynamic translators. static: Require translators to be statically defined. |
dynamic or static | ||
lazyload | Compile-time | Specify that the DTrace Object Format (DOF) should be lazily loaded rather than actively loaded. | false or true | ||
ldpath | Compile-time | 2.0.0-1.12 |
Specify the path of the dynamic linker loader (ld). | Value type: string | |
libdir | Compile-time | 2.0.0-1.12 |
Add a library directory to the library search path. | Value type: string | |
linkmode | Compile-time | 2.0.0-1.13 |
Specify the symbol linking mode that is used by the assembler when processing external symbol references: dynamic: All symbols are treated as dynamic. kernel: Kernel symbols are treated as static and user symbols are treated as dynamic. static: All symbols are treated as static. |
dynamic, kernel, or static | |
linktype | Compile-time | 2.0.0-1.13 |
Specify the output file type: dof: Produce a standalone DOF file. elf: Produce an ELF file that contains DOF. |
dof or elf | |
lockmem | Runtime | 2.0.0-1.11 |
Locked pages limit. | Value type: scalar | |
maxframes | Runtime | 2.0.0-1.6 |
Maximum number of stack frames reported by the kernel. | Value type: scalar | |
modpath | Compile-time | 2.0.0-1.12 |
Module path. The default path is /lib/modules/version. | Value type: string | |
nolibs | Compile-time | 2.0.0-0.8 |
Do not process D system libraries. | ||
noresolve | Runtime | 1.1.0-1 |
Disable automatic resolution of userspace symbols. | ||
nspec | Runtime | 2.0.0-1.13 | Number of speculations. | Value type: scalar | |
pcapsize | Runtime | 1.2.0-1 |
Maximum packet data capture size. | Value type: size | |
pgmax | Compile-time | Limit on the number of threads that DTrace can grab for tracing. The default value is 8. | Value type: scalar | ||
preallocate | Compile-time | 0.1 |
Amount of memory to preallocate. | Value type: scalar | |
procfspath | Compile-time | 1.0.2 |
Path to the procfs file system. The default path is /proc. | Value type: string | |
pspec | Compile-time | 2.0.0-1.13 | Interpret ambiguous specifiers as probe names. | ||
quiet | Dynamic runtime | 1.0.2 |
Output only explicitly traced data (equivalent to dtrace -q). | ||
quietresize | Dynamic runtime | 1.0.2 |
Suppress buffer-resize messages. | ||
rawbytes | Dynamic runtime | 1.0.2 |
Always print trace output in hexadecimal. | ||
scratchsize | Runtime | 2.0.0-1.10 |
Scratch memory size. | Value type: size | |
specsize | Runtime | 2.0.0-1.8 |
Speculation buffer size. | Value type: size | |
stackframes | Runtime | 2.0.0-1.6 |
Number of stack frames. | Value type: scalar | |
stackindent | Dynamic runtime | 1.0.2 |
Number of white space characters to use when indenting stack and ustack output. | Value type: scalar | |
statusrate | Runtime | Rate of status checking. | Value type: time | ||
stdc | Compile-time | 2.0.0-1.13 | Specify ISO C conformance settings for the preprocessor when invoking cpp with the -C option. The a, c, and t settings include the-std=gnu99 option (conformance with 1999 C standard including GNU extensions). The s setting includes the -traditional-cpp option (conformance with K&R C). | a, c, s, or t | |
strip | Compile-time | 2.0.0-1.13 | Strip non-loadable sections from the program. | ||
strsize | Runtime | 1.0.2 |
String size. | Value type: size | |
switchrate | Dynamic runtime | Rate of buffer switching. | Value type: time | ||
syslibdir | Compile-time | 1.0.2 |
Path name of system libraries. | Value type: string | |
sysslice | Compile-time | 1.0.2 |
Name of the system slice. | Value type: string | |
tree | Compile-time | 2.0.0-0.8 |
Value of the DTrace tree dump bitmap. | Value type: scalar | |
tregs | Compile-time | 2.0.0-1.12 |
Size of the DIF tuple register set. The default value is 8. | Value type: scalar | |
udefs | Compile-time | Do not permit unresolved user symbols. | |||
undef | Compile-time | 0.1 |
Undefine a symbol when invoking the preprocessor. Equivalent to dtrace -U. | Value type: string | |
unodefs | Compile-time | Permit unresolved user symbols. | |||
useruid | Compile-time | 1.0.2 |
First UID that is not in the system range. | Value type: scalar |
|
ustackframes | Runtime | 2.0.0-1.6 |
Number of user-land stack frames. | Value type: scalar | |
verbose | Compile-time | 2.0.0-1.11 |
DIF verbose mode, which shows each compiled DIF object (DIFO). | ||
version | Compile-time | 2.0.0-1.12 |
Request a specific version of the native DTrace library. | Value type: string | |
zdefs | Compile-time | 0.1 |
Permit probe definitions that match zero probes. |
When debugging a low-level problem or measuring system performance, you might need to enable probes that are associated with internal operating system routines, such as functions in the kernel, rather than probes that are associated with more stable interfaces, such as system calls. The available data at probe locations deep within the software stack is often a collection of implementation artifacts rather than more stable data structures, such as those associated with Oracle Linux system call interfaces. To assist you with writing stable D programs, DTrace provides a facility for translating implementation artifacts into stable data structures that are accessible from your D program statements. A translator is a collection of D assignment statements provided by the supplier of an interface. Translators can be used to translate an input expression into an object of the struct type.
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Define translators | ||||
Translated probe arguments | ||||
xlate operator | ||||
xlate of a single member | ||||
xlate of an entire struct | ||||
Translator types | ||||
Static translators | ||||
Dynamic translators |
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Compilation of D clauses into BPF code | 2.0.0-0.1 | |||
BPF probe program as trampoline to compiled D clauses | 2.0.0-0.1 | |||
Linking generated BPF code with pre-compiled BPF support functions |
2.0.0-1.0 | |||
Integration of the clause predicate as conditional into the main clause code |
2.0.0-1.2 | |||
Support probe specifications with wildcards |
2.0.0-1.2 | |||
Support more than one clause for a given probe |
2.0.0-1.2 | |||
Store the DTrace BPF context somewhere other than the BPF stack |
2.0.0-1.2 | |||
Provide correct buffer consumption semantics | 2.0.0-1.3 | |||
Use default action only for empty clause | 2.0.0-1.3 | |||
Hashtab rewrite | 2.0.0-1.9 | |||
Read data from kernel and userspace pointers | 2.0.0-1.11 | |||
Generic probe aliasing | ||||
Drop support | Various cases where probe data cannot be reported should be reported as 'drops' (part of the probe status). |
NOTE: this is just a summary of the table below. |
DTrace on Linux is no longer using the in-kernel support code for DTrace and instead uses existing Linux kernel tracing features such as BPF. The D compiler (primarily the code generator and the assembler) will now generate BPF code. Whereas before clauses were divided into actions and each action was compiled into its own D program, the entire clause is now being compiled into BPF code.
The assembler requires modifications as well to handle BPF specific relocations and to account for the compilation of complete clauses. Also, the disassembler needs to be modified to support and output BPF code.
DONE (2.0.0-0.1) |
---|
BPF programs are invoked from trace event triggers with a probe type specific BPF context. DTrace D clauses are expected to execute in a consistent context, especially because a single clause can be associated with probes of different types. In order to support this configuration, a probe-specific trampoline is generated by the provider code, using the BPF context to populate a generic DTrace context. Then it calls a predicate function (if the clause has a predicate), and if the predicate function returns true, the generated clause function is called.
DONE (2.0.0-0.1) |
---|
There are common code sequences that are necessary for the implementation of D code using BPF. Things like retrieving the value of a variable of a particular kind, setting the value of a variable of a particular kind, string manipulation functions, etc. Many of these can be implemented in C and compiled into BPF code. These functions are linked together into the bpf_dlib.o ELF object. These pre-compiled BPF support functions are referenced by dynamically generated BPF code as function calls to external functions. This means that entries are added into the BPF symbol relocation table. These are resolved during a special linking stage that is executed prior to loading the BPF programs into the kernel.
During this linking stage the relocation records that reference external symbols are processed. The following high level mechanism is implemented:
1. For each relocation record, determine the pre-compiled BPF support function symbol it references
2. If the symbol has already been seen, continue with the next relocation record (goto step 1)
3. If the symbol is new, do the following:
a. Mark the symbol as seen
b. Append the executable code for the symbol to the BPF program we are processing
c. Process the relocation records for this symbol per the algorithm described here
At the conclusion of this recursive process, there should not be any unresolved symbols left. The BPF program will contain all executable code necessary for executing the entire program.
DONE (2.0.0-1.0) |
---|
The predicate can perform most operations that the main clause code can do, including accessing variables of all three kinds and calling subroutines. Many of these operations require access to the DTrace BPF context. However, only the main program (dt_program) was being generated with the appropriate prologue code to ensure that BPF support functions would be able to operate in a consistent execution context. While it would be possible to resolve this issue by generating the predicate as a function with prologue and epilogue as well, the more logical solution is to simply include the predicate in the main clause code as a conditional. The main program becomes:
PROLOGUE
if (!predicate), goto exit
MAIN PROGRAM CODE
EPILOGUE
exit: return 0
DONE (2.0.0-1.2) |
---|
(See: DTrace D Compiler redesign)
DONE (2.0.0-1.2) |
---|
(See: DTrace D Compiler redesign)
DONE (2.0.0-1.2) |
---|
The BPF JIT compiler generates code that depends on a single BPF stack (maximum size right now limited to 512 bytes) whereas the BPF interpreter provides a BPF stack for each function that is executing in a call chain. So, for interpreted code, the BPF stack for each function in the call chain The quite limited stack size for JIT-compiled code caused stack overrun conditions when executing D clauses because the DTrace BPF context (stored on the stack) was consuming too much precious stack space.
The solution is to decouple the DTrace BPF context from the DTrace machine state. While the DTrace BPF context must still reside on the stack, the DTrace machine state can be stored elsewhere. The choice has been made to use the 'mem' BPF map that is used to provide memory for the output buffer. The first portion of the memory used for the map value is set aside to hole the DTrace machine state, and the remainder is for the output buffer.
DONE (2.0.0-1.2) |
---|
The processing of the trace buffers must adhere to specific DTrace buffer consumption semantics. The buffer that contains the BEGIN probe trace data must be processed first, so that the BEGIN probe will always be reported first in output. This means that we need to know what CPU the BEGIN probe fired on. Similarly, the buffer that contains the END probe trace data must be processed last, so that the END probe will always be reported last in output. This means we need to know what CPU the END probe fired on.
When processing the buffer for the BEGIN probe trace data it is important to also process any ERROR probe firings that relate to the BEGIN probe. On the other hand, ERROR probe trace data that is found in that same buffer but that does not correspond to the BEGIN probe firing must be processed after the BEGIN probe processing completes.
DONE (2.0.0-1.3) |
---|
The "default action" is used for empty clauses. After the BPF port, it was also used for clauses that were not empty but simply had no data-recording actions. One can argue whether or not that is reasonable, but it is a significant departure from established precedent. Rather reasonable scripts would perform very differently with the new behavior. So, revert to the legacy behavior.
DONE (2.0.0-1.3) |
---|
Replace a number of hand-rolled fixed-size hashtables internal to DTrace with the auto-resizing dt_htab. The most user-visible effect of this cleanup is that symbol lookup is now many times faster: since DTrace does many symbol lookups at startup, starting DTrace is now several seconds quicker than it was. (Nick)
DONE (2.0.0-1.9) |
---|
BPF does not allow us to read data directly from kernel and userspace addresses using a load instruction. We need to use the bpf_probre_read() BPF helper (and possibly bpf_probe_read_kernel() and bpf_probe_read_user() for architectures that require it).
Name | In Release | Description | Comments |
---|---|---|---|
Dynamic strings | |||
Kernel strings | 2.0.0-1.11 |
Support to retrieve (load) strings from kernel addresses (depends on temporary strings) | |
Userspace strings | 2.0.0-1.11 |
Support to retrieve (load) strings from userspace addresses (depends on temporary strings) | |
Numeric values | |||
Kernel values | 2.0.0-1.11 |
Support retrieving scalars from kernel addresses | |
Userspace values | 2.0.0-1.11 |
Support retrieving scalars from userspace addresses |
DONE (2.0.0-1.11) |
---|
The implementation of SDT probes requires a mechanism to re-use existing probes in a different DTrace context. A given SDT probe will register itself as dependent for the underlying probe that will providing the firing mechanism (and access to data to be used as arguments), The trampoline callback of the provider is used to generate code that transforms the argument data of the underlying probe into argument data that the SDT probe provides. The trampoline can also be used to generate code that implements pre-conditions to determine whether the firing of the underlying probe should cause the dependent probe to be reported as firing as well. One example of this need is found with the exec-failure and exec-success probes that both depend on the syscall::execve:return probe.
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Implement rawtp provider | 2.0.0-1.12 | Make raw tracepoints available as probes in DTrace. |
||
Save/restore probe arguments | 2.0.0-1.12 |
Save/restore a copy of probe arguments so dependent probes can modify the main copy. |
||
Implement dependent probes | 2.0.0-1.12 |
Dependent probes are added to a list in the underlying probe. The trampoline for each dependent probe is generated during the program construct phase of the underlying probe. | ||
Implement dependent probe priorities | Some SDT probes (dependents of the same underlying probe) must fire in a well defined order. This requires a mechanism to know the order in which the dependent probes should be added to the underlying probe. |
IN PROGRESS |
---|
DTrace sometimes fails to record tracing data for a probe firing. This could be due to the output buffer not having enough space to record the tracing data, or it could be due to running out of dynamic variables or speculations, etc. Such failures to record tracing data are called 'record drops'. The producer should keep track of how many records are dropped for each category and make this information available to the consumer. The consumer will typically emit warning messages to report the record drops.
Name | Target Release | In Release | Description | Comments |
---|---|---|---|---|
Principal buffer drops | 2.0.0-1.13 | DTrace reports this as a per-CPU counter. The perf event output buffer provides data on events that could not be written to the buffer (drops) but we need to distinguish between principal buffer drops and speculation drops (because speculative tracing data is passed to the consumer using the principal buffer). Therefore, we do need to keep track of these drops ourselves. |
||
Aggregation buffer drops | 2.0.0-1.13 | DTrace reports this as a per-CPU counter. |
||
Dynamic variable drops | 2.0.0-1.13 | |||
Dynamic variable drops (rinsing) | N/A | |||
Dynamic variable drops (dirty) | N/A | |||
Speculations drops | 2.0.0-1.13 | This counter is incremented when there is no space in the principal buffer to record the speculative tracing data for a clause. |
||
Speculations drops (busy) | 2.0.0-1.13 | This counter is incremented when a speculation could not be allocated because all are in use and at least one of them was being committed/discarded. |
||
Speculations drops (unavailable) | 2.0.0-1.13 | This counter is incremented when a speculation could not be allocated because all are in use. |
||
String table overflow | This reports a jstack()/ustack() string table overflow. |
|||
Double error (error-in-error) | This counter is incremented when an error occurs during ERROR probe clause execution. |
IN PROGRESS |
---|
Done: 2.0.0-1.14 |
---|