Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure littlefs to not use recursion, measure stack usage #658

Merged
merged 22 commits into from
Apr 11, 2022

Commits on Mar 11, 2022

  1. Added size-sort options to scripts/code.py

    Now with -s/--sort and -S/--reverse-sort for sorting the functions by
    size.
    
    You may wonder why add reverse-sort, since its utility doesn't seem
    worth the cost to implement (these are just helper scripts after all),
    the reason is that reverse-sort is quite useful on the command-line,
    where scrollback may be truncated, and you only care about the larger
    entries.
    
    Outside of the command-line, normal sort is prefered.
    
    Fortunately the difference is just the sign in the sort key.
    
    Note this conflicts with the short --summary flag, so that has been
    removed.
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    b045436 View commit details
    Browse the repository at this point in the history
  2. Split out scripts/code.py into scripts/code.py and scripts/data.py

    This is to avoid unexpected script behavior even though data.py should
    always return 0 bytes for littlefs. Maybe a check for this should be
    added to CI?
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    2cdabe8 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f5286ab View commit details
    Browse the repository at this point in the history
  4. Added coverage-sort to scripts/coverage.py

    scripts/coverage.py was missed originally because it's not ran as often
    as the others. Since it requires run-time info, it's usually only used
    in CI.
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    20c58dc View commit details
    Browse the repository at this point in the history
  5. Added scripts/stack.py for viewing stack usage

    Note this detects loops (recursion), and renders this as infinity.
    Currently littlefs does have a single recursive function and you can see
    how this infects the full call graph. Eventually this should be removed.
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    f4c7af7 View commit details
    Browse the repository at this point in the history
  6. Changed script's CSV formats to allow for merging different measurements

    - size  -> code_size
    - size  -> data_size
    - frame -> stack_frame
    - limit -> stack_limit
    - hits  -> coverage_hits
    - count -> coverage_count
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    d7582ef View commit details
    Browse the repository at this point in the history
  7. Added scripts/structs.py for getting sizes of structs

    Note this does include internal structs, so this should probably
    be limited to informative purposes.
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    0a2ff3b View commit details
    Browse the repository at this point in the history
  8. Added make *-diff rules, quick commands to compare sizes

    This required a patch to the --diff flag for the scripts to ignore
    a missing file. This enables the useful one liner for making comparisons
    with potentially missing previous versions:
    
        ./scripts/code.py lfs.o -d lfs.o.code.csv -o lfs.o.code.csv
    
        function (0 added, 0 removed)            old     new    diff
        TOTAL                                  25476   25476      +0
    
    One downside, these previous files are easy to delete as a part of make
    clean, which limits their usefulness for comparing configuration
    changes...
    geky committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    50ad2ad View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2022

  1. Some improvements to size scripts

    - Added -L/--depth argument to show dependencies for scripts/stack.py,
      this replaces calls.py
    - Additional internal restructuring to avoid repeated code
    - Removed incorrect diff percentage when there is no actual size
    - Consistent percentage rendering in test.py
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    eb8be9f View commit details
    Browse the repository at this point in the history
  2. Added ./script/summary.py

    A full summary of static measurements (code size, stack usage, etc) can now
    be found with:
    
        make summary
    
    This is done through the combination of a new ./scripts/summary.py
    script and the ability of existing scripts to merge into existing csv
    files, allowing multiple results to be merged either in a pipeline, or
    in parallel with a single ./script/summary.py call.
    
    The ./scripts/summary.py script can also be used to quickly compare
    different builds or configurations. This is a proper implementation
    of a similar but hacky shell script that has already been very useful
    for making optimization decisions:
    
        $ ./scripts/structs.py new.csv -d old.csv --summary
        name (2 added, 0 removed)               code             stack            structs
        TOTAL                                  28648 (-2.7%)      2448               1012
    
    Also some other small tweaks to scripts:
    
    - Removed state saving diff rules. This isn't the most useful way to
      handle comparing changes.
    
    - Added short flags for --summary (-Y) and --files (-F), since these
      are quite often used.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    55b3c53 View commit details
    Browse the repository at this point in the history
  3. A few more tweaks to scripts

    - Changed `make summary` to show a one line summary
    - Added `make lfs.csv` rule, which is useful for finding more info with
      other scripts
    - Fixed small issue in ./scripts/summary.py
    - Added *.ci (callgraph) and *.csv (script output) to CI
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    7ea2b51 View commit details
    Browse the repository at this point in the history
  4. Added new scripts to CI results

    - Added to GitHub statuses (61 results)
    
    - Reworked generated release table to include these (16 results, only thumb)
    
    These also required a surprisingly large number of other changes:
    
    - Bumbed CI Ubuntu version 18.04 -> 20.04, 22.04 is already on the
      horizon but not usable in GitHub yet
    
    - Manualy upgrade to GCC v10, this is required for the -fcallgraph-info
      flag that scripts/stack.py uses.
    
    - Increased paginated status queries to 100 per-page. If we have more
      statuses than this the status diffs may get much more complicated...
    
    - Forced whitespace in generated release table to always be nbsp. GitHub
      tables get scrunched rather ugly without this, prefering margins to
      readable tables.
    
    - Added limited support for "∞" results, since this is returned by
      ./scripts/stack.py for recursive functions.
    
    As a side-note, this increases the number of statuses reported
    per-commit from 6 to 61, so hopefully that doesn't cause any problems...
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    9d54603 View commit details
    Browse the repository at this point in the history
  5. Fixed spurious encoding error

    Using errors=replace in python utf-8 decoding makes these scripts more
    resilient to underlying errors, rather than just throwing an unhelpfully
    generic decode error.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    e4adefd View commit details
    Browse the repository at this point in the history
  6. Fixed spurious CI failure caused by multiple writers to .o files

    GCC is a bit frustrating here, it really wants to generate every file in
    a single command, which _is_ more efficient if our build system could
    leverage this. But -fcallgraph-info is a rather novel flag, so we can't
    really rely on it for generally compiling and testing littlefs.
    
    The multi-file output gets in the way when we want an explicitly
    separate rule for callgraph-info generation. We can't generate the
    callgraph-info without generating the objects files.
    
    This becomes a surprsing issue when parallel building (make -j) is used!
    Suddenly we might end up with both the .o and .ci rules writing to .o
    files, which creates a really difficult to track down issue of corrupted
    .o files.
    
    The temporary solution is to use an order-only prerequisite. This still
    ends up building the .o files twice, but it's an acceptable tradeoff for
    not requiring the -fcallgraph-info for all builds.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    3b495ba View commit details
    Browse the repository at this point in the history
  7. Cleaned up make clean

    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    563af5f View commit details
    Browse the repository at this point in the history
  8. Limit ./scripts/structs.py to report structs in local .h files

    This requires parsing an additional section of the dwarfinfo (--dwarf=rawlines)
    to get the declaration file info.
    
    ---
    
    Interpreting the results of ./scripts/structs.py reporting is a bit more
    complicated than other scripts, structs aren't used in a consistent
    manner so the cost of a large struct depends on the context in which it
    is used.
    
    But that being said, there really isn't much reason to report
    internal-only structs. These structs really only exist for type-checking
    in internal algorithms, and their cost will end up reflected in other RAM
    measurements, either stack, heap, or other.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    8475c80 View commit details
    Browse the repository at this point in the history
  9. In CI, determine loop devices dynamically to avoid conflicts with Ubu…

    …ntu snaps
    
    Introduced when updating CI to Ubuntu 20.04, Ubuntu snaps consume
    loop devices, which conflict with out assumption that /dev/loop0
    will always be unused. Changed to request a dynamic loop device from
    losetup, though it would have been nice if Ubuntu snaps allocated
    from the last device or something.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    316b019 View commit details
    Browse the repository at this point in the history
  10. Changed./scripts/struct.py to organize by header file

    Avoids redundant counting of structs shared in multiple .c files, which
    is very common. This is different from the other scripts,
    code.py/data.py/stack.py, but this difference makes sense as struct
    declarations have a very different lifetime.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    fe8f3d4 View commit details
    Browse the repository at this point in the history
  11. Fixed Popen deadlock issue in test.py

    As noted in Python's subprocess library:
    
    > This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
    > child process generates enough output to a pipe such that it blocks
    > waiting for the OS pipe buffer to accept more data.
    
    Curiously, this only became a problem when updating to Ubuntu 20.04
    in CI (python3.6 -> python3.8).
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    554e4b1 View commit details
    Browse the repository at this point in the history
  12. Removed recursion from commit/relocate code path

    lfs_dir_commit originally relied heavily on tail-recursion, though at
    least one path (through relocations) was not tail-recursive, and could
    cause unbounded stack usage in extreme cases of bad blocks. (Keep in
    mind even extreme cases of bad blocks should be in scope for littlefs).
    
    In order to remove recursion from this code path, several changed were
    raequired:
    
    - The lfs_dir_compact logic had to be somewhat inverted. Instead of
      first compacting and then resolving issues such as relocations and
      orphans, the overarching lfs_dir_commit now contains a state-machine
      which after committing or compacting handles the extra changes to the
      filesystem in a single, non-recursive loop
    
    - Instead of fixing all relocations recursively, >1 relocation requires
      defering to a full deorphan step. This step is unfortunately an
      additional n^2 process. It also required some changes to lfs_deorphan
      in order to ignore intentional orphans created as an intermediary in
      lfs_mkdir. Maybe in the future we should remove these.
    
    - Tail recursion normally found in lfs_fs_deorphan had to be rewritten
      as a loop which restarts any time a new commit causes a relocation.
      This does show that the algorithm may not terminate, but only if every
      block is bad, which will eventually cause littlefs to run out of
      blocks to write to.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    84da4c0 View commit details
    Browse the repository at this point in the history
  13. Removed recursion in file read/writes

    This mostly just required separate functions for "lfs_file_rawwrite" and
    "lfs_file_flushedwrite", since lfs_file_flush recursively invokes
    lfs_file_rawread and lfs_file_rawwrite.
    
    This comes at a code cost, but gives us bounded and measurable RAM usage
    on this code path.
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    fedf646 View commit details
    Browse the repository at this point in the history
  14. Removed recursion from lfs_dir_traverse

    lfs_dir_traverse is a bit unpleasant in that it is inherently a
    recursive function, but without a strict bound of 4 calls (commit -> filter ->
    move -> filter), and efforts to unroll the recursion comes at a
    signification code cost.
    
    It turns out the best solution I've found so far is to simple create an
    explicit stack with an explicit bound of 4 calls (or more accurately,
    3 pushed frames).
    
    ---
    
    This actually highlights one of the bigger flaws in littlefs right now,
    which is that this function, lfs_dir_traverse, takes O(n^2) disk reads
    to traverse.
    
    Note that LFS_FROM_MOVE can only occur once per commit, which is why
    this code is O(n^2) and not O(n^4).
    geky committed Mar 20, 2022
    Configuration menu
    Copy the full SHA
    8109f28 View commit details
    Browse the repository at this point in the history