Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize import substitution with zero-allocation ImportPathIter #100

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

CrazyRoka
Copy link

Optimize import substitution with zero-allocation ImportPathIter

Background

I identified significant memory allocations in the import substitution method while profiling the Bevy game engine's "many cubes" example using the DHAT memory profiler. Specifically, the to_owned() function was being called frequently, responsible for 2.34% of overall allocated memory blocks.

Changes

This PR introduces a custom ImportPathIter enum (similar to Either enum crate) and refactors the substitute_identifiers function to use this new iterator. The main goals were to reduce allocations and improve overall performance.

Key changes include:

  • Introduced ImportPathIter enum to handle different types of iterators. That helps us to avoid allocating new Vec or Box.
  • Refactored substitute_identifiers to use ImportPathIter, reducing allocations.
  • Replaced vector cloning with an iterator-based approach for better performance.
  • Implemented minor optimizations, such as using contains instead of find for quote checks.

Impact

While the memory usage of the targeted code path was relatively low (0.02% of the overall program), this optimization provides a noticeable improvement:

  • Before: to_owned() was responsible for 2.34% of overall allocated memory blocks
  • After: The function no longer appears in profiling results, indicating successful optimization

Testing

Before

  ├─▶ PP 1.5/9 (2 children) {
  │     Total:     385,755 bytes (0.02%, 26,305.12/s) in 53,169 blocks (2.34%, 3,625.66/s), avg size 7.26 bytes, avg lifetime 2.46 µs (0% of program duration)
  │     At t-gmax: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │     At t-end:  0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
  │     Allocated at {
  │       #1: 0x5564ed256876: <alloc::alloc::Global as core::alloc::Allocator>::allocate (alloc/src/alloc.rs:243:9)
  │       #2: 0x5564ed256876: alloc::raw_vec::RawVec<T,A>::try_allocate_in (alloc/src/raw_vec.rs:230:45)
  │       #3: 0x5564ed256876: alloc::raw_vec::RawVec<T,A>::with_capacity_in (alloc/src/raw_vec.rs:158:15)
  │       #4: 0x5564ed256876: alloc::vec::Vec<T,A>::with_capacity_in (src/vec/mod.rs:699:20)
  │       #5: 0x5564ed256876: <T as alloc::slice::hack::ConvertVec>::to_vec (alloc/src/slice.rs:162:25)
  │       #6: 0x5564ed256876: alloc::slice::hack::to_vec (alloc/src/slice.rs:111:9)
  │       #7: 0x5564ed256876: alloc::slice::<impl [T]>::to_vec_in (alloc/src/slice.rs:441:9)
  │       #8: 0x5564ed256876: alloc::slice::<impl [T]>::to_vec (alloc/src/slice.rs:416:14)
  │       #9: 0x5564ed256876: alloc::slice::<impl alloc::borrow::ToOwned for [T]>::to_owned (alloc/src/slice.rs:823:14)
  │       #10: 0x5564ed256876: alloc::str::<impl alloc::borrow::ToOwned for str>::to_owned (alloc/src/str.rs:211:62)
  │       #11: 0x5564ed256876: naga_oil::compose::parse_imports::substitute_identifiers (src/compose/parse_imports.rs:135:47)
  │     }
  │   }

After

I ran the application again and conducted profiling. The to_owned() function was no longer present in the profiling results, confirming that we've successfully addressed the issue.

@robtfm
Copy link
Collaborator

robtfm commented Aug 18, 2024

does it have any impact on time? i think optimizing out individual string allocations is almost certainly not worthwhile.

the extra code complexity is quite minor though, if it was any more complex i think i would object more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants