Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading and general core infrastructure micro-optimization #228

Open
3 of 7 tasks
gotmachine opened this issue May 26, 2024 · 0 comments
Open
3 of 7 tasks

Loading and general core infrastructure micro-optimization #228

gotmachine opened this issue May 26, 2024 · 0 comments
Labels
kspPerformance Possible performance improvement in KSP

Comments

@gotmachine
Copy link
Contributor

gotmachine commented May 26, 2024

This is intended to be a meta-issue tracking possible performance improvements by micro-optimizing stock methods.
The focus is mainly to try get ride of some overhead during part compilation and scene switches.

Note that frame time or performance numbers are obtained from deep profiling sessions, so the figures will be heavily skewed for methods having a deep call stack, and more representative for methods at the bottom of call stacks. For reference, here is the unity profiler (2019.4) deep profile dump I used, this only span over part compilation, and is in a more or less stock install (no KSPCF) :
https://drive.google.com/file/d/1MI_D634zUTt16F8VVJnFZ7maM5vgCxZN/view?usp=sharing

  • FlightGlobals.fetch will repeatedly call FindObjectOfType(typeof(FlightGlobals) when the instance doesn't exist, causing a very large overhead when the instance doesn't exist, such as during part compilation. Fixing this is a 20 to 40 % improvement on part compilation, depending on the mod set.
  • GameDatabase.GetModelPrefab() works by comparing the resuested url with UnityEngine.Object.name (which is extremely slow) in the list of all model GameObjects. 4-6% of part compilation call time (and 150+ KB of allocations). Backing that method with a Dictionary<string, GameObject> allow to eliminate the overhead entirely.
  • GameDatabase.GetTexture() and GameDatabase.GetTextureInfo() are also slow to a lesser extent. While they don't have the overhead of calling UnityEngine.Object.name, they still work by iterating on the whole TextureInfo list. The dictionary approach is however not as beneficial here, because those methods allow to get a texture by its case-insensitive url, so we have to fallback to iterating on the list with a case insensitive string comparison if the requested url isn't correctly cased. The net result was still beneficial in my tests, but not as much as I hoped due to that slow path being taken relatively often.
  • Part.Awake() and PartModule.Awake(), and specifically, Part.ModularSetup() and PartModule.ModularSetup() are really bad. Those methods are responsible for initializing the part and modules BaseFieldList, BaseEventList and BaseActionList, setting up the whole reflection shebang involved in the [KSPField], [KSPEvent] and [KSPAction] underlying infrastructure. Overall, setting that infrastructure is roughly the two thirds of the overall part compilation time. The issue is kinda two-fold. The first time ModularSetup() is called on specific Type, KSP parse all the type members in search of the mentioned attributes, to create a ReflectedAttributes object, which is then cached in a static Dictionary<Type, ReflectedAttributes>. This is nice, as it means those heavy reflection operations (which are roughly 1/4 (???) of the part compilation time) won't be needed on further instantiations of the same module type, but this is still quite costly. It might be worth investigating if the member and attribute parsing can be made a bit faster (most of time seems to be spent in repeated GetCustomAttributes() calls). But even when this cache has been built, the setup of the Base*List object is very reflection heavy. It seems despite some things being cached, a lot isn't, and there is a lot of delegate creation happening. It might be worth investigating better approach based on cached delegates build through Reflection.Emit. A major roadblock to improving things here is that the whole infrastructure is heavily based on generics and we might end up not being able to patch it with Harmony, although maybe a more sensible option would be to swap the stock classes for custom derived ones, this needs to be assessed as well.
  • PartLoader.CreatePartIcon() instantiate a full copy of the part, causing everything on it to be initialized (all components and modules awake, etc), then strips the copy of basically everything aside from the model. this is 15-20% of the part compilation time. The instantiated part manipulation can be made lot more efficient, but unfortunately, 90% of the time is just the part instantiation itself, specifically the Awake() of the part and of the modules. It would be possible to eliminate most of that overhead by instantiating a disabled copy to avoid Awake() from running, but this break any Awake() code manipulating the model state, which effectively happen quite quite often in stock, and also in various plugins (not to mention the icon manipulation capacities offered by overriding PartModule.OnIconCreate()). In the end, the best fix would be to not generate part instances at all for part icons, but replacing them with static images, which would have to be done anyway for the on-demand part texture loading project. This being said, improving the situation with PartModule.ModularSetup() would also improve the situation here, this where the the bulk of the Awake() overhead is.
  • Model loading is relying on byte-by-byte reading through a BinaryReader, which is awfully inefficient when backed by a MemoryStream. It allocate an array of bytes for reading every non-byte value (int, float, etc), which is especially bad when parsing meshes. Ideally, we would just iterate over the raw byte array. I'm relatively confident this could shave model loading time by a good 50 %, maybe more (model loading being 5-15% of total loading time depending on the mod set).
  • Probably not huge gains, but still, rewriting all the struct deserializers (such as for Vector3, Quaternion...) to use smarter string manipulation methods is likely to compound to decent improvements, and that would be beneficial in many other places. Patching the individual parsing methods the ConfigNode or KSPUtil classes might not end up being very beneficial due to the call overhead introduce by harmony, but there is definitely something to do about BaseField.ReadPvt(). Just eliminating the fieldType.IsValueType check would already be a good improvement.

On a side note, the fact that the vast majority of the part compilation time is spent in the various components Awake() methods mean that the whole idea (that I was very skeptical with) of trying to "compile" assetbundles out of the prefabs for future loadings just wouldn't improve anything.

@gotmachine gotmachine added the kspPerformance Possible performance improvement in KSP label May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kspPerformance Possible performance improvement in KSP
Development

No branches or pull requests

1 participant