-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance | Mesh > 100K slow to process & memory hungry #7
Comments
Here's a slightly more formal performance test: def decimate_sphere(sphere):
simplified = simplify_mesh(np.array(sphere.vertices),
np.array(sphere.faces, dtype=np.uint32),
len(sphere.vertices) // 2)
def test_large_object_decimation():
for i in range(1, 7):
sphere = trimesh.creation.icosphere(subdivisions=i, radius=1.0, color=None)
t = timeit.timeit(lambda: decimate_sphere(sphere), number=1)
print(f'Decimated {len(sphere.vertices)} vertices in {t} seconds')
|
yes, I know that my performance is quite bad. The problem is that I needed the opportunity to reduce the mesh to an exact amount of vertices. Not each vertex under a certain threshold. So I sort the errors each iteration to remove the smallest next. This needs a lot of memory to organise the data in an efficient way. For example, am I sorting the errors on a heap, because I only need the smallest. For the rest, I don't care how they are sorted. The original paper also has better reduction times. But I couldn't reach these times. Also I needed the exact interpolation between the feature vectors. So I couldn't use the exact solution of the new optimal vertex. Therefore, I must try 10 times each edge, which the best and new position is. There are implementations that do not precalculate the errors neither sort them. They just merge each vertex below a given threshold. This is faster and does not need that much memory. But is not that accurate (see here). But since this would be a complete different implementation, I just implemented everything regarding to my needs. |
I think I follow the first point. Your use case requires that you hit a specific vertex count and requires high accuracy. You need to ensure that you're removing the highest error nodes so you need to re-sort after every iteration to ensure you truly are removing the highest error nodes until you hit your threshold. For your second point, it sounds like the calculated output features are interpolated from the values of the collapsed source vertices and the new vertex position. I'm not clear on why you need to "try 10 times each edge." Does this impact the performance when there are no features provided? One thing I was curious about was the impact of BTW, please feel free to close this if you feel this issue doesn't align with the goals of your library. |
Actually, I am removing the lowest, because the error states how much the shape is altered after collapsing. 10 is just an arbitrary number I chose. As illustrated in the paper, the solution of the optimal new position is the solution of a linear problem. For this a matrix has to be invertible, to do the maths. But this is not always the case. And because I would need to find out how both original vertices are "mixed" into the new one, I skip this step which would be roughly O(1) (if the matrix is invertible) and always calculate 10 linear combinations of the original vertices => O(10). So this is one downside of my implementation. There would be the opportunity to use the exact solution in the case where no features are provided. This may speed up the reduction time a bit. And currently, the threshold is a threshold to allow vertices to collapse that are not linked by an edge (see paper). Therefore, it will increase the amount of "valid pairs" that may be collapsed. Lastly, it is a limitation currently that very large meshes cannot be reduced fastly. So I keep this issue open. ;) Thank you for your huge interest and support! |
I was thinking about the accuracy vs. performance trade off you mentioned. What if there were some sort of By way of example, with a value of 1, every time a collapse happens, the heap will be re-evaled, but with a value of 100 a batch of 100 verts could be removed before re-evaluating. I'm not sure if this sort of slider would be useful to you, but I think without realizing it, this tradeoff has become very useful to me. The high quality results for smaller objects truly outperforms other libraries I've tried. On the other hand, as objects get larger things become unusable and I need to switch to a different library. |
Sorry for the late response. The suggestion is a worth thinking about it. But I see one large problem: I am currently using a min-heap. Therefore, I need to "sort" it each time. Meaning that I need to find the next min error after removing the current root (min element). One could replace it with a fully sorted list. This will increase the init process of creating the data structure. Maybe, one could implement it for batches > 1 by looking at the childs (v1, v2) of the root, select the smaller one contract it (say v1). Afterwards, compare v2 to the children of v1 and contract the smallest. and so on. But maybe one has to recreate the tree structure after one batch. What do you think? |
currently, after each interation So replacing the current heap with a pairing heap could be beneficial and simplify the batch implementation. If you want to, please give it a try ;). |
But for the required memory, I don't see any opportunity to improve it further more. |
@jannessm I know you're quite busy, but just wanted to capture my findings when processing larger meshes. I've found that when decimating meshes larger than ~100K triangles decimation grinds to a stop.
For examples:
Some informal measurements:
The closest comparison to this library that I have for comparison is a decimation implementation in OpenMesh. It doesn't have all the same features. For example, it doesn't seem to handle edge preservation as well, so it may not be a fair performance comparison. Nonetheless, it decimates this sphere in less than 1 second and I believe uses less than 1GB of memory.
The text was updated successfully, but these errors were encountered: