-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving many entities with relationships using projections causes poor performance and increases Forseti deadlocks errors when compared to OGM #2666
Comments
Thanks for the comprehensive report. The difference between SDN and Neo4j-OGM when in comes to updating/saving you are observing is based on the fact that Spring Data Neo4j does not have a cache or dirty tracking mechanism in place. This is a direction we have chosen based on our experience with Neo4j-OGM to avoid an additional virtual graph on the application side if there's already the Java object-graph that should get written to the database. I think there is some potential in having knowledge about the existing relationships vs. new relationships to improve the save process even more. E.g. I am currently trying to avoid the merge at all for existing relationships. |
Thanks for the detailed response! We almost never use We also are curious about the ordering of relationship type creation. We have noticed both in OGM and SDN that the order in which different relationships get created is not consistent.
Saving this entity means it will need to save a relationship to Is ordering something that can be configured with SDN? Thanks! |
Although technical it would be possible to have a fixed order of property parsing (plain property + relationship definitions), this order would be fixed for every user. This would be done by a |
Thanks! Sounds like if we wanted to order when the relationship types are created across all entities, in the meantime we would have to use a separate implementation. Until SDN supports some sort of ordering. |
Also, where would we pass in the
did you mean ordering shouldn't be a solution? We know from Neo4j that ordering the |
There is currently no way to provide a custom Also, I just realised that I got you wrong on the ordering in general. I was talking about the properties on the |
We are using:
spring-data-neo4j 6.3.5
neo4j 4.3.10
As described in the reference doc, saving relationships for multiple entities tends to generate and run several queries.
It goes something like:
-to-many
relationship) using the CREATE and UNION MATCH statement-to-many
relationships).We can see all these queries being logged in Neo4j
query.log
.Assume we have a model
and we wanted to save multiple
Root
entities including their relationships toChild
orOtherChild
. We are just interested in creating the relationship to existing entities, so assumeChild
andOtherChild
entities are already persisted in the DB.We currently run something like:
where the projection is defined as
This works as expected, it will create all the root entities and correctly creates the relationship with
MERGE (startNode)-[:FIRST_REL_TYPE]->(endNode)
etc.However, it doesn't seem to scale in terms of performance when the number of root entities and relationships to save is large. We noticed from the Neo4j logs that SDN appears to inefficiently run an update property query for each related entity for each root entity. It will also run a separate
MERGE
query for each root entity for each relationship type. This is in contrast to OGM which appeared to do a "batch"UNWIND
MERGE
query per relationship type. SDN appears to only do it for the-to-many
relationship type and for the root entities' properties.We have tried removing the ids from the projections like so:
but at a minimum the queries will always update the
id
, because we are using generated id values for all of our entities. I suppose we could not use generated id values, but that is not a solution we can consider.We have also noticed duplicate queries being ran if for example there were multiple root entities that had a reference to the same related entity:
in this example, a query to update the properties of
Child
will run twice.Also if the root entity is not new, all existing relationships are first deleted, which seems inefficient if some of them are expected to just be re-created in the next
MERGE
query.Is there a way we can avoid generating update queries for the related entity properties if we just want to create the relationship between existing entities?
Furthermore, we have noticed an increased in Neo4j Forseti deadlock errors when compared to OGM. From our investigation and from Neo4j official documentation, we can reduce the amount of deadlocks errors by ordering our
SET
andMERGE
queries and whenever possible we should use aCREATE
query instead ofMERGE
to create relationships. Currently, SDN doesn't seem to follow a consistent order, even when ordering the relationships inside the projection interfaces.Is there a way we can specify SDN to use
CREATE
instead ofMERGE
dynamically at run time? Also is there a way to avoid updating related entity properties and order theMERGE
queries so that the same order is preserved in between restarts of the app?I did notice there were existing issues that seem to report on similar findings:
#2235
#2636
#2593
#2588
Will these performance and deadlocking issues be resolved in upcoming versions?
Is there an alternative, more performant, way that avoids running so many queries to save relationships for root entities?
Currently, we have decided to define our own custom
UNWIND
@Query()
repository queries instead of using the templateNeo4jOperations
,saveAll()
,saveAllAs()
etc. methods.something like
which allows us to batch create relationships while avoiding the need to update related entity properties. We also have control of whether to use the
CREATE
clause instead ofMERGE
to avoid deadlocks.We have seen significant improvements both in performance and reduction of Forseti deadlock errors when saving a large amount of root entities.
We are still wondering, however, if there are any other recommendations we can implement using SDN that can help in a similar way and will allow us to delegate relationship creation back to SDN.
Thank you!
The text was updated successfully, but these errors were encountered: