-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-58085] Fix issue where stages are shown as waiting to start when they are running/completed #2017
[JENKINS-58085] Fix issue where stages are shown as waiting to start when they are running/completed #2017
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -687,7 +687,8 @@ public List<BluePipelineNode> union(List<FlowNodeWrapper> previousNodes, final L | |||||
|
||||||
boolean graphsAreCompatible = true; | ||||||
|
||||||
Map<String, FlowNodeWrapper> newNodes = new HashMap<>(previousNodes.size()); // indexed by id | ||||||
// A map from the ID of a node from previousNodes to the corresponding node for the current build. | ||||||
Map<String, FlowNodeWrapper> newNodes = new HashMap<>(previousNodes.size()); | ||||||
|
||||||
// Start with the currently-executing nodes | ||||||
for (FlowNodeWrapper currentNodeWrapper : nodes) { | ||||||
|
@@ -710,14 +711,14 @@ public List<BluePipelineNode> union(List<FlowNodeWrapper> previousNodes, final L | |||||
|
||||||
// New wrapper object based on current execution | ||||||
FlowNodeWrapper newNodeWrapper = new FlowNodeWrapper( | ||||||
oldNodeWrapper.getNode(), // Use old node because we want to show the final id not intermediary | ||||||
currentNodeWrapper.getNode(), | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the main fix. The old comment doesn't make sense to me. My understanding is that in this case, we have a real node from the current Pipeline. Its ID will never change, even when the build completes, so we already have the "final" ID. Because the code checks for a match using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should just delete There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The root problem is the node ids do change during a run, thanks to the way this code walks the graph backwards instead of forwards. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changing this will fix some of the bad graphs, and break others that are currently working. I used the old node because it's from a finished run, which has the "final" node ids. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The ID for an actual Pipeline If you mean the problem is that a single
The problem is that expecting the node IDs from a different build to match the node IDs in the current build is absolutely not something you can count on. It happens to work in some cases, but Further on in this code, where we use the IDs from the old build in what we return, is also a bit suspect, because one of the old IDs might refer to a different node in the current Pipeline, but since those nodes are marked as having not started it seems to not really matter. It would be safer to make the node IDs for the fake There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I came up with two minimal test cases where nodes that show up while the build is in progress do not appear in the final graph. Both of them involve a Here is the Declarative test case:
Here is the scripted test case:
In both cases, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, looking into those test cases, the main difference in the graph of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I couldn't reproduce all of the situations that cause node drift reliably, it's definitely timing-sensitive. I think it occurs only when parallel stages are just starting or just ending, such that the nodes exist but not all the events for those nodes do yet. I don't really have more detail to give, other than my conclusion that it'd be quicker building a new graph visitor that uses the newer pipeline API via feature flag rather than trying to debug the exact event sequence that triggers it. Fresh eyes may be able to find it though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another possibility might be adopting a different id scheme, that is tied to the structure of the graph (akin to an xpath or something) rather than just reusing the ids given to the pipeline nodes by Jenkins. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, from what I can tell, the problem for my test cases from #2017 (comment) is that Line 321 in db8d55d
Because of that, That said, my change doesn't affect these cases as far as I can tell, they are broken with or without my fix, and the problem is the Line 705 in db8d55d
Because the overall condition is true, the graphs will be marked as incompatible, and so For now, I can't think of an example where my changes will make things worse, so I am going to mark this PR as ready for review. If we come up with an example where it does make things worse, I am happy to look into it, and if there are any regressions from the PR, I am happy to look into them as well. |
||||||
currentNodeWrapper.getStatus(), | ||||||
currentNodeWrapper.getTiming(), | ||||||
currentNodeWrapper.getInputStep(), | ||||||
run | ||||||
); | ||||||
|
||||||
newNodes.put(nodeId, newNodeWrapper); | ||||||
newNodes.put(oldNodeWrapper.getId(), newNodeWrapper); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so with some more testing, it turns out that this is actually the only change needed to fix the test case I am working with. The problem is that |
||||||
|
||||||
} else { | ||||||
// Node does not exist in previous graph, user probably changed pipleine definition. | ||||||
|
@@ -729,9 +730,9 @@ public List<BluePipelineNode> union(List<FlowNodeWrapper> previousNodes, final L | |||||
// Walk the old graph, create new wrappers for any stages not yet started | ||||||
if (graphsAreCompatible) { | ||||||
for (FlowNodeWrapper oldNodeWrapper : previousNodes) { | ||||||
final String nodeId = oldNodeWrapper.getId(); | ||||||
final String oldNodeId = oldNodeWrapper.getId(); | ||||||
|
||||||
if (!newNodes.containsKey(nodeId)) { | ||||||
if (!newNodes.containsKey(oldNodeId)) { | ||||||
FlowNodeWrapper newNodeWrapper = new FlowNodeWrapper( | ||||||
oldNodeWrapper.getNode(), | ||||||
new NodeRunStatus(null, null), | ||||||
|
@@ -740,17 +741,17 @@ public List<BluePipelineNode> union(List<FlowNodeWrapper> previousNodes, final L | |||||
run | ||||||
); | ||||||
|
||||||
newNodes.put(nodeId, newNodeWrapper); | ||||||
newNodes.put(oldNodeId, newNodeWrapper); | ||||||
} | ||||||
} | ||||||
} | ||||||
|
||||||
// Re-create edges and parentage based on previous run | ||||||
if (graphsAreCompatible) { | ||||||
for (FlowNodeWrapper oldNodeWrapper : previousNodes) { | ||||||
final String nodeId = oldNodeWrapper.getId(); | ||||||
final String oldNodeId = oldNodeWrapper.getId(); | ||||||
|
||||||
FlowNodeWrapper newNodeWrapper = newNodes.get(nodeId); | ||||||
FlowNodeWrapper newNodeWrapper = newNodes.get(oldNodeId); | ||||||
|
||||||
for (FlowNodeWrapper oldEdge : oldNodeWrapper.edges) { | ||||||
FlowNodeWrapper newEdge = newNodes.get(oldEdge.getId()); | ||||||
|
@@ -773,8 +774,8 @@ public List<BluePipelineNode> union(List<FlowNodeWrapper> previousNodes, final L | |||||
// code that makes too many assumptions | ||||||
|
||||||
for (FlowNodeWrapper oldNodeWrapper : previousNodes) { | ||||||
final String nodeId = oldNodeWrapper.getId(); | ||||||
FlowNodeWrapper newNodeWrapper = newNodes.get(nodeId); | ||||||
final String oldNodeId = oldNodeWrapper.getId(); | ||||||
FlowNodeWrapper newNodeWrapper = newNodes.get(oldNodeId); | ||||||
newList.add(new PipelineNodeImpl(newNodeWrapper, () -> parent, run)); | ||||||
} | ||||||
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it would be useful to clarify that the IDs here come from the old build, not the new build, and I updated the names of the variables that get put into the map to try to make that clearer as well.