-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not perform domain substitutions within comments #1026
base: master
Are you sure you want to change the base?
Do not perform domain substitutions within comments #1026
Conversation
5931d0a
to
84d04d2
Compare
2384438
to
b070bab
Compare
Covers C/C++, Java, Javascript and Python file extensions
b070bab
to
41d6306
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with the overall idea here (due to the practical problems caused by trying to keep this script simple and non-discretionary in the past, such as the Android manifest schema URL problem a while back), but IMO the implementation is not quite ideal.
One thing that comes to mind is this sort of pre- and post- processing before actual substitution shouldn't be inlined with the actual substitution logic. I think this should be abstracted, since this is the first time that we're introducing pre- and post- substitution logic, and I think there's potential for additional processing in the future. I'll need to think more about how I'd like to organize the code.
My second concern is regarding the implementation. For starters, the regex for the non-(Python and GN) files don't apply to all the selected file types. Secondly, I'm not 100% convinced the regex will work in all scenarios, especially if character escapes get involved. Since domain substitution is an essential component, I'd rather play it safe and implement AST parsing. This should allow us to properly support docstrings as well. I can take a look at how this can be implemented for the filetypes listed here.
As a general rule I avoid adding code for the "just in case" situation, but if you have some use-cases to leverage already then sure - why not.
What do you mean here? I am genuinely curious about some bug I might not have spotted.
That would indeed be a very advanced and complete solution. |
I think the AndroidManifest schema problem is another such example. Also, anything in
Actually, I was mistaken about this. I didn't realize all those languages supported both of those commenting types. However, I also realized your code will break if the comment delimiter appears in strings. I don't know if any substitutions could be affected by this, though. |
Yes, I am aware, I decided to go with this risk. I have inspected the produced diff (now much smaller thanks to the code in this PR) a few times (even if not thoroughly) and it seemed correct. |
What issue is tried to be solved by not performing domain substitution on comments? If it's the timestamps, wasn't that taken care of in 1a0e163? |
@Luka-Filipovic I create git commits after performing the domain substitutions and there are several MBs of unwanted changes; I also think that changing the timestamp alone won't prevent the file from being rebuilt if one if its comments changed. |
What percentage of files from lists have links only in comments? Not doing domsub in test files could also help with your problem, and it can be made only by parsing the file name. When comments and test files are cut out, what could be the next step in improving this process?
If that's the case (I didn't test it myself yet), would #1185 help the problem? |
I already covered this by excluding all test files/directories I could identify.
Can't say, but if you make a
ninja uses timestamps like make, there's an open ninja issue about using file characteristics. The problem is when you build Chromium and then run domain substitution: setting all files to a precedent timestamp to trick ninja into not rebuilding those files is wrong, as you cannot tell if only comments or code were changed. Thus what will happen is that there will be no advantage in building Chromium and then your modified version with patches and domain substitution, as most files will be rebuilt because of the changes. ccache mitigates this to some extent, but it still leaves room for improvement; with the patch I proposed here I made this process of re-using unchanged object files more efficient. As for #1185 I actually think it's better to apply domain substitution after the other patches, so that they can be used individually without dependencies on the big domain substitution patch (which I treat as a patch, not different from any other). |
That's interesting. What use-case do you have where you'd want to build Chromium once first, apply domain substitution, then re-run the build? I envisioned only these two workflows for domain substitution:
Of these two workflows, the timestamp manipulation logic is useful only for the incremental build workflow. |
The Bromite build process is a bit like that:
The Chromium builds are important because before a bug is submitted for Bromite it needs to be tested against vanilla Chromium first (yes, we often incur in upstream bugs which become visible downstream because of specific feature/GN flags combinations). In general you want to make the best out of ninja, C++, Java own invalidation systems, and ccache's one; the changes in the comments paddle against all of them as at least the parser must run to find out that nothing changed. |
Covers C/C++, Java, Javascript and Python file extensions
This change will prevent domain substitutions to happen within comments or comments blocks, thus reducing the amount of unnecessary file changes when applying domain substitution.
This is what I had in mind for issue #849.