Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO-670] refine IOUtils.contentEqualsIgnoreEOL(Reader, Reader) #118

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

XenoAmess
Copy link
Contributor

@XenoAmess XenoAmess commented Jun 1, 2020

refine IOUtils.contentEquals(Reader, Reader) from simply wrapped by BufferedReader to something better.

@XenoAmess XenoAmess changed the title refine IOUtils.contentEquals(Reader, Reader) [IO-670] refine IOUtils.contentEquals(Reader, Reader) Jun 1, 2020
@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jun 1, 2020

need I make a performance test case to show the performance?

@XenoAmess
Copy link
Contributor Author

I also want to change other contentEquals functions in IOUtils (using similar way), but I want to listen to your advices first.

@coveralls
Copy link

coveralls commented Jun 1, 2020

Coverage Status

Coverage increased (+0.005%) to 89.4% when pulling eb25f9a on xenoamess-fork:refine_contentEquals into c54bf68 on apache:master.

@garydgregory
Copy link
Member

@XenoAmess

  • When you create a PR, please do not leave the description empty. It makes it harder to review.
  • If a PR claims a performance improvement, it must back it up; some components do this using JMH microbenchmarks, for example please see Apache Commons BCEL, Crypto, CSV, Lang, Math, RNG.

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jun 1, 2020

@garydgregory

@XenoAmess

  • When you create a PR, please do not leave the description empty. It makes it harder to review.

OK

  • If a PR claims a performance improvement, it must back it up; some components do this using JMH microbenchmarks, for example please see Apache Commons BCEL, Crypto, CSV, Lang, Math, RNG.

I have only experiences using Jprofiler and I don't know how to embed a test with performance in junit.
I'll learn about JMH microbenchmarks.
thanks.
update: I learned how to use it, thanks for your recommendation.

also, need I create more tests for testing wether this function is correct?
or tests that already exist, which for the older function, is good enough and I need not create more?

@XenoAmess
Copy link
Contributor Author

@garydgregory
I will use this as example and change its content and add the test into this repo if you don't mind.
https://github.com/apache/commons-lang/blob/5cdac9cfd5a74b0a52ebde32798b973c6edbaa79/src/test/java/org/apache/commons/lang3/HashSetvBitSetTest.java

@XenoAmess
Copy link
Contributor Author

@garydgregory BTW, how can I show the performance result generated?
simply paste it here?

@XenoAmess

This comment has been minimized.

@XenoAmess

This comment has been minimized.

@XenoAmess
Copy link
Contributor Author

@garydgregory
Hi gary.
please find some time for this pr.
If this pr make sence, then I'll renew other contentEquals functions in the same class.
thx.

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jun 8, 2020

OK, according to 72H lazy consensus, I think nobody strongly against this change.
and I will continue changing codes like this in other IOUtils.contentEquals overloads today.

@garydgregory
Copy link
Member

Just FYI, your expectations for 72 hour consensus might be slightly off, it usually applies to certain kinds of votes, not parts of PRs.

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jun 9, 2020

Just FYI, your expectations for 72 hour consensus might be slightly off, it usually applies to certain kinds of votes, not parts of PRs.

@garydgregory
Oh really... Then I might should delay it...
Anyway, I've already finished what I want in this pr for now.
Now all the 3 contentEquals functions are remaked.
I also added some tests.
So if anybody want to review it this might be a good time.

@sebbASF
Copy link
Contributor

sebbASF commented Jul 29, 2020

The code needs comments.
It's not clear what the variables are doing, nor what the algorithm is.

In the case of contentEquals(InputStream...) and contentEquals(Reader ...) it would be enough to comment one and refer the reader to the other for details as they are relatively short and use exactly the same logic AFAICT. I don't suppose there is any way to share the code?

The other method is now enormous, and really ought to be refactored if possible.
It certainly needs careful comments to describe the algorithm.

Also what does -1 mean? I assume it is the EOF marker, in which case why not use the EOF constant?

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jul 29, 2020

Hi.
I added comments for a function, and refined performance test recently.
full test at: https://pastebin.ubuntu.com/p/m9PK52gznf/
In short:

Benchmark                                                                   Mode  Cnt           Score           Error  Units
IOUtilsContentEqualsPerformanceTest.testContentEqualsForFileNew             avgt   25     4021167.663 ?    30682.529  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsForFileOld             avgt   25     4396755.117 ?    45747.957  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsForStringNew           avgt   25   464646861.938 ?  8169109.450  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsForStringOld           avgt   25  2895667654.000 ? 77500989.791  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForFileNew    avgt   25     4124001.545 ?    56479.453  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForFileOld    avgt   25     4190892.835 ?    55878.484  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForStringNew  avgt   25  3165287610.267 ?398266582.335  ns/op

IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForStringOld  avgt   25  6108671322.000 ? 68331589.659  ns/op

Considering about FileReader itself is a time coster(and too slow that might hide the contentEquals's runing time), so I mainly use StringReader 's result for showing performance gain.
For contentEquals, it is about (2895667654.000/464646861.938) - 100% = 523% faster.
For contentEqualsIgnoreEOL, it is (6108671322.000/3165287610.267) - 100% = 92% faster.

@XenoAmess

This comment has been minimized.

src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
src/main/java/org/apache/commons/io/IOUtils.java Outdated Show resolved Hide resolved
@sebbASF
Copy link
Contributor

sebbASF commented Jul 29, 2020

It occurs to me that the code is effectively buffering the output from a BufferedReader (or BufferedInputStream).
One would expect these classes to be reasonably fast, however the JVM has to do locking and other checks in order to support multi-threading. It has to do this for each read() call.

Rather than implement the buffering directly in the compare methods, it might be better to implement a generic, non-threadsafe buffered reader/stream that can be used in situations such as these. [No need to wrap the original input in a buffered version first]. It would be a non-threadsafe version of BufferedReader/InputStream.

The original code should then work without any change other than to add the filter.

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jul 29, 2020

@sebbASF

It occurs to me that the code is effectively buffering the output from a BufferedReader (or BufferedInputStream).

Yes, this is the main trick.
For that two not-so-long function, I used another trick to make it even faster by using same checkIndex for two buffers.

One would expect these classes to be reasonably fast, however the JVM has to do locking and other checks in order to support multi-threading. It has to do this for each read() call.

Rather than implement the buffering directly in the compare methods, it might be better to implement a generic, non-threadsafe filter that can be used in situations such as these.

The original code should then work without any change other than to add the filter.

I used the filter idea you mentioned and re-write another implementation, named contentEqualsIgnoreEOLNew2.
You can see it in the latest commit.
A fast performance test shows that it is SLOWER than my giant function (called contentEqualsIgnoreEOLNew1), but still very much FASTER than the original implementation in commons-io.

And I worried if we really split it out to class, it may become even slower.
(still, faster than original, of course.)

But, I do agree with your idea, a non-thread-safe-but-faster-BufferedReader is valuable in many cases.
Should we find some time to implement it?

@XenoAmess
Copy link
Contributor Author

will re-run the performance test and see how it can get after 10s warm-up.
If the performance lost is not so much, I will recommand contentEqualsIgnoreEOLNew2 as it is really shorter.

@sebbASF
Copy link
Contributor

sebbASF commented Jul 29, 2020

It looks to me like contentEqualsIgnoreEOLNew2 is using InputReader - I cannot see any buffering?

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jul 29, 2020

@sebbASF

It looks to me like contentEqualsIgnoreEOLNew2 is using InputReader - I cannot see any buffering?

it is embeded.
I used charArray1 and charArray2 as buffer, and make it a filter-like-check everytime we read some chars, to make sure all "\r\n" become "\n", and all "\r" become "\n" (actually not all, but the basic idea is like this).

@sebbASF
Copy link
Contributor

sebbASF commented Jul 29, 2020

I see. I would be clearer to use a separate class to implement this.

However the method inputOnlyHaveCRLForEOF uses read() on an unbuffered input class.

Note that IO has a CircularBufferInputStream which may be suitable for at least one of the cases. Should be easy enough to implement the Reader equivalent.

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Jul 29, 2020

Hi.
I just found a bug in contentEqualsIgnoreEOLNew2.
Will redo it tomorrow.


Actually will redo it anyway, as I thought it good to split it out to some class structure, rather than repeat two times in the function.

@XenoAmess

This comment has been minimized.

@XenoAmess
Copy link
Contributor Author

Hi.
I simplified the codes today, and fixed the bugs found yesterday.
full test at: https://pastebin.ubuntu.com/p/XNj26hJZYM/
In short:

Benchmark                                                                                  Mode  Cnt           Score           Error  Units
IOUtilsContentEqualsPerformanceTest.testContentEqualsForFileNew                            avgt    5      396240.533 ?     36799.033  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsForFileOld                            avgt    5      433584.396 ?     29788.600  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsForStringNew                          avgt    5   459169873.636 ?  57372500.268  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsForStringOld                          avgt    5  2819825050.000 ? 357333711.445  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForFileNew                   avgt    5      413149.875 ?    248377.525  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForFileNew2                  avgt    5      418270.135 ?     60833.578  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForFileOld                   avgt    5      405679.398 ?     47700.407  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForStringNew                 avgt    5  2592415555.000 ? 113553422.979  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForStringNew2                avgt    5  2038896808.000 ?1221539599.791  ns/op
IOUtilsContentEqualsPerformanceTest.testContentEqualsIgnoreEOLForStringOld                 avgt    5  5809328550.000 ?1231523322.706  ns/op
IOUtilsContentEqualsPerformanceTest.testSpecialCaseForContentEqualsIgnoreEOLForStringNew   avgt    5        2524.221 ?       177.439  ns/op
IOUtilsContentEqualsPerformanceTest.testSpecialCaseForContentEqualsIgnoreEOLForStringNew2  avgt    5        7650.172 ?      1566.588  ns/op
IOUtilsContentEqualsPerformanceTest.testSpecialCaseForContentEqualsIgnoreEOLForStringOld   avgt    5    86502907.284 ?  12466755.564  ns/op

I'm amazed that ContentEqualsIgnoreEOLNew2 runs even faster after the simplify.
So I think it is a good time for a re-review.

@sebbASF
I implimented the BufferedInput classes we discussed yesterday.
Please have a look.

@sebbASF
Copy link
Contributor

sebbASF commented Aug 5, 2020

Do you have any proof that using UnsyncBufferedReader etc in contentEquals will result in a slow-down unless further changes are made to contentEquals?

If so, what further changes are needed to contentEquals, and what is the speed improvement?

As to any renaming, that can be done later.

@garydgregory
Copy link
Member

Undynchronized should presumably be Unsynchronized?

Phone keyboard! Right, today we have:

  • org.apache.commons.io.input.UnsynchronizedByteArrayInputStream
  • org.apache.commons.io.output.UnsynchronizedByteArrayOutputStream

However, I think that makes the class names rather long.
I think Unsync is clear enough as a prefix (and avoids the issue of whether to use -ized or -ised)

@XenoAmess
Copy link
Contributor Author

XenoAmess commented Aug 5, 2020

@sebbASF

Do you have any proof that using UnsyncBufferedReader etc in contentEquals will result in a slow-down unless further changes are made to contentEquals?

Yes, I ran a rough performance test on a same pc, and the result be time 9e8.(the current version of this pr shows 4.6e8, and master 2.8e9)
If you need a full detailed performance test I can find some time to run one.

If so, what further changes are needed to contentEquals, and what is the speed improvement?

That trick only be possible when we comparing two(or several) InputReaders.
I shared an index on them, thus the two buffer's index be a same local variable.

@XenoAmess XenoAmess requested a review from sebbASF August 17, 2020 12:29
@XenoAmess
Copy link
Contributor Author

Hi.
Is there anything that I should do/fix for this pr?

@garydgregory
Copy link
Member

I think would like to/will soon-ish bring in the underlying Unsynchronized* classes first since they seem like useful general building blocks. Then this PR can be rebased. I would like to do this work at some point this week or at the weekend. Stay tuned.

@XenoAmess
Copy link
Contributor Author

I think would like to/will soon-ish bring in the underlying Unsynchronized* classes first since they seem like useful general building blocks. Then this PR can be rebased. I would like to do this work at some point this week or at the weekend. Stay tuned.

@garydgregory got it. Thanks.

@XenoAmess
Copy link
Contributor Author

@garydgregory Weekend now. Any news?:)

@XenoAmess
Copy link
Contributor Author

@garydgregory Weekend now. Any news?:)

ping? :)

@garydgregory
Copy link
Member

I will look over all Commons PRs starting tomorrow over the next while I have some time off from work... please be patient.

@garydgregory
Copy link
Member

@garydgregory
got it.
new added class renamed as:

org.apache.commons.io.input.buffer.UnsyncBufferedInputStream
org.apache.commons.io.input.buffer.UnsyncBufferedReader
org.apache.commons.io.input.buffer.LineEndUnifiedBufferedReader

The Java folks make it sounds like these are superfluous in https://bugs.openjdk.java.net/browse/JDK-4097272, so I think we need to see a performance test that shows there is a clear performance benefit to adding those as valuable on their own.

Perhaps our addition of UnsynchronizedByteArrayInputStream was a mistake.

@XenoAmess
Copy link
Contributor Author

@garydgregory
got it.
new added class renamed as:
org.apache.commons.io.input.buffer.UnsyncBufferedInputStream
org.apache.commons.io.input.buffer.UnsyncBufferedReader
org.apache.commons.io.input.buffer.LineEndUnifiedBufferedReader

The Java folks make it sounds like these are superfluous in https://bugs.openjdk.java.net/browse/JDK-4097272, so I think we need to see a performance test that shows there is a clear performance benefit to adding those as valuable on their own.

Perhaps our addition of UnsynchronizedByteArrayInputStream was a mistake.

@garydgregory
Hi.
I already listed some performance test results 3 months ago, in this thread. please look back and see them.
isn't the performance test in the list enough?
If so , what other types of performance test do you need?

@garydgregory
Copy link
Member

garydgregory commented Jan 3, 2021

Hello @XenoAmess
We already have a lot of changes for the next release, so I want to manage expectations such that I would I prefer to get out 3.12 before making even more big changes like these.

But still, let's continue this thread. Starting with the lowest-level bits: we need to justify the addition of the misnamed Unsync* classes, the prefix should be Unsynchronized like our existing UnsynchronizedByteArrayInputStream, which I've already mentioned. Perhaps our addition of UnsynchronizedByteArrayInputStream was a mistake since the Java folks make it sounds like these are superfluous in https://bugs.openjdk.java.net/browse/JDK-4097272, so I think we need to see a performance test that shows there is a clear performance benefit to adding those as valuable on their own.

We need performances test that show the differences, if any, between the JRE's classes and our proposed Unsynchronized versions. Since you propose two such classes UnsyncBufferedInputStream and UnsyncBufferedReader, that's two new tests. Or did I miss these here?

I think you should create a new PR for just these two new classes and their tests. This will make the work simpler for everyone when reviewing and testing.

TY.

@XenoAmess
Copy link
Contributor Author

Hello @XenoAmess
We already have a lot of changes for the next release, so I want to manage expectations such that I would I prefer to get out 3.12 before making even more big changes like these.

But still, let's continue this thread. Starting with the lowest-level bits: we need to justify the addition of the misnamed Unsync* classes, the prefix should be Unsynchronized like our existing UnsynchronizedByteArrayInputStream, which I've already mentioned. Perhaps our addition of UnsynchronizedByteArrayInputStream was a mistake since the Java folks make it sounds like these are superfluous in https://bugs.openjdk.java.net/browse/JDK-4097272, so I think we need to see a performance test that shows there is a clear performance benefit to adding those as valuable on their own.

We need performances test that show the differences, if any, between the JRE's classes and our proposed Unsynchronized versions. Since you propose two such classes UnsyncBufferedInputStream and UnsyncBufferedReader, that's two new tests. Or did I miss these here?

I think you should create a new PR for just these two new classes and their tests. This will make the work simpler for everyone when reviewing and testing.

TY.

@garydgregory Hi. I done the performance test at #184.
The two classes is faster, but not as that faster as we thought.
Detailed performance tests results: https://pastebin.ubuntu.com/p/TtDxYw4WVG/

@XenoAmess
Copy link
Contributor Author

@garydgregory unsynced buffered classes deleted.
re-run performance tests, results at https://pastebin.ubuntu.com/p/ZpZFHRMQkh/
please do a re-review.
Thanks.

asfgit pushed a commit that referenced this pull request Jan 24, 2021
This is based on the PR #118 by
XenoAmess but only for this one method.
asfgit pushed a commit that referenced this pull request Jan 25, 2021
This is based on the PR #118 by
XenoAmess but only for this one method.
Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @XenoAmess
Please see git master and rebase. I updated with your changes for InputStream and Reader but not for the EOL APIs. If you still want those in, please rebase and follow the pattern I established with the *Benchmark classes.
TY! -)
PS: I gave you credit in the commit message and changes.xml 👍

@XenoAmess XenoAmess force-pushed the refine_contentEquals branch 2 times, most recently from 620bc26 to 1f7750a Compare February 9, 2021 17:25
@XenoAmess
Copy link
Contributor Author

Hi @XenoAmess
Please see git master and rebase. I updated with your changes for InputStream and Reader but not for the EOL APIs. If you still want those in, please rebase and follow the pattern I established with the *Benchmark classes.
TY! -)
PS: I gave you credit in the commit message and changes.xml 👍

rebased. benchmark redone.

@garydgregory garydgregory changed the title [IO-670] refine IOUtils.contentEquals(Reader, Reader) [IO-670] refine IOUtils.contentEqualsIgnoreEOL(Reader, Reader) Feb 26, 2021
@XenoAmess
Copy link
Contributor Author

@garydgregory rebased. now passed new checkstyle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants