make slowEqual not slow #128

dominictarr · 2021-02-25T00:38:12Z

there is a method in bipf to generate the createSeekPath method

https://github.com/ssbc/bipf/blob/master/index.js#L299-L317

it generates source code and creates a new function because it can generate flatter, faster, code without closure scoping.

then you won't need the indexType option, because you can just use the path.

This reverts commit 873940d.

Barbarrosa · 2021-04-11T20:41:35Z

Based on the benchmarks in #146, the suggested change here doesn't appear to improve slowEqual performance. This may differ between NodeJS versions, though.

staltz · 2021-04-12T08:58:40Z

I don't trust createSeekPath because it uses Function constructor with strings, and has security issues similar to eval(). Given that our current slowEqual is for prototyping (maybe its name should have been slowInsecureEqual, but now is too late to issue a breaking change) and equal is for production, and that it's not hard to hand-write args for equal, I don't see a reason why we should speed up slowEqual.

Barbarrosa · 2021-04-13T00:52:06Z

If someone's passing user input as slowEqual's first arg, then they probably have a lot of other problems in their application. I see what you mean though, it escalates the risk beyond accessing unauthorized paths to potentially crashing the application or exploiting any JSON.stringify bugs for RCE.

dominictarr · 2021-04-15T03:07:34Z

I disagree. I think this code is safe because Buffer.from throws if the input is not a string, ArrayBuffer, or Array of numbers. ArrayBuffer is never returned by JSON.stringify. null, true, etc will return a seekPath that is throws. We can trust JSON.stringify's escaping. Sure, this would be vulnerable if JSON.stringify had a bug... but JSON.stringify is trustworthy, 100% specified and solid.

they can't do insert a comment then real code after it, in the style of SQL injection.

should add a check that every item in target is a string,
or if you wanna be truely paranoid, enforce that keys are /^[a-zA-Z_][a-zA-Z_0-9]+$/ then surround with quotes.
then it would be impossible to write js except a variable name.

Barbarrosa · 2021-04-15T06:07:45Z

@dominictarr It would only take a JSON.stringify exploit that allows a self-invoking function to circumvent the Buffer.from call, and JSON.stringify has had exploits like this before. I would have considered it overprotective since the dev using this library would have to trust bad inputs already for this exploit, but the point seems moot since the refactor doesn't appear to deliver any performance improvement in NodeJS 12 (and I bet it will fare worse in later versions).

Also - if you follow the CVE link, you may find one of the mentioned pages specifically mentions the JSON.stringify exploit in relation to RCE.

dominictarr · 2021-04-15T07:07:32Z

to make that bug actually work, the stringify needs to be running with a reviver function. which this one isn't. or, I think if you passed it an object with a custom toValue method, but that can't be done remotely (only in the local js vm)

to attack this you'd need to get JSON.stringify to actually output the wrong thing. slowEqual takes a single string and splits on '.' ... then passes the result to createSeekPath it has to get an array of strings... that means you'd need an attack string that causes JSON.stringify to output a string that is incorrectly escaped

dominictarr · 2021-04-15T07:11:58Z

but yes, this doesn't matter unless it's actually faster or not... I was confused about the PR, it seemed to say that the benchmarks didn't use createSeekPath?

the docs say that slowEqual is slow... but is there a benchmark that shows that? maybe slow equal isn't slow? or there are other things that make the impact of slow equal insignificant, but when I was developing this stuff I recall this stuff had significant impact

Barbarrosa · 2021-04-15T07:46:59Z

@dominictarr I setup two pull requests in my fork to trigger the benchmark, and slowEqual performed more slowly than equal both with and without this change.

dominictarr · 2021-04-19T00:39:13Z

is one of those results the hand written query? is it "Query 1 big index"? how big is the log? are there records that match and also do not match?

btw, why are you putting it inside the if that might return undefined? it returns -1 to mean not found because this can be checked with a single operator, and doesn't change return types. When you have a polymorphic function it isn't as optimizable, so better to avoid this. Also, createSeekPath tries to avoid using closures then you added one back ;)

now, js perf can be very finnicky and the engine optimizations might have changed since I wrote that, you have to actually measure this... so I could be wrong...

but in bipf/test/perf the difference between seekPath and createSeekPath does make a difference
(createSeekPath is faster) https://github.com/ssbc/bipf/blob/master/test/perf.js#L136-L154 so it's slower in this benchmark there must be something else going on...

Barbarrosa · 2021-04-19T01:05:57Z

@dominictarr Looks like the difference is that the JITDB benchmark factors the compilation time into the benchmark & JITDB wasn't caching the compiled code. If we update JITDB to cache/memoize bipf.createSeekPath, then the second and third runs may show better performance.

Barbarrosa · 2021-04-19T01:48:42Z

@dominictarr I introduced a static module-level LRU cache optimization to slowEquals so we can better gauge the impact of bipf.createSeekPath on performance. To make sure it's a fair benchmark, I also included the LRU optimization on the change w/o bipf.createSeekPath.

LRU cache

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	1422.81ms ± 54.09ms	-24.07 kB ± 0 kB	38
Query 1 big index with slowEqual (2nd run)	289.3ms ± 2.74ms	4.19 kB ± 0 kB	42
Query 1 big index with slowEqual (3rd run)	0.31ms ± 0.05ms	3.07 kB ± 18.14 kB	48

LRU cache with `bipf.createSeekPath`

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	915.94ms ± 6.73ms	8.63 kB ± 0 kB	59
Query 1 big index with slowEqual (2nd run)	319.64ms ± 3.36ms	68.74 kB ± 38.55 kB	42
Query 1 big index with slowEqual (3rd run)	0.5ms ± 0.18ms	10.55 kB ± 24.38 kB	44

It looks like bipf.createSeekPath makes the first query faster, but not later queries. Note though that these numbers completely exclude the compilation time, which would be a serious consideration for very dynamic paths in the queries.

Barbarrosa · 2021-04-19T02:30:31Z

I just noticed that the original and createSeekPath share a problem - they repeatedly generate the comparison buffers. I created another branch to test a safe solution that beats both in performance.

LRU cache with pregenerated comparison buffer

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	864.28ms ± 18.35ms	13.64 kB ± 0 kB	62
Query 1 big index with slowEqual (2nd run)	321.48ms ± 2.07ms	16.15 kB ± 34.04 kB	41
Query 1 big index with slowEqual (3rd run)	0.36ms ± 0.09ms	5.09 kB ± 0 kB	49

@arj03 @staltz What are your thoughts on this solution?

dominictarr · 2021-04-19T02:41:11Z

@Barbarrosa great job!

dominictarr · 2021-04-19T02:48:54Z

which benchmark is equals? is slowEquals as fast as that now?

Barbarrosa · 2021-04-19T03:13:03Z

@dominictarr slowEquals seems much more comparable in speed using the LRU and pregenerated buffer. Here are the benchmarks that use equals:

Part	Speed	Heap Change	Samples
Query 1 big index (1st run)	912.47ms ± 24.95ms	21.11 kB ± 0 kB	59
Query 1 big index (2nd run)	303.86ms ± 2.07ms	10.73 kB ± 0 kB	44
Count 1 big index (3rd run)	0.28ms ± 0.02ms	23.67 kB ± 0 kB	45

Barbarrosa · 2021-04-19T03:18:10Z

Thinking back, Query 1 big index with slowEqual (3rd run) should really be Count 1 big index with slowEqual (3rd run).

Barbarrosa · 2021-04-19T03:47:46Z

I think two more benchmarks may be worth looking at here.

Original `slowEquals`

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	1227.72ms ± 30.83ms	-43.01 kB ± 18.86 kB	45
Query 1 big index with slowEqual (2nd run)	302.25ms ± 1.66ms	77.83 kB ± 0 kB	40
Query 1 big index with slowEqual (3rd run)	0.29ms ± 0.02ms	21.15 kB ± 0 kB	44

Pregenerated buffer w/o LRU cache

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	1253.71ms ± 24.5ms	32.54 kB ± 45.04 kB	43
Query 1 big index with slowEqual (2nd run)	307.64ms ± 2.64ms	-15.19 kB ± 12.49 kB	43
Query 1 big index with slowEqual (3rd run)	0.62ms ± 0.32ms	7.5 kB ± 43.83 kB	42

Barbarrosa · 2021-04-19T04:21:51Z

I was curious about the impact if the LRU cache was set per JITDB instance. Here's that result.

Per-instance LRU cache w/ pregenerated buffer

Part	Speed	Heap Change	Samples
Query 1 big index with slowEqual (1st run)	1443.42ms ± 7.46ms	-18.06 kB ± 30.34 kB	37
Query 1 big index with slowEqual (2nd run)	286.96ms ± 2.86ms	74.54 kB ± 13.3 kB	45
Query 1 big index with slowEqual (3rd run)	0.48ms ± 0.21ms	14.76 kB ± 32.07 kB	43

staltz · 2021-04-19T07:27:16Z

how big is the log? are there records that match and also do not match?

@dominictarr It's querying for msgs of type post, there are 23310 of those and the log in total has 100000 msgs and 2000 feeds. This log is deterministically generated with ssb-fixtures which uses statistical sampling based on real-world SSB statistics and the Pareto distribution.

dominictarr · 2021-04-19T08:42:35Z

This log is deterministically generated with ssb-fixtures which uses statistical sampling based on real-world SSB statistics and the Pareto distribution.

perfect!

Barbarrosa · 2021-04-25T01:28:04Z

@dominictarr #154 (now merged) implements something similar to the change I suggested, i.e. it preprocesses the queried path into an array of Buffers and caches the resulting function with a Map. This means related performance issues w/ slowEqual will solely affect the first query using a particular path. Any following queries will see performance much more like that of equal.

Barbarrosa mentioned this issue Apr 10, 2021

Implement slowEqual enhancement from #128 #146

Closed

Barbarrosa added a commit to Barbarrosa/jitdb that referenced this issue Apr 10, 2021

Implement slowEqual speed enhancement from ssbc#128

873940d

Barbarrosa added a commit to Barbarrosa/jitdb that referenced this issue Apr 11, 2021

Revert "Implement slowEqual speed enhancement from ssbc#128"

a8961d2

This reverts commit 873940d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make slowEqual not slow #128

make slowEqual not slow #128

dominictarr commented Feb 25, 2021

Barbarrosa commented Apr 11, 2021

staltz commented Apr 12, 2021

Barbarrosa commented Apr 13, 2021

dominictarr commented Apr 15, 2021

Barbarrosa commented Apr 15, 2021 •

edited

Loading

dominictarr commented Apr 15, 2021

dominictarr commented Apr 15, 2021

Barbarrosa commented Apr 15, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021 •

edited

Loading

Barbarrosa commented Apr 19, 2021

dominictarr commented Apr 19, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021 •

edited

Loading

Barbarrosa commented Apr 19, 2021 •

edited

Loading

staltz commented Apr 19, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 25, 2021

make slowEqual not slow #128

make slowEqual not slow #128

Comments

dominictarr commented Feb 25, 2021

Barbarrosa commented Apr 11, 2021

staltz commented Apr 12, 2021

Barbarrosa commented Apr 13, 2021

dominictarr commented Apr 15, 2021

Barbarrosa commented Apr 15, 2021 • edited Loading

dominictarr commented Apr 15, 2021

dominictarr commented Apr 15, 2021

Barbarrosa commented Apr 15, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021 • edited Loading

LRU cache

LRU cache with bipf.createSeekPath

Barbarrosa commented Apr 19, 2021

LRU cache with pregenerated comparison buffer

dominictarr commented Apr 19, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021

Barbarrosa commented Apr 19, 2021 • edited Loading

Original slowEquals

Pregenerated buffer w/o LRU cache

Barbarrosa commented Apr 19, 2021 • edited Loading

Per-instance LRU cache w/ pregenerated buffer

staltz commented Apr 19, 2021

dominictarr commented Apr 19, 2021

Barbarrosa commented Apr 25, 2021

Barbarrosa commented Apr 15, 2021 •

edited

Loading

Barbarrosa commented Apr 19, 2021 •

edited

Loading

LRU cache with `bipf.createSeekPath`

Barbarrosa commented Apr 19, 2021 •

edited

Loading

Original `slowEquals`

Barbarrosa commented Apr 19, 2021 •

edited

Loading