Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230

thejayps · 2023-06-02T14:01:27Z

Provides a solution to issue #215

Progress towards resolving #110

This isn't the nicest way to solve the problem, but so far it works well for Confifura who have encountered the issue.

Part of the plan to meet Configura's requirements by separating and reviewing each implementation.

This branch cherry-picks and edits #76 to only include the "hash array" feature.

Documentation for the original requirement is here: https://www.ravenbrook.com/project/mps/issue/job003435/

rptb1 · 2023-06-05T09:27:10Z

I think it would be worth writing a test case for this. It would need to create hash tables using LDs and other objects, then provoke and detect the case where the rehashing goes into a loop. It's quite a subtle test, and I don't think we should do it for version 1.118. Perhaps raise an issue. Also, check the MMQA suite and the custom/cet branch for relevant stuff. We can and should reproduce it in CET. It might be an example of a test that could be implemented neatly in the Scheme interpreter.

Note: #247

thejayps · 2023-06-05T18:58:40Z

Executing proc.review.entry

Start time 1953
proc.review.entry.criteria:
applying entry.universal, entry.design, entry.impl.
entry.universal.author: This work was done by manually isolating specific changes that form the implementation of the "hash arrays" feature from Configura's custom production MPS source. I am not the original author, but I performed the work to isolate this feature, and I request and permit review. Configura have given us permission to do this work: https://info.ravenbrook.com/mail/2022/01/18/12-53-38/0/

thejayps · 2023-06-05T19:13:43Z

paused 20:10

thejayps · 2023-06-05T21:05:20Z

resuming 2124
4. entry.universal.source-available: 2 mail threads: The story behind "Ravenbrook MPS sources are out of sync with
Configura's sources" and Gathering information on Configura specific functionality describe the approach to this work. (Links to the base email only, p4 grep can find all the subsequent responses). In addition, the issue #215 links to a description of the problem this functionality was designed to solve. There aren't any branches documented in https://info.ravenbrook.com/project/mps/branch/ that describe the original genesis of the requirement.
5. source-approved - there is no available evidence that any source document has exited review. All source documents should be considered "unreviewed"
6. entry.design.rfc - there isn't much evidence available for why the design approach for this solution was originally chosen, however the working of the design itself seems to be described adequately
7. Entry finished at 2205

thejayps · 2023-06-05T21:08:14Z

Entry took ~1h (but included writing up issues that arose with the entry criteria)

thejayps · 2023-06-05T21:50:27Z

Executing proc.review.plan

Started at 2221
@thejayps, @rptb1, @UNAA008 will be available
The change itself consists of 3 lines of code changed, and 70 lines of documentation. The code changes expose pre-existing implementation to the MPS API so the "product document" is not limited to the 3 lines of code changed. I estimate less than 100 lines may need to be checked for correctness by the correctness reviewer.
Doc 70 lines
Code ~100 lines but those lines need tracing and finding so I'll apply a multiplier ->(10 loc per min) = 10 mins *1.5 = 15 mins checking code.
~30 mins checking documentation for fresh eyes (i.e unfamiliar with code or requirement) in clarity role.
Role ideas:
@rptb1 checking correctness ~15/20 mins, additional time to be spent specifically identifying defects that might have been detected by a testbench (since one does not exist)
@thejayps checking .sources and .consitency (and leading)
@UNAA008 on .clarity: "The clarity checking role focuses on whether the product document is clear and obvious. This is a good role to give to someone who has never seen the product document before, but who is in the intended readership. Anything that is unclear to them is a defect."
Time: checking will take 30 mins, but could be extended to 45 while calibration of checking rates is ongoing.
Sources and instructions are available - TODO: send to checkers
Plan finished at 2249
Plan took ~30 mins

thejayps · 2023-06-05T22:05:52Z

The source documents for this review are:
Issues:
#110
#215
Mails:
https://info.ravenbrook.com/mail/2023/01/13/14-45-37/0/
(note that this is the first mail in the thread - other mails in the thread might provide more context but won't add much else)

What is to be reviewed:

The documentation contained in this change is largely the "product document" in terms of documentation, but consistency with other documentation will need to be reviewed by the .consistency checker.
The code change is only 3 lines: The code product document mostly consists of those lines of code that would not have been executed if the three changed lines had not existed. The .correctness role has time factored in to identify the product document. My estimation is that it's less than 100 lines. I won't provide further guidance on what I think the product document is as it's more likely that defects will be found if the correctness checker discovers the product document for themselves.

Rules
https://github.com/Ravenbrook/mps/blob/586336a7021c2276a3f798ad10e6e4ab5f438226/procedure/rule.code.rst
https://github.com/Ravenbrook/mps/blob/586336a7021c2276a3f798ad10e6e4ab5f438226/procedure/rule.code.style.rst
https://github.com/Ravenbrook/mps/blob/586336a7021c2276a3f798ad10e6e4ab5f438226/procedure/rule.generic.rst
https://github.com/Ravenbrook/mps/blob/586336a7021c2276a3f798ad10e6e4ab5f438226/procedure/rule.style.rst
apply.

rptb1 · 2023-06-06T11:52:36Z

Executing proc.review.plan

There is nothing noted for proc.review.plan.schedule. I realise we arranged this in advance, but you should record it.

From the invitation:

Review will take place at 1300 on 2023-06-06.

thejayps · 2023-06-06T11:58:25Z

Here is proc.review.check for checkers

thejayps · 2023-06-06T12:24:05Z

Beginning proc.review.ko at 1304
@rptb1 on .correctness and identifying defects that could be hidden because of lack of testbench
@thejayps on .sources and .consistency
@UNAA008 on .clarity
Checking started at 1320
Logging will take place at 1350 (optionally checking might be extended by 15 mins if checkers would like more time).

rptb1

Executing proc.review.check

Start time 13:20.
Checked out branch and ran make test on my machine (lii6gc) without issue.
Also built the manual with make -C manual clean html without issue.
Checked the manual and the header changes.
Checking existing hash arrays code in poolamc.c.
m. https://github.com/Ravenbrook/mps/blob/branch/2023-05-31/hash-arrays/code/poolamc.c#L516 pointless comment.
M. https://github.com/Ravenbrook/mps/blob/branch/2023-05-31/hash-arrays/code/poolamc.c#L961-L962 "defer the size accounting" with what intention? (Could refer to design.mps.strategy section.) rule.generic.clear [Please see https://github.com/Hash arrays mechanism is not general #233#issuecomment-1580131391 when editing this. RB 2023-05-07]
Finished checking for logging meeting 13:50.
Found 1M 6m 1q.

manual/source/pool/amc.rst

thejayps · 2023-06-06T13:05:15Z

M1 this issue does not link to documentation for the original requirement which seems to be here: https://www.ravenbrook.com/project/mps/issue/job003435/

Fix: added link in #230 description above

M2 This feature is not tested (stress, correctness or regression). This job describes the lack of stress testing: https://www.ravenbrook.com/project/mps/issue/job003963/

Raise: #247

comment1: nice improvements to the documentation noted in the past could be relevant https://www.ravenbrook.com/project/mps/issue/job004109/

Raise: #249

UNAA008

My only comment is that the mechanism that setting
MPS_KEY_AP_HASH_ARRAYS to true uses to not contribute to nursery space
is not described. But perhaps it is correct that implementation
details (which could change) are not discussed.

Raise: #249

rptb1 · 2023-06-06T13:17:20Z

M1 this issue does not link to documentation for the original requirement which seems to be here: https://www.ravenbrook.com/project/mps/issue/job003435/

We can update the Perforce fixes to link this to the hash arrays implementation as well.

rptb1 · 2023-06-06T13:32:53Z

NQ. Is ramping required? If not we could deprecate or remove it.
NQ. Is ramping documented? If so, we could mention in the manual that hash arrays are ramping.

rptb1 · 2023-06-06T13:39:01Z

Editing may involve updating and tagging design.mps.strategy so I suggest the editor also take note of

mps/design/strategy.txt

Lines 565 to 567 in 8be637f

    
           - 2013-06-04 NB_ Checked this in although it's far from complete. 
        
             Pasted in my 'ramping notes' from email, which mention some bugs 
        
             which I may have fixed (TODO: check this).

thejayps · 2023-06-06T13:40:40Z

Begin logging at 1407: (no extra time checking but took a break)
@rptb1 found 1M 6m 1question
@UNAA008 1 comment
@thejayps found 2M 1m 1comment
Logging finished at 1440. Brainstorming to start at 1450

thejayps · 2023-06-06T14:42:48Z

Brainstorming started at 1505
Considering @thejayps Ms:
M1 #230 (comment):
@thejayps + @rptb1 together suggest a "proc.excavate" procedure which suggests ways to help look for symbols/requirements etc in as many sources as possible. @thejayps original idea was an easy or documented way to search the mail archive. And beyond into the Harlequin era - something @rptb1 is capable of doing.
M2: @rptb1 maybe invest more in our test infrastructure to make tests easier to write. Eg make it easy to test in scheme.
@thejayps brainstorm ways to have more resource available to the MPS team.
Note (@rptb1) that testing has never been a major way of ensuring quality in the MPS.
@rptb1 M #230 (review)
@thejayps suggests a heat map or statistical analysis of tags. @rptb1 notes it's a graph tracing problem.

PI: @UNAA008 suggests that 5 mins brainstorming each defect isn't always enough.
Logging finished at 1540

rptb1 · 2023-06-09T10:26:17Z

NQ. Is ramping required? If not we could deprecate or remove it.

Answer: @thejayps please check this for Configura, but I think this is a good requirement to keep, on reflection.

NQ. Is ramping documented? If so, we could mention in the manual that hash arrays are ramping.

Answer: Ramping is documented in the Allocation Patterns topic of the manual.

thejayps · 2023-06-15T16:33:36Z

Begin proc.review.edit

Start time 1733

thejayps · 2023-06-15T16:36:21Z

6. m. https://github.com/Ravenbrook/mps/blob/branch/2023-05-31/hash-arrays/code/poolamc.c#L516 pointless comment.

from #230 (review)

Fix: deleted comment in commit

thejayps · 2023-06-15T17:15:59Z

editing paused at 1815

thejayps

documenting proc.review.edit

manual/source/release.rst

manual/source/pool/amc.rst

thejayps · 2023-06-16T09:39:50Z

NQ. Is ramping required? If not we could deprecate or remove it.

Answer: @thejayps please check this for Configura, but I think this is a good requirement to keep, on reflection.
from #230 (comment)

Answer to @rptb1: see https://info.ravenbrook.com/mail/2023/06/15/20-56-30/0/

thejayps · 2023-06-16T12:50:56Z

7. M. https://github.com/Ravenbrook/mps/blob/branch/2023-05-31/hash-arrays/code/poolamc.c#L961-L962 "defer the size accounting" with what intention?  (Could refer to design.mps.strategy section.)  rule.generic.clear  [Please see https://github.com/[Hash arrays mechanism is not general #233](https://github.com/Ravenbrook/mps/issues/233)#issuecomment-1580131391 when editing this. RB 2023-05-07] from https://github.com/Ravenbrook/mps/pull/230#pullrequestreview-1465077305

Raise: #249

thejayps · 2023-06-16T16:25:01Z

M1 this issue does not link to documentation for the original requirement which seems to be here: https://www.ravenbrook.com/project/mps/issue/job003435/

We can update the Perforce fixes to link this to the hash arrays implementation as well.

from #230 (comment)

pass: @rptb1

Assigning PR to @rptb1 and PR should remain "pending"

thejayps · 2023-06-16T16:40:16Z

executing proc.review.exit

Start time 1730
Transgression - Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230 (comment) has no response. Raise Design for "hash arrays" needs to clearer #249 is the appropriate action
Transgression - estimation of remaining defects has not taken place: 3 M defects found: using 75% rule 1 likely remains
exit.pass
review.exit.calc:
- hours used: ~12
- hours saved: 30-12 = 18
- major defects remaining: ~1
exit finished at 1740 - took 10 mins

thejayps · 2023-06-16T17:34:41Z

executing proc.merge.pull-request (currently in #228)

Start time 1813
Checkist "no" to item 3: There is no testcase, we have decided to proceed and write one later: Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230 (comment)
Merge conflicts found. Pausing at 1825

thejayps · 2023-06-16T18:35:23Z

Merge resumed and completed at 1935

>.

rptb1 · 2023-06-17T10:47:23Z

Fixing up merge mistakes by catching up from master, re-merging, and force-pushing to master. See #253 for prevention.

manually cherry pick changes from b928fa2 relating to hash-arrays

8be637f

thejayps requested review from rptb1 and UNAA008 June 2, 2023 14:01

thejayps added this to the version 1.118 milestone Jun 2, 2023

rptb1 assigned thejayps Jun 5, 2023

rptb1 added the essential Will cause failure to meet customer requirements. Assign resources. label Jun 5, 2023

rptb1 linked an issue Jun 6, 2023 that may be closed by this pull request

Large hash tables can result in a feedback loop of collections and reallocations of the hash table #215

Closed

rptb1 requested changes Jun 6, 2023

View reviewed changes

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

UNAA008 reviewed Jun 6, 2023

View reviewed changes

thejayps added the pending Something needs doing, even if closed. label Jun 6, 2023

rptb1 mentioned this pull request Jun 7, 2023

Hash arrays mechanism is not general #233

Open

thejayps mentioned this pull request Jun 12, 2023

The review procedure document does not prompt leaders to do phase.estimation at the appropriate time clearly enough #237

Open

thejayps commented Jun 15, 2023

View reviewed changes

manual/source/release.rst Outdated Show resolved Hide resolved

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

manual/source/pool/amc.rst Outdated Show resolved Hide resolved

performing proc.review.edit

d08a198

This was referenced Jun 16, 2023

"Hash arrays" work lacks a specific testbench #247

Open

Design for "hash arrays" needs to clearer #249

Open

thejayps assigned rptb1 Jun 16, 2023

rptb1 approved these changes Jun 16, 2023

View reviewed changes

thejayps merged commit de19bf4 into master Jun 16, 2023

rptb1 added a commit that referenced this pull request Jun 17, 2023

Merging branch/2023-05-31/hash-arrays for GitHub pull request #230 <#230

91ca65c

>.

rptb1 mentioned this pull request Nov 6, 2023

Apply all changes from Configura to the public MPS. #76

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230

Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230

thejayps commented Jun 2, 2023 •

edited

Loading

rptb1 commented Jun 5, 2023 •

edited by thejayps

Loading

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023 •

edited

Loading

thejayps commented Jun 5, 2023 •

edited

Loading

thejayps commented Jun 5, 2023 •

edited

Loading

rptb1 commented Jun 6, 2023

thejayps commented Jun 6, 2023

thejayps commented Jun 6, 2023

rptb1 left a comment •

edited

Loading

thejayps commented Jun 6, 2023 •

edited

Loading

UNAA008 left a comment •

edited by thejayps

Loading

rptb1 commented Jun 6, 2023

rptb1 commented Jun 6, 2023 •

edited

Loading

rptb1 commented Jun 6, 2023 •

edited

Loading

thejayps commented Jun 6, 2023

thejayps commented Jun 6, 2023

rptb1 commented Jun 9, 2023

thejayps commented Jun 15, 2023

thejayps commented Jun 15, 2023

thejayps commented Jun 15, 2023

thejayps left a comment

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

rptb1 commented Jun 17, 2023

Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230

Allow clients to implement large hash arrays without a collection-rehash feedback loop in the public MPS #230

Conversation

thejayps commented Jun 2, 2023 • edited Loading

rptb1 commented Jun 5, 2023 • edited by thejayps Loading

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023

thejayps commented Jun 5, 2023 • edited Loading

thejayps commented Jun 5, 2023 • edited Loading

thejayps commented Jun 5, 2023 • edited Loading

rptb1 commented Jun 6, 2023

thejayps commented Jun 6, 2023

thejayps commented Jun 6, 2023

rptb1 left a comment • edited Loading

Choose a reason for hiding this comment

thejayps commented Jun 6, 2023 • edited Loading

UNAA008 left a comment • edited by thejayps Loading

Choose a reason for hiding this comment

rptb1 commented Jun 6, 2023

rptb1 commented Jun 6, 2023 • edited Loading

rptb1 commented Jun 6, 2023 • edited Loading

thejayps commented Jun 6, 2023

thejayps commented Jun 6, 2023

rptb1 commented Jun 9, 2023

thejayps commented Jun 15, 2023

thejayps commented Jun 15, 2023

thejayps commented Jun 15, 2023

thejayps left a comment

Choose a reason for hiding this comment

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

thejayps commented Jun 16, 2023

rptb1 commented Jun 17, 2023

thejayps commented Jun 2, 2023 •

edited

Loading

rptb1 commented Jun 5, 2023 •

edited by thejayps

Loading

thejayps commented Jun 5, 2023 •

edited

Loading

thejayps commented Jun 5, 2023 •

edited

Loading

thejayps commented Jun 5, 2023 •

edited

Loading

rptb1 left a comment •

edited

Loading

thejayps commented Jun 6, 2023 •

edited

Loading

UNAA008 left a comment •

edited by thejayps

Loading

rptb1 commented Jun 6, 2023 •

edited

Loading

rptb1 commented Jun 6, 2023 •

edited

Loading