Use server for concurrency of 1 when writing analysis output & readability factors #54

EllAchE · 2024-03-10T10:48:51Z

No description provided.

src/metrics/promotions.ts

bennyrubanov · 2024-03-14T02:08:58Z

src/queue.ts

+// will just write to wherever the process is running, but the server needs to be launched from the same directory so we use an abs path
+export const RESULTS_PATH = `${__dirname}/results.json`;
+
+function launchQueueServer() {


i like this. curious if it works - have you tested the decompression script with this?
as discussed on call, if this doesn't work, I think you could make a bunch of unique results.json files (one for each analysis), and then append all of them in the end
Alternatively, could also make one results.json file for each item of the batch that's being run. I.e. if the code is run in batches of 13, then make 13 results.json files (results1.json, results2.json, etc) and then make sure each item writes to one of them, then append the 13 results.json files at the end. i think this actually might be the easiest thing to implement (if server doesn't work... i don't know much about the createServer function)

what is even in results.json? I assumed there were issues if we did separate files bc we'd have to track weighted averages (which we don't rn) which is why I went with the queue process.

objects that look like this (after running the script):
{
"Number of games analyzed": 23,
"maxKDRatio": 14,
"pieceWithHighestKDRatio": [
"K"
],
"KDRatios": {
"RA": 1.4,
"NB": 0.8235294117647058,
"BC": 0.7777777777777778,
"Q": 1.8571428571428572,
"K": 14,
"BF": 0.8333333333333334,
"NG": 1.4615384615384615,
"RH": 1.0833333333333333,
"PA": 0.25,
"PB": 0.8333333333333334,
"PC": 0.7142857142857143,
"PD": 0.5714285714285714,
"PE": 0.9285714285714286,
"PF": 1.2,
"PG": 0.42857142857142855,
"PH": 0.2857142857142857,
"ra": 1.4545454545454546,
"nb": 0.8235294117647058,
"bc": 1.2941176470588236,
"q": 3.1,
"k": 7.5,
"bf": 0.6470588235294118,
"ng": 1.4,
"rh": 1,
"pa": 0.8571428571428571,
"pb": 0.6,
"pc": 0.5,
"pd": 0.5,
"pe": 0.35714285714285715,
"pf": 0.2222222222222222,
"pg": 0.6666666666666666,
"ph": 0
},

src/results.json

src/run_metrics_on_file.ts

bennyrubanov · 2024-03-14T02:14:36Z

src/zst_decompressor.ts

+  const base_path = `${__dirname}/../../data/${file.replace('.zst', '')}`;
+
+  // Create a new file path
+  const newFilePath = `${base_path}_${randomUUID()}`;


does randomUUID create a random file path number? i prefer counting them. corresponds better with the log here too console.log(Creating file #${file_counter} at ${newFilePath});

the issue with counting is that in high concurrency they will not match up. We could do it by game identifier or institute locking if necessary, but for uniquness I use UUID

let's do both? one for logs, one for avoiding concurrency/overwrites?

bennyrubanov · 2024-03-14T02:15:53Z

src/zst_decompressor.ts

since the file was renamed, it's hard for me to see what specific changes you've made here. please test this script before merging into main to make sure it's working properly. I'd prefer above all if you could loom/screen record the running of the script on a smaller dataset, and I can test if it works properly. Or I can run it myself - let me know when it's ready to be run (now?)

I'll see if i can run this

bennyrubanov · 2024-03-14T02:27:29Z

src/zst_decompressor.ts

+  const base_path = `${__dirname}/../../data/${file.replace('.zst', '')}`;
+


not accurately tracking

Creating file #1 at /Users/bennyrubanov/Coding_Projects/chessanalysis/src/../../data/lichess_db_standard_rated_2013-01.pgn_e261b15e-1f2b-4e3b-b3c9-d4c352af17c1

should be chessanalysis/data/lichess...

if you run this with ts node the paths will be different than if running the transpiled cod.e That's probably why you saw all the differences

i think my changes addressed this properly?

src/zst_decompressor.ts

bennyrubanov · 2024-03-14T02:51:32Z

Tried running the decompression script using ts-node src/zst_decompressor.ts and was running into lots of issues. I fixed the file path ones, but then it started creating way too many decompressed files (like 7.5k of them that were each 8kb in size as opposed to being equivalent to SIZE_LIMIT).

It's a pain to try to figure out what you changed because the whole file got renamed and so i can't see specific line changes. Somewhere, in handling the writing to the file and the check for SIZE_LIMIT, something got moved into the wrong order. Can you review this and figure out what got broken?

EllAchE · 2024-03-20T07:06:01Z

Tried running the decompression script using ts-node src/zst_decompressor.ts and was running into lots of issues. I fixed the file path ones, but then it started creating way too many decompressed files (like 7.5k of them that were each 8kb in size as opposed to being equivalent to SIZE_LIMIT).

It's a pain to try to figure out what you changed because the whole file got renamed and so i can't see specific line changes. Somewhere, in handling the writing to the file and the check for SIZE_LIMIT, something got moved into the wrong order. Can you review this and figure out what got broken?

note that using ts-node vs. transpiled code will cause issues with paths

EllAchE · 2024-03-20T07:08:31Z

src/zst_decompressor.ts

@@ -108,9 +114,10 @@ const decompressAndAnalyze = async (file, start = 0) => {

      // https://www.npmjs.com/package/node-zstandard#decompressionstreamfromfile-inputfile-callback
      zstd.decompressionStreamFromFile(
-        `${__dirname}/../../data/${file}`,
+        `${compressedFilePath}/${file}`,


@bennyrubanov fyi these aren't bugs. They are differences between your workflow using ts-node and mine which transpiles to JS first.

understood. can we write code to work with both?

EllAchE · 2024-03-20T07:15:36Z

Tried running the decompression script using ts-node src/zst_decompressor.ts and was running into lots of issues. I fixed the file path ones, but then it started creating way too many decompressed files (like 7.5k of them that were each 8kb in size as opposed to being equivalent to SIZE_LIMIT).

It's a pain to try to figure out what you changed because the whole file got renamed and so i can't see specific line changes. Somewhere, in handling the writing to the file and the check for SIZE_LIMIT, something got moved into the wrong order. Can you review this and figure out what got broken?

the 8kb files are bc I'm reassigning new file path each time, probably. It should just be when size is too big

EllAchE · 2024-03-20T07:21:49Z

Tried running the decompression script using ts-node src/zst_decompressor.ts and was running into lots of issues. I fixed the file path ones, but then it started creating way too many decompressed files (like 7.5k of them that were each 8kb in size as opposed to being equivalent to SIZE_LIMIT).

It's a pain to try to figure out what you changed because the whole file got renamed and so i can't see specific line changes. Somewhere, in handling the writing to the file and the check for SIZE_LIMIT, something got moved into the wrong order. Can you review this and figure out what got broken?

The specific issue of small files is fixed

EllAchE · 2024-03-20T07:58:59Z

@bennyrubanov I've made some updates to this script. There is remaining work before this works out of the box but I'd prefer to merge then come back with followups. Specifically:

results.json should be read by the queue process & combined there to ensure no race conditions
orphaned processes should be killed
max concurrency should be set and enforced

…size and file size limit, as well as error catch for missing files

rename and create some functions

cc85e2d

EllAchE requested a review from bennyrubanov March 10, 2024 10:48

EllAchE added 4 commits March 10, 2024 04:00

rename .js to .ts

c5aa56d

create files using uuid

756c7b6

fix error introduced by trying to assign to var 2x

db4e4be

export-res-path-for-consistency

b4643f3

EllAchE changed the title ~~rename and create some functions~~ Use server for concurrency of 1 when writing analysis output & readability factors Mar 11, 2024

EllAchE added 2 commits March 11, 2024 01:00

only run functions if the entry point is the specified module

b0b3eb8

only aggregate if the module is main

e6b2f81

bennyrubanov reviewed Mar 14, 2024

View reviewed changes

src/zst_decompressor.ts Show resolved Hide resolved

corrections re: file paths not resolving correctly

6f66b8f

EllAchE commented Mar 20, 2024

View reviewed changes

do not shadow the scope of newFilePath;

d788dc0

fix decompression script etc

976bac9

EllAchE requested a review from bennyrubanov March 20, 2024 07:59

reverting back to 9abf7b1, then adding in terminal prompts for batch …

1905908

…size and file size limit, as well as error catch for missing files

bennyrubanov force-pushed the readability branch from 56fb2fc to 1905908 Compare March 24, 2024 05:39

bennyrubanov merged commit 4294a7d into main Mar 24, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use server for concurrency of 1 when writing analysis output & readability factors #54

Use server for concurrency of 1 when writing analysis output & readability factors #54

EllAchE commented Mar 10, 2024

bennyrubanov Mar 14, 2024

EllAchE Mar 20, 2024

bennyrubanov Mar 20, 2024

bennyrubanov Mar 14, 2024

EllAchE Mar 20, 2024

bennyrubanov Mar 20, 2024

bennyrubanov Mar 14, 2024

EllAchE Mar 20, 2024

bennyrubanov Mar 14, 2024

bennyrubanov Mar 14, 2024

EllAchE Mar 20, 2024

bennyrubanov Mar 20, 2024

bennyrubanov commented Mar 14, 2024

EllAchE commented Mar 20, 2024

EllAchE Mar 20, 2024 •

edited

Loading

bennyrubanov Mar 20, 2024

EllAchE commented Mar 20, 2024

EllAchE commented Mar 20, 2024

EllAchE commented Mar 20, 2024

		const base_path = `${__dirname}/../../data/${file.replace('.zst', '')}`;

Use server for concurrency of 1 when writing analysis output & readability factors #54

Use server for concurrency of 1 when writing analysis output & readability factors #54

Conversation

EllAchE commented Mar 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bennyrubanov commented Mar 14, 2024

EllAchE commented Mar 20, 2024

EllAchE Mar 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EllAchE commented Mar 20, 2024

EllAchE commented Mar 20, 2024

EllAchE commented Mar 20, 2024

EllAchE Mar 20, 2024 •

edited

Loading