-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data generation logic #1
base: master
Are you sure you want to change the base?
Conversation
This data all fits into memory in one run and takes less than a minute to load into snowflake so it will completely reset nightly. Overview of (max) granularity of each table - One time: -- Databases, Tables - Daily: -- DatabaseStorageUsageHistory, StorageUsage - Hourly: -- LoadHistory, LoginHistory - Minute: -- QueryHistory
@brln-looker - ready for you to review. I didn't end up making any comments on this PR, hopefully it's all reasonably self explanatory |
} | ||
]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
} | ||
]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
this.ERROR_COUNT = Math.random() < 0.1 ? Math.round(Math.random() * 10) : 0; | ||
} | ||
|
||
static oddsNew() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I would define this as a const at the top of the module, const ODDS_NEW 0.2
but I don't really have any good reason to do it one way or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early on I thought I'd have a lot more "dice rolling" to create an object than I ended up with. So I made this DataRow.rollDice()
/ SubClass.oddsNew()
construct that ended up not being used very much (and actually DataRow.oddsNew()
should be static
). What I'd like is class level properties but doesn't seem like those exist in JS?
I could set const ODDS_NEW = 0.2
at the module level and then reference it in the static oddsNew()
method in each class to help with the "magic number" issue, is that what you're referring to? or would you do the check differently?
} | ||
]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
data_rows/QueryHistory.js
Outdated
constructor(databaseName, date) { | ||
super(); | ||
this.QUERY_ID = uuid4(); | ||
this.setQueryText(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in DatabaseStorageUsageHistory you use the _convention
to annotate private methods. that's not super standard for node, but I like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, sorry for lack of consistency. I'll switch everything applicable to "private"
data_rows/QueryHistory.js
Outdated
import uuid4 from "uuid/v4"; | ||
|
||
import DataRow from "./DataRow"; | ||
import { LoginHistory } from "./LoginHistory"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I'm only exporting one thing from a module, I like to use export default, then you don't have to put braces around it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I switched to using the "package" (./index.js) I got errors and thought I had to remove the default. But that's not the case, not sure what I was doing wrong but I'll go back to defaults
[this.WAREHOUSE_NAME, this.WAREHOUSE_SIZE] = helpers.randomFromArray([ | ||
["BIG_WH", "X-Large"], | ||
["SMALL_WH", "Small"] | ||
]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you referring to my super original names/sizes or the "de-structuring" feature of JS (python's unpacking, php's list()
)? I'm happy to see JS has it too :-)
} | ||
]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
} | ||
]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if (this.WAREHOUSE_NAME == "SMALL_WH") { | ||
this.CREDITS_USED *= 0.3; | ||
} | ||
this.CREDITS_USED.toFixed(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up that .toFixed outputs a string, but the column type here is marked as a float. it probably gets cast somewhere in the data upload, but just making sure you're aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that. yeah, it just "sort of works" and makes the data be 2 points of precision instead of a butt load when it finally gets into the db. I think native type gets kind of lost in writing the csv file and the Types
stuff we have is just for creating the table columns in snowflake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a/ there's a bug I fixed here (needed to assign the result of toFixed
, it doesn't mutate in place)
b/ now I see the string problem:
> const n1 = 0.3252.toFixed(2);
undefined
> const n2 = 0.3652.toFixed(2);
undefined
> n1 + n2
'0.330.37'
oops...
main.js
Outdated
} | ||
} | ||
|
||
backfill().catch(e => console.log(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
lgtm. i should have marked this as a "review" but started doing individual comments and didn't want to switch halfway. oh well next time. |
88ed58c
to
579f7bb
Compare
I added a few feedback commits. I also still have a couple tweaks to make after showing the data to Jeffty but I'll put those up tomorrow. |
and fix toFixed bug
Also add weight to different queries
3bb4de4
to
f238772
Compare
@brln-looker - data tweak fixes are up, I think this is ready to go |
This data all fits into memory in one run and takes less than a minute
to load into snowflake so it will completely reset nightly.
Overview of (max) granularity of each table
-- Databases, Tables
-- DatabaseStorageUsageHistory, StorageUsage
-- LoadHistory, LoginHistory
-- QueryHistory