Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All Redshift and Snowflake models: Fix varchar length for pseudonymized fields #121

Open
agnessnowplow opened this issue Feb 16, 2022 · 0 comments

Comments

@agnessnowplow
Copy link
Contributor

In case the PII pseudonymization enrichment is enabled and run, the length of the target fields may change depending on the hashing algorithm used. The complete list of fields that may be hashed can be found here. This could present a problem that may prevent the commit steps from running in case there is a mismatch between the character length defined in the models` table definitions vs the incoming data, especially if the source data is longer than the target.

One possible solution is to increase the length of all the possibly impacted columns to fit the highest value of 128 characters (which could result from SHA-512 being used) in case it is less than it is currently defined in the model. Based on this criteria, domain_userid and session_id seem to be the fields that are impacted (Redshift and Snowflake, web and mobile models).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant