Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a ReIdentify Pipeline #36

Open
anantdamle opened this issue Oct 25, 2021 · 0 comments
Open

Implement a ReIdentify Pipeline #36

anantdamle opened this issue Oct 25, 2021 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@anantdamle
Copy link
Collaborator

Currently the project allows inspection and deidentification of sensitive data in different data sources.
There is a need to reidentify the data for business needs.

Implement a reidentify pipeline that:

  1. Uses the same encryption configuration as deidentify pipeline
  2. Supports following sources:
    a. BigQuery Table or Query
    b. AVRO file
  3. Supports writing output as BigQuery table or AVRO for CSV file
  4. Supports emitting only a subset of columns (all columns with subset not provided)
@anantdamle anantdamle added enhancement New feature or request help wanted Extra attention is needed labels Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant