Data Transformation Exercise

This exercise is based on a real problem that we had to solve at EFL. It is designed to show you an example of the kinds of challenges you would be working on at EFL, as well as to help us understand how you approach these kinds of challenges! Good luck, and above all, have fun!

Data Transformation Exercise

The company where you work has created an online survey. Whenever someone completes this survey, their responses and some metadata are collected and sent to the backend as a list of JSON objects.

After validation, this data must be transformed so it can be correctly stored in the database.

Your task is to design and build a process to transform the incoming data from payload.json (included in this GitHub repo) so that it can be imported into the assessmentResponse table described below.

You must use Python (any version) to complete this exercise.
You may use any frameworks/ORMs/libraries/etc. that you are comfortable with.
Your solution does not actually have to import anything into a SQL database, but it must produce something that would be trivially easy to import.

Model

Here's the model that you have to work with. Unfortunately, it's part of the legacy application and doesn't quite match the structure of the data from the new survey application. But, lots of existing systems rely on it, so you'll just have to find a way to make the data fit.

CREATE TABLE assessmentResponse (
  id integer PRIMARY KEY,
  questionCode varchar(104),
  questionResponse varchar(104),
  section varchar(104),
  timeElapsed smallint,
  viewsCount smallint,
  responseSequence varchar(104)
);

Example Data

To give yourself an idea of how you want it to look once saved in the database, you manually transformed one of the questions/steps.

Input JSON

{
  "uid":      "b1435d5771c14c2498642f07ce04b087",
  "step":     "likert",
  "sequence": 1,

  "observations": {
    "cheat_customer": {
      "responseValue":    "5",
      "responseSequence": ["5"],
      "timeElapsed":      8,
      "viewCount":        1
    },

    "work_integrity": {
      "responseValue":    "3",
      "responseSequence": ["2", "3", "4", "3"],
      "timeElapsed":      48,
      "viewCount":        5
    },

    "personal_honesty": {
      "responseValue":    "1",
      "responseSequence": ["5", "1"],
      "timeElapsed":      17,
      "viewCount":        2
    }
  },

  "metas": {
    "timeElapsed":    486,
    "helpBarClicks":  3,
    "viewCount":      1
  }
}

Transformed Values

+-----------------------+------------------+-------------+-------------+------------+----------------------+
| questionCode          | questionResponse | section     | timeElapsed | viewsCount | responseSequence     |
+-----------------------+------------------+-------------+-------------+------------+----------------------+
| cheat_customer        | 5                | likert      |           8 |          1 | ["5"]                |
| work_integrity        | 3                | likert      |          48 |          5 | ["2", "3", "4", "3"] |
| personal_honesty      | 1                | likert      |          17 |          2 | ["5", "1"]           |
| _likert               | NULL             | likert      |         486 |          1 | NULL                 |
| _likert_helpBarClicks | 3                | likert      |        NULL |       NULL | NULL                 |
+-----------------------+------------------+-------------+-------------+------------+----------------------+

Hints

Some of the observations in payload.json have attributes that don't map nicely to assessmentResponse. Your code should still try to import these values... somehow.
You may need to make assumptions about the meaning/purpose of some of the data. As long as you document it somewhere (and it's... you know, plausible 🙂), we'll play along.
This is a timed exercise, so focus your attention on the areas that will best showcase your skills and talents, while still completing as many of the requirements as possible. An amazing solution that's 75% complete will be way better received than an okay solution that's 100% complete.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
payload.json		payload.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Transformation Exercise

Model

Example Data

Input JSON

Transformed Values

Hints

About

Releases

Packages

Contributors 3

lenddoefl/exercise-data-transformation

Folders and files

Latest commit

History

Repository files navigation

Data Transformation Exercise

Model

Example Data

Input JSON

Transformed Values

Hints

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages