Skip to content

Latest commit

 

History

History
93 lines (84 loc) · 4.75 KB

README.md

File metadata and controls

93 lines (84 loc) · 4.75 KB

Response Time Decomposition (RTD)

Exploring the response time decomposition of student action logs in ASSISTments.
Collaborators:
       Anthony F. Botelho, Ph.D.
       Ashish Gurung, Ph.D.

link to the data [NOTE: While the code is available under the MIT license the dataset is provided through a different license that can be found here.]



Analysis Replication Guide

If you wish to replicate the code without going through preprocessing then download 3 csv files from the drive:

  1. RTD_data_randomsample_20K_new.csv
  2. hint_infos.csv
  3. assignment_problem_npc_infos_with_priors.csv

Once you have saved the CSV files in the data folder in your workspace. You need to run the
../analysis/paper_results_replication_file.py
and the results in the paper should be replicated.

[NOTE: As our analysis was exploratory in nature the paper_results_replication_file.py file only facilitates replication of what we reported in the paper. The other files can provide insight into all the other aspects of the user behavior we had explored.]


The following is the order of execution of the files in the project for preprocessing:

  1. libreoffice_prep.py
    This is the first code base that sorts the data and ensures that everything is in order and all the additional features generation is automated. This takes the ...random_sample_20K.csv data and outputs a RTD_data_randomsample_20K.csv data.
    Once the RDT_data_randomsample_20K.csv is generated the using libre office to generate the feature values is the quicker option. The preprocessing in python is taking forever so had to figure out if it made more sense to have it done in LibreOffice.


    1. Make sure to run the libreoffice_prep.py beforehand
      Run the preprocess to clean the PR and PS columns along with the pair features.
    2. Generate action_action_pairs
      This pairs the relevant actions made by a user per problem to generate the action pairs associated with user made to solve the problem.
      Formula:
      =IF(M2 = -1, K2, CONCAT(K2, "_", K3))
      column M : pr
      column K : action_type
    3. Generate action_action_pairs_time_taken
      This calculates the time taken by a user for each action pair while solving the problem.
      Formula:
      =ROUND(IF(OR(L2 <> L3, C2<>C3), 0, G3 - G2), 4)
      column L : ps
      column G : action_unix_time [1 second = 1 unix time]
      column C : user_xid
    4. Generate pr_answered_correctly_pair
      This checks if the action pair lead to a correct answer to the pr.
      Formula:
      =IF(AND(K3="StudentResponseAction", M4=-1), 1, IF(AND(M3=-1, M2 <> -1),P1,0))
      column N: action_action_pairs
    5. Generate attempts made per Problem:
      This generates all the attemps a student made inorder to answer the pr.
      Formula:
      =IF(M2 <> -1, IF(K3="StudentResponseAction", Q1+1, Q1), 0)
      column M: pr
      column K: action_type
      column Q: number_of_attempts made in the problem
    6. Generate hint requested per Problem:
      This generates all the attemps a student made inorder to answer the pr.
      Formula:
      =IF(M2 <> -1, IF(K3="HintRequestedAction", R1+1, R1), 0)
      column M: pr
      column K: action_type
      column R: number_of_hints accessed in the problem