Skip to content

gstudio Data Almanac

SDQT edited this page Feb 19, 2018 · 1 revision

gStudio Data Almanac

Last updated on: Feb 19, 2018

This document explains the details of data logging of gstudio platform.


Part 1: Platform data - overview

  1. CLIx platform (https://staging-clix.tiss.edu) is built on gstudio software architecture.

  2. gstudio https://github.com/gnowledge/gstudio is a node description framework (NDF).

  3. gstudio curriculum/course hierarchy:

     [Level1] Module
         → [Level2] **Unit**  ⇒ considered as Group; Enrollment, Progress Report and Assessment are mapped at
                            this level. It is the most important learning entity/node.
             → [Level3] Lesson  
     	        → [Level4] Activity Page
     		        → text
     		        → audios
     		        → videos
     		        → assessments ⇒ offered by Open Assessment Tool external to gstudio
     		        → quiz items
     		        → interactive tools ⇒ external to gstudio
    
  4. Platform affordances: Gstudio facilitates constructionist and collaborative learning and by design offers affordances that can be used to collect explicit user generated artefacts:

    4.1. Gallery: learners can,

    • upload files,
    • Give rating to files uploaded by other learners
    • Provide feedback/comment on files uploaded by other learners
    • Upload alternate files, transcripts for audio/video, subtitle for videos

    4.2. Notebook:

    • Write a note
    • Rating
    • Provide feedback/comment on notes written by other learners

    4.5. Activity page level interactions:

    • Feedback/Discuss/Comments
    • Rating

All these affordance are organised/collated at the Unit level.

Platform Data: gstudio platform data can be categorized into four categories:

  1. Quantitative data: Exported as user progress report CSVs
  2. Qualitative data
  3. Collected data details
  4. Quiz data
  5. Benchmark data

Part 2: Platform data - details

Quantitative data

  • At the unit level, CSV files are generated [Link]
  • In the schools, CSVs are generated every hour so that data is preserved despite electricity failures and/or any other unanticipated technical glitch.
  • Following data is being reflected in the CSV:
  1. server_id
  2. school_name
  3. school_code
  4. unit_name
  5. username
  6. user_id
  7. total_lessons
  8. lessons_completed
  9. percentage_lessons_completed
  10. total_activities
  11. activities_completed
  12. percentage_activities_completed
  13. total_quizitems visited_quizitems
  14. attempted_quizitems
  15. unattempted_quizitems
  16. correct_attempted_quizitems
  17. notapplicable_quizitems
  18. incorrect_attempted_quizitems
  19. user_files total_files_viewed_by_user
  20. other_viewing_my_files
  21. unique_users_commented_on_user_files
  22. total_rating_rcvd_on_files
  23. commented_on_others_files
  24. cmts_on_user_files
  25. total_cmnts_by_user
  26. user_notes
  27. others_reading_my_notes
  28. cmts_on_user_notes
  29. cmnts_rcvd_by_user
  30. total_notes_read_by_user
  31. commented_on_others_notes
  32. total_rating_rcvd_on_notes
  33. correct_attempted_assessments
  34. unattempted_assessments
  35. visited_assessments
  36. notapplicable_assessments
  37. incorrect_attempted_assessments
  38. attempted_assessments
  39. Total_assessment_items

Quiz data

  • Gstudio has three types of quiz items:
  1. Multiple Choice Question - single select
  2. Multiple Choice Question - multiple select
  3. Short answer/descriptive answer
  • Based on the settings the user input is logged with timestamp
  • There is no timer or clickstream data collection.
  • Quiz responses can be downloaded realtime in CSV or PDF by course administrators. Ex: Pre-CLIx Survey and Post-CLIx survey.

Qualitative data

All the user generated data from the platform affordances explained in #4 above will be available with buddy details and timestamp.

Benchmark data: gstudio collects humongous amount of data at various node levels using benchmarks. Through data mining various possibilities can be explored.

At this moment, it can be confidently said that by mining the benchmark data one can extract:

  • Activity page
    • Number of visits
    • Timestamp of visits
    • Activity Type (AssessmentID, Tool-name, mime-type)
  • Number of visits to Group analytics page
  • Number of visits Progress Report (disabled in 2017-18)
  • Downloads from gallery/resources
    • File name
    • User id (with buddies)
    • Timestamp

Possibilities can be explored through deriving/mining the data to infer:

  • Time spent by a user on Activity page
  • Sequence of Activity page visit by a user
  • Language tag of Activity page visited

Data that is NOT being logged by gstudio platform (because gstudio is not designed to collect it):

  • Clickstream data
  • Actions in audio/video player (because external player is being used)

Cross-reference to external/integrated components

  • Unit will hold all the reference of AssessmentsID embedded in the child Activities
  • An activity page holds the details about type of content (AssessmentID, Tool-name, mime-type)
  • Open Assessment Tool/OEA player logs data independent of gstudio platform as explained here [Link]
  • External tools and interactives log data (usually as json files) based on their design and saves at gstudio configurable endpoint /data/gstudio_tools_logs/<tool-name> as explained here.
  • During the year 2017-18 except for PoliceQuad no other tool has been configured to log the data.

Buddy reference

  • All the collaboratively file uploads in Gallery, comments as Discuss, Feedback and Notebook notes captures buddy details with timestamp.
  • Tools can fetch the buddy details from cookies and generate json files for every buddy.

Part 3: gstudio API

Part 4: server logs

Further details