Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple submissions per student #1584

Open
rien opened this issue Sep 17, 2024 · 0 comments
Open

Support multiple submissions per student #1584

rien opened this issue Sep 17, 2024 · 0 comments

Comments

@rien
Copy link
Member

rien commented Sep 17, 2024

This issue is primarily meant as a braindump/storm to collect ideas and remarks. Feel free to add your own!

Problem

Most programming platforms allow students to submit more than once for a given exercise. If that is the case, students who hand in plagiarized submissions will sometimes copy another students' solution as-is to confirm it is correct, and afterwards submit an altered version to hide plagiarism.

To discover this kind of plagiarism, you can submit all submissions for a single exercise. However, Dolos currently considers each file its own submission and will match files from the same student together. Since these submissions are often very similar, they will create high-similarity pairs that will drown out pairs between different students - reducing the effectiveness of the report.

Solution

  • First, there needs to be a way to communicate to Dolos which submissions belong to the same student, I see two different methods:
    • If the paths given as argument to the CLI are directories, all submissions within the same directory could be considered from the same student
    • When receiving input from an input.csv, a field student_id could tell Dolos which submissions belong together. As this is the output of Dodona's export format, there is no need for Dodona to change anything to support this.
  • Second, the Dolos algorithm should probably ignore matches between submissions of the same student and not generate Pairs between
    • However, it could be interesting to be able to view the changes a student made between subsequent submissions as well.
  • Finally, careful consideration is needed how to implement this in the UI / CSV generation:
    • A Pair could become the "most similar pair" between two students' submissions. However, this could differ between each pair of students.
    • There should be a way to go through the individual submissions of each student and compare them seperately. Often there is useful information contained in previous/later submissions than the most similar one.
Sidenote

Currently at Aalto they are using labels to group students together. While this works surprisingly well, it does give some issues in the UI (e.g. a plagiarism graph that is very long).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant