Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export results to a self-contained directory with HTML/JS/CSS #1583

Closed
Kobzol opened this issue Sep 13, 2024 · 10 comments
Closed

Export results to a self-contained directory with HTML/JS/CSS #1583

Kobzol opened this issue Sep 13, 2024 · 10 comments
Labels
enhancement New feature or request

Comments

@Kobzol
Copy link

Kobzol commented Sep 13, 2024

Hi, thanks for this great tool! I hope that it will be able to replace MOSS in our code submission tool. We have been using MOSS for a few years, but often we have issues with it (primarily because of the fact that it is only available as a remote API that is often slow or outright offline).

Is your feature request related to a problem? Please describe.
We run code plagiarism checks in an automated manner on code submissions from many (several hundreds of) students, so we want to have plagiarism checking integrated directly within our submission website, to avoid the need for teachers to manually use a CLI tool (or an external website) to see the plagiarism check results. With MOSS, we use their API, which generates a set of self-contained HTML files, which we then serve and show to teachers directly through our website.

However, I haven't found a similar feature in Dolos. It can visualize the results in a website (which looks totally awesome, and I would love to show it to teachers using our tool!), however, it needs the dolos serve command that actually serves the website. This is difficult to combine with our existing (Django) web app, as we would somehow need to keep a separate persistent Dolos server running per plagiarism checking result, which is not really feasible. Even if it was possible to use just a single Dolos server for this, it would be a bit complex, and we would need to have a way to send a specific result to the Dolos server through the URL, to have the ability to display different results in our website.

Describe the solution you'd like
Ideally, I would like to have a way to export the Dolos website to a self-contained directory that could just be opened in a browser (without any active web server) and it would "just work" :) Since the web is mostly a SPA, this probably shouldn't be that difficult, I hope.

So ideally, I'd like to have a command like dolos export, which would act in the same way as dolos serve, but it would just generate a directory with HTML/JS/CSS files, rather than starting a web server. Or, alternatively, there could be some output format like dolos -f web-dir, that would do the same thing.

Describe alternatives you've considered
I could create my own web visualization (integrated within our website) out of the generated CSV files, but this is obviously a lot of work and I would be duplicating what the Dolos web already does. Alternatively, I could open the Dolos website programmatically and then somehow "snapshot" it (using a headless browser?) to generate the self-contained direcetory, but that would be a very complex process.

Let me know what do you think about this idea, and how complex do you think it would be. I can try to send a PR, if you think that it's feasible and if you can guide me to where should I start taking a look.

@Kobzol Kobzol added the enhancement New feature or request label Sep 13, 2024
@rien
Copy link
Member

rien commented Sep 13, 2024

Hi @Kobzol, what you request is definitely possible, but I suggest a better alternative:

The dolos-web module containing the code for the Web UI actually has the HTML/CSS/JS files needed to visualize every report based on the CSV files that are generated for each analysis. So where MOSS creates separate HTML-files for each report, the HTML/CSS/JS files that Dolos uses stay the same.

The command dolos serve is hence nothing more than statically hosting the contents of the report together with the static files of the dolos-web package. This is currently handled by a <100LOC class server.ts, but the important part is this:

let filePath;
if (reqPath.startsWith("/data")) {
filePath = path.join(reportDir, reqPath.slice(5));
} else if (reqPath.endsWith("/")) {
filePath = path.join(webDir, reqPath, "index.html");
} else {
filePath = path.join(webDir, reqPath);
}
const type = MIME[path.extname(filePath).slice(1)];
if (!type) {
return notFound(response);
}

  • If the request path starts with /data it is a report CSV-file, so look in the report directory
  • If not, the request is for a HTML/CSS/JS file.

This is should be easy to achieve with Django as well and would save you making a copy of the identical ~10MB web UI files for each report. It also has the added benefit that if we add new features to the Web UI, you could just update to the latest version and all previously generated reports can use these new features.

Let me know if this approach would work for you.

@rien
Copy link
Member

rien commented Sep 13, 2024

You don't need to build the dolos-web UI yourself BTW, we publish the prebuilt HTML/CSS/JS files on NPM: https://www.npmjs.com/package/@dodona/dolos-web?activeTab=code

So you would only need to download the package, extract it and statically host the files in the dist/ folder.

@Kobzol
Copy link
Author

Kobzol commented Sep 13, 2024

That sounds great, thank you! So if I understand it correctly, when the SPA is started, it always tries to load /data/[metadata|files|kgrams|pairs].csv, and then displays whatever is returned to it by the backend?

If I wanted to keep your original source code without doing any modifications to it, I'd need to somehow disambiguate different checking results (e.g. using /data/...csv?id=XYZ or something like that). Can that be configured somehow using the current code, or do you think that I'd need to fork and modify the frontend to change this?

@rien
Copy link
Member

rien commented Sep 13, 2024

That sounds great, thank you! So if I understand it correctly, when the SPA is started, it always tries to load /data/[metadata|files|kgrams|pairs].csv, and then displays whatever is returned to it by the backend?

If I wanted to keep your original source code without doing any modifications to it, I'd need to somehow disambiguate different checking results (e.g. using /data/...csv?id=XYZ or something like that). Can that be configured somehow using the current code, or do you think that I'd need to fork and modify the frontend to change this?

It actually loads /data/*.csv relative to the index.html file. So you could host the Dolos-web files on /reports/*/index.html be the same and let /reports/{id}/data/*.csv return the CSV-files belonging to that report-id. So there is no need to change anything.

For the Dolos server we actually have a special mode to build the frontend to upload, list and go to the reports. But we host the reports in the way described above.

@Kobzol
Copy link
Author

Kobzol commented Sep 13, 2024

Ah, cool, I thought that it's hardcoded to load from the root /, but if it's relative, than that should indeed be ideal for our use-case. Thanks a lot for explaining this to me! :) I will try to integrate it within our system and let you know how it went.

@rien
Copy link
Member

rien commented Sep 13, 2024

We're trying to make Dolos as flexible as possible to make integrations like this possible, so definitely get in touch to let us know how it goes. We're looking forward to how you would be using Dolos.

@rien rien closed this as completed Sep 13, 2024
@Kobzol
Copy link
Author

Kobzol commented Sep 19, 2024

Thanks to your hints, I was able to integrate Dolps in our system quite easily, thank you!

Ideally, we'd need to have multi-file submits per student, but for now I simulate it with just concatenating all files together, we'll see how that works.

We're still keeping MOSS for now, as it returns different results and has a bit more intuitive pair visualization (with the differently colored section per plagiarized block), but it's very nice to have an alternative (that can't timeout because the MOSS server is down, lol).

@rien
Copy link
Member

rien commented Sep 19, 2024

Glad to hear the integration worked out!

Supporting multiple files per submission is indeed not possible. We solve it as well by concatenating multiple files. We do have an open issue (#1121) but it is currently not our priority.

As for the pair visualization: we deliberately used only one color to keep the visual complexity in check. You used to be able to click on a fragment such that it would highlight the matching parts, but this has broken at one point. There are some other improvements possible with the comparison as well.

We welcome contributions if you want to help us out with this part. Let me know if that is the case and I can write down the changes required for the different features.

In any case, good luck with using Dolos and we welcome any additional feedback that you have 😊

@Kobzol
Copy link
Author

Kobzol commented Sep 19, 2024

You used to be able to click on a fragment such that it would highlight the matching parts, but this has broken at one point.

This seems to work for me, so maybe it got unbroken in the meantime 😆

I can help contributing some changes, it might take me some time, but if you can write down some hints, that would help a lot, of course. The feature that I would appreciate the most is probably #1121.

Anyway, thanks a lot for a great tool!

@rien
Copy link
Member

rien commented Sep 19, 2024

The best bugs are the ones that fix themselves 😅

I have extended #1121 with some initial pointers, but if you want I could set up a video call to walk you through the project.

I will also make a separate issue for clarifying the matched fragments with some ideas I have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants