Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting Stryker4S into Stryker4JVM supporting Scala and Kotlin #1522

Open
YVbakker opened this issue Mar 15, 2024 · 5 comments
Open

Converting Stryker4S into Stryker4JVM supporting Scala and Kotlin #1522

YVbakker opened this issue Mar 15, 2024 · 5 comments

Comments

@YVbakker
Copy link
Contributor

This was attempted in PR #1291 , but failed due to a complicated git history and lack of momentum.

The goal of this issue is to track the progress on this issue, that is blocking for almost all feature development of stryker4s and the future stryker4jvm kotlin & java projects.

In order to attract more developers and / or momentum, the following should be considered:

  • Build tooling uses Maven, unless otherwise necessary. (e.g. for the build tool implementations)
    • Because Maven is the most compatible with everything, and there is currently a lot of Maven knowledge around (@infosupport HQ)
  • Programming language used is Java, unless otherwise necessary. (e.g. for the existing core, and the language interface implementations)

The basic premise of this refactor is the following:

  • Add a language interface
    • The language interfaces can be written in their own respective language (e.g. Kotlin in Kotlin)
  • Add a build tool interface
    • Build each build tool specific implementation with the respective build tool. (e.g. Gradle w/ Gradle)
  • The common interfaces will be Java for compatibility / interop.

It has been proposed to use ServiceLoaders to load the language interface implementations dynamically into the core at runtime, so that the order of compilation and / or build orchestration is greatly simplified.

StrykerJVM proposal

This was referenced Mar 15, 2024
@hugo-vrijswijk
Copy link
Member

Some things to consider:

Because Maven is the most compatible with everything, and there is currently a lot of Maven knowledge around (https://github.com/infosupport HQ)

I don't think this is necessarily a good reason to use Maven as the main build tool. In the proposed architecture, a lot of code is still Scala. Which means to get up and running with most of the code you'd have to do a lot of switching between build tools. IMO, it makes more sense to use the build tool that's used for most code wherever possible, and only step outside of it when there is no other option (like for build tool plugins).

Programming language used is Java, unless otherwise necessary. (e.g. for the existing core, and the language interface implementations)

I think you underestimate how much code the existing core is 😅. The idea for Stryker4JVM came from the observation that most of the code in Stryker4s (and other Stryker flavours) is not language-specific, but deals with build tools, finding files and code to mutate, managing processes, reporting, etc. This code already exists, and IMO is a very good fit for Scala. For example, the reporter api is expressed as a fs2 Pipe, files are mutated through a fs2.Stream, processes are cats-effect Resources. These are all concepts that don't translate very well to other languages.

I think the least amount of work would be to create a separate language-interface module. That module would have the minimal interfaces needed for language-specific work. I.e. going through text and returning a list of mutants, instrumenting mutants into a new file, etc. It shouldn't have to know anything about reading files from disk, logging, reporting, etc.

The "build tool interface" already exists, in stryker4s-core. The proposed image doesn't fit, as the build tool is the "application" using the core libraries. Since the build tools always have a dependency on core, they always have a dependency on Scala. So, IMO, there's not much benefit to extracting this out to a separate module. There could be some interop advantages to having these in Java, but currently these also rely on very Scala-specific types and libraries. For example, the main entrypoint for a plugin is implementing the abstract class stryker4s.run.Stryker4sRunner which, simplified, has this API:

abstract class Stryker4sRunner {
  def resolveTestRunners(tmpDir: Path): Either[NonEmptyList[CompilerErrMsg], NonEmptyList[Resource[IO, stryker4s.run.TestRunner]]]
}

Which can return a (non-empty) list of compile errors, or Resource's of test-runners, which handle creating and finalization very nicely. This pattern works quite nicely as is, would be difficult to translate to pure Java, and IMO not worth changing anything about.

My suggestion would be something like this:

image

This is closer to the current codebase, only the language interface would have to be extracted (though not to be underestimated!). Not shown are other current modules like api (only contains Logger) and testkit (shared testing code).

Lastly, I'm very excited to get Stryker4JVM running! It will be a big effort either way, but greatly expands the Stryker umbrella ☂️. I do think we have to be careful to also think about the current user base. Stryker4s has around 25.000-35.000 downloads per month. Ideally, the migration to Stryker4jvm should be as smooth as possible. I'm hesitant to put a lot of Java/Maven in Stryker4jvm since I wouldn't call myself a Java developer nowadays. There are no maintainers, contributors or users of Stryker4s at Info Support, and I think it's important to keep the current user base in mind.

@jjkester
Copy link

jjkester commented May 8, 2024

If you'd ask me, for this change we should change the least amount possible. But also focus on maintainability. If we can get some build orchestration working that manages everything without relying too much on assumptions (like having built some subproject first) that would be amazing. Everything else that may be an improvement can be discussed later.

Personally I think this change will aid in getting people enthousiastic about contributing since it can open up a lot of possiblities (many different things people can work on) without the perceived hurdle of having to get familiar with the complete code base and the Scala language. For example, a Java implementation can be built with just Java and a basic understanding of the process.

I think the diagram you suggested is more or less what @YVbakker meant (actually, I created that diagram, hoping to recreate our earlier diagram from memory and apparently failing at that).

After having a chat about this issue with @YVbakker and @Giovds we think the best approach might be to set a goal and split up the work into manageable chunks that can be merged independently. Not necessarily to allow for parallelising the work, but more to avoid the issue we had last time (long running feature branches that become impossible to merge) and avoiding the hurdle of such a big task (personally really feel that one which does not motivate to get started on it).

Maybe it is best to have a (virtual) sit-down and prepare the work as GitHub issues, maybe even in a Project? Don't think we need to know everything upfront, but if we have some idea more issues can be added as we come to them.

@jessemaas
Copy link

Another point to consider is that we might want to stay close to what the group of students in PR #1291 did. Staying as close as possible to their structure should allow us to reuse some of their code. I'm not sure how much code we will actually be able to reuse directly, because many changes are spread across files and they can't be merged automatically, but having their code as an example can still speed up re-implementation, and it will probably sometimes be possible to copy entire pieces of code.

I've created an overview of what they did from reading their code and the report one of the students (Mart de Roos, who now works at Info Support) sent me. Their version of Stryker4JVM has the following structure / modules:

  • Stryker4JVM-Core (written in Java) defines interfaces to be implemented by language-specific modules. Most importantly:
    • AST (the abstract syntax tree)
    • Parser
    • Collector (to gather all mutations)
    • Instrumenter (to apply mutations)
    • LanguageMutatorProvider/LanguageMutator (class LanguageMutator contains Parser, Collector and Instrumenter instances)
  • Stryker4JVM-Core also defines / implements:
    • Some shared exceptions
    • Logging
  • Stryker4JVM (written in Scala) contains most of the original code that is also in stryker4s/modules/core.
  • (Discussion: This structure is closest to what @hugo-vrijswijk suggested, where most of the core remains in Scala with a set of interfaces defined in Java. One disadvantage is that this requires some conversion between Java and Scala collection types, but scala.jdk.CollectionConverters does this quite efficiently)
  • Stryker4JVM-mutator-Scala and Stryker4JVM-mutator-Kotlin implement the abstract classes and interfaces from Stryker4JVM-Core for their respective language. They also implement mutators. The language that each module implements is also the language that module is written in.
  • There is no Stryker4JVM-Java. This should eventually exist in Stryker4JVM.
  • Stryker4JVM-plugin-Maven and Stryker4JVM-plugin-SBT (both written in Scala) implement build tool plugins for Maven and SBT.
  • There is not plugin for Gradle yet
  • Currently, the general Stryker4JVM module directly uses on ScalaMutatorProvider and KotlinMutatorProvider and therefore depends on Stryker4JVM-mutator-Scala and Stryker4JVM-mutator-Kotlin. In the future work section of their report, the students suggest introducing a @LanguageMutatorProvider annotation and querying for this annotation, removing the direct dependency. An alternative would be using a ServiceLoader as suggested above.
  • Lastly, Stryker4JVM-api is mostly just the same as the module api in Stryker4S

Besides the suggestions already mentioned, there are some more possible improvements and changes:

  • Stryker4JVM-mutator-Kotlin copies a lot of code from Stryker4K. However, many JVM versions aren't supported by this code because it causes illegal reflection exceptions. Stryker4JVM-mutator-Kotlin should probably be rewritten. It is currently very incomplete to begin with.
  • Three different build tools are used for different modules. It would be nice to use a single build tool for every module.
  • The students also do some more suggestions in the Future Work section of their report (ask me or Mart de Roos to get access).

What do you think of this structure? Do you have ideas on what should be done differently or do you think we should try to reuse as much as possible from PR #1291?

@jessemaas jessemaas reopened this Jun 5, 2024
@rouke-broersma
Copy link
Member

Fwiw I am also interested in implementing a similar model in stryker.net (but taking it one step further and also splitting out 'core' into submodules) so this sound like a good plan to me :)

@YVbakker
Copy link
Contributor Author

As discussed on 19-6-24:

  • Using AST implementation (library) per language e.g. Scalameta for Scala
  • Interfaces in Java, interop is easier this way
  • Mutator API should not have any dependencies (on Scala or Kotlin)
  • Mutator API is focused on 'purely data'
  • During the development of the mutator API we will use sbt to build the java files, this could be moved to a separate repository in a later stage and here we could use Maven for example.
  • Mutators should be pure functions and not do any logging
  • Exceptions should be specific
  • Willing to increase the support Java version from 8 to 11

Mutator API steps to success:

  1. What interfaces do we need?
  2. Create first version of said interfaces
  3. Implement in existing scala4s code
  4. See where to go from here

@hugo-vrijswijk will create a branch off of stryker4s where we can 'poc' the interface definitions and tweak until everyone's happy. We'll create separate issues from there that will each focus on implementing one interface in the existing scala4s code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants