Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency management #54

Open
Xazax-hun opened this issue Jul 18, 2020 · 15 comments
Open

Dependency management #54

Xazax-hun opened this issue Jul 18, 2020 · 15 comments

Comments

@Xazax-hun
Copy link
Contributor

I think this project would benefit a lot from unit tests. Using a framework like GTest or Catch2 could help with this task, but one crucial question before writing any of them: what is the strategy for dependencies?

Some options came to my mind:

  • Use a fully automatic solution like CPM where the CMake scrip will download and build the dependencies automatically without any action from the user.
  • Include the dependencies in the source tree as git submodules or just copy them over. LLVM takes this approach with GTest.
  • Use a package manager like vcpkg and rely on the user installing the packages before building the project.
  • Use a package manager like vcpkg but make the CMake scripts install packages automatically.
  • Don't use any dependencies, have our own test framework.

What are the preferences of this project going forward?

@GabrielDosReis
Copy link
Owner

GabrielDosReis commented Jul 21, 2020

This is a very good question.

Yes, it will be good to have official tests in this repot -- and such tests would exercise both in-memory representation and externalization roundtrip. One crucial requirement is that this library should easily integrate in other larger products without requiring or imposing its own testing framework or bigger dependencies. For example, MSVC is using its own testing framework. When I originally wrote this library (circa 2004), I was very well familiar with DejaGNU (I contributed to GCC's libstdc++-v3 testing framework half a decade earlier), and was familiar with then CodeSourcery's QMTest. Each would have worked really well in the environment I was developing the library (GNU/Linux and Unix), but would have fared poorly in new environments that it is being used in today.

Responding to your bulleted list, I think it would be a good idea not to impose a hard dependency on the project in general, so the library can continue to be used in wider environment as possible. We should definitely have test suites, even when projects using this library have their own.

@Xazax-hun
Copy link
Contributor Author

I see. It is definitely possible to hide the tests and the dependency behind a flag which can be off by default (similar to building the documentation is behind a flag).

I think nowadays GTest is one of the most popular testing frameworks, and I believe it is supported by most IDEs including Clion, Visual Studio, and Visual Studio Code.

There are also some header-only libraries like catch2 that could be very easily included with the project.

@GabrielDosReis
Copy link
Owner

Yes, if we ended up with GTest as used in the Microsoft GSL for example. I think I am still at the stage of what I would like to require of the testing framework - whatever it is.

@d-winsor
Copy link
Contributor

If dependencies are used I think a good fit for them would be the use of External_project (link) in CMake. Then you can isolate it to the build process as part of a opt-in configuration as @Xazax-hun mentioned.

@Xazax-hun
Copy link
Contributor Author

I decided to make a prototype as a baseline so we can discuss what do we want to change.

I opened a sample pull request here: #58

The important bit is how the CI was modified. It will first configure and build the project without any frameworks installed to ensure that the project continues to build without any dependencies. After that, it will reconfigure the project with tests enabled to do the actual testing.

We can replace anything we want including the method to get the test framework or the test framework.

@BjarneStroustrup
Copy link
Collaborator

BjarneStroustrup commented Jul 22, 2020 via email

@Xazax-hun
Copy link
Contributor Author

I think testing has multiple purposes. It can also be viewed as documentation that is always up to date. In a separate Latex user guide, it is easy to break the code examples. With tests, the code examples are always verified to be up to date by the CI. Even if we do not aim for a large coverage or exhaustive testing, some complete and self-contained code examples could be helpful for people planning to use IPR.

@BjarneStroustrup
Copy link
Collaborator

BjarneStroustrup commented Jul 22, 2020 via email

@Xazax-hun
Copy link
Contributor Author

Sorry about my misunderstanding.

As a potential future user, I'd love to see tests on the abstraction level of the user interface with practical value, such as how to create a class with members and methods, how to serialize it, and so on. Doing so would exercise some internal template classes.

How to test the internal building blocks of IPR is a good question and it indeed poses some questions and challenges. I think I'd defer dealing with that part of the problem and concentrate on tests that help document how to use IPR idiomatically.

@Xazax-hun
Copy link
Contributor Author

I updated the prototype PR #58 to only build gtest when an installed version was not found. Feel free to take a look if that kind of dependency management is ok. Note that, gtest is optional and will only searched for when tests are explicitly enabled.

@GabrielDosReis
Copy link
Owner

So, just to reiterate and set expectations. There is no debate that we need a set of testsuite in this repo. The question that we are grappling with is what should we expect the test to cover? Clearly, we should have tests to cover/prevent regressions. That is the easiest for some definition of "easy".

Ideally, the test framework should cover and stress test all aspects of the IPR. Not just unit tests, but whole in-memory representation of various semantics graph objects. That includes both round tripping, not necessarily via XPR since that is a difference language syntax with its own type checking rules (subset of C++). One would output s-expression, YAML (more readable than JSON), or in Bemol syntax (easy to parse, and embeddable in a programming language). Right now, we don't have any of those -- we need them. We need tests output of which can be scanned for specific behavior. Of course, we also need a dedicated separate directory for educational purposes.

I will not go as far as turning to literate programming -- my experience with it on a few projects over nearly two decades has not been conclusive, and in the end people just opt for comment in the source code instead of the actual literate approach. I am not mentioning the additional dependencies involved (yes, I also wrote a weaver in C++ to help reduce the dependencies). For now, I am inclined to view that route as distraction.

So, as a proposal, under a tests/ directory, I would see:

  • YAML/ -- tests for YAML external representation
  • XPR/ -- tests for XPR external representation
  • Bemol/ -- tests for Bemol external representation
  • s-expr/ -- tests for s-expression external representation
  • GCC/ -- tests for GCC integration
  • MSVC/ -- tests for MSVC integration
  • Clang/ -- tests for Clang integration
  • unit-tests/ -- unit tests
  • examples/ -- tutorial or educational tests

I suspect you're focusing on unit-tests/. A good patch for enabling that shouldn't be a hard dependency, and the testing framework for it shouldn't introduce more complexity than needed. It is amusing that GitHub believes we currently have 80% C++ and 20% CMake; I would like to increase the ration of C++ :-)

Comments?

@Xazax-hun
Copy link
Contributor Author

I agree with almost everything you said, but I want to separate two questions:

  • What infra should we have to test stuff?
  • What kind of tests should we actually have?

I think this PR is more about the first than the second. I'd prefer to focus here on the first question and defer the rest of the discussion to a follow-up PR. My proposal would be to not to discuss the actual tests here, and the single test in this PR is very likely to be replaced in the future.

After we have the infra fundamentals, I'll open a new PR with some basic tests that reflect the directory structure you described.

What is your definition of hard dependency? If an optional dependency is not a hard dependency, this PR should already pass this requirement.

There is a trade-off in the complexity of CMake and user-friendliness. Currently, this PR prioritize user-friendliness as it has the option of automatically downloading and building the dependency (in case the user opt into testing and does not have the dependency installed on their machine.) Removing this convenience feature and require users (that opt into testing) to manually install the dependency would make the cmake simpler.

@GabrielDosReis
Copy link
Owner

I understand that the PR is about your first point rather than the second. In general, we proceed from a draft spec, i.e. general directions, to an implementation instead of mutating an implementation into a spec -- if we don't have an idea of where we are going, how do we know we are making progress towards that goal? I would like to set a general direction for what kind of tests we should have in this repo, and then proceed from there in terms of PR. Without those general ground rules, I don't know we even have something that we would call infra fundamentals.

What is your definition of hard dependency?

Examples:

  • The build system is a hard dependency. Before the IPR was on GitHub, it used Autotools, which were hard dependencies; so it could not build directly and simply with MSVC in its native environment (some early adopters on Windows made up their own MSBuild solution files). It transitioned to an embryonic CMake-based build setup, and now it can build everywhere CMake exists without imposing too much of a requirement. You got rid of the vestigial Autotools, and that was most welcome. That is an example of how a choice of hard dependency can be limiting in what can be simply expressed and where things can be deployed.

  • If for example, we start using GSL in the IPR interface or implementation (which I suspect will be a thing sooner or later), then it becomes a hard dependency.

  • Similarly, if we make the headers generated from some literate files, then the supporting literate tools become hard dependencies.

I am not suggesting that we will never have hard dependencies -- we will have some of them -- but I would like to keep the hard dependencies to the minimum.

If an optional dependency is not a hard dependency, this PR should already pass this requirement.

I don't view an optional dependency as a hard dependency, and that also depends on how it is set up.

We are having this conversation here (and not as part of the review of the PR) because I would like for us to have a shared understanding of a direction of testing framework before we embark on a one specific implementation, and also so that we have a clearer idea of how said implementation fits in the bigger picture. It isn't wasted effort.

There is a trade-off in the complexity of CMake and user-friendliness.

I am not looking at the testing framework requirements and intregration in the build setup as 'user-friendliness'. User-friendliness would be about the IPR interface and tradeoff between various takes on ergonomics. Rather, this discussion about optionality of and "integrability" in larger projects.

Removing this convenience feature and require users (that opt into testing) to manually install the dependency would make the cmake simpler.

I am not suggesting to remove the optionality of downloading Gtest; rather I was jokingly making a cautionary comment about the build setup proportion (however it is measured) from competing with that of the actual C++ part. Build systems come and go, and with fashion -- after all, this library is already on its second or third build system.

I hope this message clarifies things.

@Xazax-hun
Copy link
Contributor Author

I see. In your proposal, you suggest having a wide variety of tests. I think most test frameworks are not limited to unit tests, most integration tests (like XPR round trips) could be implemented in GTest or any equivalent framework without any issues.

I do see some tests that do not fit into this idiom, Clang integration would be one example. While it is possible to execute a Clang frontend action from C++, the tests probably want to exercise the exact same scenario that the users will face. So those tests would involve invoking binaries and comparing the results of the output. The lit tests are responsible for this in LLVM.

I think these two categories of tests could be implemented independently infrastructure wise, and this is the approach that LLVM is following. It has both lit tests and gtest-based tests and they are largely independent (lit tests can execute the gtest-based tests, but this is the only dependence between the two).

I think unless we find a solution that is a one size fits all for both kinds of tests I discussed above, we can independently move forward with the two categories of tests.

@GabrielDosReis
Copy link
Owner

GabrielDosReis commented Aug 26, 2020

I see. In your proposal, you suggest having a wide variety of tests.

Correct, I cannot imagine "testsuite for IPR" is just limited to "XPR roundtrip" -- as useful as it is, that covers a fairly limited fraction of the surface areas of IPR capability. The most impactful places where we want to full cover are in the places of integration with compilers and other tools.

I do see some tests that do not fit into this idiom, Clang integration would be one example. While it is possible to execute a Clang frontend action from C++, the tests probably want to exercise the exact same scenario that the users will face. So those tests would involve invoking binaries and comparing the results of the output.

Correct -- and those places are most impactful, as they exercise not just the compiler integration itself, but also the internal consistency of IPR with respect to expected output with respect to the input.

I think these two categories of tests could be implemented independently infrastructure wise

Maybe. I mentioned earlier:

I was very well familiar with DejaGNU (I contributed to GCC's libstdc++-v3 testing framework half a decade earlier), and was familiar with then CodeSourcery's QMTest. Each would have worked really well in the environment I was developing the library (GNU/Linux and Unix), but would have fared poorly in new environments that it is being used in today.

The libstdc++-v3 test framework, which is based on DejaGNU, can handle both, and more, but I find the DejaGNU and its dependencies not suitable for the sort of environments of deployment envisioned for IPR. QMTest, based on Python hence available in more environments, does not appear to be actively maintained.

I think unless we find a solution that is a one size fits all for both kinds of tests I discussed above, we can independently move forward with the two categories of tests.

The purpose of the conversation is not necessarily to find a one-size-fits-all (a couple do actually exist), but to expose a shared understanding of the requirements, directions, and design space before implementing one specific view so that as we progress we get a sense of where each piece fits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants