Enable logs to be stored for successful CI builds #944

amshinde · 2018-03-07T00:24:01Z

Currently we can only retrieve the logs when a build has failed with Jenkins. We should be able to retrieve them for successful builds as well to able able to inspect if we are running with the correct environment.

amshinde · 2018-03-07T00:24:29Z

@chavafg Can you take a look at this?

jodh-intel · 2018-03-07T09:20:56Z

I'm guessing this should really have been raised on https://github.com/clearcontainers/jenkins.

/cc @grahamwhaley as this might have implications for the metrics system storage requirements.

grahamwhaley · 2018-03-07T09:33:25Z

We should probably discuss and define which logs, and how much debug they have in them.
If we take all the system logs and have all the CC debug enabled in the toml for instance then the logs come out pretty big (100's of Kb iirc), which we may not want to gather and store for every run.
If we know what info we want in advance, then we could run some commands at startup such as cc-runtime cc-env, docker info and @jodh-intel 's magic system info collection script. We could even run all of those to gather into a file and add the file to the stored 'results archive' in Jenkins, which would help reduce pollution in the console output screen/log.

@chavafg I think it was recently pointed out that the metrics CI logs were already pretty big, and I should check that, as that is not intentional.

jodh-intel · 2018-03-07T10:04:06Z

For reference, that magic script is https://github.com/clearcontainers/runtime/blob/master/data/collect-data.sh.in.

@amshinde - can you give a concrete example where retaining logs would have helped? I'm not disagreeing that it's a good idea, but it would be good to explore if there are other ways to give you what you want.

How long do we think we'll need to store logs? "Forever" probably won't cut it so would a month (4 releases) be sufficient do you think?

But as @grahamwhaley's suggesting, I'm not sure we need to keep the logs as long as we can know the environment the tests ran in, to allow a test run to be recreated, namely:

the commit version of every component.
the runtime config.
the version of the container manager being used.
the container manager config.
the version of the distro.
the package set being used (rpm -qa / dpkg -l).

As denoted by the checkboxes, the collect-data.sh script captures almost all we need here. The package set is the only missing item (although the script does capture the versions of any CC packages installed on the system already).

For reference, the output of the collect script when gzip -9'd is ~6k (for a system without any CC errors in the journal).

If we decide to store full logs for all PRs, we'll need something in place to warn about the ENOSPC that is almost guaranteed to happen one day... 😄

jodh-intel · 2018-03-07T10:05:13Z

Oh - we might also want to include procenv output (see clearcontainers/jenkins#5) for things like system limits, etc.

grahamwhaley · 2018-03-07T10:07:44Z

Agree on logs and longevity - I'm going to presume Jenkins has some plugin or setting that can manage and expire the gathered results files - and we should look at that indeed (we do collect up the .csv results files for the metrics for instance at present, but do not expire them)

grahamwhaley · 2018-03-07T10:08:35Z

procenv was the magic I was thinking of :-)

jodh-intel · 2018-03-07T11:23:07Z

Ah - soz - so much magic about! ;)

chavafg · 2018-03-07T13:53:11Z

I think @amshinde concern is to know the agent version, which at some point last week, we had a wrong version testing latest PRs.
As for keeping the logs, I can add a rule to gather them in the Azure jenkins configuration, that way the metrics jenkins will not have any impact. But also the azure jenkins server may have storage issues in the future if we continue growing the logs we keep on every run.
As @jodh-intel and @grahamwhaley said, it would be better to gather the information we require instead of getting all the logs from the execution.

jodh-intel · 2018-03-07T14:32:31Z

@chavafg - we could just run cc-collect-data.sh in the teardown script couldn't we? That way we get what info we want but also we ensure that script is being run regularly. If we need the complete list of packages, it would be easy to add an extra --all-packages option or similar.

chavafg · 2018-03-07T14:36:05Z

@jodh-intel yes, I think that would be the best. does cc-collect-data.sh collect the agent version? Because I have seen that it appears as unknown.

[Agent]
  Type = "hyperstart"
  Version = "<<unknown>>"

jodh-intel · 2018-03-07T15:20:55Z

@chavafg - good point! No, it doesn't.

I've had a think about this and I can think of two ways we could do this:

The gross hack

We could capture the agent version by adding something like a "--full" option to cc-collect.sh script. That option would run as normal, but would then:

enable full debug

change cc-collect-data.sh to run:

sudo docker run --runtime cc-runtime busybox true

look at the proxy messages in the system journal because the first message from the agent will contain it's version string.

But it's a hack ;)

The slightly-less gross option

Change the runtime so that it loop-mounts the currently configured container image read-only (with mount -oro,noatime,noload (thanks @grahamwhaley)) and then run cc-agent --version and grab the output.

That seems liked the best option but wdyt @grahamwhaley, @sboeuf, @sameo?

grahamwhaley · 2018-03-07T15:31:49Z

I had sometime very recently also considered we could loop mount the .img file and run the agent on the host with --version to extract that info. Either we can do that in the collect script or have the runtime do it. In the runtime feels a little skank, but I guess then we could in theory add the info into cc-env.

jodh-intel · 2018-03-07T15:36:31Z

I was having similar feelings about having that sort of code in the runtime too. That said, we do sort of have precedent if you look at cc-check.go which calls modinfo(8).

I'm happy for us to have this purely in the collect script but, yes, if it doesn't go in the runtime, we need to remove the Agent.Version field that @chavafg highlighted as currently it's static.

amshinde · 2018-03-07T18:08:10Z

@chavafg @jodh-intel @grahamwhaley Gathering the agent version was one of the requirements I had in mind, as we were running with a wrong agent last week. What I really wanted to have a look at were the CRIO logs, to take a look at the lifecycle events in the log and see that the container storage driver passed is actually the one being used with crio.
I would say for successful builds, one would be interested in the logs typically just after the build, so I am ok with keeping this around for a week or even just for a couple of days.

grahamwhaley · 2018-03-07T18:31:53Z

It looks like in the Jenkins 'discard old builds' option we may also have the ability to specify how long to keep artifacts for btw.

…eout_docker ci: cleanup: add timeouts to docker on cleanups

amshinde assigned chavafg Mar 7, 2018

jodh-intel mentioned this issue Mar 7, 2018

main: Add a CLI version option clearcontainers/agent#213

Merged

jodh-intel mentioned this issue Mar 9, 2018

Find a way to display the agent version clearcontainers/runtime#1055

Closed

mcastelino pushed a commit to mcastelino/tests that referenced this issue Jan 23, 2019

Merge pull request clearcontainers#944 from grahamwhaley/20181127_tim…

84ef05f

…eout_docker ci: cleanup: add timeouts to docker on cleanups

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable logs to be stored for successful CI builds #944

Enable logs to be stored for successful CI builds #944

amshinde commented Mar 7, 2018

amshinde commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

chavafg commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

chavafg commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018 •

edited

Loading

amshinde commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

Enable logs to be stored for successful CI builds #944

Enable logs to be stored for successful CI builds #944

Comments

amshinde commented Mar 7, 2018

amshinde commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

chavafg commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

chavafg commented Mar 7, 2018

jodh-intel commented Mar 7, 2018

The gross hack

The slightly-less gross option

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018 • edited Loading

amshinde commented Mar 7, 2018

grahamwhaley commented Mar 7, 2018

jodh-intel commented Mar 7, 2018 •

edited

Loading