Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geoip processor update #11854

Merged
merged 35 commits into from
Jan 26, 2022
Merged

Geoip processor update #11854

merged 35 commits into from
Jan 26, 2022

Conversation

roberto-graylog
Copy link
Contributor

@roberto-graylog roberto-graylog commented Dec 17, 2021

Description

This PR addresses issue GeoIP Processor Update

The code changes include:

  1. Being able to switch vendors for Geo IP data (e.g. MAXMIND, IPINFO)
  2. Added all available fields provided by each vendor
  3. Limited search fields to a predefined set (to avoid checking whether any field has a valid IP address)
  4. Tagging of reserved/local IP addresses
  5. Added ASN information
  6. Removes the field db_type in the collection cluster_config
  7. Adds the field database_vendor_type in the collection cluster_config

Motivation and Context

The ability to switch vendors at runtime gives users flexibility, depending on which data subscription they have.

How Has This Been Tested?

Testing includes a unit test to confirm the factory can successfully query the current configuration and create the appropriate location and ASN resolver to get the appropriate location and ASN data. To avoid embedding vendor data, all further testing was performed manually on a local installation.

Screenshots (if appropriate):

N/A

Testing Steps During Review:

  1. The DB migration should the cluster_config collection for type 'org.graylog.plusins.map.config.GeoIpResovlerConfig' to { "enabled": true, "enforce_graylog_schema": false, "db_vendor_type": "MAXMIND", "city_db_path": "/etc/graylog/server/GeoLite2-City.mmdb", "asn_db_path": "/" }

  2. Test the Geo-Location Processor updates by navigating to System/Configuration. Test that all the fields are updated correctly AND that when enabled=true the City database is validated (file location as well as database type) and that when an ASN database path is provided, it is also validated. Also test that the vendor type is switched--the server logs indicate that it switched successfully as well as which new type is configured.

  3. With the configuration enabled and enforce_graylog_schema=true , submit a message with fields source_ip,host_ip, and destination_ip, and confirm that the IP addresses in these fields (and only these fields) is handled correctly. Include reserved, public IPv4 addresses as well as an IPv6 address to confirm it is handled correctly. public addresses should produce additional fields with the geo data. Reserved addresses should produce a new field called _reserved_ip with value true. IPv6 addresses will also produced a log entry in the server log. Include IP addresses in any other than the aforementioned fields to confirm only the preset fields were queried.

  4. Update the configuration to enforce_graylog_schema=false and submit a message with IP addresses in fields other than the preset fields--confirm these addresses were handled.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactoring (non-breaking change)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.

@roberto-graylog
Copy link
Contributor Author

The back-end support is largely complete, though needs some polishing up.

I discussed the GUI changes with @mikedklein, and we were thinking of something as in the following illustration:
proposed-config-dialog

The Database Providers would be MaxMind and IPInfo, which correspond with the prefixes in the DatabaseType enum set.

The caveat with this approach, which makes the workflow a bit unnatural (at the code level) is that, the city_db_type and asn_db_type fields in the config object would have to be updated dynamically when a database provider selection is made (e.g. When MaxMind is selected the city_db_type and asn_db_type will get values MAXMIND_CITY and MAXMIND_ASN).

When updating the model, current selection for the Database Provider based on the *_db_type fields in the config object.

Alternatively, we can remove the 'db_type' concept and instead create a database_provider field. However, this will require a substantially disruptive change set, including database migrations (which may be just fine if you think it is worth it).

As is, I can switch from MaxMind to IPInfo directly via the endpoint, and the process works as expected (the processor uses the respective City/ASN databases).

Copy link
Member

@bernd bernd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few questions and suggestions below.

In your comment you mentioned the management of the database reader.

How is lifecycle management handled in general for the objects that read from a database file?

I don't see any code that "restarts" the objects once a user changes the configuration in the web interface. So it looks like any change to the resolver config currently requires a restart of the server.
Everything that is configured in the web interface shouldn't require a restart to get applied.

Lookup tables, for example, are services with lifecycle management that will be restarted once their config gets updated. (that includes closing file handles to databases, etc. - see: MaxmindDataAdapter)

To make the runtime changes work, we would need to implement something similar for the geo IP resolver engine.

And this raises the question of what use case or problem we need to solve triggered these changes in the first place? 😅
If we add lifecycle handling to the geo IP resolver engine, we basically reimplement parts of lookup tables.

So I think we should take a step back and talk about the problem we want to solve. /cc @waab76 @bud1979

…set default vendor type after renaming old column
@danotorrey danotorrey self-assigned this Jan 13, 2022
Copy link
Contributor

@kingzacko1 kingzacko1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality looks good and previously enabled geolocation processing continues to work after upgrading.

@danotorrey
Copy link
Contributor

I am reviewing this now, and should be done today.

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberto-graylog I am still reviewing, but I noticed one issue I wanted to share:

It looks like there is an order-of-operation issue that only shows itself on the first startup. The org.graylog.plugins.map.geoip.processor.GeoIpProcessor starts up before the migration runs to migrate the Cluster config data.

The result is the following error only on the first startup (not sure if this is a race condition that only I am seeing) and the GeoProcessor won't run if it was previously enabled halting GeoProcessing. Subsequent restarts of the GeoIpProcessor are successful because the migration has run by then. I think we will need to find a way for the processor to wait on startup until after the migrations have run. I believe @mpfz0r recently added some preflight checks to Graylog when it starts, I am not sure if something there could help. Any suggestions @mpfz0r?

#11889
#11526

org.graylog2.cluster.ClusterConfigServiceImpl - Couldn't extract payload from cluster config (type: org.graylog.plugins.map.config.GeoIpResolverConfig)
2022-01-21 08:47:11,606 ERROR: org.graylog2.cluster.ClusterConfigServiceImpl - Error while deserializing payload java.lang.IllegalArgumentException: Cannot construct instance of `org.graylog.plugins.map.config.GeoIpResolverConfig`, problem: Null cityDbPath at [Source: UNKNOWN; line: -1, column: -1] at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3751) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:3669) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at org.graylog2.cluster.ClusterConfigServiceImpl.extractPayload(ClusterConfigServiceImpl.java:93) [classes/:?] at org.graylog2.cluster.ClusterConfigServiceImpl.get(ClusterConfigServiceImpl.java:109) [classes/:?] at org.graylog2.cluster.ClusterConfigServiceImpl.get(ClusterConfigServiceImpl.java:119) [classes/:?] at org.graylog2.cluster.ClusterConfigServiceImpl.getOrDefault(ClusterConfigServiceImpl.java:124) [classes/:?] at org.graylog.plugins.map.geoip.GeoIpResolver.(GeoIpResolver.java:39) [classes/:?] at org.graylog.plugins.map.geoip.MaxMindIpResolver.(MaxMindIpResolver.java:34) [classes/:?] at org.graylog.plugins.map.geoip.MaxMindIpLocationResolver.(MaxMindIpLocationResolver.java:43) [classes/:?] at org.graylog.plugins.map.geoip.MaxMindIpLocationResolver$$FastClassByGuice$$1047832451.GUICE$TRAMPOLINE() [?:?] at org.graylog.plugins.map.geoip.MaxMindIpLocationResolver$$FastClassByGuice$$1047832451.apply() [?:?] at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:82) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1100) [guice-5.0.1.jar:?] at com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:878) [guice-assistedinject-5.0.1.jar:?] at com.sun.proxy.$Proxy159.createMaxMindCityResolver(Unknown Source) [?:?] at org.graylog.plugins.map.geoip.GeoIpVendorResolverService.createCityResolver(GeoIpVendorResolverService.java:43) [classes/:?] at org.graylog.plugins.map.geoip.GeoIpResolverEngine.(GeoIpResolverEngine.java:70) [classes/:?] at org.graylog.plugins.map.geoip.processor.GeoIpProcessor.(GeoIpProcessor.java:77) [classes/:?] at org.graylog.plugins.map.geoip.processor.GeoIpProcessor$$FastClassByGuice$$565401393.GUICE$TRAMPOLINE() [?:?] at org.graylog.plugins.map.geoip.processor.GeoIpProcessor$$FastClassByGuice$$565401393.apply() [?:?] at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:82) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:206) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:159) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalProviderInstanceBindingImpl$Factory.get(InternalProviderInstanceBindingImpl.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1100) [guice-5.0.1.jar:?] at com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:878) [guice-assistedinject-5.0.1.jar:?] at com.sun.proxy.$Proxy71.create(Unknown Source) [?:?] at org.graylog2.shared.buffers.ProcessBuffer.(ProcessBuffer.java:91) [classes/:?] at org.graylog2.shared.buffers.ProcessBuffer$$FastClassByGuice$$23047344.GUICE$TRAMPOLINE() [?:?] at org.graylog2.shared.buffers.ProcessBuffer$$FastClassByGuice$$23047344.apply() [?:?] at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:82) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:169) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:206) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:159) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalProviderInstanceBindingImpl$Factory.get(InternalProviderInstanceBindingImpl.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:169) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45) [guice-5.0.1.jar:?] at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:60) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:206) [guice-5.0.1.jar:?] at com.google.inject.internal.RealMultibinder$RealMultibinderProvider.doProvision(RealMultibinder.java:159) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalProviderInstanceBindingImpl$Factory.get(InternalProviderInstanceBindingImpl.java:113) [guice-5.0.1.jar:?] at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:50) [guice-5.0.1.jar:?] at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:146) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:124) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-5.0.1.jar:?] at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:296) [guice-5.0.1.jar:?] at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:58) [guice-5.0.1.jar:?] at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) [guice-5.0.1.jar:?] at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:169) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:186) [guice-5.0.1.jar:?] at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:113) [guice-5.0.1.jar:?] at com.google.inject.Guice.createInjector(Guice.java:87) [guice-5.0.1.jar:?] at org.graylog2.shared.bindings.GuiceInjectorHolder.createInjector(GuiceInjectorHolder.java:34) [classes/:?] at org.graylog2.bootstrap.CmdLineTool.setupInjector(CmdLineTool.java:489) [classes/:?] at org.graylog2.bootstrap.CmdLineTool.run(CmdLineTool.java:294) [classes/:?] at org.graylog2.bootstrap.Main.main(Main.java:45) [classes/:?] Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of `org.graylog.plugins.map.config.GeoIpResolverConfig`, problem: Null cityDbPath at [Source: UNKNOWN; line: -1, column: -1] at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:67) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:1608) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:484) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:503) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:285) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:229) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:195) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:488) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1287) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:326) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:159) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3746) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] ... 94 more Caused by: java.lang.NullPointerException: Null cityDbPath at org.graylog.plugins.map.config.AutoValue_GeoIpResolverConfig$Builder.cityDbPath(AutoValue_GeoIpResolverConfig.java:146) ~[classes/:?] at org.graylog.plugins.map.config.GeoIpResolverConfig.create(GeoIpResolverConfig.java:55) ~[classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181] at com.fasterxml.jackson.databind.introspect.AnnotatedMethod.call(AnnotatedMethod.java:109) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:283) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:229) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:195) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:488) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1287) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:326) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:159) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3746) ~[jackson-databind-2.9.10.4.jar:2.9.10.4] ... 94 more 2022-01-21 08:47:11,607 ERROR: org.graylog2.cluster.ClusterConfigServiceImpl - Couldn't extract payload from cluster config (type: org.graylog.plugins.map.config.GeoIpResolverConfig)

image

image

Old DB config that is present when the GeoIpProcessor starts up the first time:
image

Final and correct config after the migration runs:

{
        "_id" : ObjectId("61eaae5b871ccb599da8d859"),
        "type" : "org.graylog.plugins.map.config.GeoIpResolverConfig",
        "payload" : {
                "enabled" : true,
                "asn_db_path" : "",
                "city_db_path" : "/Users/danieltorrey/maxmind/GeoLite2-City.mmdb",
                "db_vendor_type" : "MAXMIND",
                "enforce_graylog_schema" : false
        },
        "last_updated" : ISODate("2022-01-21T13:00:11.966Z"),
        "last_updated_by" : "e065896b-8a9a-4f45-83f2-e740525ed035"
}

@danotorrey
Copy link
Contributor

A bit more info: The GeolocationProcessor starts up when the ProcessorBuffer is injected in CmdLineTool here:

And later, migrations are run during the startCommand() here:


The root of the problem is that message processing objects are initialized at the time when injection is set up. There appears to be no state control that waits for Graylog startup operations to be complete first. Other than a larger reworking, can anyone think of any solutions or workarounds for this PR? @mpfz0r @bernd @kingzacko1?

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberto-graylog I've finished reviewing the rest of the code. Excellent work! I really like the implementation, structure, and all of the error handling. Seems really solid to me.

I only left a few very small comments here. My only remaining concern with this PR is the initial start error and failure of the GeoLocation engine to start. Once a solution is found for that, I think this is ready for merging.

@roberto-graylog
Copy link
Contributor Author

roberto-graylog commented Jan 21, 2022

@roberto-graylog I've finished reviewing the rest of the code. Excellent work! I really like the implementation, structure, and all of the error handling. Seems really solid to me.

I only left a few very small comments here. My only remaining concern with this PR is the initial start error and failure of the GeoLocation engine to start. Once a solution is found for that, I think this is ready for merging.

Thanks @danotorrey , this was a great catch--I completely missed it.

The migration runner requires the injector, not I cannot see at the moment how we an run it before the processor gets injected. I will continue to look and let you know if I find something.

I attempted to run migrations on beforeInjectorCreation with the pre-flight injector, but that just seems to have angered it--perhaps i am just missing module configurations.

@danotorrey
Copy link
Contributor

danotorrey commented Jan 21, 2022

The migration runner requires the injector, not I cannot see at the moment how we an run it before the processor gets injected. I will continue to look and let you know if I find something.

@roberto-graylog I just thought of one possible solution that might be simple. While the GeoIpProcessor constructor is called when injection is setup (migrations have not run at this point), the process method should only be called once the server reaches a RUNNING status. So, if the configuration is lazily retrieved when that method is called, then it should be available then because migrations will have finished by that time. Messages should only arrive in the processor once the server reaches the RUNNING state, I think.

One lazy init option is to use the Apache Commons Lazy Initializer.

Moving the clusterConfigService.getOrDefault(GeoIpResolverConfig.class, GeoIpResolverConfig.defaultConfig()); retrieval to the lazy initializer should do it.

UPDATE
After looking at this a bit more, using a lazy initializer gets a little tricky since both the filterEngine and refresh operation need the config. Perhaps the initializer can return the filterEngine instance and refresh could reinitialize it.

@bernd
Copy link
Member

bernd commented Jan 24, 2022

The root of the problem is that message processing objects are initialized at the time when injection is set up. There appears to be no state control that waits for Graylog startup operations to be complete first. Other than a larger reworking, can anyone think of any solutions or workarounds for this PR? @mpfz0r @bernd @kingzacko1?

@danotorrey @roberto-graylog How about doing something like the following? We can wait with the engine's initialization until the server is in the RUNNING state. It also removes IO from the constructor, which is always preferable. What do you think?

diff --git graylog2-server/src/main/java/org/graylog/plugins/map/geoip/processor/GeoIpProcessor.java graylog2-server/src/main/java/org/graylog/plugins/map/geoip/processor/GeoIpProcessor.java
index cacad4bf6d..72e9109cd3 100644
--- graylog2-server/src/main/java/org/graylog/plugins/map/geoip/processor/GeoIpProcessor.java
+++ graylog2-server/src/main/java/org/graylog/plugins/map/geoip/processor/GeoIpProcessor.java
@@ -25,7 +25,9 @@ import org.graylog.plugins.map.geoip.GeoIpVendorResolverService;
 import org.graylog2.cluster.ClusterConfigChangedEvent;
 import org.graylog2.plugin.Message;
 import org.graylog2.plugin.Messages;
+import org.graylog2.plugin.ServerStatus;
 import org.graylog2.plugin.cluster.ClusterConfigService;
+import org.graylog2.plugin.lifecycles.Lifecycle;
 import org.graylog2.plugin.messageprocessors.MessageProcessor;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -55,32 +57,38 @@ public class GeoIpProcessor implements MessageProcessor {
     private final ScheduledExecutorService scheduler;
     private final MetricRegistry metricRegistry;
     private final GeoIpVendorResolverService geoIpVendorResolverService;
+    private final ServerStatus serverStatus;
 
-    private final AtomicReference<GeoIpResolverConfig> config;
-    private final AtomicReference<GeoIpResolverEngine> filterEngine;
+    private final AtomicReference<GeoIpResolverEngine> filterEngine = new AtomicReference<>(null);
 
     @Inject
     public GeoIpProcessor(ClusterConfigService clusterConfigService,
                           @Named("daemonScheduler") ScheduledExecutorService scheduler,
                           EventBus eventBus,
                           MetricRegistry metricRegistry,
-                          GeoIpVendorResolverService geoIpVendorResolverService) {
+                          GeoIpVendorResolverService geoIpVendorResolverService,
+                          ServerStatus serverStatus) {
         this.clusterConfigService = clusterConfigService;
         this.scheduler = scheduler;
         this.metricRegistry = metricRegistry;
         this.geoIpVendorResolverService = geoIpVendorResolverService;
-
-        final GeoIpResolverConfig config = clusterConfigService.getOrDefault(GeoIpResolverConfig.class,
-                GeoIpResolverConfig.defaultConfig());
-
-        this.config = new AtomicReference<>(config);
-        this.filterEngine = new AtomicReference<>(new GeoIpResolverEngine(geoIpVendorResolverService, config, metricRegistry));
+        this.serverStatus = serverStatus;
 
         eventBus.register(this);
     }
 
     @Override
     public Messages process(Messages messages) {
+        // Lazily initialize the filter engine because we might not get the correct resolver config until the
+        // server is running. (until migrations are done)
+        if (filterEngine.get() == null) {
+            if (serverStatus.getLifecycle() == Lifecycle.RUNNING) {
+                reload();
+            } else {
+                return messages;
+            }
+        }
+
         for (Message message : messages) {
             filterEngine.get().filter(message);
         }
@@ -103,7 +111,6 @@ public class GeoIpProcessor implements MessageProcessor {
                 GeoIpResolverConfig.defaultConfig());
 
         LOG.info("Updating GeoIP resolver engine - {}", newConfig);
-        config.set(newConfig);
         filterEngine.set(new GeoIpResolverEngine(geoIpVendorResolverService, newConfig, metricRegistry));
     }
 }

@danotorrey
Copy link
Contributor

@bernd I like the thinking here. Great idea to use the server status to identify the running server state. How about using serverStatus.awaitRunning() with a similar approach (just in case the process method is called before startup)? It should not happen, but this adds some protection against returning unprocessed messages in that case.

@Override
public Messages process(Messages messages) {
    if (filterEngine.get() == null) {
        try {
            serverStatus.awaitRunning();
        } catch (InterruptedException e) {
            LOG.error("The GeoIpProcessor was interrupted while waiting for the Server to start up.");
            // TODO: Handle error condition? Return? 
            //  Probably the server is shutting down if this is thrown, so returning is probably ok?
            return messages;
        }
        reload();
    }
    for (Message message : messages) {
        filterEngine.get().filter(message);
    }

    return messages;
}

@bernd
Copy link
Member

bernd commented Jan 24, 2022

@bernd I like the thinking here. Great idea to use the server status to identify the running server state. How about using serverStatus.awaitRunning() with a similar approach (just in case the process method is called before startup)? It should not happen, but this adds some protection against returning unprocessed messages in that case.

@danotorrey Good point! 👍 I was a bit reluctant to use #awaitRunning() because it blocks the processing thread. But I think you are right, waiting here is the right thing to do.

Copy link
Contributor

@danotorrey danotorrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @roberto-graylog. The latest update fixed the error on first start up 🎉. Well done!

Do you think the info log messages when the processor engine starts up and restarts can be reduced to debug level? currently, the following messages are logging for every processor (4 by default) and could be even more for customer configurations. Perhaps adding started/starting info level messages to GeoIpProcessor instead would be better, so that they only display once time.

2022-01-26 09:23:12,104 INFO : org.graylog.plugins.map.geoip.processor.GeoIpProcessor - Updating GeoIP resolver engine - GeoIpResolverConfig{enabled=true, enforceGraylogSchema=false, databaseVendorType=MAXMIND, cityDbPath=/Users/danieltorrey/maxmind/GeoLite2-City.mmdb, asnDbPath=/Users/danieltorrey/maxmind/GeoLite2-ASN.mmdb}
2022-01-26 09:23:12,104 INFO : org.graylog.plugins.map.geoip.processor.GeoIpProcessor - Updating GeoIP resolver engine - GeoIpResolverConfig{enabled=true, enforceGraylogSchema=false, databaseVendorType=MAXMIND, cityDbPath=/Users/danieltorrey/maxmind/GeoLite2-City.mmdb, asnDbPath=/Users/danieltorrey/maxmind/GeoLite2-ASN.mmdb}
2022-01-26 09:23:12,104 INFO : org.graylog.plugins.map.geoip.processor.GeoIpProcessor - Updating GeoIP resolver engine - GeoIpResolverConfig{enabled=true, enforceGraylogSchema=false, databaseVendorType=MAXMIND, cityDbPath=/Users/danieltorrey/maxmind/GeoLite2-City.mmdb, asnDbPath=/Users/danieltorrey/maxmind/GeoLite2-ASN.mmdb}
2022-01-26 09:23:12,104 INFO : org.graylog.plugins.map.geoip.processor.GeoIpProcessor - Updating GeoIP resolver engine - GeoIpResolverConfig{enabled=true, enforceGraylogSchema=false, databaseVendorType=MAXMIND, cityDbPath=/Users/danieltorrey/maxmind/GeoLite2-City.mmdb, asnDbPath=/Users/danieltorrey/maxmind/GeoLite2-ASN.mmdb}
2022-01-26 09:23:12,104 INFO : org.graylog.plugins.map.geoip.processor.GeoIpProcessor - Updating GeoIP resolver engine - GeoIpResolverConfig{enabled=true, enforceGraylogSchema=false, databaseVendorType=MAXMIND, cityDbPath=/Users/danieltorrey/maxmind/GeoLite2-City.mmdb, asnDbPath=/Users/danieltorrey/maxmind/GeoLite2-ASN.mmdb}
2022-01-26 09:23:12,105 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - Created Geo IP Resolvers for 'MAXMIND'
2022-01-26 09:23:12,105 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpLocationResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - Created Geo IP Resolvers for 'MAXMIND'
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpAsnResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - Created Geo IP Resolvers for 'MAXMIND'
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - Created Geo IP Resolvers for 'MAXMIND'
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - Created Geo IP Resolvers for 'MAXMIND'
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpLocationResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpLocationResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpLocationResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpLocationResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpAsnResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpAsnResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpAsnResolver' Status Enabled: true
2022-01-26 09:23:12,106 INFO : org.graylog.plugins.map.geoip.GeoIpResolverEngine - 'MaxMindIpAsnResolver' Status Enabled: true

Otherwise, this PR LGTM!

@roberto-graylog
Copy link
Contributor Author

Thanks @roberto-graylog. The latest update fixed the error on first start up tada. Well done!

Do you think the info log messages when the processor engine starts up and restarts can be reduced to debug level? currently, the following messages are logging for every processor (4 by default) and could be even more for customer configurations. Perhaps adding started/starting info level messages to GeoIpProcessor instead would be better, so that they only display once time.

Otherwise, this PR LGTM!

Thanks @danotorrey . I have made the updates from INFO to DEBUG for those entries.

@roberto-graylog roberto-graylog merged commit 96b2aa3 into master Jan 26, 2022
@roberto-graylog roberto-graylog deleted the geoip-processor-update branch January 26, 2022 21:04
@roberto-graylog roberto-graylog restored the geoip-processor-update branch January 28, 2022 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants