Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Mongo-DB calls (in memory only) #998

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

jason-fox
Copy link
Contributor

@jason-fox jason-fox commented Mar 2, 2021

Mongo-DB access is slow. This PR minimizes the need to make database calls, by caching the last 1000 device calls and 100 group calls in a refreshable cache. This in turn reduces network traffic and increases maximum throughput.

  • Add cache-manager
  • Wrap mongo-db GET calls
  • Bust cache on any provisioning updates/deletes
  • Update unit tests.
  • Add Documentation

All parameters are settable as config or Docker ENV variables.

Adding cache=true as part of any Device or Group configuration will ensure that the data can potentially be received from cached and not necessarily retrieved from the MongoDB database.

@jason-fox
Copy link
Contributor Author

Duplicate of #926 but without the redis cache

@mapedraza
Copy link
Collaborator

Thank for your contribution Jason.

Regarding the config parameters, the option dontCache unfortunately does not provide backwards compatibility.

A good example could be something like JEXL configuration approach:

  • A global parameter that configures the default mode (cache enabled or disabled for all config groups, by default)
  • A parameter on the config group that specifies the cache mode and overrides the default configuration

With these two parameters, an existent deployment that update the IoT Agent can use cache if it needed, offering backwards compatibility with config groups already provisioned. The expected behaviour is described below.

Regarding the cache distribution, it should allow segmentation (as mentioned in here #926 (comment)). As far as I saw in the code, all the cache is shared across all the tenants. In multi tenant environments, the risk one tenant could overuse all the resources is present. Having this in mind, we can differentiate between two types of cache:

  • Groups cache: It is associated to a tenant (aka FIWARE-Service). Since it is not modeled in IoT Agent’s data model or in the API any “entity” or operation that reference to a Service (Tenant) right now, a first approach to configure the group cache should be:
    • A general switch on/of group cache.
    • Group cache size (All tenants would have same Group cache size).
  • Devices cache: Each device group has to have their own independient device cache.

The architecture discussed above is illustrated in the following diagram:

Diagram

As a resume, we would have to have the following env vars, as well as config.js parameter:

  • Group Cache
    • IOTA_GROUP_CACHE_MODE. Posibles values: none, inMemory
    • IOTA_GROUP_CACHE_SIZE
    • IOTA_GROUP_CACHE_TTL
  • Device Group. Those configuration can be overridden through device group provision.
    • IOTA_DEVICE_CACHE_DEFAULT_MODE
    • IOTA_DEVICE_CACHE_DEFAULT_SIZE
    • IOTA_DEVICE_CACHE_DEFAULT_TTL
    • IOTA_DEVICE_CACHE_MAX_SIZE

The device group provision JSON as well should include the following parameters. This parameters should override the env var or config.js file configurations.

  • cacheTTL
  • cacheMode: may have one of the following values: none, inMemory
  • cacheSize: in case the value provided is greater than IOTA_DEVICE_CACHE_MAX_SIZE, it will be set to IOTA_DEVICE_CACHE_MAX_SIZE.

Example of cache config file section:

{
    "groupCacheMode":"inMemory",
    "groupCacheSize":100,
    "groupCacheTTL":100000,
    "deviceCacheDefaultMode":"inMemory",
    "deviceCacheDefaultSize":100,
    "deviceCacheDefaultTTL":100000,
    "deviceCacheMaxSize":1000
}

Equivalence with envars:

Environment variable Configuration attribute
IOTA_CB_URL cache.contextBroker.url
IOTA_CB_HOST cache.contextBroker.host
IOTA_GROUP_CACHE_MODE cache.groupCacheMode
IOTA_GROUP_CACHE_SIZE cache.groupCacheSize
IOTA_GROUP_CACHE_TTL cache.groupCacheTTL
IOTA_DEVICE_CACHE_DEFAULT_MODE cache.deviceCacheDefaultMode
IOTA_DEVICE_CACHE_DEFAULT_SIZE cache.deviceCacheDefaultSize
IOTA_DEVICE_CACHE_DEFAULT_TTL cache.deviceCacheDefaultMode
IOTA_DEVICE_CACHE_MAX_SIZE cache.deviceCacheMaxSize

Another point to consider is the Cache Replacement Mode. Depending on the scenario, it may be more interesting to have a LRU, MRU or random replacement based policy. Which mode is used right now? It would be interesting to be able to configure it.

@SBlechmann
Copy link

Thanks for this interesting PR! I really hope this can in fact improve read queries from the agents!
Since I'm not a coding expert (I'm more like a simple user) and haven't reviewed the code totally, please, allow me these questions:

  • If I understood correctly, the group cache is supposed to be the maximum granted cache per tenant (fiware-service) and the device cache is the maximum granted cache per group (endpoint /iot/services)? If I am in fact right I find this naming pretty confusing. How about renaming "group cache" to "tenant cache" or "db cache" and "device cache" to "group cache"?
  • Is there a check that the "device cache" or better the sum of "device caches" per tenant don't exceed the "group cache"?

Thanks for your time, really appreciate it!

@mapedraza
Copy link
Collaborator

First, I want to clarify that my previous comment with the description and diagrams, shows the desired behaviour expected from a cache system.

  • If I understood correctly, the group cache is supposed to be the maximum granted cache per tenant (fiware-service) and the device cache is the maximum granted cache per group (endpoint /iot/services)? If I am in fact right I find this naming pretty confusing. How about renaming "group cache" to "tenant cache" or "db cache" and "device cache" to "group cache"?

I know it is a bit confusing, but the reason about naming "group cache" is because it is a cache that store groups (and it is releated to each tenant). The reason why the device cache is named as device cache, is because devices are stored on that cache (and it is also linked to a group). Depending on what data is being stored or what the cache belongs to, one naming or another may make more sense.

  • Is there a check that the "device cache" or better the sum of "device caches" per tenant don't exceed the "group cache"?

They are different types of caches. The group cache only store config group (also named provision groups) and they dont store devices. You just have 2 different limits (IOTA_GROUP_CACHE_SIZE and IOTA_DEVICE_CACHE_MAX_SIZE)

@SBlechmann
Copy link

Thanks for the explanation, I think I got your naming now. So device and group cache are independent from each other.

Allow me one last question:

  • The group cache is at a fixed level for each tenant/database?

I believe there is checks in the background that still the sum of group and device cache don't exceed the memory of mongodb?

@jason-fox
Copy link
Contributor Author

Regarding the cache distribution, it should allow segmentation (as mentioned in here #926 (comment)). As far as I saw in the code, all the cache is shared across all the tenants. In multi tenant environments, the risk one tenant could overuse all the resources is present. Having this in mind, we can differentiate between two types of cache:

I think the architecture you described won't work with a pure in-memory cache (which is what this PR now is) but is something to be achieved in PR #926 - all that this PR does for now is substitute an in memory record of the last n hits - it is very, very simple, but very fast to access. I would assume the per tenant config would be on Redis as a series of RedisCaches.

The same goes for replacement mode - not this PR but the other one.

@jason-fox
Copy link
Contributor Author

jason-fox commented Apr 8, 2021

Regarding the config parameters, the option dontCache unfortunately does not provide backwards compatibility.

Getting this right is something for the first PR, laying the groundwork so to speak. How is dontCache not backward compatible?
The default behaviour is Don't use any caches same as before. If and only if caching is deliberately enabled, it is used.
dontCache is a flag on provisioning to bypass any caching for important, must be consistent devices.

Can you clarify what changes to the current behaviour are you looking for here? I assume something needs fixing, I'm just not sure what.

A good example could be something like JEXL configuration approach:

A global parameter that configures the default mode (cache enabled or disabled for all config groups, by default)

This is currently IOTA_MEMCACHE_ENABLED or memCache.enabled in the config.

A parameter on the config group that specifies the cache mode and overrides the default configuration

The current architecture is assuming each cache can be enabled separately e.g. :

  • IOTA_MEMCACHE_ENABLED
  • IOTA_REDISCACHE_ENABLED
  • etc.

The local in-memory is the fastest and smallest. If IOTA_MEMCACHE_ENABLED=false it is not used if both are true then Redis will run if in-memory fails.

With these two parameters, an existent deployment that update the IoT Agent can use cache if it needed, offering
backwards compatibility with config groups already provisioned.

If we ignore Redis for now, does the in-memory quick and dirty do enough or not?

@jason-fox
Copy link
Contributor Author

jason-fox commented Apr 13, 2021

@mapedraza - The in-provisioning flag has been switched from dontCache to cache. The default has therefore switched to not to cache unless explicitly provisioned to do so in the device or device group level. Does this change pass the comment below:

Regarding the config parameters, the option dontCache unfortunately does not provide backwards compatibility.

@Blobonat
Copy link

Is there still active work on this topic? For a horizontal scalable deployment of IOTAs with one common MongoDB-Cluster this would be a gigantic performance boost lantency-wise.

@jason-fox
Copy link
Contributor Author

Rebased as requested. This part is actually a much smaller change than it appears, since it also corrects the location of the mongoDB test tool, and runs cache flushing when necessary.

const mongoUtils = require('../../tools/mongoDBUtils');

@Blobonat
Copy link

Will this feature introduce breaking changes for the IOTA-implementations like IOTA-JSON and IOTA-UL or are the changes transparent so a simple version bump will enable the use of this functionality?

@jason-fox
Copy link
Contributor Author

jason-fox commented Aug 23, 2022

It is opt in, so it is only enabled if you set the configuration to do so. Even if you are using an in-memory cache, you could provision individual devices not to use it - it just depends if you want lower latency or if you are worried about the potential that IoT Agent A uses older cached in-memory info when a provisioning update has occurred through IoT Agent B

@jason-fox
Copy link
Contributor Author

@mapedraza - is this PR still in the queue to be reviewed? It is opt-in, so without setting the parameters, the PR itself is harmless. It is use-case dependent as to whether you want full consistency across multiple IoT Agent instances or lower latency and fewer Database look-ups. The text is quite clear about this:

The memCache data is not shared across instances and therefore should be reserved to short term data storage. Multiple
IoT Agents would potential hold inconsistent provisioning data until the cache has expired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants