Adding mechanisms to gracefully handle "map is full" error #34

ncloudioj · 2018-06-07T20:56:26Z

Each lmdb store has a predefined size (10MB by default), if a store is running out of free space due to,

Store was filled by the data
Some orphan transactions prevented lmdb from reclaiming the unused pages

either way, all the following inserts will be rejected by lmdb with a MDB_MAP_FULL error.

We will have to provider some bailout mechanisms for this particular issue to avoid the write downtime.

Resizing the store, this requires all users terminate their transactions (read/write) first, also, once the size gets increased, there is no way to shrink it unless recreating a new one and copying all the data over
To let lmdb reclaim the unused pages, we need to ensure there is no orphan holding locks in lmdb's transaction table. Lmdb has a API (mdb_reader_check) to clean up those zombie transactions, looks like this API was not exposed by ldmb-rs, we might have to add that to upstream first.

The text was updated successfully, but these errors were encountered:

zen0wu · 2018-10-13T01:58:43Z

On a related matter, it seems that mdb_env_set_mapsize should be exposed to change the default size. I went back to lower level library because of this limitation. Not sure if there's an existing issue on that.

mykmelez · 2018-10-19T00:14:24Z

@shivawu You can change the map size using rkv. See #82 for tests that demonstrate how to do that.

zen0wu · 2018-10-21T04:53:42Z

Ah I see, somehow I missed it. Thanks for pointing that out. Probably the local doc does not work that well for me.. Also btw, the online doc is missing for the project (https://docs.rs/rkv/0.5.1/rkv/), expected?

mykmelez · 2018-10-22T16:45:52Z

Ah I see, somehow I missed it. Thanks for pointing that out. Probably the local doc does not work that well for me.. Also btw, the online doc is missing for the project (https://docs.rs/rkv/0.5.1/rkv/), expected?

Not expected, but known. It's #81, where you can read all the details (tl;dr: docs.rs is using an outdated rustc compiler).

ncloudioj · 2019-02-14T23:08:46Z

Another way for us to deal with MAP_FULL is that we check the space usage of environment via its stats API (which reports max map size, total page # in use, total page # in the freelist etc), and then increase the map size if the disk usage goes over a pre-defined high watermark.

Specifically, we can do this either by a dedicated maintenance worker, or by the consumers themselves. Either way, some coordination is required since resizing can only be done on a free environment (only one writer without any readers).

This preventive resizing appears to be more preferred over the resize-when-you-hit-the-map-full in two aspects:

Write transactions are less likely to be interrupted
Easier to coordinate, preparing the resize condition is hard when there are active readers/writers

@mykmelez What do you think?

mykmelez · 2019-02-15T00:43:38Z

@ncloudioj It seems helpful to periodically check stats and preemptively increase the map size if the database is approaching its limit (as well as to periodically call mdb_reader_check to clear stale readers). And that'll make it less likely for a write to fail with MAP_FULL. But it seems like we'll still occasionally encounter that error, so we'll still need to handle it somehow.

Perhaps kvstore (or the consumer) can catch the MAP_FULL error and trigger a resize (and stale reader check) operation at that point, after which it can retry the write?

That operation could be identical to the one we run periodically. And perhaps it could be invoked with a flag to indicate that it's being invoked manually and should definitely increase the map size, even if the environment doesn't look too full, since a large-enough write could fail in an environment that is mostly empty.

As for coordination, I suppose that whatever is invoking the resize operation could await existing readers while clearing stale ones and blocking new ones, then resize once existing readers are cleared, and finally unblocking the new readers (after which it can return control to whoever called it—kvstore or the consumer—to retry the write).

ncloudioj · 2019-02-15T02:31:08Z

Agreed. Looks like kvstore is the perfect candidate to do this bookkeeping. Once this particular error gets handled properly, it'll be a big step forward for us to roll out rkv in production at scale!

Speaking of maintenance, LMDB also has a set of API for the environment clone (as mentioned in #12 ), which could be useful to eschew the fragmentation problem (like VACUUM). A frequently updated store could consume much more pages than a fresh clone.

With stale reader collector, resize utility, and live copy utility, I believe they'd make a solid maintenance lineup for kvstore.

mykmelez · 2019-02-15T16:58:00Z

This morning it occurred to me that we might implement this in rkv instead of kvstore, in which case rkv consumers more generally would gain the benefit of this automatic resizing. And doing that would be consistent with the goal of rkv to provide a "simple, humane" interface to LMDB. Unsure if there's any reason not to implement this in rkv, but I'll think about it some more.

@ncloudioj Do you have any thoughts about implementing this in kvstore vs. rkv?

ncloudioj · 2019-02-15T18:52:15Z

My reasoning was based on that the consumer knows better than rkv regarding when&how to do the resize :)

Though I agree with you that letting rkv handle it for the consumers would be nice in most cases. If we were to implement it in rkv, I think we also need to consider following cases:

rkv consumers might want to handle MPA_FULL by themselves. For example, they might either want a fixed size store, or do not want it to grow without bound. Perhaps leave auto-resize as an option?
Off the top of my mind, coordination between readers and writer for resizing would be hard because rkv doesn't know anything about the active readers, such as when they will end the underlying transactions. I just realize that kvstore actually face the same challenge though. The key point is how can we efficiently send this need-to-resize message to all the rkv readers&writers.

mykmelez · 2019-02-20T21:29:43Z

It occurred to me recently that a reason to do this in rkv rather than kvstore is that even in Firefox there are consumers who are using rkv directly, such as the one in bug 1429796.

rkv consumers might want to handle MPA_FULL by themselves. For example, they might either want a fixed size store, or do not want it to grow without bound. Perhaps leave auto-resize as an option?

As with some other issues, like #109, a consumer might want more control over their interaction with LMDB under certain circumstances while appreciating the value of rkv's higher-level abstractions in other cases. And there's a tension between wanting to support those use cases while keeping the API simple for the common case.

I do want to support those use cases, and I agree that it should be possible for an rkv consumer to "opt-out" of auto-resize. I would just want to make sure we do so in a way that imposes the least cognitive burden on consumers in the common case.

Off the top of my mind, coordination between readers and writer for resizing would be hard because rkv doesn't know anything about the active readers, such as when they will end the underlying transactions. I just realize that kvstore actually face the same challenge though. The key point is how can we efficiently send this need-to-resize message to all the rkv readers&writers.

I suspect that rkv will have to track and manage active readers somehow, such that it can await them, while blocking new ones, when a resize is pending.

ncloudioj · 2019-02-21T00:01:18Z

Indeed, there are pros&cons in both mechanisms. Looks like equipping rkv with auto-resize works better in more cases at the moment. So yes, I am convinced, and let's get started with that design.

To keep the initial work in a reasonable scope, I'd also like to revoke my prior suggestion of making the auto-resize optional. We can re-visit such feature if we find it useful in the future.

I suspect that rkv will have to track and manage active readers somehow, such that it can await them, while blocking new ones, when a resize is pending.

Yes, it's definitely doable in a single process scenario, but will be more challenging when active readers reside in different processes. Perhaps we can have an internal db (e.g. __meta__) to store all the meta information in order to facilitate the coordination.

With this change, readers could also be blocked by the resize coordination. Though this should be easier to handle as we can return an error (such as RESIZE_IN_PROGRESS) in the reader constructor.

mykmelez · 2019-03-25T17:03:02Z

Note https://www.openldap.org/its/index.cgi/Software%20Bugs?id=8975, which fixes a (rare) error/crash when calling mdb_env_set_mapsize() on Windows if MDB_WRITEMAP is set. It's already been landed on the mdb.RE/0.9 branch, but a new version of LMDB hasn't been released with the fix yet.

mykmelez mentioned this issue Oct 19, 2018

test exceeding and increasing the map size #82

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding mechanisms to gracefully handle "map is full" error #34

Adding mechanisms to gracefully handle "map is full" error #34

ncloudioj commented Jun 7, 2018

zen0wu commented Oct 13, 2018

mykmelez commented Oct 19, 2018

zen0wu commented Oct 21, 2018

mykmelez commented Oct 22, 2018

ncloudioj commented Feb 14, 2019

mykmelez commented Feb 15, 2019

ncloudioj commented Feb 15, 2019

mykmelez commented Feb 15, 2019

ncloudioj commented Feb 15, 2019 •

edited

Loading

mykmelez commented Feb 20, 2019

ncloudioj commented Feb 21, 2019

mykmelez commented Mar 25, 2019 •

edited

Loading

Adding mechanisms to gracefully handle "map is full" error #34

Adding mechanisms to gracefully handle "map is full" error #34

Comments

ncloudioj commented Jun 7, 2018

zen0wu commented Oct 13, 2018

mykmelez commented Oct 19, 2018

zen0wu commented Oct 21, 2018

mykmelez commented Oct 22, 2018

ncloudioj commented Feb 14, 2019

mykmelez commented Feb 15, 2019

ncloudioj commented Feb 15, 2019

mykmelez commented Feb 15, 2019

ncloudioj commented Feb 15, 2019 • edited Loading

mykmelez commented Feb 20, 2019

ncloudioj commented Feb 21, 2019

mykmelez commented Mar 25, 2019 • edited Loading

ncloudioj commented Feb 15, 2019 •

edited

Loading

mykmelez commented Mar 25, 2019 •

edited

Loading