Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in lmdb.cc + improve performance while reading collections in resolveMultiMatches #2520

Open
wants to merge 4 commits into
base: v3/master
Choose a base branch
from

Conversation

ziollek
Copy link
Contributor

@ziollek ziollek commented Feb 24, 2021

What have been done ?

  1. Removed memory leak in lmdb.cc while passing data from MDB_val to VariableValue:
  • added new VariableValue constructors for rvalue strings
  • decreased memory footprint (just one data copying between lmdb and VariableValue thanks to above constructors)
  1. Optimized fetching collection in resolveMultiMatches:
  • at the beginning, the cursor is moved to a key equal to or greater than the input key (MDB_SET_RANGE)
  • values are fetched via cursor as long as the input key is equal to the key fetched from lmdb

…ableValue

- add new VariableValue constructors for rvalue strings
- decrease number of memory copies between lmdb and VariableValue
- move cursor to first key equal or greather than input key (MDB_SET_RANGE)
- break while if input key is not equal key fetched from lmdb
@zimmerle zimmerle self-requested a review February 26, 2021 14:41
@zimmerle
Copy link
Contributor

Hi @ziollek, thanks for the contribution.

I am wondering how to fit the rvalue change on VariableValue of the 3.1-experimental development branch -

https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L51-L59

Here is the key + value constructor -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L98-L108

Here is the collection + key + value -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L111-L184

Giving that changes on VariableValue, we already have the LMDB in this shape -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/src/collection/backend/lmdb.cc#L498-L516

The benefit of MDB_SET_RANGE change is very clear. What do you think about the changes that we have on 3.1-experimental?

@zimmerle zimmerle self-assigned this Feb 26, 2021
@zimmerle zimmerle added the 3.x Related to ModSecurity version 3.x label Feb 26, 2021
@ziollek
Copy link
Contributor Author

ziollek commented Feb 26, 2021

Hi @zimmerle - thanks for your comments.

I wasn't aware of VariableValue changes in 3.1-experimental. Despite of internal changes in variable_value.h i'm afraid, there is still a memory leak in lmdb.cc - because of using new std::string. In effect, below constructor is called:

https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L142-L151

Above constructor (unlike constructors quoted in your comment) don't 'move' allocated in lmdb memory - it just copies it.

Luckily it could be easily fixed by replacing
new std::string(...) to std::make_unique<std::string>(...)

@zimmerle
Copy link
Contributor

Different methods from lmdb back-end are performing this string copy.

resolveFirst -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L192-L194

resolveSingleMatch -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L290-L293

resolveMultiMatches -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L499-L504

and

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L510-L515

resolveRegularExpression -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L571-L575

They are using the constructors -

Method Constructor Leak on v3/master
resolveFirst https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely
resolveSingleMatch https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely
resolveMultiMatches (a) https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L50-L57 likely
resolveMultiMatches (b) https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L50-L57 Likely
resolveRegularExpression https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely

The class VariableValue has four std::string members: m_collection, m_key, m_keyWithCollection, and m_value the attribution of their values happens on the constructor for VariableValue. m_value could be also set via setValue. All attributions are based on the std::string copy constructor.

Later the VariableValue is used at -

Transaction

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L913-L918

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1533-L1539

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1571-L1576

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1670-L1674

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1700-L1704

RunTimeString

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/run_time_string.cc#L63-L66

RuleWithOperator

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/rule_with_operator.cc#L278-L338

As an alternative the inMemoryBackEnd back-end for collections does not duplicates/copy the strings.

Investigating the best approach to tackle this without too much modification on the v3/master code.

data.mv_size)));
}
string2val(var, &key);
rc = mdb_cursor_get(cursor, &key, &data, MDB_SET_RANGE);
Copy link

@sobczak-m sobczak-m Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziollek replace MDB_SET_RANGE to MDB_SET_KEY
MDB_SET_RANGE - getting first key greater than or equal to specified key http://www.lmdb.tech/doc/group__mdb.html#ga1206b2af8b95e7f6b0ef6b28708c9127
If you are looking for none exist samplekey you get key containing samplekey like samplekey2 (if such key exist)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
break;
}
} while ((rc = mdb_cursor_get(cursor, &key, &data, MDB_NEXT)) == 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace MDB_NEXT to MDB_NEXT_DUB to iterate only over current key

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zimmerle
Copy link
Contributor

I see a benefit of having less copies around, studying the feasibility to have this pull request partially merged on the upcoming 3.0.5.

On QA - https://github.com/SpiderLabs/ModSecurity/actions/runs/883266568

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants