Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support datastreams as an AuditLog Sink #4257

Merged
merged 6 commits into from
Sep 26, 2024

Conversation

tmanninger
Copy link
Contributor

@tmanninger tmanninger commented Apr 16, 2024

Description

This PR allows writing auditlogs to datastreams.

auditlog type "internal_opensearch" currently not supporting datatstreams.
java.lang.IllegalArgumentException: only write ops with an op_type of create are allowed in data streams

This PR changes the op_type to "create" for the "internal_opensearch" auditlogs, because audit logs should never be updated.
Datastreams only supporting "create" as op_type.

Issues Resolved

Testing

Tested manually: Security auditlog with and without datastreams

@tmanninger tmanninger changed the title Add op_type write to internal_opensearcn auditlogs Add op_type write to internal_opensearch auditlogs Apr 16, 2024
@tmanninger tmanninger changed the title Add op_type write to internal_opensearch auditlogs Add op_type "create" to internal_opensearch auditlogs Apr 16, 2024
@peternied
Copy link
Member

@tmanninger Thanks for the contribution, can you please since your commits with the DCO [1]?

Copy link
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Peter's comments. @tmanninger Could you please +ve/-ve test cases for this change. We would like to really ensure that explicitly marking op_type as CREATE would resolve the question.

@willyborankin
Copy link
Collaborator

willyborankin commented Apr 16, 2024

I agree with @peternied and @DarshitChanpura . The data stream can be created via client.createDataStream. The setting OpType.CREATED for the document can help, but currently, an audit index is not a data stream. To create a data stream, first you need:

  • Create an index template since the data stream index is different compare to the regular index
  • Create a data stream
  • Create ISM polity to rollover and clean data stream index.

Under different index I mean that data stream is a system index from the sec plugin point of view:
E.g. you have a data stream: aaa

According to your rollover policy, the ISM plugin will create each time a new index

  • .aaa_001
  • .aaa_002
  • ...
    And delete them if it is needed.
    The template defines an alias that the end user should use to search for data in the data stream.

Here is an example: https://github.com/opensearch-project/opensearch-java/blob/main/samples/src/main/java/org/opensearch/client/samples/DataStreamBasics.java.

Furthermore, restoring from the snapshot is a trick for datastreams since you have to restore all datastreams first and only after all regular indices.

@willyborankin
Copy link
Collaborator

I think the right solution would be to extend existing InternalOpenSearchSink or crate a new Sync call it let's say internal_opensearch_datastream or something else. @peternied, @DarshitChanpura what do you thin about the idea?

@tmanninger
Copy link
Contributor Author

@willyborankin do you mean, the Sink should create the Template, Datastream and ISM policy?
If i am using the the current internal sink with an index, i also must create the index template with an alias manually.

@willyborankin
Copy link
Collaborator

@willyborankin do you mean, the Sink should create the Template, Datastream and ISM policy? If i am using the the current internal sink with an index, i also must create the index template with an alias manually.

@tmanninger yes I think so, the main reason is that if you already has an audit index in your cluster and the cluster was migrated on the new version ... you fix will start to create a datastream ... which is not backward compatible.

@tmanninger
Copy link
Contributor Author

@willyborankin do you mean, the Sink should create the Template, Datastream and ISM policy? If i am using the the current internal sink with an index, i also must create the index template with an alias manually.

@tmanninger yes I think so, the main reason is that if you already has an audit index in your cluster and the cluster was migrated on the new version ... you fix will start to create a datastream ... which is not backward compatible.

Really?
irb.setOpType(DocWriteRequest.OpType.CREATE);
Without an datastream template, the fix will not create a datastream. I tested it in my test environment.

@DarshitChanpura
Copy link
Member

@tmanninger to ensure backwards compatibility for your fix:

Really?
irb.setOpType(DocWriteRequest.OpType.CREATE);
Without an datastream template, the fix will not create a datastream. I tested it in my test environment.

Can we add backwards compatibility tests as well?

@tmanninger
Copy link
Contributor Author

I will add some test cases.

@tmanninger
Copy link
Contributor Author

tmanninger commented Apr 18, 2024

I added a test for datastream and for an index.
For both, i generate some audit logs and check the count of documents in the audit log index/datastream.

Thank's ok?

Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right solution would be to extend existing InternalOpenSearchSink or create a new Sync call it let's say internal_opensearch_datastream or something else. @peternied, @DarshitChanpura what do you thin about the idea?

@willyborankin This is a great idea, I strongly suggest we move in this direction

Add a backward combability test.

I don't think a BWC test is needed in this scenario so long as we are adding a new audit log type.

@tmanninger Let us know if you have questions, while the proposed fix could work. I'd rather there is consistency. Having to correctly configure the data stream before its used seems like a way to have a broken audit log and not be aware of it, whereas having a concrete definition ensures that the data stream is setup correctly without user intervention

@tmanninger
Copy link
Contributor Author

tmanninger commented Apr 18, 2024

I can create a new Sink named internal_opensearch_datastream

  • Create index template (name: auditlog_datastream)
  • Create datastream (name: auditlog , config option: plugins.security.audit.config.datastream)
  • Write audit logs to this datastream

is this way ok?

@peternied
Copy link
Member

@tmanninger Please look into how the other audit log storage systems [1] work and consider what should be have sensible defaults that is configurable vs required.

As far as sink names go I think internal_opensearch and external_opensearch would look silly with external_opensearch_data_stream side-by-side. You might need to be creative with how the class is instantiation or create 'inner classes' one for index and one for data_stream

As you mentioned plugins.security.audit.config.datastream would effectively communicate the name, and I would recommend that plugins.security.audit.config.index is mutually exclusive. I think the template values should be configurable via a similar mechanism for the index shard/replica count of an index

Looks forward to those changes - thank you.

@tmanninger
Copy link
Contributor Author

How can i create a datastream template?

When i am using this example:
https://github.com/opensearch-project/opensearch-java/blob/main/samples/src/main/java/org/opensearch/client/samples/DataStreamBasics.java.

error: package org.opensearch.client.opensearch.indices does not exist

I can use org.opensearch.client.indices.PutIndexTemplateRequest, but this class does not support DataStream config.

@peternied
Copy link
Member

How can i create a datastream template?

When i am using this example: https://github.com/opensearch-project/opensearch-java/blob/main/samples/src/main/java/org/opensearch/client/samples/DataStreamBasics.java.

error: package org.opensearch.client.opensearch.indices does not exist

I can use org.opensearch.client.indices.PutIndexTemplateRequest, but this class does not support DataStream config.

@tmanninger I'm not sure, maybe you could look at what options are available in OpenSearch's repository

@peternied peternied changed the title Add op_type "create" to internal_opensearch auditlogs Support datastreams as an AuditLog Sink Apr 19, 2024
@tmanninger
Copy link
Contributor Author

I rebased git in an incorrect way :-/ (and github was auto-closed this MR) ...
I readded all my changes in one commit

Signed-off-by: tmanninger <[email protected]>
@tmanninger
Copy link
Contributor Author

Added MR for documentation: opensearch-project/documentation-website#8356

Signed-off-by: tmanninger <[email protected]>
@tmanninger
Copy link
Contributor Author

tmanninger commented Sep 24, 2024

I fixed the spotless errors https://github.com/opensearch-project/security/actions/runs/11010456713 .
@willyborankin please rerun the tests.

Signed-off-by: tmanninger <[email protected]>
Signed-off-by: tmanninger <[email protected]>
@tmanninger
Copy link
Contributor Author

tmanninger commented Sep 24, 2024

Check the doc count after rollover failed.

    Expected: (a value greater than <0> and a value less than <5>)
         but: a value less than <5> <5> was equal to <5>
        at __randomizedtesting.SeedInfo.seed([BB9BC69071BEFE5B:AF3E69E787FB9BB0]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at org.opensearch.security.auditlog.sink.InternalOpensearchDataStreamSinkTest.testTemplate(InternalOpensearchDataStreamSinkTest.java:108)
        at org.opensearch.security.auditlog.sink.InternalOpensearchDataStreamSinkTest.testDefaultSettings(InternalOpensearchDataStreamSinkTest.java:123)

Sometimes, there is one more audit log in the new datastream backend, maybe the rollover is faster than writing all audit logs generated before the rollover.
I increased the number of maximal audit docs.

@willyborankin pls rerun (hope the last time).

Copy link

codecov bot commented Sep 24, 2024

Codecov Report

Attention: Patch coverage is 84.26966% with 14 lines in your changes missing coverage. Please review.

Project coverage is 65.65%. Comparing base (31cfd54) to head (4aa65c0).
Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
...uditlog/sink/InternalOpenSearchDataStreamSink.java 83.67% 6 Missing and 2 partials ⚠️
.../auditlog/sink/AbstractInternalOpenSearchSink.java 76.00% 4 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4257      +/-   ##
==========================================
+ Coverage   65.51%   65.65%   +0.14%     
==========================================
  Files         319      321       +2     
  Lines       22448    22538      +90     
  Branches     3602     3611       +9     
==========================================
+ Hits        14707    14798      +91     
+ Misses       5933     5928       -5     
- Partials     1808     1812       +4     
Files with missing lines Coverage Δ
.../opensearch/security/OpenSearchSecurityPlugin.java 84.82% <100.00%> (+0.34%) ⬆️
...security/auditlog/sink/InternalOpenSearchSink.java 100.00% <100.00%> (+16.66%) ⬆️
...pensearch/security/auditlog/sink/SinkProvider.java 80.89% <100.00%> (+0.43%) ⬆️
...g/opensearch/security/support/ConfigConstants.java 95.23% <ø> (ø)
.../auditlog/sink/AbstractInternalOpenSearchSink.java 76.00% <76.00%> (ø)
...uditlog/sink/InternalOpenSearchDataStreamSink.java 83.67% <83.67%> (ø)

... and 13 files with indirect coverage changes

cwperks
cwperks previously approved these changes Sep 24, 2024
Copy link
Member

@cwperks cwperks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added MR for documentation: opensearch-project/documentation-website#8356

Thank you for creating a documentation PR for this change!

The changes look good to me, I just had one remaining question on the default value for number of shards.

Signed-off-by: tmanninger <[email protected]>
@DarshitChanpura DarshitChanpura merged commit 9c5e32a into opensearch-project:main Sep 26, 2024
42 checks passed
@DarshitChanpura DarshitChanpura added the backport 2.x backport to 2.x branch label Sep 26, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 26, 2024
Signed-off-by: tmanninger <[email protected]>
(cherry picked from commit 9c5e32a)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants