Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM-6307 Make sure ConsentAwareFileOrchestrator is thread safe #2313

Conversation

mariusc83
Copy link
Collaborator

@mariusc83 mariusc83 commented Oct 8, 2024

What does this PR do?

  • Issue Identified:

    • Our storage logic has a race condition issue affecting event processing.
    • This problem is visible only in a specific use case involving concurrent operations and tracking consent management.
  • Specific Use Case:

    • The system is operating under "PENDING" tracking consent mode.
    • Events are being written via a featureScope by one thread.
    • At the same time, tracking consent is being changed by another thread.
  • Underlying Process:

    • For each event written:
      • The system first resolves the DatadogContext to determine the current environment and configuration.
      • It then uses the DatadogContext to resolve the FileOrchestrator, which decides where events should be stored.
      • Subsequently, a task is added to the Executor queue to handle the actual writing of the event.
override fun writeCurrentBatch(
       datadogContext: DatadogContext,
       forceNewBatch: Boolean,
       callback: (EventBatchWriter) -> Unit
   ) {
       val orchestrator = when (datadogContext.trackingConsent) {
           TrackingConsent.GRANTED -> grantedOrchestrator
           TrackingConsent.PENDING -> pendingOrchestrator
           TrackingConsent.NOT_GRANTED -> null
       }

       val metric = internalLogger.startPerformanceMeasure(
           callerClass = ConsentAwareStorage::class.java.name,
           metric = TelemetryMetricType.MethodCalled,
           samplingRate = MethodCallSamplingRate.RARE.rate,
           operationName = "writeCurrentBatch[${orchestrator?.getRootDirName()}]"
       )

       executorService.submitSafe("Data write", internalLogger) {
           synchronized(writeLock) {
               val batchFile = orchestrator?.getWritableFile(forceNewBatch)
               val metadataFile = if (batchFile != null) {
                   orchestrator.getMetadataFile(batchFile)
                   ...
  • Race Condition Trigger:
    • During the execution of setTrackingConsent(Granted):
      • A new task is created to migrate events from the "PENDING" to the "GRANTED" consent state.
      • This task is added to the Executor queue.
    • The race condition occurs when this migration task enters the queue between the resolution of the FileOrchestrator and the addition of the event-writing task.
journey
    title Storge Executor Queue
    section Write Tasks
        event1InPendign: Write Event 1
        event2InPending: Write Event 2
    section SwitchConsent Tasks
        PendingToGranted: Switch Consent
    section Write Tasks
        event3InPending: Write Event 3
        event4InGranted: Write Event 4
Loading
  • Resulting Issue:

    • The events, initially deemed "PENDING," get stored in the "PENDING" directory.
    • Due to the timing of the migration task:
      • These events may fail to migrate to the "GRANTED" state.
      • Consequently, they remain in the "PENDING" folder indefinitely.
    • This results in the loss of events, as they are not properly sent or processed.
  • Implications:

    • This issue could lead to incomplete data capture and analysis.
    • It may also affect downstream processes that rely on accurate and complete event data.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Make sure you discussed the feature or bugfix with the maintaining team in an Issue
  • Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

@mariusc83 mariusc83 force-pushed the mconstantin/rum-6307/add-integration-test-for-async-consent-switching branch 3 times, most recently from 73bebc7 to f94a330 Compare October 11, 2024 11:26
@mariusc83 mariusc83 self-assigned this Oct 11, 2024
@codecov-commenter
Copy link

codecov-commenter commented Oct 11, 2024

Codecov Report

Attention: Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Project coverage is 70.37%. Comparing base (ca952c5) to head (085f2e8).
Report is 4 commits behind head on develop.

Files with missing lines Patch % Lines
...d/core/internal/persistence/ConsentAwareStorage.kt 93.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #2313   +/-   ##
========================================
  Coverage    70.37%   70.37%           
========================================
  Files          738      738           
  Lines        27529    27541   +12     
  Branches      4616     4616           
========================================
+ Hits         19372    19381    +9     
- Misses        6864     6879   +15     
+ Partials      1293     1281   -12     
Files with missing lines Coverage Δ
...in/com/datadog/android/core/internal/SdkFeature.kt 91.10% <100.00%> (+0.09%) ⬆️
...droid/core/internal/persistence/AbstractStorage.kt 100.00% <100.00%> (ø)
...ence/file/advanced/ConsentAwareFileOrchestrator.kt 96.77% <ø> (ø)
...d/core/internal/persistence/ConsentAwareStorage.kt 96.70% <93.33%> (-0.92%) ⬇️

... and 36 files with indirect coverage changes

@mariusc83 mariusc83 force-pushed the mconstantin/rum-6307/add-integration-test-for-async-consent-switching branch 4 times, most recently from 428bee9 to 1d4a258 Compare October 11, 2024 13:56
@mariusc83 mariusc83 marked this pull request as ready for review October 11, 2024 15:06
@mariusc83 mariusc83 requested review from a team as code owners October 11, 2024 15:06
@mariusc83 mariusc83 force-pushed the mconstantin/rum-6307/add-integration-test-for-async-consent-switching branch from 1d4a258 to e8bcade Compare October 14, 2024 14:56
@mariusc83 mariusc83 force-pushed the mconstantin/rum-6307/add-integration-test-for-async-consent-switching branch from e8bcade to 085f2e8 Compare October 14, 2024 14:59
@mariusc83 mariusc83 merged commit 611b5c9 into develop Oct 15, 2024
24 checks passed
@mariusc83 mariusc83 deleted the mconstantin/rum-6307/add-integration-test-for-async-consent-switching branch October 15, 2024 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants