[FEATURE] - DBR 14.3 support - foreachbatch impacts #56

riccamini · 2024-08-01T14:21:51Z

Is your feature request related to a problem? Please describe.

This issue is related to #33. Goal is to investigate the impacts on foreachbatch calls in the code.

riccamini · 2024-08-01T14:57:40Z

I have found only one reference to DataFrame.foreachbatch function in the StreamWriter class (spark.writers.stream.py:292)

One additional consideration after reading the DOC

    This function behaves differently in Spark Connect mode. See examples.
    In Connect, the provided function doesn't have access to variables defined outside of it.

    Examples
    --------
    >>> import time
    >>> df = spark.readStream.format("rate").load()
    >>> my_value = -1
    >>> def func(batch_df, batch_id):
    ...     global my_value
    ...     my_value = 100
    ...     batch_df.collect()
    ...
    >>> q = df.writeStream.foreachBatch(func).start()
    >>> time.sleep(3)
    >>> q.stop()
    >>> # if in Spark Connect, my_value = -1, else my_value = 100

This is not happening in Koheesio, but maybe it should be made explicit in the StreamWriter documentation for the field batch_function.

SynchronizeDeltaToSnowflakeTask does not have additional calls to foreachbatch as it reuses StreamWriter

dannymeijer · 2024-11-08T11:11:33Z

@riccamini - can you verify that this issue is resolved with the upcoming 0.9 release?

riccamini added the enhancement New feature or request label Aug 1, 2024

riccamini mentioned this issue Aug 1, 2024

[FEATURE] Ensure that we can support DBR 14.3LTS #33

Open

3 tasks

dannymeijer added this to the 0.9.0 milestone Nov 8, 2024

dannymeijer linked a pull request Nov 11, 2024 that will close this issue

Release/0.9 #97

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] - DBR 14.3 support - foreachbatch impacts #56

[FEATURE] - DBR 14.3 support - foreachbatch impacts #56

riccamini commented Aug 1, 2024

riccamini commented Aug 1, 2024

dannymeijer commented Nov 8, 2024

[FEATURE] - DBR 14.3 support - foreachbatch impacts #56

[FEATURE] - DBR 14.3 support - foreachbatch impacts #56

Comments

riccamini commented Aug 1, 2024

Is your feature request related to a problem? Please describe.

riccamini commented Aug 1, 2024

dannymeijer commented Nov 8, 2024