Skip to content

Commit

Permalink
[MINOR][PYTHON][DOCS] Clarify verifySchema at createDataFrame not wor…
Browse files Browse the repository at this point in the history
…king with pandas DataFrame with Arrow optimization

### What changes were proposed in this pull request?

This PR proposes to clarify that  `verifySchema` at createDataFrame does not wotj with pandas DataFrame with Arrow optimization enabled.

### Why are the changes needed?

For correct information about `verifySchema` <> Arrow optimization in `createDataFrame`.

### Does this PR introduce _any_ user-facing change?

Yes, it fixes the user-facing documentation.

### How was this patch tested?

I manually ran linters

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #45333 from HyukjinKwon/improve-doc-verifySchema.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
HyukjinKwon committed Feb 29, 2024
1 parent 11e8ae4 commit a8b6e3c
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions python/pyspark/sql/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -1325,6 +1325,9 @@ def createDataFrame( # type: ignore[misc]
if ``samplingRatio`` is ``None``.
verifySchema : bool, optional
verify data types of every row against schema. Enabled by default.
When the input is :class:`pandas.DataFrame` and
`spark.sql.execution.arrow.pyspark.enabled` is enabled, this option is not
effective. It follows Arrow type coercion.
.. versionadded:: 2.1.0
Expand Down

0 comments on commit a8b6e3c

Please sign in to comment.