Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor rename delta table #125

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ chispa.egg-info/
tmp/
.idea/
.DS_Store
*.parquet

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
15 changes: 4 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -554,31 +554,24 @@ Notice that the records that violated either of the constraints are appended to

## Rename a Delta Table

This function is designed to rename a Delta table. It can operate either within a Databricks environment or with a standalone Spark session.
This function is designed to rename a Delta table. It can operate either within a Databricks environment or with a standalone Spark session.

## Parameters:
Here are the parameters for the function:

- `delta_table` (`DeltaTable`): An object representing the Delta table to be renamed.
- `new_table_name` (`str`): The new name for the table.
- `table_location` (`str`, optional): The file path where the table is stored. If not provided, the function attempts to deduce the location from the `DeltaTable` object. Defaults to `None`.
- `databricks` (`bool`, optional): A flag indicating the function's operational environment. Set to `True` if running within Databricks, otherwise, `False`. Defaults to `False`.
- `spark_session` (`pyspark.sql.SparkSession`, optional): The Spark session. This is required when `databricks` is set to `True`. Defaults to `None`.

## Returns:
The function raises a `TypeError` if the provided `delta_table` is not a DeltaTable object, or if `databricks` is set to `True` and `spark_session` is `None`.

- `None`

## Raises:

- `TypeError`: If the provided `delta_table` is not a DeltaTable object, or if `databricks` is set to `True` and `spark_session` is `None`.

## Example Usage:
Here's how to use the function:

```python
rename_delta_table(existing_delta_table, "new_table_name")
```


## Dictionary

We're leveraging the following terminology defined [here](https://www.databasestar.com/database-keys/#:~:text=Natural%20key%3A%20an%20attribute%20that,can%20uniquely%20identify%20a%20row).
Expand Down
37 changes: 8 additions & 29 deletions mack/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from pyspark.sql.dataframe import DataFrame
from pyspark.sql.functions import col, concat_ws, count, md5, row_number, max
from pyspark.sql.window import Window
from pyspark.sql import SparkSession


def type_2_scd_upsert(
Expand Down Expand Up @@ -694,44 +695,22 @@ def constraint_append(


def rename_delta_table(
delta_table: DeltaTable,
old_table_name: str,
new_table_name: str,
table_location: str = None,
databricks: bool = False,
spark_session: pyspark.sql.SparkSession = None,
spark_session: SparkSession = SparkSession.getActiveSession(),
) -> None:
"""
Renames a Delta table to a new name. This function can be used in a Databricks environment or with a
standalone Spark session.
Renames a Delta table to a new name.

Parameters:
delta_table (DeltaTable): The DeltaTable object representing the table to be renamed.
old_table_name (str): The old name for the table.
new_table_name (str): The new name for the table.
table_location (str, optional): The file path where the table is stored. Defaults to None.
If None, the function will attempt to determine the location from the DeltaTable object.
databricks (bool, optional): A flag indicating whether the function is being run in a Databricks
environment. Defaults to False. If True, a SparkSession must be provided.
spark_session (pyspark.sql.SparkSession, optional): The Spark session. Defaults to None.
Required if `databricks` is set to True.
spark (pyspark.sql.SparkSession, optional): The Spark session. Defaults to active SparkSession.

Returns:
None

Raises:
TypeError: If the provided `delta_table` is not a DeltaTable object, or if `databricks` is True
and `spark_session` is None.

Example Usage:
>>> rename_delta_table(existing_delta_table, "new_table_name")
>>> rename_delta_table("old_table_name", "new_table_name")
"""
if not isinstance(delta_table, DeltaTable):
raise TypeError("An existing delta table must be specified for delta_table.")
if databricks and spark_session is None:
raise TypeError("A spark session must be specified for databricks.")

if databricks:
spark_session.sql(f"ALTER TABLE {delta_table.name} RENAME TO {new_table_name}")
else:
delta_table.toDF().write.format("delta").mode("overwrite").saveAsTable(
new_table_name
)
spark_session.sql(f"ALTER TABLE {old_table_name} RENAME TO {new_table_name}")
Loading
Loading