Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: implement Spanner connector #59

Merged
merged 80 commits into from
Sep 23, 2023
Merged

all: implement Spanner connector #59

merged 80 commits into from
Sep 23, 2023

Commits on Sep 23, 2023

  1. ScanBuilder: implement buildScan to construct SQL

    Allows us to build SQL from filters and required columns.
    
    Updates #58
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    a363750 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8489767 View commit details
    Browse the repository at this point in the history
  3. Add RDD creators

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    e68e4f8 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    197916c View commit details
    Browse the repository at this point in the history
  5. more unit tests per class

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    7ec5d50 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9baf076 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b50f924 View commit details
    Browse the repository at this point in the history
  8. Remove dialect v1 RDD code

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    da84649 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    c529dc3 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    5a2c260 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    8556acf View commit details
    Browse the repository at this point in the history
  12. Add more diverse tests

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    cfbb77e View commit details
    Browse the repository at this point in the history
  13. Add Map->JSON serializer and deserializer

    Given that we've gotten crashes that Map<String, String>
    is non-serializable, this change adds a serializer and
    deserializer to JSON string.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    bc8d205 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    cc03ef4 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    ac9aef3 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    0878137 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    ccdd23c View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    f6f17b4 View commit details
    Browse the repository at this point in the history
  19. tests: add DATE as well

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    68a7dc1 View commit details
    Browse the repository at this point in the history
  20. Did following steps to make the SpannerInputPartitionContext work wit…

    …h serializable requirement. 1. Move the spanner generation inside createPartitionReaderContext to avoid making it serializable. 2. Remove the Closeable interface since spanner varible is not a member of the class. 3. Catch JsonProcessingException since Exception is too broad.
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    2fd2591 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    e2b0c59 View commit details
    Browse the repository at this point in the history
  22. Removed unused datasource V1 files.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    e80ccee View commit details
    Browse the repository at this point in the history
  23. Remove the unused opts.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    6a13fbd View commit details
    Browse the repository at this point in the history
  24. Removed the unused opts.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    e8b0599 View commit details
    Browse the repository at this point in the history
  25. Change the logger class to SpannerTable iteself. Changed the JSON typ…

    …e to String. JSON inside spanner is not easy to convert to array.
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    cb5d733 View commit details
    Browse the repository at this point in the history
  26. Making the following type conversion works from end to end: JSON, Byt…

    …e, Date, and Timestamp.
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    04f939e View commit details
    Browse the repository at this point in the history
  27. Making the following type conversion works from end to end: JSON, Byt…

    …e, Date, Timestamp and Array<String>.
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    46f245c View commit details
    Browse the repository at this point in the history
  28. Fixed an initialization issue. Removed the close() function in the Re…

    …aderContext since spanner is not defined as a private member of the class.
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    cb80f6c View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    ea7fdc8 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    ffcdcaa View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    6b7dd68 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    c1f6e82 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    81a75de View commit details
    Browse the repository at this point in the history
  34. Use com.google.cloud.NoCredentials with SpannerEmulator+hermetic tests

    Allows tests to run with the Cloud Spanner emulator without
    having to have GCP credentials present.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    fd537f3 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    9417a17 View commit details
    Browse the repository at this point in the history
  36. Try autoConfigEmulator=true;usePlainText with the right conditions

    This allows the Cloud Spanner Emulator to be used in tests.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    2dcdfe5 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    d024244 View commit details
    Browse the repository at this point in the history
  38. Added Dockerfile to download gcloud sdk.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    04d5b3d View commit details
    Browse the repository at this point in the history
  39. Fixed the cloudbuild.yaml cannot unmarshal string into Go value of ty…

    …pe map[string]json.RawMessage
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    2d1a57b View commit details
    Browse the repository at this point in the history
  40. Added init step in the cloudbuild.yaml.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    ac734b6 View commit details
    Browse the repository at this point in the history
  41. Added the presubmit.sh

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    00ec28f View commit details
    Browse the repository at this point in the history
  42. Remove the not found files.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    c5e890b View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    e744e29 View commit details
    Browse the repository at this point in the history
  44. Fixed the issue that SpannerScanBuilderTest.testReadSchema:75 expecte…

    …d:<6> but was:<0>
    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    aafba1f View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    777bf67 View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    841441e View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    753bbde View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    872dae0 View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    dc62d35 View commit details
    Browse the repository at this point in the history
  50. Added -y when install the emulator

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    9aeff03 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    6388369 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    c713512 View commit details
    Browse the repository at this point in the history
  53. Set up the env for the integration test.

    halio-g authored and odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    9e0837b View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    9c4bc20 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    1dd99ef View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    806b9ee View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    1e41554 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    a25bbd1 View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    bb540be View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    4b2233f View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    fdb031f View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    7104076 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    d59a56a View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    ff8cc8c View commit details
    Browse the repository at this point in the history
  65. Remove unnecessary spark.Batch from SpannerScanBuilder

    Keep that functionality in SpannerScanner. While here also
    add filters to SpannerScanner to allow filtration.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    8241919 View commit details
    Browse the repository at this point in the history
  66. SpannerScanBuilder: implement SupportsPushDownRequiredColumns

    This change implements a columns selector to reduce the amount
    of data returned and the results were confirmed by a live query
    
    ```python
    df.printSchema()
    df.select("created_at", "value", "base_cur")
      .filter((df["value"] > 3720) & (df["base_cur"] == "USD"))
      .show()
    ```
    
    which produced
    
    ```shell
    root
     |-- id: string (nullable = false)
     |-- base_cur: string (nullable = false)
     |-- end_cur: string (nullable = false)
     |-- value: double (nullable = false)
     |-- data_src: string (nullable = false)
     |-- created_at: timestamp (nullable = false)
     |-- published_at: timestamp (nullable = true)
    
    +--------------------+-----------+--------+
    |          created_at|      value|base_cur|
    +--------------------+-----------+--------+
    |2023-08-30 20:56:...|3724.365901|     USD|
    |2023-09-09 04:40:...|3738.384066|     USD|
    |2023-09-07 05:44:...|3735.182945|     USD|
    |2023-08-25 17:00:...|3730.606064|     USD|
    |2023-09-15 10:08:...|3724.849835|     USD|
    |2023-08-21 18:24:...|3727.353643|     USD|
    |2023-08-20 07:52:...|3724.922281|     USD|
    |2023-08-23 23:00:...|3738.342818|     USD|
    |2023-08-14 10:00:...|3749.635375|     USD|
    |2023-09-01 05:48:...|3730.472718|     USD|
    |2023-09-15 21:40:...|3730.465981|     USD|
    |2023-09-11 16:40:...|3723.400239|     USD|
    |2023-09-13 13:16:...|3723.537745|     USD|
    |2023-08-19 14:00:...|3724.922281|     USD|
    |2023-09-06 02:56:...|3735.927727|     USD|
    |2023-09-11 13:04:...|3723.400239|     USD|
    |2023-09-14 17:56:...|3724.849835|     USD|
    |2023-09-07 20:44:...|3739.570874|     USD|
    |2023-08-22 07:00:...|3729.171921|     USD|
    |2023-08-22 08:20:...|3729.171921|     USD|
    +--------------------+-----------+--------+
    ```
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    ae94dcc View commit details
    Browse the repository at this point in the history
  67. SpannerUtils: add UserAgent when creating batchClient

    Helps identify the connector to Google Cloud so that
    later usage metrics, health checks, quota updates, optimizations
    can be trivially made.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    8fe090f View commit details
    Browse the repository at this point in the history
  68. README: improve example

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    06ca6f8 View commit details
    Browse the repository at this point in the history
  69. SpannerScanner: add option to disableDataboost

    Allows Databoost to be disabled; it is on by default
    given the point of this connector. However, there is
    something to be said about compatibility so that by
    default most users who haven't enabled Databoost can
    still use it, but that's to be discussed for later.
    
    Fixes #68
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    cd66665 View commit details
    Browse the repository at this point in the history
  70. README: fix markdown

    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    dd19c67 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    6c241a3 View commit details
    Browse the repository at this point in the history
  72. Configuration menu
    Copy the full SHA
    10ec27c View commit details
    Browse the repository at this point in the history
  73. test: introduce SparkFilterUtilsTest and retrofit for that code

    While here also fixed up the package path to be fully:
    
        package com.google.cloud.spark.spanner;
    
    instead of erroneously:
    
        package com.google.cloud.spark;
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    5e3fcba View commit details
    Browse the repository at this point in the history
  74. make enableDataboost=true explicit and not an inversion with disableD…

    …ataboost
    
    Hao reasoned that databoost being enabled should be an opt-in
    because it is expensive for customers so that's a good reason
    to require it to be explicitly specified by the customer.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    0a95456 View commit details
    Browse the repository at this point in the history
  75. SpannerScanner: correctly close BatchClient in .planInputPartitions

    This fixes a long standing shutdown failure due to unclosed
    Spanner objects.
    odeke-em committed Sep 23, 2023
    Configuration menu
    Copy the full SHA
    a943057 View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    e5758d2 View commit details
    Browse the repository at this point in the history
  77. Configuration menu
    Copy the full SHA
    cb58880 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    47a95bd View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    ec25270 View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    493eea2 View commit details
    Browse the repository at this point in the history