Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #13

Merged
merged 68 commits into from
May 13, 2024
Merged

Develop #13

merged 68 commits into from
May 13, 2024

Conversation

g-lorena
Copy link
Owner

No description provided.

Tedson2019 and others added 30 commits April 23, 2024 19:41
…nciate the differents modules, added new line to gitignore
…n and venv_layer folder after creating the layer.zip
…nciate the differents modules, added new line to gitignore
…n and venv_layer folder after creating the layer.zip
g-lorena and others added 28 commits April 29, 2024 14:59
change source to . to resolve issue of activation the ven due to the …
added information into the README.md
Copy link

Terraform Format and Style 🖌success

Terraform Initialization ⚙️success

Terraform Validation 🤖success

Validation Output

Success! The configuration is valid.


Terraform Plan 📖success

Show Plan

terraform
module.lambdaFunction.data.archive_file.lambda: Reading...
module.lambdaFunction.data.archive_file.lambda: Read complete after 0s [id=ebe3f57c2a0a271455e440e357787a8a28d3b47b]
module.glueIamRole.data.aws_iam_policy_document.glue_assume_role: Reading...
module.lambdaFunction.data.aws_iam_policy_document.lambda_policy: Reading...
module.glueIamRole.data.aws_iam_policy_document.glue_policy_document: Reading...
module.lambdaFunction.data.aws_iam_policy_document.lambda_assume_role: Reading...
module.lambdaFunction.data.aws_iam_policy_document.lambda_assume_role: Read complete after 0s [id=2690255455]
module.glueIamRole.data.aws_iam_policy_document.glue_policy_document: Read complete after 0s [id=3375310060]
module.glueIamRole.data.aws_iam_policy_document.glue_assume_role: Read complete after 0s [id=2681768870]
module.lambdaFunction.data.aws_iam_policy_document.lambda_policy: Read complete after 0s [id=3620005588]

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.cloudwatch_schedule_module.aws_cloudwatch_event_rule.schedule will be created
  + resource "aws_cloudwatch_event_rule" "schedule" {
      + arn                 = (known after apply)
      + description         = "Schedule for Lambda Function"
      + event_bus_name      = "default"
      + force_destroy       = false
      + id                  = (known after apply)
      + name                = "schedule"
      + name_prefix         = (known after apply)
      + schedule_expression = "cron(0 8 ? * MON-FRI *)"
      + tags_all            = (known after apply)
    }

  # module.cloudwatch_schedule_module.aws_cloudwatch_event_target.schedule_lambda will be created
  + resource "aws_cloudwatch_event_target" "schedule_lambda" {
      + arn            = (known after apply)
      + event_bus_name = "default"
      + force_destroy  = false
      + id             = (known after apply)
      + rule           = "schedule"
      + target_id      = "processing_lambda"
    }

  # module.cloudwatch_schedule_module.aws_lambda_permission.allow_events_bridge_to_run_lambda will be created
  + resource "aws_lambda_permission" "allow_events_bridge_to_run_lambda" {
      + action              = "lambda:InvokeFunction"
      + function_name       = "lambda_extract_fromAPI"
      + id                  = (known after apply)
      + principal           = "events.amazonaws.com"
      + statement_id        = "AllowExecutionFromCloudWatch"
      + statement_id_prefix = (known after apply)
    }

  # module.glueCatalogDatabase.aws_glue_catalog_database.aws_glue_catalog_database will be created
  + resource "aws_glue_catalog_database" "aws_glue_catalog_database" {
      + arn          = (known after apply)
      + catalog_id   = (known after apply)
      + id           = (known after apply)
      + location_uri = (known after apply)
      + name         = "real-estate-database"
      + tags_all     = (known after apply)
    }

  # module.glueClassifier.aws_glue_classifier.crawler_classifier will be created
  + resource "aws_glue_classifier" "crawler_classifier" {
      + id   = (known after apply)
      + name = "real_estate_classifier"

      + json_classifier {
          + json_path = "$[*]"
        }
    }

  # module.glueCrawler.aws_glue_crawler.houston_crawler will be created
  + resource "aws_glue_crawler" "houston_crawler" {
      + arn           = (known after apply)
      + classifiers   = (known after apply)
      + database_name = "real-estate-database"
      + id            = (known after apply)
      + name          = "real_estate_houston_crawler"
      + role          = (known after apply)
      + table_prefix  = "immo_"
      + tags_all      = (known after apply)

      + s3_target {
          + path = "real-estate-etl-101/raw_data/houston"
        }
    }

  # module.glueCrawler.aws_glue_crawler.panamera_crawler will be created
  + resource "aws_glue_crawler" "panamera_crawler" {
      + arn           = (known after apply)
      + classifiers   = (known after apply)
      + database_name = "real-estate-database"
      + id            = (known after apply)
      + name          = "real_estate_panamera_crawler"
      + role          = (known after apply)
      + table_prefix  = "immo_"
      + tags_all      = (known after apply)

      + s3_target {
          + path = "real-estate-etl-101/raw_data/pasadena"
        }
    }

  # module.glueIamRole.aws_iam_policy.glue_policy will be created
  + resource "aws_iam_policy" "glue_policy" {
      + arn              = (known after apply)
      + attachment_count = (known after apply)
      + description      = "allow lambda to get and list object into the bucket"
      + id               = (known after apply)
      + name             = "glue-policy"
      + name_prefix      = (known after apply)
      + path             = "/"
      + policy           = jsonencode(
            {
              + Statement = [
                  + {
                      + Action   = [
                          + "s3:PutObject",
                          + "s3:ListBucket",
                          + "s3:GetObject",
                          + "s3:GetBucketLocation",
                          + "s3:GetBucketAcl",
                          + "s3:DeleteObject",
                        ]
                      + Effect   = "Allow"
                      + Resource = "*"
                    },
                  + {
                      + Action   = "glue:*"
                      + Effect   = "Allow"
                      + Resource = "*"
                    },
                  + {
                      + Action   = [
                          + "logs:PutLogEvents",
                          + "logs:CreateLogStream",
                          + "logs:CreateLogGroup",
                        ]
                      + Effect   = "Allow"
                      + Resource = "arn:aws:logs:*:*:*:/aws-glue/*"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + policy_id        = (known after apply)
      + tags_all         = (known after apply)
    }

  # module.glueIamRole.aws_iam_role.iam_for_glue will be created
  + resource "aws_iam_role" "iam_for_glue" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "sts:AssumeRole"
                      + Effect    = "Allow"
                      + Principal = {
                          + Service = "glue.amazonaws.com"
                        }
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = false
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "iam_for_glue"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)
    }

  # module.glueIamRole.aws_iam_role_policy_attachment.attach_getObject will be created
  + resource "aws_iam_role_policy_attachment" "attach_getObject" {
      + id         = (known after apply)
      + policy_arn = (known after apply)
      + role       = "iam_for_glue"
    }

  # module.glueJob.aws_glue_job.immo-glue-job will be created
  + resource "aws_glue_job" "immo-glue-job" {
      + arn               = (known after apply)
      + default_arguments = {
          + "--class"                   = "GlueApp"
          + "--conf"                    = "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions  --conf spark.sql.catalog.glue_catalog=org.apache.iceberg.spark.SparkCatalog  --conf spark.sql.catalog.glue_catalog.warehouse=s3://tnt-erp-sql/ --conf spark.sql.catalog.glue_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog  --conf spark.sql.catalog.glue_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO"
          + "--datalake-formats"        = "iceberg"
          + "--enable-auto-scaling"     = "false"
          + "--enable-glue-datacatalog" = "true"
          + "--enable-job-insights"     = "true"
          + "--job-bookmark-option"     = "job-bookmark-disable"
          + "--job-language"            = "python"
        }
      + glue_version      = "4.0"
      + id                = (known after apply)
      + max_capacity      = (known after apply)
      + name              = "real_estate_job"
      + number_of_workers = (known after apply)
      + role_arn          = (known after apply)
      + tags_all          = (known after apply)
      + timeout           = 2880
      + worker_type       = (known after apply)

      + command {
          + name            = "glueetl"
          + python_version  = (known after apply)
          + runtime         = (known after apply)
          + script_location = "s3://real-estate-etl-utils/script/glue_etl_script.py"
        }
    }

  # module.glueTrigger.aws_glue_trigger.gluejob-trigger will be created
  + resource "aws_glue_trigger" "gluejob-trigger" {
      + arn      = (known after apply)
      + enabled  = true
      + id       = (known after apply)
      + name     = "realestate-glue-job-trigger"
      + schedule = "cron(0 8 ? * MON-FRI *)"
      + state    = (known after apply)
      + tags_all = (known after apply)
      + type     = "SCHEDULED"

      + actions {
          + job_name = "real_estate_job"
        }
    }

  # module.lambdaFunction.aws_iam_policy.lambda_policy will be created
  + resource "aws_iam_policy" "lambda_policy" {
      + arn              = (known after apply)
      + attachment_count = (known after apply)
      + description      = "allow lambda to get and list object into the bucket"
      + id               = (known after apply)
      + name             = "lambda-policy"
      + name_prefix      = (known after apply)
      + path             = "/"
      + policy           = jsonencode(
            {
              + Statement = [
                  + {
                      + Action   = [
                          + "s3:PutObject",
                          + "s3:ListBucket",
                          + "s3:GetObject",
                        ]
                      + Effect   = "Allow"
                      + Resource = "*"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + policy_id        = (known after apply)
      + tags_all         = (known after apply)
    }

  # module.lambdaFunction.aws_iam_role.iam_for_lambda will be created
  + resource "aws_iam_role" "iam_for_lambda" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "sts:AssumeRole"
                      + Effect    = "Allow"
                      + Principal = {
                          + Service = "lambda.amazonaws.com"
                        }
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = false
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "iam_for_lambda"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)
    }

  # module.lambdaFunction.aws_iam_role_policy_attachment.attach_getObject will be created
  + resource "aws_iam_role_policy_attachment" "attach_getObject" {
      + id         = (known after apply)
      + policy_arn = (known after apply)
      + role       = "iam_for_lambda"
    }

  # module.lambdaFunction.aws_lambda_function.lambda will be created
  + resource "aws_lambda_function" "lambda" {
      + architectures                  = (known after apply)
      + arn                            = (known after apply)
      + filename                       = "lambda_function_extract_data.zip"
      + function_name                  = "lambda_extract_fromAPI"
      + handler                        = "extract_data.lambda_handler"
      + id                             = (known after apply)
      + invoke_arn                     = (known after apply)
      + last_modified                  = (known after apply)
      + layers                         = (known after apply)
      + memory_size                    = 512
      + package_type                   = "Zip"
      + publish                        = false
      + qualified_arn                  = (known after apply)
      + qualified_invoke_arn           = (known after apply)
      + reserved_concurrent_executions = -1
      + role                           = (known after apply)
      + runtime                        = "python3.10"
      + signing_job_arn                = (known after apply)
      + signing_profile_version_arn    = (known after apply)
      + skip_destroy                   = false
      + source_code_hash               = "ZcPY9r55Z7zsZFteLvThWeWrwCdvYtmeAIgxqExQvEE="
      + source_code_size               = (known after apply)
      + tags_all                       = (known after apply)
      + timeout                        = 300
      + version                        = (known after apply)

      + environment {
          + variables = {
              + "API_HOST"   = "zillow56.p.rapidapi.com"
              + "API_KEY"    = "XXXX"
              + "DST_BUCKET" = "real-estate-etl-101"
              + "RAW_FOLDER" = "raw_data"
              + "REGION"     = "eu-west-3"
            }
        }
    }

  # module.lambdaFunction.aws_lambda_permission.s3 will be created
  + resource "aws_lambda_permission" "s3" {
      + action              = "lambda:InvokeFunction"
      + function_name       = (known after apply)
      + id                  = (known after apply)
      + principal           = "s3.amazonaws.com"
      + source_arn          = (known after apply)
      + statement_id        = "AllowExecutionFromS3Bucket"
      + statement_id_prefix = (known after apply)
    }

  # module.lambdaLayer.aws_lambda_layer_version.requests_layer will be created
  + resource "aws_lambda_layer_version" "requests_layer" {
      + arn                         = (known after apply)
      + compatible_runtimes         = [
          + "python3.10",
        ]
      + created_date                = (known after apply)
      + id                          = (known after apply)
      + layer_arn                   = (known after apply)
      + layer_name                  = "my_lambda_requirements_layer"
      + s3_bucket                   = (known after apply)
      + s3_key                      = "lambda_layer/my_lambda_requirements_layer/python.zip"
      + signing_job_arn             = (known after apply)
      + signing_profile_version_arn = (known after apply)
      + skip_destroy                = false
      + source_code_hash            = (known after apply)
      + source_code_size            = (known after apply)
      + version                     = (known after apply)
    }

  # module.lambdaLayer.aws_s3_bucket.lambda_layer_bucket will be created
  + resource "aws_s3_bucket" "lambda_layer_bucket" {
      + acceleration_status         = (known after apply)
      + acl                         = (known after apply)
      + arn                         = (known after apply)
      + bucket                      = "my-lambda-layer-bucket-001"
      + bucket_domain_name          = (known after apply)
      + bucket_prefix               = (known after apply)
      + bucket_regional_domain_name = (known after apply)
      + force_destroy               = false
      + hosted_zone_id              = (known after apply)
      + id                          = (known after apply)
      + object_lock_enabled         = (known after apply)
      + policy                      = (known after apply)
      + region                      = (known after apply)
      + request_payer               = (known after apply)
      + tags_all                    = (known after apply)
      + website_domain              = (known after apply)
      + website_endpoint            = (known after apply)
    }

  # module.lambdaLayer.aws_s3_object.lambda_layer_zip will be created
  + resource "aws_s3_object" "lambda_layer_zip" {
      + acl                    = (known after apply)
      + arn                    = (known after apply)
      + bucket                 = (known after apply)
      + bucket_key_enabled     = (known after apply)
      + checksum_crc32         = (known after apply)
      + checksum_crc32c        = (known after apply)
      + checksum_sha1          = (known after apply)
      + checksum_sha256        = (known after apply)
      + content_type           = (known after apply)
      + etag                   = (known after apply)
      + force_destroy          = false
      + id                     = (known after apply)
      + key                    = "lambda_layer/my_lambda_requirements_layer/python.zip"
      + kms_key_id             = (known after apply)
      + server_side_encryption = (known after apply)
      + source                 = "python.zip"
      + storage_class          = (known after apply)
      + tags_all               = (known after apply)
      + version_id             = (known after apply)
    }

  # module.lambdaLayer.null_resource.lambda_layer will be created
  + resource "null_resource" "lambda_layer" {
      + id       = (known after apply)
      + triggers = {
          + "requirements" = "d720f37aa9be04689ae0463ff48a2a0d2bca11bb"
        }
    }

  # module.s3bucket.aws_s3_bucket.etl_bucket will be created
  + resource "aws_s3_bucket" "etl_bucket" {
      + acceleration_status         = (known after apply)
      + acl                         = (known after apply)
      + arn                         = (known after apply)
      + bucket                      = "real-estate-etl-101"
      + bucket_domain_name          = (known after apply)
      + bucket_prefix               = (known after apply)
      + bucket_regional_domain_name = (known after apply)
      + force_destroy               = true
      + hosted_zone_id              = (known after apply)
      + id                          = (known after apply)
      + object_lock_enabled         = (known after apply)
      + policy                      = (known after apply)
      + region                      = (known after apply)
      + request_payer               = (known after apply)
      + tags_all                    = (known after apply)
      + website_domain              = (known after apply)
      + website_endpoint            = (known after apply)
    }

  # module.s3bucket.aws_s3_bucket.utils_bucket will be created
  + resource "aws_s3_bucket" "utils_bucket" {
      + acceleration_status         = (known after apply)
      + acl                         = (known after apply)
      + arn                         = (known after apply)
      + bucket                      = "real-estate-etl-utils"
      + bucket_domain_name          = (known after apply)
      + bucket_prefix               = (known after apply)
      + bucket_regional_domain_name = (known after apply)
      + force_destroy               = true
      + hosted_zone_id              = (known after apply)
      + id                          = (known after apply)
      + object_lock_enabled         = (known after apply)
      + policy                      = (known after apply)
      + region                      = (known after apply)
      + request_payer               = (known after apply)
      + tags_all                    = (known after apply)
      + website_domain              = (known after apply)
      + website_endpoint            = (known after apply)
    }

  # module.s3bucket.aws_s3_object.glue_script will be created
  + resource "aws_s3_object" "glue_script" {
      + acl                    = (known after apply)
      + arn                    = (known after apply)
      + bucket                 = (known after apply)
      + bucket_key_enabled     = (known after apply)
      + checksum_crc32         = (known after apply)
      + checksum_crc32c        = (known after apply)
      + checksum_sha1          = (known after apply)
      + checksum_sha256        = (known after apply)
      + content_type           = (known after apply)
      + etag                   = "8509d29574902bdfde0d573600a32368"
      + force_destroy          = false
      + id                     = (known after apply)
      + key                    = "script/glue_etl_script.py"
      + kms_key_id             = (known after apply)
      + server_side_encryption = (known after apply)
      + source                 = "../etl/glue_etl_job/transform_data.py"
      + storage_class          = (known after apply)
      + tags_all               = (known after apply)
      + version_id             = (known after apply)
    }

  # module.s3bucket.aws_s3_object.raw_zone will be created
  + resource "aws_s3_object" "raw_zone" {
      + acl                    = "private"
      + arn                    = (known after apply)
      + bucket                 = (known after apply)
      + bucket_key_enabled     = (known after apply)
      + checksum_crc32         = (known after apply)
      + checksum_crc32c        = (known after apply)
      + checksum_sha1          = (known after apply)
      + checksum_sha256        = (known after apply)
      + content_type           = "application/x-directory"
      + etag                   = (known after apply)
      + force_destroy          = false
      + id                     = (known after apply)
      + key                    = "raw_data/"
      + kms_key_id             = (known after apply)
      + server_side_encryption = (known after apply)
      + storage_class          = (known after apply)
      + tags_all               = (known after apply)
      + version_id             = (known after apply)
    }

  # module.s3bucket.aws_s3_object.std_zone will be created
  + resource "aws_s3_object" "std_zone" {
      + acl                    = "private"
      + arn                    = (known after apply)
      + bucket                 = (known after apply)
      + bucket_key_enabled     = (known after apply)
      + checksum_crc32         = (known after apply)
      + checksum_crc32c        = (known after apply)
      + checksum_sha1          = (known after apply)
      + checksum_sha256        = (known after apply)
      + content_type           = "application/x-directory"
      + etag                   = (known after apply)
      + force_destroy          = false
      + id                     = (known after apply)
      + key                    = "std_data/"
      + kms_key_id             = (known after apply)
      + server_side_encryption = (known after apply)
      + storage_class          = (known after apply)
      + tags_all               = (known after apply)
      + version_id             = (known after apply)
    }

Plan: 26 to add, 0 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

Pushed by: @g-lorena, Action: pull_request

@g-lorena g-lorena merged commit 78b7fd8 into main May 13, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants