- The server was only able to partially fulfill your request
- Template file failed to load
- The referenced network resource cannot be found
- Unable to open the Dataflow staging file
- No matching distribution found for apache-beam==2.30.0
This error message is shown on the GCP Console when trying to view resources protected by the VPC-SC perimeters, such as Dataflow Jobs.
Error message:
Sorry, the server was only able to partially fulfill your request. Some data might not be rendered.
Cause:
You are not in the list of members of the access level associated with the perimeter.
Solution:
You need to be added in the input perimeter_additional_members
of the Secured Data Warehouse Module. Members of this list are added to the access level.
See the inputs section in the README for more details.
This error message is shown on the GCP Console when you are creating a new Dataflow Job.
Error message:
The metadata file for this template could not be parsed.
VIEW DETAILS
In VIEW DETAILS:
Fail to process as Flex Template and Legacy Template. Flex Template Process result:(390ac373ef6bcb87):
Template file failed to load: gs://<BUCKET-NAME>/flex-template-samples/regional-python-dlp-flex.json.
Permissions denied. Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: <UNIQUE-IDENTIFIER>,
Legacy Template Process result:(390ac373ef6bc2a5): Template file failed to load: gs://<BUCKET-NAME>/flex-template-samples/regional-python-dlp-flex.json.
Permissions denied. Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: <UNIQUE-IDENTIFIER>
Cause:
The private Dataflow job template that is being used is outside of the VPC-SC Perimeter. Use the troubleshooting page to debug the details regarding the violation.
Solution:
The identity deploying the Dataflow jobs must be added to the correct list indicated below. The list configures egress rules that allows the Dataflow templates to be fetched.
- For the confidential perimeter, the identity needs to be added in the input
confidential_data_dataflow_deployer_identities
of the Secured Data Warehouse Module. - For the data ingestion perimeter, the identity needs to be added in the input
data_ingestion_dataflow_deployer_identities
of the Secured Data Warehouse Module.
See the inputs section in the README for more details.
This error message is shown in the Dataflow jobs details page for the deployed Dataflow Job, in the Job Logs
section.
Error message:
Failed to start the VM, launcher-2021120604300713065380799072320283, used for launching because of status code: INVALID_ARGUMENT, reason:
Error: Message: Invalid value for field 'resource.networkInterfaces[0].network': 'global/networks/default'.
The referenced network resource cannot be found. HTTP Code: 400.
or
Failed to start the VM, launcher-2021120906191814752290113584255576, used for launching because of
status code: INVALID_ARGUMENT, reason:
Error: Message: Invalid value for field 'resource.networkInterfaces[0]': '{ "network": "global/networks/default", "accessConfig": [{ "type": "ONE_TO_ONE_NAT", "name":...'.
Subnetwork should be specified for custom subnetmode network HTTP Code: 400.
Cause:
If you do not specify a network or subnetwork in the job parameters, Dataflow will use the default VPC network to deploy the Job. If the default network does not exist, you will get the 400 error.
Solution:
A valid VPC subnetwork must be declared as a job parameter in the creation of the Dataflow Job, as you can see in the regional-dlp example that uses the dataflow-flex-job module.
- GCP Console: Use the optional parameter
Subnetwork
. - Gcloud CLI: Use the optional flag --subnetwork.
- Terraform: Use the input
subnetwork_self_link
from the Dataflow Flex Job Module.
After deploying a new Dataflow job in the console, the job creation fails. Looking at the Job Logs section, in the bottom part of the job detail page, there is an error with the message:
Error message:
Failed to read the result file : gs://<BUCKET-NAME>/staging/template_launches/ 2021-12-06_04_37_18-105494327517795773/
operation_result with error message: (59b58cff2e1b7caf): Unable to open template file:
gs://<BUCKET-NAME>/staging/template_launches/ 2021-12-06_04_37_18-105494327517795773/ operation_result..
Cause:
You did not specify the appropriate Service Account created by the main module as the Dataflow Worker Service Account in the parameters, the Dataflow job will use the Compute Engine default service account as the Dataflow Worker Service Account.
Solution:
You must use the appropriate Service Account created by the main module.
- Data ingestion:
- Module output:
dataflow_controller_service_account_email
- Email format:
sa-dataflow-controller@<DATA-INGESTION-PROJECT-ID>.iam.gserviceaccount.com
- Module output:
- Confidential Data:
- Module output:
confidential_dataflow_controller_service_account_email
- Email format:
sa-dataflow-controller-reid@<CONFIDENTIAL-DATA-PROJECT-ID>.iam.gserviceaccount.com
- Module output:
The Service Account must be declared as a job parameter.
- GCP Console: Use the optional parameter Service account email.
- Gcloud: Use the optional flag --service-account-email.
- Terraform: Use the input
service_account_email
from the Dataflow Flex Job Module.
For more details about Dataflow staging files see Resource usage and management documentation.
After deploying a new Dataflow job in the console, the job creation fails. Looking at the Job Logs section, in the bottom part of the job detail page, there is an error with the message:
Error message:
https://LOCATION-python.pkg.dev/ARTIFACT-REGISTRY-PROJECT-ID/python-modules/simple/
ERROR: Could not find a version that satisfies the requirement apache-beam==2.30.0 (from versions: none)
ERROR: No matching distribution found for apache-beam==2.30.0
Cause:
The Dataflow Worker Service Account is trying to access the Artifact Registry to download the Apache Beam module but do not have the right permissions to access it.
Solution:
You must grant the role Artifact Registry Reader (roles/artifactregistry.reader
) in the Artifact Registry Repository, that hosts the Python modules, to the Dataflow Worker Service Account.
You must use the appropriate Service Account created by the main module.
- Data ingestion:
- Module output:
dataflow_controller_service_account_email
- Email format:
sa-dataflow-controller@<DATA-INGESTION-PROJECT-ID>.iam.gserviceaccount.com
- Module output:
- Confidential Data:
- Module output:
confidential_dataflow_controller_service_account_email
- Email format:
sa-dataflow-controller-reid@<CONFIDENTIAL-DATA-PROJECT-ID>.iam.gserviceaccount.com
- Module output:
Using gcloud
command:
export project_id=<ARTIFACT-REGISTRY-PROJECT-ID>
export location=<ARTIFACT-REGISTRY-REPOSITORY-LOCATION>
export dataflow_worker_service_account=<DATAFLOW-WORKER-SERVICE-ACCOUNT>
gcloud artifacts repositories add-iam-policy-binding python-modules \
--member="serviceAccount:${dataflow_worker_service_account}" \
--role='roles/artifactregistry.reader' \
--project=${project_id} \
--location=${location}
Using terraform:
resource "google_artifact_registry_repository_iam_member" "python_reader" {
provider = google-beta
project = "ARTIFACT-REGISTRY-PROJECT-ID"
location = "ARTIFACT-REGISTRY-REPOSITORY-LOCATION"
repository = "python-modules"
role = "roles/artifactregistry.reader"
member = "serviceAccount:sa-dataflow-controller-reid@CONFIDENTIAL-DATA-PROJECT-ID.iam.gserviceaccount.com"
}