diff --git a/guides/linked/Earth_Engine_REST_API_compute_image.ipynb b/guides/linked/Earth_Engine_REST_API_compute_image.ipynb new file mode 100644 index 000000000..0ae073618 --- /dev/null +++ b/guides/linked/Earth_Engine_REST_API_compute_image.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Earth_Engine_REST_API_compute_image.ipynb","private_outputs":true,"provenance":[{"file_id":"https://github.com/google/earthengine-community/blob/master/guides/linked/Earth_Engine_REST_API_compute_image.ipynb","timestamp":1629810798805},{"file_id":"1cLECy_jcK8DxQVd1eR2CHJFf5RSgrwBo","timestamp":1605024879401},{"file_id":"https://github.com/google/earthengine-community/blob/master/guides/linked/Earth_Engine_REST_API_computation.ipynb","timestamp":1605024830392},{"file_id":"1roGESkJ-6YGl3Xod7WJ1q1DhofbDo_Jh","timestamp":1591133797094}],"collapsed_sections":[],"toc_visible":true,"authorship_tag":"ABX9TyMD8VjPQHA3Tl11t+PcAIM8"},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"code","metadata":{"id":"fSIfBsgi8dNK"},"source":["#@title Copyright 2021 Google LLC. { display-mode: \"form\" }\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","# https://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License."],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"aV1xZ1CPi3Nw"},"source":["
\n","\n"," Run in Google Colab\n","\n"," View source on GitHub
"]},{"cell_type":"markdown","metadata":{"id":"CrEM35gqHouU"},"source":["# Image computations with the Earth Engine REST API\n","\n","***Note:*** *The REST API contains new and advanced features that may not be suitable for all users. If you are new to Earth Engine, please get started with the [JavaScript guide](https://developers.google.com/earth-engine/guides/getstarted).*\n","\n","The [Earth Engine REST API quickstart](https://developers.google.com/earth-engine/reference/Quickstart) shows how to access blocks of pixels from an Earth Engine asset. Suppose you want to apply a computation to the pixels before obtaining the result. This guide shows how to prototype a computation with one of the client libraries, serialize the computation graph and use the REST API to obtain the computed result. Making compute requests through the REST API corresponds to a `POST` request to one of the compute endpoints, for example [`computePixels`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.image/computePixels), [`computeFeatures`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.table/computeFeatures), or the generic [`value.compute`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.value/compute). Specifically, this example demonstrates getting a median composite of Sentinel-2 imagery in a small region."]},{"cell_type":"markdown","metadata":{"id":"H2VOD2agf4Cm"},"source":["## Before you begin\n","\n","Follow [these instructions](https://developers.google.com/earth-engine/cloud/earthengine_cloud_project_setup) to:\n","\n","1. Apply for Earth Engine\n","2. Create a Google Cloud project\n","3. Enable the Earth Engine API on the project\n","4. Create a service account\n","5. Give the service account project level permission to perform Earth Engine computations\n","\n","**Note**: To complete this tutorial, you will need a service account that is registered for Earth Engine access. See [these instructions](https://developers.google.com/earth-engine/guides/service_account#register-the-service-account-to-use-earth-engine) to register a service account before proceeding."]},{"cell_type":"markdown","metadata":{"id":"OfMAA6YhPuFl"},"source":["## Authenticate to Google Cloud\n","\n","The first thing to do is login so that you can make authenticated requests to Google Cloud. You will set the project at the same time. Follow the instructions in the output to complete the sign in."]},{"cell_type":"code","metadata":{"id":"FRm2HczTIlKe"},"source":["# INSERT YOUR PROJECT HERE\n","PROJECT = 'your-project'\n","\n","!gcloud auth login --project {PROJECT}"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"hnufOtSfP0jX"},"source":["## Obtain a private key file for your service account\n","\n","You should already have a service account registered to use Earth Engine. If you don't, follow [these instructions](https://developers.google.com/earth-engine/guides/service_account#create-a-service-account) to get one. Copy the email address of your service account into the following cell. (The service account must already be registered to use Earth Engine). In the following cell, the `gsutil` command line is used to generate a key file for the service account. The key file will be created on the notebook VM."]},{"cell_type":"code","metadata":{"id":"tLxOnKL2Nk5p"},"source":["# INSERT YOUR SERVICE ACCOUNT HERE\n","SERVICE_ACCOUNT='your-service-account@your-project.iam.gserviceaccount.com'\n","KEY = 'key.json'\n","\n","!gcloud iam service-accounts keys create {KEY} --iam-account {SERVICE_ACCOUNT}"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"6QksNfvaY5em"},"source":["## Start an `AuthorizedSession` and test your credentials\n","\n","Test the private key by using it to get credentials. Use the credentials to create an authorized session to make HTTP requests. Make a `GET` request through the session to check that the credentials work."]},{"cell_type":"code","metadata":{"id":"h2MHcyeqLufx"},"source":["from google.auth.transport.requests import AuthorizedSession\n","from google.oauth2 import service_account\n","\n","credentials = service_account.Credentials.from_service_account_file(KEY)\n","scoped_credentials = credentials.with_scopes(\n"," ['https://www.googleapis.com/auth/cloud-platform'])\n","\n","session = AuthorizedSession(scoped_credentials)\n","\n","url = 'https://earthengine.googleapis.com/v1beta/projects/earthengine-public/assets/LANDSAT'\n","\n","response = session.get(url)\n","\n","from pprint import pprint\n","import json\n","pprint(json.loads(response.content))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"U7ZzzoW_HgZ5"},"source":["## Serialize a computation\n","\n","Before you can send a request to compute something, the computation needs to be put into the Earth Engine expression graph format. The following demonstrates how to obtain the expression graph.\n","\n","### Authenticate to Earth Engine\n","\n","Get Earth Engine scoped credentials from the service account. Use them to initialize Earth Engine."]},{"cell_type":"code","metadata":{"id":"LdTW8sPQIsFx"},"source":["import ee\n","\n","# Get some new credentials since the other ones are cloud scope.\n","ee_creds = ee.ServiceAccountCredentials(SERVICE_ACCOUNT, KEY)\n","ee.Initialize(ee_creds)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Zb243CYTP48x"},"source":["### Define a computation\n","\n","Prototype a simple computation with the client API. Note that the result of the computation is an `Image`."]},{"cell_type":"code","metadata":{"id":"S9fsJ4RtPr12"},"source":["coords = [\n"," -121.58626826832939, \n"," 38.059141484827485,\n","]\n","region = ee.Geometry.Point(coords)\n","\n","collection = ee.ImageCollection('COPERNICUS/S2')\n","collection = collection.filterBounds(region)\n","collection = collection.filterDate('2020-04-01', '2020-09-01')\n","image = collection.median()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"HLk-i3htLEBf"},"source":["### Serialize the expression graph\n","\n","This will create an object that represents the Earth Engine expression graph (specifically, an [`Expression`](https://developers.google.com/earth-engine/reference/rest/v1beta/Expression)). In general, you should build these with one of the client APIs."]},{"cell_type":"code","metadata":{"id":"Mvbi4LuhV9BR"},"source":["serialized = ee.serializer.encode(image)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"LufqOdLWKJ9l"},"source":["Create the desired projection (WGS84) at the desired scale (10 meters for Sentinel-2). This is just to discover the desired scale in degrees, the units of the projection. These scales will be used to specify the affine transform in the request."]},{"cell_type":"code","metadata":{"id":"ZJaOvVgUKKJK"},"source":["# Make a projection to discover the scale in degrees.\n","proj = ee.Projection('EPSG:4326').atScale(10).getInfo()\n","\n","# Get scales out of the transform.\n","scale_x = proj['transform'][0]\n","scale_y = -proj['transform'][4]"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Y_Lu1qq5GUTF"},"source":["## Send the request\n","\n","Make a `POST` request to the [`computePixels`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.image/computePixels) endpoint. Note that the request contains the [`Expression`](https://developers.google.com/earth-engine/reference/rest/v1beta/Expression), which is the serialized computation. It also contains a [`PixelGrid`](https://developers.google.com/earth-engine/reference/rest/v1beta/PixelGrid). The `PixelGrid` contains `dimensions` for the desired output and an `AffineTransform` in the units of the requested coordinate system. Here the coordinate system is geographic, so the transform is specified with scale in degrees and geographic coordinates of the upper left corner of the requested image patch."]},{"cell_type":"code","metadata":{"id":"_pbqvd48dT33"},"source":["import json\n","\n","url = 'https://earthengine.googleapis.com/v1beta/projects/{}/image:computePixels'\n","url = url.format(PROJECT)\n","\n","response = session.post(\n"," url=url,\n"," data=json.dumps({\n"," 'expression': serialized,\n"," 'fileFormat': 'PNG',\n"," 'bandIds': ['B4','B3','B2'],\n"," 'grid': {\n"," 'dimensions': {\n"," 'width': 640,\n"," 'height': 640\n"," },\n"," 'affineTransform': {\n"," 'scaleX': scale_x,\n"," 'shearX': 0,\n"," 'translateX': coords[0],\n"," 'shearY': 0,\n"," 'scaleY': scale_y,\n"," 'translateY': coords[1]\n"," },\n"," 'crsCode': 'EPSG:4326',\n"," },\n"," 'visualizationOptions': {'ranges': [{'min': 0, 'max': 3000}]},\n"," })\n",")\n","\n","image_content = response.content"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"NZRt0HQU65Hy"},"source":["If you are running this in a notebook, you can display the results using the `IPython` image display widget."]},{"cell_type":"code","metadata":{"id":"4edL2ZLe7E2a"},"source":["# Import the Image function from the IPython.display module. \n","from IPython.display import Image\n","Image(image_content)"],"execution_count":null,"outputs":[]}]} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_REST_API_compute_table.ipynb b/guides/linked/Earth_Engine_REST_API_compute_table.ipynb new file mode 100644 index 000000000..a9da71a75 --- /dev/null +++ b/guides/linked/Earth_Engine_REST_API_compute_table.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Earth_Engine_REST_API_compute_table.ipynb","private_outputs":true,"provenance":[{"file_id":"https://github.com/google/earthengine-community/blob/master/guides/linked/Earth_Engine_REST_API_computation.ipynb","timestamp":1605024830392},{"file_id":"1roGESkJ-6YGl3Xod7WJ1q1DhofbDo_Jh","timestamp":1591133797094}],"collapsed_sections":[],"toc_visible":true,"authorship_tag":"ABX9TyN5qB96LKgraIuC+Vj2rORP"},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"code","metadata":{"id":"fSIfBsgi8dNK"},"source":["#@title Copyright 2020 Google LLC. { display-mode: \"form\" }\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","# https://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License."],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"aV1xZ1CPi3Nw"},"source":["
\n","\n"," Run in Google Colab\n","\n"," View source on GitHub
"]},{"cell_type":"markdown","metadata":{"id":"CrEM35gqHouU"},"source":["# Table computations with the Earth Engine REST API\n","\n","***Note:*** *The REST API contains new and advanced features that may not be suitable for all users. If you are new to Earth Engine, please get started with the [JavaScript guide](https://developers.google.com/earth-engine/guides/getstarted).*\n","\n","The [Earth Engine REST API quickstart](https://developers.google.com/earth-engine/reference/Quickstart) shows how to access blocks of pixels from an Earth Engine asset. The [compute pixels example](https://developers.google.com/earth-engine/Earth_Engine_REST_API_compute_image) demonstrates how to apply a computation to the pixels before obtaining the result. This example demonstrates getting the mean of pixels in each image of an `ImageCollection` in each feature of a `FeatureCollection`. Specifically, this is a `POST` request to the [`computeFeatures`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.table/computeFeatures) endpoint."]},{"cell_type":"markdown","metadata":{"id":"H2VOD2agf4Cm"},"source":["## Before you begin\n","\n","Follow [these instructions](https://developers.google.com/earth-engine/earthengine_cloud_project_setup#apply-to-use-earth-engine) to:\n","\n","1. Apply for Earth Engine\n","2. Create a Google Cloud project\n","3. Enable the Earth Engine API on the project\n","4. Create a service account\n","5. Give the service account project level permission to perform Earth Engine computations\n","\n","**Note**: To complete this tutorial, you will need a service account that is registered for Earth Engine access. See [these instructions](https://developers.google.com/earth-engine/guides/service_account#register-the-service-account-to-use-earth-engine) to register a service account before proceeding."]},{"cell_type":"markdown","metadata":{"id":"OfMAA6YhPuFl"},"source":["## Authenticate to Google Cloud\n","\n","The first thing to do is login so that you can make authenticated requests to Google Cloud. You will set the project at the same time. Follow the instructions in the output to complete the sign in."]},{"cell_type":"code","metadata":{"id":"FRm2HczTIlKe"},"source":["# INSERT YOUR PROJECT HERE\n","PROJECT = 'your-project'\n","\n","!gcloud auth login --project {PROJECT}"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"hnufOtSfP0jX"},"source":["## Obtain a private key file for your service account\n","\n","You should already have a service account registered to use Earth Engine. If you don't, follow [these instructions](https://developers.google.com/earth-engine/service_account#create-a-service-account) to get one. Copy the email address of your service account into the following cell. (The service account must already be registered to use Earth Engine). In the following cell, the `gsutil` command line is used to generate a key file for the service account. The key file will be created on the notebook VM."]},{"cell_type":"code","metadata":{"id":"tLxOnKL2Nk5p"},"source":["# INSERT YOUR SERVICE ACCOUNT HERE\n","SERVICE_ACCOUNT='your-service-account@your-project.iam.gserviceaccount.com'\n","KEY = 'key.json'\n","\n","!gcloud iam service-accounts keys create {KEY} --iam-account {SERVICE_ACCOUNT}"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"6QksNfvaY5em"},"source":["## Start an `AuthorizedSession` and test your credentials\n","\n","Test the private key by using it to get credentials. Use the credentials to create an authorized session to make HTTP requests. Make a `GET` request through the session to check that the credentials work."]},{"cell_type":"code","metadata":{"id":"h2MHcyeqLufx"},"source":["from google.auth.transport.requests import AuthorizedSession\n","from google.oauth2 import service_account\n","\n","credentials = service_account.Credentials.from_service_account_file(KEY)\n","scoped_credentials = credentials.with_scopes(\n"," ['https://www.googleapis.com/auth/cloud-platform'])\n","\n","session = AuthorizedSession(scoped_credentials)\n","\n","url = 'https://earthengine.googleapis.com/v1beta/projects/earthengine-public/assets/LANDSAT'\n","\n","response = session.get(url)\n","\n","from pprint import pprint\n","import json\n","pprint(json.loads(response.content))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"U7ZzzoW_HgZ5"},"source":["## Serialize a computation\n","\n","Before you can send a request to compute something, the computation needs to be put into the Earth Engine expression graph format. The following demonstrates how to obtain the expression graph.\n","\n","### Authenticate to Earth Engine\n","\n","Get Earth Engine scoped credentials from the service account. Use them to initialize Earth Engine."]},{"cell_type":"code","metadata":{"id":"LdTW8sPQIsFx"},"source":["import ee\n","\n","# Get some new credentials since the other ones are cloud scope.\n","ee_creds = ee.ServiceAccountCredentials(SERVICE_ACCOUNT, KEY)\n","ee.Initialize(ee_creds)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Zb243CYTP48x"},"source":["### Define a computation\n","\n","Prototype a simple computation with the client API. Note that the result of the computation is a `FeatureCollection`.\n","To check that the computation can succeed without errors, get a value from the first `Feature` (the mean NDVI in the polygon)."]},{"cell_type":"code","metadata":{"id":"S9fsJ4RtPr12"},"source":["# A collection of polygons.\n","states = ee.FeatureCollection('TIGER/2018/States')\n","maine = states.filter(ee.Filter.eq('NAME', 'Maine'))\n","\n","# Imagery: NDVI vegetation index from MODIS.\n","band = 'NDVI'\n","images = ee.ImageCollection('MODIS/006/MOD13Q1').select(band)\n","image = images.first()\n","\n","computation = image.reduceRegions(\n"," collection=maine, \n"," reducer=ee.Reducer.mean().setOutputs([band]), \n"," scale=image.projection().nominalScale()\n",")\n","\n","# Print the value to test.\n","print(computation.first().get(band).getInfo())"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"HLk-i3htLEBf"},"source":["### Serialize the expression graph\n","\n","This will create an object that represents the Earth Engine expression graph (specifically, an [`Expression`](https://developers.google.com/earth-engine/reference/rest/v1beta/Expression)). In general, you should build these with one of the client APIs."]},{"cell_type":"code","metadata":{"id":"Mvbi4LuhV9BR"},"source":["# Serialize the computation.\n","serialized = ee.serializer.encode(computation)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Y_Lu1qq5GUTF"},"source":["## Send the request\n","\n","Make a `POST` request to the [`computeFeatures`](https://developers.google.com/earth-engine/reference/rest/v1beta/projects.table/computeFeatures) endpoint. Note that the request contains the [`Expression`](https://developers.google.com/earth-engine/reference/rest/v1beta/Expression), which is the serialized computation."]},{"cell_type":"code","metadata":{"id":"_pbqvd48dT33"},"source":["import json\n","\n","url = 'https://earthengine.googleapis.com/v1beta/projects/{}/table:computeFeatures'\n","\n","response = session.post(\n"," url = url.format(PROJECT),\n"," data = json.dumps({'expression': serialized})\n",")\n","\n","import json\n","pprint(json.loads(response.content))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"MPO7-O4qB16V"},"source":["The response contains the resultant `FeatureCollection` as GeoJSON, which can be consumed by other apps or processes."]}]} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_TensorFlow_AI_Platform.ipynb b/guides/linked/Earth_Engine_TensorFlow_AI_Platform.ipynb new file mode 100644 index 000000000..e89ec3e78 --- /dev/null +++ b/guides/linked/Earth_Engine_TensorFlow_AI_Platform.ipynb @@ -0,0 +1,692 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Earth_Engine_TensorFlow_AI_Platform.ipynb", + "private_outputs": true, + "provenance": [], + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "code", + "metadata": { + "id": "fSIfBsgi8dNK" + }, + "source": [ + "#@title Copyright 2021 Google LLC. { display-mode: \"form\" }\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aV1xZ1CPi3Nw" + }, + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + " View source on GitHub
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AC8adBmw-5m3" + }, + "source": [ + "# Introduction\n", + "\n", + "This is an Earth Engine <> TensorFlow demonstration notebook. This demonstrates a per-pixel neural network implemented in a way that allows the trained model to be hosted on [Google AI Platform](https://cloud.google.com/ai-platform) and used in Earth Engine for interactive prediction from an `ee.Model.fromAIPlatformPredictor`. See [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/TF_demo1_keras.ipynb) for background on the dense model.\n", + "\n", + "**Running this demo may incur charges to your Google Cloud Account!**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KiTyR3FNlv-O" + }, + "source": [ + "# Setup software libraries\n", + "\n", + "Import software libraries and/or authenticate as necessary." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HsyDopq-yy2b" + }, + "source": [ + "## Authenticate to Colab and Cloud\n", + "\n", + "To read/write from a Google Cloud Storage bucket to which you have access, it's necessary to authenticate (as yourself). *This should be the same account you use to login to Earth Engine*. When you run the code below, it will display a link in the output to an authentication page in your browser. Follow the link to a page that will let you grant permission to the Cloud SDK to access your resources. Copy the code from the permissions page back into this notebook and press return to complete the process.\n", + "\n", + "(You may need to run this again if you get a credentials error later.)" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "sYyTIPLsvMWl", + "cellView": "code" + }, + "source": [ + "from google.colab import auth\n", + "auth.authenticate_user()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ejxa1MQjEGv9" + }, + "source": [ + "## Upgrade Earth Engine and Authenticate\n", + "\n", + "Update Earth Engine to ensure you have the latest version. Authenticate to Earth Engine the same way you did to the Colab notebook. Specifically, run the code to display a link to a permissions page. This gives you access to your Earth Engine account. *This should be the same account you used to login to Cloud previously*. Copy the code from the Earth Engine permissions page back into the notebook and press return to complete the process." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "UBRbA9I06LIM" + }, + "source": [ + "!pip install -U earthengine-api --no-deps" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "HzwiVqbcmJIX", + "cellView": "code" + }, + "source": [ + "import ee\n", + "ee.Authenticate()\n", + "ee.Initialize()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xhZiXmnSyy2l" + }, + "source": [ + "## Test the TensorFlow installation\n", + "\n", + "Import TensorFlow and check the version." + ] + }, + { + "cell_type": "code", + "metadata": { + "cellView": "code", + "id": "WjOh_CJeyy2m" + }, + "source": [ + "import tensorflow as tf\n", + "print(tf.__version__)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "w6hPSVmYyy2p" + }, + "source": [ + "## Test the Folium installation\n", + "\n", + "We will use the Folium library for visualization. Import the library and check the version." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "frodQp2syy2q" + }, + "source": [ + "import folium\n", + "print(folium.__version__)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DrXLkJC2QJdP" + }, + "source": [ + "# Define variables\n", + "\n", + "The training data are land cover labels with a single vector of Landsat 8 pixel values (`BANDS`) as predictors. See [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/TF_demo1_keras.ipynb) for details on how to generate these training data." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "GHTOc5YLQZ5B" + }, + "source": [ + "# REPLACE WITH YOUR CLOUD PROJECT!\n", + "PROJECT = 'your-project'\n", + "\n", + "# Cloud Storage bucket with training and testing datasets.\n", + "DATA_BUCKET = 'ee-docs-demos'\n", + "# Output bucket for trained models. You must be able to write into this bucket.\n", + "OUTPUT_BUCKET = 'your-bucket'\n", + "\n", + "# This is a good region for hosting AI models.\n", + "REGION = 'us-central1'\n", + "\n", + "# Training and testing dataset file names in the Cloud Storage bucket.\n", + "TRAIN_FILE_PREFIX = 'Training_demo'\n", + "TEST_FILE_PREFIX = 'Testing_demo'\n", + "file_extension = '.tfrecord.gz'\n", + "TRAIN_FILE_PATH = 'gs://' + DATA_BUCKET + '/' + TRAIN_FILE_PREFIX + file_extension\n", + "TEST_FILE_PATH = 'gs://' + DATA_BUCKET + '/' + TEST_FILE_PREFIX + file_extension\n", + "\n", + "# The labels, consecutive integer indices starting from zero, are stored in\n", + "# this property, set on each point.\n", + "LABEL = 'landcover'\n", + "# Number of label values, i.e. number of classes in the classification.\n", + "N_CLASSES = 3\n", + "\n", + "# Use Landsat 8 surface reflectance data for predictors.\n", + "L8SR = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')\n", + "# Use these bands for prediction.\n", + "BANDS = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7']\n", + "\n", + "# These names are used to specify properties in the export of \n", + "# training/testing data and to define the mapping between names and data\n", + "# when reading into TensorFlow datasets.\n", + "FEATURE_NAMES = list(BANDS)\n", + "FEATURE_NAMES.append(LABEL)\n", + "\n", + "# List of fixed-length features, all of which are float32.\n", + "columns = [\n", + " tf.io.FixedLenFeature(shape=[1], dtype=tf.float32) for k in FEATURE_NAMES\n", + "]\n", + "\n", + "# Dictionary with feature names as keys, fixed-length features as values.\n", + "FEATURES_DICT = dict(zip(FEATURE_NAMES, columns))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8sosRFEDdOMA" + }, + "source": [ + "# Read data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "43-c0JNFI_m6" + }, + "source": [ + "### Check existence of the data files\n", + "\n", + "Check that you have permission to read the files in the output Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "YDZfNl6yc0Kj" + }, + "source": [ + "print('Found training file.' if tf.io.gfile.exists(TRAIN_FILE_PATH) \n", + " else 'No training file found.')\n", + "print('Found testing file.' if tf.io.gfile.exists(TEST_FILE_PATH) \n", + " else 'No testing file found.')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "LS4jGTrEfz-1" + }, + "source": [ + "## Read into a `tf.data.Dataset`\n", + "\n", + "Here we are going to read a file in Cloud Storage into a `tf.data.Dataset`. ([these TensorFlow docs](https://www.tensorflow.org/guide/data) explain more about reading data into a `tf.data.Dataset`). Check that you can read examples from the file. The purpose here is to ensure that we can read from the file without an error. The actual content is not necessarily human readable. Note that we will use all data for training.\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "T3PKyDQW8Vpx", + "cellView": "code" + }, + "source": [ + "# Create a dataset from the TFRecord file in Cloud Storage.\n", + "train_dataset = tf.data.TFRecordDataset([TRAIN_FILE_PATH, TEST_FILE_PATH],\n", + " compression_type='GZIP')\n", + "\n", + "# Print the first record to check.\n", + "print(iter(train_dataset).next())" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QNfaUPbcjuCO" + }, + "source": [ + "## Parse the dataset\n", + "\n", + "Now we need to make a parsing function for the data in the TFRecord files. The data comes in flattened 2D arrays per record and we want to use the first part of the array for input to the model and the last element of the array as the class label. The parsing function reads data from a serialized `Example` proto (i.e. [`example.proto`](https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/core/example/example.proto)) into a dictionary in which the keys are the feature names and the values are the tensors storing the value of the features for that example. ([Learn more about parsing `Example` protocol buffer messages](https://www.tensorflow.org/programmers_guide/datasets#parsing_tfexample_protocol_buffer_messages))." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "x2Q0g3fBj2kD", + "cellView": "code" + }, + "source": [ + "def parse_tfrecord(example_proto):\n", + " \"\"\"The parsing function.\n", + "\n", + " Read a serialized example into the structure defined by FEATURES_DICT.\n", + "\n", + " Args:\n", + " example_proto: a serialized Example.\n", + "\n", + " Returns:\n", + " A tuple of the predictors dictionary and the LABEL, cast to an `int32`.\n", + " \"\"\"\n", + " parsed_features = tf.io.parse_single_example(example_proto, FEATURES_DICT)\n", + " labels = parsed_features.pop(LABEL)\n", + " return parsed_features, tf.cast(labels, tf.int32)\n", + "\n", + "# Map the function over the dataset.\n", + "parsed_dataset = train_dataset.map(parse_tfrecord, num_parallel_calls=4)\n", + "\n", + "from pprint import pprint\n", + "\n", + "# Print the first parsed record to check.\n", + "pprint(iter(parsed_dataset).next())" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Nb8EyNT4Xnhb" + }, + "source": [ + "Note that each record of the parsed dataset contains a tuple. The first element of the tuple is a dictionary with bands names for keys and tensors storing the pixel data for values. The second element of the tuple is tensor storing the class label." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yZwSBGX27Bfy" + }, + "source": [ + "## Adjust dimension and shape\n", + "\n", + "Turn the dictionary of *{name: tensor,...}* into a 1x1xP array of values, where P is the number of predictors. Turn the label into a 1x1x`N_CLASSES` array of indicators (i.e. one-hot vector), in order to use a categorical crossentropy-loss function. Return a tuple of (predictors, indicators where each is a three dimensional array; the first two dimensions are spatial x, y (i.e. 1x1 kernel)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ABZvVGZw7BsS" + }, + "source": [ + "# Inputs as a tuple. Make predictors 1x1xP and labels 1x1xN_CLASSES.\n", + "def to_tuple(inputs, label):\n", + " return (tf.expand_dims(tf.transpose(list(inputs.values())), 1),\n", + " tf.expand_dims(tf.one_hot(indices=label, depth=N_CLASSES), 1))\n", + "\n", + "input_dataset = parsed_dataset.map(to_tuple)\n", + "# Check the first one.\n", + "pprint(iter(input_dataset).next())\n", + "\n", + "input_dataset = input_dataset.shuffle(128).batch(8)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nEx1RAXOZQkS" + }, + "source": [ + "# Model setup\n", + "\n", + "Make a densely-connected convolutional model, where the convolution occurs in a 1x1 kernel. This is exactly analogous to the model generated in [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/TF_demo1_keras.ipynb), but operates in a convolutional manner in a 1x1 kernel. This allows Earth Engine to apply the model spatially, as demonstrated below.\n", + "\n", + "Note that the model used here is purely for demonstration purposes and hasn't gone through any performance tuning." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "t9pWa54oG-xl" + }, + "source": [ + "## Create the Keras model\n", + "\n", + "Before we create the model, there's still a wee bit of pre-processing to get the data into the right input shape and a format that can be used with cross-entropy loss. Specifically, Keras expects a list of inputs and a one-hot vector for the class. (See [the Keras loss function docs](https://keras.io/losses/), [the TensorFlow categorical identity docs](https://www.tensorflow.org/guide/feature_columns#categorical_identity_column) and [the `tf.one_hot` docs](https://www.tensorflow.org/api_docs/python/tf/one_hot) for details).\n", + "\n", + "Here we will use a simple neural network model with a 64 node hidden layer. Once the dataset has been prepared, define the model, compile it, fit it to the training data. See [the Keras `Sequential` model guide](https://keras.io/getting-started/sequential-model-guide/) for more details." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "OCZq3VNpG--G", + "cellView": "code" + }, + "source": [ + "from tensorflow import keras\n", + "\n", + "# Define the layers in the model. Note the 1x1 kernels.\n", + "model = tf.keras.models.Sequential([\n", + " tf.keras.layers.Input((None, None, len(BANDS),)),\n", + " tf.keras.layers.Conv2D(64, (1,1), activation=tf.nn.relu),\n", + " tf.keras.layers.Dropout(0.1),\n", + " tf.keras.layers.Conv2D(N_CLASSES, (1,1), activation=tf.nn.softmax)\n", + "])\n", + "\n", + "# Compile the model with the specified loss and optimizer functions.\n", + "model.compile(optimizer=tf.keras.optimizers.Adam(),\n", + " loss='categorical_crossentropy',\n", + " metrics=['accuracy'])\n", + "\n", + "# Fit the model to the training data. Lucky number 7.\n", + "model.fit(x=input_dataset, epochs=7)\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "shbr6cSXShRg" + }, + "source": [ + "## Save the trained model\n", + "\n", + "Export the trained model to TensorFlow `SavedModel` format in your cloud storage bucket. The [Cloud Platform storage browser](https://console.cloud.google.com/storage/browser) is useful for checking on these saved models." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Sgg7MTXfS1PK" + }, + "source": [ + "MODEL_DIR = 'gs://' + OUTPUT_BUCKET + '/demo_pixel_model'\n", + "model.save(MODEL_DIR, save_format='tf')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "keijPyVQTIAq" + }, + "source": [ + "# EEification\n", + "\n", + "EEIfication prepares the model for hosting on [Google AI Platform](https://cloud.google.com/ai-platform). Learn more about EEification from [this doc](https://developers.google.com/earth-engine/tensorflow#interacting-with-models-hosted-on-ai-platform). First, get (and SET) input and output names of the nodes. **CHANGE THE OUTPUT NAME TO SOMETHING THAT MAKES SENSE FOR YOUR MODEL!** Keep the input name of 'array', which is how you'll pass data into the model (as an array image)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "w49O7n5oTS4w" + }, + "source": [ + "from tensorflow.python.tools import saved_model_utils\n", + "\n", + "meta_graph_def = saved_model_utils.get_meta_graph_def(MODEL_DIR, 'serve')\n", + "inputs = meta_graph_def.signature_def['serving_default'].inputs\n", + "outputs = meta_graph_def.signature_def['serving_default'].outputs\n", + "\n", + "# Just get the first thing(s) from the serving signature def. i.e. this\n", + "# model only has a single input and a single output.\n", + "input_name = None\n", + "for k,v in inputs.items():\n", + " input_name = v.name\n", + " break\n", + "\n", + "output_name = None\n", + "for k,v in outputs.items():\n", + " output_name = v.name\n", + " break\n", + "\n", + "# Make a dictionary that maps Earth Engine outputs and inputs to\n", + "# AI Platform inputs and outputs, respectively.\n", + "import json\n", + "input_dict = \"'\" + json.dumps({input_name: \"array\"}) + \"'\"\n", + "output_dict = \"'\" + json.dumps({output_name: \"output\"}) + \"'\"\n", + "print(input_dict)\n", + "print(output_dict)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AX2icXa1UdFF" + }, + "source": [ + "## Run the EEifier\n", + "\n", + "The actual EEification is handled by the `earthengine model prepare` command. Note that you will need to set your Cloud Project prior to running the command." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IYmH_wCOUhIv" + }, + "source": [ + "# Put the EEified model next to the trained model directory.\n", + "EEIFIED_DIR = 'gs://' + OUTPUT_BUCKET + '/eeified_pixel_model'\n", + "\n", + "# You need to set the project before using the model prepare command.\n", + "!earthengine set_project {PROJECT}\n", + "!earthengine model prepare --source_dir {MODEL_DIR} --dest_dir {EEIFIED_DIR} --input {input_dict} --output {output_dict}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0_uTqQAaVTIK" + }, + "source": [ + "# Deploy and host the EEified model on AI Platform\n", + "\n", + "Now there is another TensorFlow `SavedModel` stored in `EEIFIED_DIR` ready for hosting by AI Platform. Do that from the `gcloud` command line tool, installed in the Colab runtime by default. Be sure to specify a regional model with the `REGION` parameter. Note that the `MODEL_NAME` must be unique. If you already have a model by that name, either name a new model or a new version of the old model. The [Cloud Console AI Platform models page](https://console.cloud.google.com/ai-platform/models) is useful for monitoring your models.\n", + "\n", + "**If you change anything about the trained model, you'll need to re-EEify it and create a new version!**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8RZRRzcfVu5T" + }, + "source": [ + "MODEL_NAME = 'pixel_demo_model'\n", + "VERSION_NAME = 'v0'\n", + "\n", + "!gcloud ai-platform models create {MODEL_NAME} \\\n", + " --project {PROJECT} \\\n", + " --region {REGION}\n", + "\n", + "!gcloud ai-platform versions create {VERSION_NAME} \\\n", + " --project {PROJECT} \\\n", + " --region {REGION} \\\n", + " --model {MODEL_NAME} \\\n", + " --origin {EEIFIED_DIR} \\\n", + " --framework \"TENSORFLOW\" \\\n", + " --runtime-version=2.3 \\\n", + " --python-version=3.7" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5aTGza-rWIjp" + }, + "source": [ + "# Connect to the hosted model from Earth Engine\n", + "\n", + "1. Generate the input imagery. This should be done in exactly the same way as the training data were generated. See [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/TF_demo1_keras.ipynb) for details.\n", + "2. Connect to the hosted model.\n", + "3. Use the model to make predictions.\n", + "4. Display the results.\n", + "\n", + "Note that it takes the model a couple minutes to spin up and make predictions." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "P2OsyrJ7HAhE" + }, + "source": [ + "# Cloud masking function.\n", + "def maskL8sr(image):\n", + " cloudShadowBitMask = ee.Number(2).pow(3).int()\n", + " cloudsBitMask = ee.Number(2).pow(5).int()\n", + " qa = image.select('pixel_qa')\n", + " mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(\n", + " qa.bitwiseAnd(cloudsBitMask).eq(0))\n", + " return image.updateMask(mask).select(BANDS).divide(10000)\n", + "\n", + "# The image input data is a 2018 cloud-masked median composite.\n", + "image = L8SR.filterDate('2018-01-01', '2018-12-31').map(maskL8sr).median()\n", + "\n", + "# Get a map ID for display in folium.\n", + "rgb_vis = {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 0.3, 'format': 'png'}\n", + "mapid = image.getMapId(rgb_vis)\n", + "\n", + "# Turn into an array image for input to the model.\n", + "array_image = image.float().toArray()\n", + "\n", + "# Point to the model hosted on AI Platform. If you specified a region other\n", + "# than the default (us-central1) at model creation, specify it here.\n", + "model = ee.Model.fromAiPlatformPredictor(\n", + " projectName=PROJECT,\n", + " modelName=MODEL_NAME,\n", + " version=VERSION_NAME,\n", + " # Can be anything, but don't make it too big.\n", + " inputTileSize=[8, 8],\n", + " # Keep this the same as your training data.\n", + " proj=ee.Projection('EPSG:4326').atScale(30),\n", + " fixInputProj=True,\n", + " # Note the names here need to match what you specified in the\n", + " # output dictionary you passed to the EEifier.\n", + " outputBands={'output': {\n", + " 'type': ee.PixelType.float(),\n", + " 'dimensions': 1\n", + " }\n", + " },\n", + ")\n", + "\n", + "# model.predictImage outputs a one dimensional array image that\n", + "# packs the output nodes of your model into an array. These\n", + "# are class probabilities that you need to unpack into a \n", + "# multiband image with arrayFlatten(). If you want class\n", + "# labels, use arrayArgmax() as follows.\n", + "predictions = model.predictImage(array_image)\n", + "probabilities = predictions.arrayFlatten([['bare', 'veg', 'water']])\n", + "label = predictions.arrayArgmax().arrayGet([0]).rename('label')\n", + "\n", + "# Get map IDs for display in folium.\n", + "probability_vis = {\n", + " 'bands': ['bare', 'veg', 'water'], 'max': 0.5, 'format': 'png'\n", + "}\n", + "label_vis = {\n", + " 'palette': ['red', 'green', 'blue'], 'min': 0, 'max': 2, 'format': 'png'\n", + "}\n", + "probability_mapid = probabilities.getMapId(probability_vis)\n", + "label_mapid = label.getMapId(label_vis)\n", + "\n", + "# Visualize the input imagery and the predictions.\n", + "map = folium.Map(location=[37.6413, -122.2582], zoom_start=11)\n", + "\n", + "folium.TileLayer(\n", + " tiles=mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='median composite',\n", + " ).add_to(map)\n", + "folium.TileLayer(\n", + " tiles=label_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='predicted label',\n", + ").add_to(map)\n", + "folium.TileLayer(\n", + " tiles=probability_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='probability',\n", + ").add_to(map)\n", + "map.add_child(folium.LayerControl())\n", + "map" + ], + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_TensorFlow_logistic_regression.ipynb b/guides/linked/Earth_Engine_TensorFlow_logistic_regression.ipynb new file mode 100644 index 000000000..db247f623 --- /dev/null +++ b/guides/linked/Earth_Engine_TensorFlow_logistic_regression.ipynb @@ -0,0 +1,756 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Earth_Engine_TensorFlow_logistic_regression.ipynb", + "private_outputs": true, + "provenance": [], + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "code", + "metadata": { + "id": "fSIfBsgi8dNK" + }, + "source": [ + "#@title Copyright 2021 Google LLC. { display-mode: \"form\" }\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aV1xZ1CPi3Nw" + }, + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + " View source on GitHub
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A5_vH_K4PM6X" + }, + "source": [ + "# Introduction\n", + "\n", + "## Logistic regression\n", + "Logistic regression is a classical machine learning method to estimate the probability of an event occurring (sometimes called the \"risk\"). Specifically, the probability is modeled as a sigmoid function of a linear combination of inputs. This can be implemented as a very simple neural network with a single trainable layer.\n", + "\n", + "Here, the event being modeled is deforestation in 2016. If a pixel is labeled as deforesetation in 2016 according to the [Hansen Global Forest Change dataset](https://developers.google.com/earth-engine/datasets/catalog/UMD_hansen_global_forest_change_2018_v1_6), the event occurred with probability 1. The probability is zero otherwise. The input variables (i.e. the predictors of this event) are the pixel values of two Landsat 8 surface reflectance median composites, from 2015 and 2017, assumed to represent before and after conditions.\n", + "\n", + "The model will be hosted on [Google AI Platform](https://cloud.google.com/ai-platform) and used in Earth Engine for interactive prediction from an `ee.Model.fromAIPlatformPredictor`. See [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/Earth_Engine_TensorFlow_AI_Platform.ipynb) for background on hosted models.\n", + "\n", + "**Running this demo may incur charges to your Google Cloud Account!**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KiTyR3FNlv-O" + }, + "source": [ + "# Setup software libraries\n", + "\n", + "Import software libraries and/or authenticate as necessary." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HsyDopq-yy2b" + }, + "source": [ + "## Authenticate to Colab and Cloud\n", + "\n", + "*This should be the same account you use to login to Earth Engine*." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "sYyTIPLsvMWl", + "cellView": "code" + }, + "source": [ + "from google.colab import auth\n", + "auth.authenticate_user()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ejxa1MQjEGv9" + }, + "source": [ + "## Authenticate to Earth Engine\n", + "\n", + "*This should be the same account you used to login to Cloud previously*." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "HzwiVqbcmJIX", + "cellView": "code" + }, + "source": [ + "import ee\n", + "ee.Authenticate()\n", + "ee.Initialize()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xhZiXmnSyy2l" + }, + "source": [ + "## Test the TensorFlow installation" + ] + }, + { + "cell_type": "code", + "metadata": { + "cellView": "code", + "id": "WjOh_CJeyy2m" + }, + "source": [ + "import tensorflow as tf\n", + "print(tf.__version__)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "w6hPSVmYyy2p" + }, + "source": [ + "## Test the Folium installation" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "frodQp2syy2q" + }, + "source": [ + "import folium\n", + "print(folium.__version__)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DrXLkJC2QJdP" + }, + "source": [ + "# Define variables" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "GHTOc5YLQZ5B" + }, + "source": [ + "# REPLACE WITH YOUR CLOUD PROJECT!\n", + "PROJECT = 'your-project'\n", + "\n", + "# Output bucket for trained models. You must be able to write into this bucket.\n", + "OUTPUT_BUCKET = 'your-bucket'\n", + "\n", + "# Cloud Storage bucket with training and testing datasets.\n", + "DATA_BUCKET = 'ee-docs-demos'\n", + "\n", + "# This is a good region for hosting AI models.\n", + "REGION = 'us-central1'\n", + "\n", + "# Training and testing dataset file names in the Cloud Storage bucket.\n", + "TRAIN_FILE_PREFIX = 'logistic_demo_training'\n", + "TEST_FILE_PREFIX = 'logistic_demo_testing'\n", + "file_extension = '.tfrecord.gz'\n", + "TRAIN_FILE_PATH = 'gs://' + DATA_BUCKET + '/' + TRAIN_FILE_PREFIX + file_extension\n", + "TEST_FILE_PATH = 'gs://' + DATA_BUCKET + '/' + TEST_FILE_PREFIX + file_extension\n", + "\n", + "# The labels, consecutive integer indices starting from zero, are stored in\n", + "# this property, set on each point.\n", + "LABEL = 'loss16'\n", + "# Number of label values, i.e. number of classes in the classification.\n", + "N_CLASSES = 3\n", + "\n", + "# Use Landsat 8 surface reflectance data for predictors.\n", + "L8SR = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')\n", + "# Use these bands for prediction.\n", + "OPTICAL_BANDS = ['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7']\n", + "THERMAL_BANDS = ['B10', 'B11']\n", + "BEFORE_BANDS = OPTICAL_BANDS + THERMAL_BANDS\n", + "AFTER_BANDS = [str(s) + '_1' for s in BEFORE_BANDS]\n", + "BANDS = BEFORE_BANDS + AFTER_BANDS\n", + "\n", + "# Forest loss in 2016 is what we want to predict.\n", + "IMAGE = ee.Image('UMD/hansen/global_forest_change_2018_v1_6')\n", + "LOSS16 = IMAGE.select('lossyear').eq(16).rename(LABEL)\n", + "\n", + "# Study area. Mostly Brazil.\n", + "GEOMETRY = ee.Geometry.Polygon(\n", + " [[[-71.96531166607349, 0.24565390557980268],\n", + " [-71.96531166607349, -17.07400853625319],\n", + " [-40.32468666607349, -17.07400853625319],\n", + " [-40.32468666607349, 0.24565390557980268]]], None, False)\n", + "\n", + "# These names are used to specify properties in the export of training/testing\n", + "# data and to define the mapping between names and data when reading from\n", + "# the TFRecord file into a tf.data.Dataset.\n", + "FEATURE_NAMES = list(BANDS)\n", + "FEATURE_NAMES.append(LABEL)\n", + "\n", + "# List of fixed-length features, all of which are float32.\n", + "columns = [\n", + " tf.io.FixedLenFeature(shape=[1], dtype=tf.float32) for k in FEATURE_NAMES\n", + "]\n", + "\n", + "# Dictionary with feature names as keys, fixed-length features as values.\n", + "FEATURES_DICT = dict(zip(FEATURE_NAMES, columns))\n", + "\n", + "# Where to save the trained model.\n", + "MODEL_DIR = 'gs://' + OUTPUT_BUCKET + '/logistic_demo_model'\n", + "# Where to save the EEified model.\n", + "EEIFIED_DIR = 'gs://' + OUTPUT_BUCKET + '/logistic_demo_eeified'\n", + "\n", + "# Name of the AI Platform model to be hosted.\n", + "MODEL_NAME = 'logistic_demo_model'\n", + "# Version of the AI Platform model to be hosted.\n", + "VERSION_NAME = 'v0'" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YmBGDcKYO7S2" + }, + "source": [ + "# Generate training data\n", + "\n", + "This is a multi-step process. First, export the image that contains the prediction bands. When that export completes (several hours in this example), it can be reloaded and sampled to generate training and testing datasets. The second step is to export the traning and testing tables to TFRecord files in Cloud Storage (also several hours)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Lt9OMetJkDe6" + }, + "source": [ + "# Cloud masking function.\n", + "def maskL8sr(image):\n", + " cloudShadowBitMask = ee.Number(2).pow(3).int()\n", + " cloudsBitMask = ee.Number(2).pow(5).int()\n", + " qa = image.select('pixel_qa')\n", + " mask1 = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(\n", + " qa.bitwiseAnd(cloudsBitMask).eq(0))\n", + " mask2 = image.mask().reduce('min')\n", + " mask3 = image.select(OPTICAL_BANDS).gt(0).And(\n", + " image.select(OPTICAL_BANDS).lt(10000)).reduce('min')\n", + " mask = mask1.And(mask2).And(mask3)\n", + " return image.select(OPTICAL_BANDS).divide(10000).addBands(\n", + " image.select(THERMAL_BANDS).divide(10).clamp(273.15, 373.15)\n", + " .subtract(273.15).divide(100)).updateMask(mask)\n", + "\n", + "# Make \"before\" and \"after\" composites.\n", + "composite1 = L8SR.filterDate(\n", + " '2015-01-01', '2016-01-01').map(maskL8sr).median()\n", + "composite2 = L8SR.filterDate(\n", + " '2016-12-31', '2017-12-31').map(maskL8sr).median()\n", + "\n", + "stack = composite1.addBands(composite2).float()\n", + "\n", + "export_image = 'projects/google/logistic_demo_image'\n", + "\n", + "image_task = ee.batch.Export.image.toAsset(\n", + " image = stack, \n", + " description = 'logistic_demo_image', \n", + " assetId = export_image, \n", + " region = GEOMETRY,\n", + " scale = 30,\n", + " maxPixels = 1e10\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3NYcEjex2Yjw" + }, + "source": [ + "First, export the image stack that contains the predictors." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FTNhKtTGv5Jn" + }, + "source": [ + "image_task.start()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jHH59m6q-OeZ" + }, + "source": [ + "Wait until the image export is completed, then sample the exported image." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "5eKAILt-cC3i" + }, + "source": [ + "sample = ee.Image(export_image).addBands(LOSS16).stratifiedSample(\n", + " numPoints = 10000,\n", + " classBand = LABEL,\n", + " region = GEOMETRY,\n", + " scale = 30,\n", + " tileScale = 8\n", + ")\n", + "\n", + "randomized = sample.randomColumn()\n", + "training = randomized.filter(ee.Filter.lt('random', 0.7))\n", + "testing = randomized.filter(ee.Filter.gte('random', 0.7))\n", + "\n", + "train_task = ee.batch.Export.table.toCloudStorage(\n", + " collection = training,\n", + " description = TRAIN_FILE_PREFIX,\n", + " bucket = OUTPUT_BUCKET,\n", + " fileFormat = 'TFRecord'\n", + ")\n", + "\n", + "test_task = ee.batch.Export.table.toCloudStorage(\n", + " collection = testing,\n", + " description = TEST_FILE_PREFIX,\n", + " bucket = OUTPUT_BUCKET,\n", + " fileFormat = 'TFRecord'\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sFQ7vwpU2gER" + }, + "source": [ + "Export the training and testing tables. This also takes a few hours." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "EjAyMkEFt1W8" + }, + "source": [ + "train_task.start()\n", + "test_task.start()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QNfaUPbcjuCO" + }, + "source": [ + "# Parse the exported datasets\n", + "\n", + "Now we need to make a parsing function for the data in the TFRecord files. The data comes in flattened 2D arrays per record and we want to use the first part of the array for input to the model and the last element of the array as the class label. The parsing function reads data from a serialized `Example` proto (i.e. [`example.proto`](https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/core/example/example.proto)) into a dictionary in which the keys are the feature names and the values are the tensors storing the value of the features for that example. ([Learn more about parsing `Example` protocol buffer messages](https://www.tensorflow.org/programmers_guide/datasets#parsing_tfexample_protocol_buffer_messages))." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "x2Q0g3fBj2kD", + "cellView": "code" + }, + "source": [ + "def parse_tfrecord(example_proto):\n", + " \"\"\"The parsing function.\n", + "\n", + " Read a serialized example into the structure defined by FEATURES_DICT.\n", + "\n", + " Args:\n", + " example_proto: a serialized Example.\n", + "\n", + " Returns:\n", + " A tuple of the predictors dictionary and the label, cast to an `int32`.\n", + " \"\"\"\n", + " parsed_features = tf.io.parse_single_example(example_proto, FEATURES_DICT)\n", + " labels = parsed_features.pop(LABEL)\n", + " return parsed_features, tf.cast(labels, tf.int32)\n", + "\n", + "\n", + "def to_tuple(inputs, label):\n", + " \"\"\" Convert inputs to a tuple.\n", + "\n", + " Note that the inputs must be a tuple of tensors in the right shape.\n", + "\n", + " Args:\n", + " dict: a dictionary of tensors keyed by input name.\n", + " label: a tensor storing the response variable.\n", + "\n", + " Returns:\n", + " A tuple of tensors: (predictors, label).\n", + " \"\"\"\n", + " # Values in the tensor are ordered by the list of predictors.\n", + " predictors = [inputs.get(k) for k in BANDS]\n", + " return (tf.expand_dims(tf.transpose(predictors), 1),\n", + " tf.expand_dims(tf.expand_dims(label, 1), 1)) \n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "T3PKyDQW8Vpx", + "cellView": "code" + }, + "source": [ + "# Load datasets from the files.\n", + "train_dataset = tf.data.TFRecordDataset(TRAIN_FILE_PATH, compression_type='GZIP')\n", + "test_dataset = tf.data.TFRecordDataset(TEST_FILE_PATH, compression_type='GZIP')\n", + "\n", + "# Compute the size of the shuffle buffer. We can get away with this\n", + "# because it's a small dataset, but watch out with larger datasets.\n", + "train_size = 0\n", + "for _ in iter(train_dataset):\n", + " train_size+=1\n", + "\n", + "batch_size = 8\n", + "\n", + "# Map the functions over the datasets to parse and convert to tuples.\n", + "train_dataset = train_dataset.map(parse_tfrecord, num_parallel_calls=4)\n", + "train_dataset = train_dataset.map(to_tuple, num_parallel_calls=4)\n", + "train_dataset = train_dataset.shuffle(train_size).batch(batch_size)\n", + "\n", + "test_dataset = test_dataset.map(parse_tfrecord, num_parallel_calls=4)\n", + "test_dataset = test_dataset.map(to_tuple, num_parallel_calls=4)\n", + "test_dataset = test_dataset.batch(batch_size)\n", + "\n", + "# Print the first parsed record to check.\n", + "from pprint import pprint\n", + "pprint(iter(train_dataset).next())" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Nb8EyNT4Xnhb" + }, + "source": [ + "Note that each record of the parsed dataset contains a tuple. The first element of the tuple is a dictionary with bands for keys and the numeric value of the bands for values. The second element of the tuple is the class label, which in this case is an indicator variable that is one if deforestation happened, zero otherwise." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "t9pWa54oG-xl" + }, + "source": [ + "# Create the Keras model\n", + "\n", + "This model is intended to represent traditional logistic regression, the parameters of which are estimated through maximum likelihood. Specifically, the probability of an event is represented as the sigmoid of a linear function of the predictors. Training or fitting the model consists of finding the parameters of the linear function that maximize the likelihood function. This is implemented in Keras by defining a model with a single trainable layer, a sigmoid activation on the output, and a crossentropy loss function. Note that the only trainable layer is convolutional, with a 1x1 kernel, so that Earth Engine can apply the model in each pixel. To fit the model, a Stochastic Gradient Descent (SGD) optimizer is used. This differs somewhat from traditional fitting of logistic regression models in that stocahsticity is introduced by using mini-batches to estimate the gradient." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "OCZq3VNpG--G", + "cellView": "code" + }, + "source": [ + "from tensorflow import keras\n", + "\n", + "# Define the layers in the model.\n", + "model = tf.keras.models.Sequential([\n", + " tf.keras.layers.Input((1, 1, len(BANDS))),\n", + " tf.keras.layers.Conv2D(1, (1,1), activation='sigmoid')\n", + "])\n", + "\n", + "# Compile the model with the specified loss function.\n", + "model.compile(optimizer=tf.keras.optimizers.SGD(momentum=0.9),\n", + " loss='binary_crossentropy',\n", + " metrics=['accuracy'])\n", + "\n", + "# Fit the model to the training data.\n", + "model.fit(x=train_dataset, \n", + " epochs=20,\n", + " validation_data=test_dataset)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "shbr6cSXShRg" + }, + "source": [ + "## Save the trained model\n", + "\n", + "Save the trained model to `tf.saved_model` format in your cloud storage bucket." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Sgg7MTXfS1PK" + }, + "source": [ + "model.save(MODEL_DIR, save_format='tf')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "keijPyVQTIAq" + }, + "source": [ + "# EEification\n", + "\n", + "The first part of the code is just to get (and SET) input and output names. Keep the input name of 'array', which is how you'll pass data into the model (as an array image)." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "w49O7n5oTS4w" + }, + "source": [ + "from tensorflow.python.tools import saved_model_utils\n", + "\n", + "meta_graph_def = saved_model_utils.get_meta_graph_def(MODEL_DIR, 'serve')\n", + "inputs = meta_graph_def.signature_def['serving_default'].inputs\n", + "outputs = meta_graph_def.signature_def['serving_default'].outputs\n", + "\n", + "# Just get the first thing(s) from the serving signature def. i.e. this\n", + "# model only has a single input and a single output.\n", + "input_name = None\n", + "for k,v in inputs.items():\n", + " input_name = v.name\n", + " break\n", + "\n", + "output_name = None\n", + "for k,v in outputs.items():\n", + " output_name = v.name\n", + " break\n", + "\n", + "# Make a dictionary that maps Earth Engine outputs and inputs to \n", + "# AI Platform inputs and outputs, respectively.\n", + "import json\n", + "input_dict = \"'\" + json.dumps({input_name: \"array\"}) + \"'\"\n", + "output_dict = \"'\" + json.dumps({output_name: \"output\"}) + \"'\"\n", + "print(input_dict)\n", + "print(output_dict)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AX2icXa1UdFF" + }, + "source": [ + "## Run the EEifier\n", + "\n", + "Use the command line to set your Cloud project and then run the eeifier." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IYmH_wCOUhIv" + }, + "source": [ + "!earthengine set_project {PROJECT}\n", + "!earthengine model prepare --source_dir {MODEL_DIR} --dest_dir {EEIFIED_DIR} --input {input_dict} --output {output_dict}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0_uTqQAaVTIK" + }, + "source": [ + "# Deploy and host the EEified model on AI Platform\n", + "\n", + "**If you change anything about the model, you'll need to re-EEify it and create a new version!**" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8RZRRzcfVu5T" + }, + "source": [ + "!gcloud ai-platform models create {MODEL_NAME} \\\n", + " --project {PROJECT} \\\n", + " --region {REGION}\n", + "\n", + "!gcloud ai-platform versions create {VERSION_NAME} \\\n", + " --project {PROJECT} \\\n", + " --region {REGION} \\\n", + " --model {MODEL_NAME} \\\n", + " --origin {EEIFIED_DIR} \\\n", + " --framework \"TENSORFLOW\" \\\n", + " --runtime-version=2.3 \\\n", + " --python-version=3.7" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5aTGza-rWIjp" + }, + "source": [ + "# Connect to the hosted model from Earth Engine\n", + "\n", + "Now that the model is hosted on AI Platform, point Earth Engine to it and make predictions. These predictions can be thresholded for a rudimentary deforestation detector. Visualize the after imagery, the reference data and the predictions." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "WtWJMvzAo279" + }, + "source": [ + "# Turn into an array image for input to the model.\n", + "array_image = stack.select(BANDS).float().toArray()\n", + "\n", + "# Point to the model hosted on AI Platform. If you specified a region other\n", + "# than the default (us-central1) at model creation, specify it here.\n", + "model = ee.Model.fromAiPlatformPredictor(\n", + " projectName=PROJECT,\n", + " modelName=MODEL_NAME,\n", + " version=VERSION_NAME,\n", + " # Can be anything, but don't make it too big.\n", + " inputTileSize=[8, 8],\n", + " # Keep this the same as your training data.\n", + " proj=ee.Projection('EPSG:4326').atScale(30),\n", + " fixInputProj=True,\n", + " # Note the names here need to match what you specified in the\n", + " # output dictionary you passed to the EEifier.\n", + " outputBands={'output': {\n", + " 'type': ee.PixelType.float(),\n", + " 'dimensions': 1\n", + " }\n", + " },\n", + ")\n", + "\n", + "# Output probability.\n", + "predictions = model.predictImage(array_image).arrayGet([0])\n", + "\n", + "# Back-of-the-envelope decision rule.\n", + "predicted = predictions.gt(0.7).selfMask()\n", + "\n", + "# Training data for comparison.\n", + "reference = LOSS16.selfMask()\n", + "\n", + "# Get map IDs for display in folium.\n", + "probability_vis = {'min': 0, 'max': 1}\n", + "probability_mapid = predictions.getMapId(probability_vis)\n", + "\n", + "predicted_vis = {'palette': 'red'}\n", + "predicted_mapid = predicted.getMapId(predicted_vis)\n", + "\n", + "reference_vis = {'palette': 'orange'}\n", + "reference_mapid = reference.getMapId(reference_vis)\n", + "\n", + "image_vis = {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 0.3}\n", + "image_mapid = composite2.getMapId(image_vis)\n", + "\n", + "# Visualize the input imagery and the predictions.\n", + "map = folium.Map(location=[-9.1, -62.3], zoom_start=11)\n", + "folium.TileLayer(\n", + " tiles=image_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='image',\n", + ").add_to(map)\n", + "folium.TileLayer(\n", + " tiles=probability_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='probability',\n", + ").add_to(map)\n", + "folium.TileLayer(\n", + " tiles=predicted_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='predicted',\n", + ").add_to(map)\n", + "folium.TileLayer(\n", + " tiles=reference_mapid['tile_fetcher'].url_format,\n", + " attr='Map Data © Google Earth Engine',\n", + " overlay=True,\n", + " name='reference',\n", + ").add_to(map)\n", + "map.add_child(folium.LayerControl())\n", + "map" + ], + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_asset_from_cloud_geotiff.ipynb b/guides/linked/Earth_Engine_asset_from_cloud_geotiff.ipynb new file mode 100644 index 000000000..8d6325746 --- /dev/null +++ b/guides/linked/Earth_Engine_asset_from_cloud_geotiff.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Earth_Engine_asset_from_cloud_geotiff.ipynb","provenance":[{"file_id":"https://github.com/google/earthengine-community/blob/master/guides/linked/Earth_Engine_asset_from_cloud_geotiff.ipynb","timestamp":1655816119626},{"file_id":"1f_rRBTQVKbPVhaoRRsSWUlBtBgnZkiTz","timestamp":1590793341638}],"private_outputs":true,"collapsed_sections":[],"toc_visible":true},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"code","metadata":{"id":"fSIfBsgi8dNK"},"source":["#@title Copyright 2022 Google LLC. { display-mode: \"form\" }\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","# https://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License."],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"aV1xZ1CPi3Nw"},"source":["
\n","\n"," Run in Google Colab\n","\n"," View source on GitHub
"]},{"cell_type":"markdown","metadata":{"id":"CrEM35gqHouU"},"source":["# Cloud GeoTiff Backed Earth Engine Assets\n","\n","***Note:*** *The REST API contains new and advanced features that may not be suitable for all users. If you are new to Earth Engine, please get started with the [JavaScript guide](https://developers.google.com/earth-engine/guides/getstarted).*\n","\n","Earth Engine can load images from Cloud Optimized GeoTiffs (COGs) in Google Cloud Storage ([learn more](https://developers.google.com/earth-engine/guides/image_overview#images-from-cloud-geotiffs)). This notebook demonstrates how to create Earth Engine assets backed by COGs. An advantage of COG-backed assets is that the spatial and metadata fields of the image will be indexed at asset creation time, making the image more performant in collections. (In contrast, an image created through `ee.Image.loadGeoTIFF` and put into a collection will require a read of the GeoTiff for filtering operations on the collection.) A disadvantage of COG-backed assets is that they may be several times slower than standard assets when used in computations.\n","\n","To create a COG-backed asset, make a `POST` request to the Earth Engine [`CreateAsset` endpoint](https://developers.google.com/earth-engine/reference/rest/v1alpha/projects.assets/create). As shown in the following, this request must be authorized to create an asset in your user folder."]},{"cell_type":"markdown","metadata":{"id":"fmxat3ujhwGx"},"source":["## Start an authorized session\n","\n","To be able to make an Earth Engine asset in your user folder, you need to be able to authenticate as yourself when you make the request. You can use credentials from the Earth Engine authenticator to start an [`AuthorizedSession`](https://google-auth.readthedocs.io/en/master/reference/google.auth.transport.requests.html#google.auth.transport.requests.AuthorizedSession). You can then use the `AuthorizedSession` to send requests to Earth Engine."]},{"cell_type":"code","metadata":{"id":"qVu8GhINwYfO"},"source":["import ee\n","from google.auth.transport.requests import AuthorizedSession\n","\n","ee.Authenticate() # or !earthengine authenticate --auth_mode=gcloud\n","session = AuthorizedSession(ee.data.get_persistent_credentials())"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jz8e263wvbTN"},"source":["## Request body\n","\n","The request body is an instance of an [EarthEngineAsset](https://developers.google.com/earth-engine/reference/rest/v1alpha/projects.assets#EarthEngineAsset). This is where the path to the COG is specified, along with other useful properties. Note that the image is a small area exported from the composite made in [this example script](https://code.earthengine.google.com/?scriptPath=Examples%3ACloud%20Masking%2FSentinel2). See [this doc](https://developers.google.com/earth-engine/exporting#configuration-parameters) for details on exporting a COG.\n","\n","Earth Engine will determine the bands, geometry, and other relevant information from the metadata of the TIFF. The only other fields that are accepted when creating a COG-backed asset are `properties`, `start_time`, and `end_time`."]},{"cell_type":"code","metadata":{"id":"OGESPnfEvqVq"},"source":["import json\n","from pprint import pprint\n","\n","# Request body as a dictionary.\n","request = {\n"," 'type': 'IMAGE',\n"," 'gcs_location': {\n"," 'uris': ['gs://ee-docs-demos/COG_demo.tif']\n"," },\n"," 'properties': {\n"," 'source': 'https://code.earthengine.google.com/d541cf8b268b2f9d8f834c255698201d'\n"," },\n"," 'startTime': '2016-01-01T00:00:00.000000000Z',\n"," 'endTime': '2016-12-31T15:01:23.000000000Z',\n","}\n","\n","pprint(json.dumps(request))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"9_MfryWIpyhS"},"source":["## Send the request\n","\n","Make the POST request to the Earth Engine [`projects.assets.create`](https://developers.google.com/earth-engine/reference/rest/v1alpha/projects.assets/create) endpoint."]},{"cell_type":"code","metadata":{"id":"NhmNrvS2p4qQ"},"source":["# Earth Engine enabled Cloud Project.\n","project_folder = 'your-project'\n","# A folder (or ImageCollection) name and the new asset name.\n","asset_id = 'cog-collection/your-cog-asset'\n","\n","url = 'https://earthengine.googleapis.com/v1alpha/projects/{}/assets?assetId={}'\n","\n","response = session.post(\n"," url = url.format(project_folder, asset_id),\n"," data = json.dumps(request)\n",")\n","\n","pprint(json.loads(response.content))"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"mK5lCJY0CDfK"},"source":["## Details on COG-backed assets\n","\n","### Permissions\n","The ACLs of COG-backed Earth Engine assets and the underlying data are managed separately. If a COG-backed asset is shared in Earth Engine, it is the owner's responsibility to ensure that the data in GCS is shared with the same parties. If the data is not visible, Earth Engine will return an error of the form \"Failed to load the GeoTIFF at `gs://my-bucket/my-object#123456`\" (123456 is the generation of the object).\n","\n","### Generations\n","When a COG-backed asset is created, Earth Engine reads the metadata of the TIFF in Cloud Storage and creates asset store entry. The URI associated with that entry can have a generation. See the [object versioning docs](https://cloud.google.com/storage/docs/object-versioning) for details on generations. If a generation is specified (e.g., `gs://foo/bar#123`), Earth Engine will use it. If a generation is not specified, Earth Engine will use the latest generation of the object. \n","\n","That means that if the object in GCS is updated, Earth Engine will return a \"Failed to load the GeoTIFF at `gs://my-bucket/my-object#123456`\" error because the expected object no longer exists (unless the bucket enables multiple object versions). This policy is designed to keep metadata of the asset in sync with the metadata of the object. \n","\n","### Configuration\n","In terms of how a COG should be configured, the TIFF MUST be:\n","\n","- Tiled, where the tile dimensions are either:\n"," - 16x16\n"," - 32x32\n"," - 64x64\n"," - 128x128\n"," - 256x256\n"," - 512x512\n"," - 1024x1024\n","\n","- Arranged so that all IFDs are at the beginning.\n","\n","For best performance:\n","\n","- Use tile dimensions of 128x128 or 256x256.\n","- Include power of 2 overviews.\n","\n","See [this page](https://cogeotiff.github.io/rio-cogeo/Advanced/#web-optimized-cog) for more details on an optimized configuration."]}]} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_training_patches_computePixels.ipynb b/guides/linked/Earth_Engine_training_patches_computePixels.ipynb new file mode 100644 index 000000000..8305e73e3 --- /dev/null +++ b/guides/linked/Earth_Engine_training_patches_computePixels.ipynb @@ -0,0 +1,495 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + } + }, + "cells": [ + { + "cell_type": "code", + "metadata": { + "id": "fSIfBsgi8dNK" + }, + "source": [ + "#@title Copyright 2023 Google LLC. { display-mode: \"form\" }\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aV1xZ1CPi3Nw" + }, + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + " View source on GitHub
" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# Download training patches from Earth Engine\n", + "\n", + "This demonstration shows how to get patches of imagery from Earth Engine for training ML models. Specifically, use `computePixels` calls in parallel to quickly and efficiently write a TFRecord file." + ], + "metadata": { + "id": "9SV-E0p6PpGr" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Imports" + ], + "metadata": { + "id": "uvlyhBESPQKW" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "rppQiHjZPX_y" + }, + "outputs": [], + "source": [ + "from google.colab import auth\n", + "from google.api_core import retry\n", + "from IPython.display import Image\n", + "from matplotlib import pyplot as plt\n", + "from numpy.lib import recfunctions as rfn\n", + "\n", + "import concurrent\n", + "import ee\n", + "import google\n", + "import io\n", + "import multiprocessing\n", + "import numpy as np\n", + "import requests\n", + "import tensorflow as tf" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Authentication and initialization\n", + "\n", + "Use the Colab auth widget to get credentials, then use them to initialize Earth Engine. During initialization, be sure to specify a project and Earth Engine's [high-volume endpoint](https://developers.google.com/earth-engine/cloud/highvolume), in order to make automated requests." + ], + "metadata": { + "id": "pbLzoz4klKwH" + } + }, + { + "cell_type": "code", + "source": [ + "# REPLACE WITH YOUR PROJECT!\n", + "PROJECT = 'your-project'" + ], + "metadata": { + "id": "HN5H25U_JBdp" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "auth.authenticate_user()" + ], + "metadata": { + "id": "TLmI05-wT_GD" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "credentials, _ = google.auth.default()\n", + "ee.Initialize(credentials, project=PROJECT, opt_url='https://earthengine-highvolume.googleapis.com')" + ], + "metadata": { + "id": "c5bEkwQHUDPS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Define variables" + ], + "metadata": { + "id": "q7rHLQsPuwyb" + } + }, + { + "cell_type": "code", + "source": [ + "# REPLACE WITH YOUR BUCKET!\n", + "OUTPUT_FILE = 'gs://your-bucket/your-file.tfrecord.gz'\n", + "\n", + "# Output resolution in meters.\n", + "SCALE = 10\n", + "\n", + "# Pre-compute a geographic coordinate system.\n", + "proj = ee.Projection('EPSG:4326').atScale(SCALE).getInfo()\n", + "\n", + "# Get scales in degrees out of the transform.\n", + "SCALE_X = proj['transform'][0]\n", + "SCALE_Y = -proj['transform'][4]\n", + "\n", + "# Patch size in pixels.\n", + "PATCH_SIZE = 128\n", + "\n", + "# Offset to the upper left corner.\n", + "OFFSET_X = -SCALE_X * PATCH_SIZE / 2\n", + "OFFSET_Y = -SCALE_Y * PATCH_SIZE / 2\n", + "\n", + "# Request template.\n", + "REQUEST = {\n", + " 'fileFormat': 'NPY',\n", + " 'grid': {\n", + " 'dimensions': {\n", + " 'width': PATCH_SIZE,\n", + " 'height': PATCH_SIZE\n", + " },\n", + " 'affineTransform': {\n", + " 'scaleX': SCALE_X,\n", + " 'shearX': 0,\n", + " 'shearY': 0,\n", + " 'scaleY': SCALE_Y,\n", + " },\n", + " 'crsCode': proj['crs']\n", + " }\n", + " }\n", + "\n", + "# Blue, green, red, NIR, AOT.\n", + "FEATURES = ['B2_median', 'B3_median', 'B4_median', 'B8_median', 'AOT_median']\n", + "\n", + "# Bay area.\n", + "TEST_ROI = ee.Geometry.Rectangle(\n", + " [-123.05832753906247, 37.03109527141115,\n", + " -121.14121328124997, 38.24468432993584])\n", + "# San Francisco.\n", + "TEST_COORDS = [-122.43519674072265, 37.78010979412811]\n", + "\n", + "TEST_DATE = ee.Date('2021-06-01')\n", + "\n", + "# Number of samples per ROI, and per TFRecord file.\n", + "N = 64\n", + "\n", + "# Specify the size and shape of patches expected by the model.\n", + "KERNEL_SHAPE = [PATCH_SIZE, PATCH_SIZE]\n", + "COLUMNS = [\n", + " tf.io.FixedLenFeature(shape=KERNEL_SHAPE, dtype=tf.float32) for k in FEATURES\n", + "]\n", + "FEATURES_DICT = dict(zip(FEATURES, COLUMNS))" + ], + "metadata": { + "id": "hj_ZujvvFlGR" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Image retrieval functions\n", + "\n", + "This section includes functions to compute a Sentinel-2 median composite and get a pacth of pixels from the composite, centered on the provided coordinates, as either a numpy array or a JPEG thumbnail (for visualization). The functions that request patches are retriable and you can do that automatically by decorating the functions with [Retry](https://googleapis.dev/python/google-api-core/latest/retry.html)." + ], + "metadata": { + "id": "vbEM4nlUOmQn" + } + }, + { + "cell_type": "code", + "source": [ + "def get_s2_composite(roi, date):\n", + " \"\"\"Get a two-month Sentinel-2 median composite in the ROI.\"\"\"\n", + " start = date.advance(-1, 'month')\n", + " end = date.advance(1, 'month')\n", + "\n", + " s2 = ee.ImageCollection('COPERNICUS/S2_HARMONIZED')\n", + " s2c = ee.ImageCollection('COPERNICUS/S2_CLOUD_PROBABILITY')\n", + " s2Sr = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')\n", + "\n", + " s2c = s2c.filterBounds(roi).filterDate(start, end)\n", + " s2Sr = s2Sr.filterDate(start, end).filterBounds(roi)\n", + "\n", + " def indexJoin(collectionA, collectionB, propertyName):\n", + " joined = ee.ImageCollection(ee.Join.saveFirst(propertyName).apply(\n", + " primary=collectionA,\n", + " secondary=collectionB,\n", + " condition=ee.Filter.equals(\n", + " leftField='system:index',\n", + " rightField='system:index'\n", + " ))\n", + " )\n", + " return joined.map(lambda image : image.addBands(ee.Image(image.get(propertyName))))\n", + "\n", + " def maskImage(image):\n", + " s2c = image.select('probability')\n", + " return image.updateMask(s2c.lt(50))\n", + "\n", + " withCloudProbability = indexJoin(s2Sr, s2c, 'cloud_probability')\n", + " masked = ee.ImageCollection(withCloudProbability.map(maskImage))\n", + " return masked.reduce(ee.Reducer.median(), 8)\n", + "\n", + "\n", + "@retry.Retry()\n", + "def get_patch(coords, image):\n", + " \"\"\"Get a patch centered on the coordinates, as a numpy array.\"\"\"\n", + " request = dict(REQUEST)\n", + " request['expression'] = image\n", + " request['grid']['affineTransform']['translateX'] = coords[0] + OFFSET_X\n", + " request['grid']['affineTransform']['translateY'] = coords[1] + OFFSET_Y\n", + " return np.load(io.BytesIO(ee.data.computePixels(request)))\n", + "\n", + "\n", + "@retry.Retry()\n", + "def get_display_image(coords, image):\n", + " \"\"\"Helper to display a patch using notebook widgets.\"\"\"\n", + " point = ee.Geometry.Point(coords)\n", + " region = point.buffer(64 * 10).bounds()\n", + " url = image.getThumbURL({\n", + " 'region': region,\n", + " 'dimensions': '128x128',\n", + " 'format': 'jpg',\n", + " 'min': 0, 'max': 5000,\n", + " 'bands': ['B4_median', 'B3_median', 'B2_median']\n", + " })\n", + "\n", + " r = requests.get(url, stream=True)\n", + " if r.status_code != 200:\n", + " raise google.api_core.exceptions.from_http_response(r)\n", + "\n", + " return r.content" + ], + "metadata": { + "id": "VMBgRRUARTH1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "TEST_IMAGE = get_s2_composite(TEST_ROI, TEST_DATE)\n", + "image = get_display_image(TEST_COORDS, TEST_IMAGE)\n", + "Image(image)" + ], + "metadata": { + "id": "o6FH8sIlHElY" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "np_array = get_patch(TEST_COORDS, TEST_IMAGE)" + ], + "metadata": { + "id": "nQ60n8pMaRur" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# This is a structured array.\n", + "print(np_array['B4_median'])" + ], + "metadata": { + "id": "QZFZud6Ia_n7" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "display_array = rfn.structured_to_unstructured(np_array[['B4_median', 'B3_median', 'B2_median']])/5000\n", + "plt.imshow(display_array)\n", + "plt.show()" + ], + "metadata": { + "id": "eG9aK0dh-IaK" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Sampling functions\n", + "\n", + "These are helper functions to get a random sample as a list of coordinates, sample the composite (using `computePixels`) at each coordinate, serialize numpy arrays to `tf.Example` protos and write them into a file. The sampling is handled in multiple threads using a `ThreadPoolExecutor`." + ], + "metadata": { + "id": "c7fC63m4Ow8e" + } + }, + { + "cell_type": "code", + "source": [ + "def get_sample_coords(roi, n):\n", + " \"\"\"\"Get a random sample of N points in the ROI.\"\"\"\n", + " points = ee.FeatureCollection.randomPoints(region=roi, points=n, maxError=1)\n", + " return points.aggregate_array('.geo').getInfo()\n", + "\n", + "\n", + "def array_to_example(structured_array):\n", + " \"\"\"\"Serialize a structured numpy array into a tf.Example proto.\"\"\"\n", + " feature = {}\n", + " for f in FEATURES:\n", + " feature[f] = tf.train.Feature(\n", + " float_list = tf.train.FloatList(\n", + " value = structured_array[f].flatten()))\n", + " return tf.train.Example(\n", + " features = tf.train.Features(feature = feature))\n", + "\n", + "\n", + "def write_dataset(image, sample_points, file_name):\n", + " \"\"\"\"Write patches at the sample points into a TFRecord file.\"\"\"\n", + " future_to_point = {\n", + " EXECUTOR.submit(get_patch, point['coordinates'], image): point for point in sample_points\n", + " }\n", + "\n", + " # Optionally compress files.\n", + " writer = tf.io.TFRecordWriter(file_name)\n", + "\n", + " for future in concurrent.futures.as_completed(future_to_point):\n", + " point = future_to_point[future]\n", + " try:\n", + " np_array = future.result()\n", + " example_proto = array_to_example(np_array)\n", + " writer.write(example_proto.SerializeToString())\n", + " writer.flush()\n", + " except Exception as e:\n", + " print(e)\n", + " pass\n", + "\n", + " writer.close()" + ], + "metadata": { + "id": "NeKS5M-kRT4r" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "EXECUTOR = concurrent.futures.ThreadPoolExecutor(max_workers=N)" + ], + "metadata": { + "id": "Hs_FozNIQFXI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# These could come from anywhere. Here is just a random sample.\n", + "sample_points = get_sample_coords(TEST_ROI, N)\n", + "\n", + "# Sample patches from the image at each point. Each sample is\n", + "# fetched in parallel using the ThreadPoolExecutor.\n", + "write_dataset(TEST_IMAGE, sample_points, OUTPUT_FILE)" + ], + "metadata": { + "id": "1dDAqyH5dZBK" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Check the written file\n", + "\n", + "Load and inspect the written file by visualizing a few patches." + ], + "metadata": { + "id": "AyoEjI31O67O" + } + }, + { + "cell_type": "code", + "source": [ + "def parse_tfrecord(example_proto):\n", + " \"\"\"Parse a serialized example.\"\"\"\n", + " return tf.io.parse_single_example(example_proto, FEATURES_DICT)\n", + "\n", + "dataset = tf.data.TFRecordDataset(OUTPUT_FILE)\n", + "dataset = dataset.map(parse_tfrecord, num_parallel_calls=5)" + ], + "metadata": { + "id": "H3_SQsQCu9bh" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "take_20 = dataset.take(20)\n", + "\n", + "for data in take_20:\n", + " rgb = np.stack([\n", + " data['B4_median'].numpy(),\n", + " data['B3_median'].numpy(),\n", + " data['B2_median'].numpy()], 2) / 5000\n", + " plt.imshow(rgb)\n", + " plt.show()\n" + ], + "metadata": { + "id": "MBlWwC0_SycO" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Where to go next\n", + "\n", + " - Learn about how to scale training data generation pipelines with Apache Beam in [this demo](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/people-and-planet-ai/land-cover-classification).\n", + " - Learn about training models on Vertex AI in [this doc](/earth-engine/guides/tf_examples#semantic-segmentation-with-an-fcnn-trained-and-hosted-on-vertex-ai)." + ], + "metadata": { + "id": "uwcryQrV5E8m" + } + } + ] +} \ No newline at end of file diff --git a/guides/linked/Earth_Engine_training_patches_getPixels.ipynb b/guides/linked/Earth_Engine_training_patches_getPixels.ipynb new file mode 100644 index 000000000..1fd697612 --- /dev/null +++ b/guides/linked/Earth_Engine_training_patches_getPixels.ipynb @@ -0,0 +1,381 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + } + }, + "cells": [ + { + "cell_type": "code", + "metadata": { + "id": "fSIfBsgi8dNK" + }, + "source": [ + "#@title Copyright 2023 Google LLC. { display-mode: \"form\" }\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aV1xZ1CPi3Nw" + }, + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + " View source on GitHub
" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# Download training patches from Earth Engine\n", + "\n", + "This demonstration shows how to get patches of imagery from Earth Engine assets. Specifically, use `getPixels` calls in parallel to write a TFRecord file." + ], + "metadata": { + "id": "9SV-E0p6PpGr" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Imports" + ], + "metadata": { + "id": "uvlyhBESPQKW" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "rppQiHjZPX_y" + }, + "outputs": [], + "source": [ + "import concurrent\n", + "import ee\n", + "import google\n", + "import io\n", + "import json\n", + "import matplotlib.pyplot as plt\n", + "import matplotlib.animation as animation\n", + "import multiprocessing\n", + "import numpy as np\n", + "import requests\n", + "import tensorflow as tf\n", + "\n", + "from google.api_core import retry\n", + "from google.colab import auth\n", + "from google.protobuf import json_format\n", + "from IPython.display import Image\n", + "from matplotlib import rc\n", + "from tqdm.notebook import tqdm\n", + "\n", + "rc('animation', html='html5')" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Authentication and initialization\n", + "\n", + "Use the Colab auth widget to get credentials, then use them to initialize Earth Engine. During initialization, be sure to specify a project and Earth Engine's [high-volume endpoint](https://developers.google.com/earth-engine/cloud/highvolume), in order to make automated requests." + ], + "metadata": { + "id": "pbLzoz4klKwH" + } + }, + { + "cell_type": "code", + "source": [ + "# REPLACE WITH YOUR PROJECT!\n", + "PROJECT = 'your-project'" + ], + "metadata": { + "id": "HN5H25U_JBdp" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "auth.authenticate_user()" + ], + "metadata": { + "id": "TLmI05-wT_GD" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "credentials, _ = google.auth.default()\n", + "ee.Initialize(credentials, project=PROJECT, opt_url='https://earthengine-highvolume.googleapis.com')" + ], + "metadata": { + "id": "c5bEkwQHUDPS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Define variables" + ], + "metadata": { + "id": "q7rHLQsPuwyb" + } + }, + { + "cell_type": "code", + "source": [ + "# REPLACE WITH YOUR BUCKET!\n", + "OUTPUT_FILE = 'gs://your-bucket/your-file.tfrecord.gz'\n", + "\n", + "# MODIS vegetation indices, 16-day.\n", + "MOD13Q1 = ee.ImageCollection('MODIS/061/MOD13Q1').select('NDVI')\n", + "\n", + "# Output resolution in meters.\n", + "SCALE = 250\n", + "\n", + "# Bay area.\n", + "ROI = ee.Geometry.Rectangle(\n", + " [-123.05832753906247, 37.03109527141115,\n", + " -121.14121328124997, 38.24468432993584])\n", + "\n", + "# Number of samples per ROI, per year, and per TFRecord file.\n", + "N = 64\n", + "\n", + "# A random sample of N locations in the ROI as a list of GeoJSON points.\n", + "SAMPLE = ee.FeatureCollection.randomPoints(\n", + " region=ROI, points=N, maxError=1).aggregate_array('.geo').getInfo()\n", + "\n", + "# The years from which to sample every 16-day composite.\n", + "YEARS = np.arange(2010, 2023)" + ], + "metadata": { + "id": "hj_ZujvvFlGR" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Image retrieval functions\n", + "\n", + "This section has a function to get a 1000x1000 meter patch of pixels from an asset, centered on the provided coordinates, as a numpy array. The function can be retried automatically by using the [Retry](https://googleapis.dev/python/google-api-core/latest/retry.html) decorator. There is also a function to serialize a structured array to a `tf.Example` proto." + ], + "metadata": { + "id": "vbEM4nlUOmQn" + } + }, + { + "cell_type": "code", + "source": [ + "@retry.Retry()\n", + "def get_patch(coords, asset_id, band):\n", + " \"\"\"Get a patch of pixels from an asset, centered on the coords.\"\"\"\n", + " point = ee.Geometry.Point(coords)\n", + " request = {\n", + " 'fileFormat': 'NPY',\n", + " 'bandIds': [band],\n", + " 'region': point.buffer(1000).bounds().getInfo(),\n", + " 'assetId': asset_id\n", + " }\n", + " return np.load(io.BytesIO(ee.data.getPixels(request)))[band]\n", + "\n", + "\n", + "def _float_feature(floats):\n", + " \"\"\"Returns a float_list from a float list.\"\"\"\n", + " return tf.train.Feature(float_list=tf.train.FloatList(value=floats))\n", + "\n", + "\n", + "def array_to_example(struct_array):\n", + " \"\"\"\"Serialize a structured numpy array into a tf.Example proto.\"\"\"\n", + " struct_names = struct_array.dtype.names\n", + " feature = {}\n", + " shape = np.shape(struct_array[struct_names[0]])\n", + " feature['h'] = _float_feature([shape[1]])\n", + " feature['w'] = _float_feature([shape[2]])\n", + " for f in struct_names:\n", + " feature[f] = _float_feature(struct_array[f].flatten())\n", + " return tf.train.Example(\n", + " features = tf.train.Features(feature = feature))" + ], + "metadata": { + "id": "NeKS5M-kRT4r" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "# Get patches from the images\n", + "\n", + "In the variable declarations, there's a random sample in an arbitrary region of interest and a year range. At each point in the sample, in each year, in each 16-day composite, get a patch. The patch extraction is handled in multiple threads using a `ThreadPoolExecutor`. Write into TFRecords where each record stores all patches for a (point, year) combination." + ], + "metadata": { + "id": "H-3tJLl4WRkS" + } + }, + { + "cell_type": "code", + "source": [ + "executor = concurrent.futures.ThreadPoolExecutor(max_workers=200)\n", + "\n", + "writer = tf.io.TFRecordWriter(OUTPUT_FILE, 'GZIP')\n", + "\n", + "for point in tqdm(SAMPLE):\n", + " for year in tqdm(YEARS):\n", + " year = int(year)\n", + " images = MOD13Q1.filter(\n", + " ee.Filter.calendarRange(year, year, 'year')).getInfo()['features']\n", + "\n", + " future_to_image = {\n", + " executor.submit(get_patch, point['coordinates'], image['id'], 'NDVI'):\n", + " image['id'] for image in images\n", + " }\n", + "\n", + " arrays = ()\n", + " types = []\n", + " for future in concurrent.futures.as_completed(future_to_image):\n", + " image_id = future_to_image[future]\n", + " image_name = image_id.split('/')[-1]\n", + " try:\n", + " np_array = future.result()\n", + " arrays += (np_array,)\n", + " types.append((image_name, np.int_, np_array.shape))\n", + " except Exception as e:\n", + " print(e)\n", + " pass\n", + " array = np.array([arrays], types)\n", + " example_proto = array_to_example(array)\n", + " writer.write(example_proto.SerializeToString())\n", + " writer.flush()\n", + "\n", + "writer.close()" + ], + "metadata": { + "id": "Hs_FozNIQFXI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Inspect the written files\n", + "\n", + "The parsing function dynamically determines the shape and keys of each record, which may vary by point and year. Once the data are parsed, they can be displayed as an animation: one year's worth of NDVI change in a patch centered on the point." + ], + "metadata": { + "id": "-ecw45dSPF-J" + } + }, + { + "cell_type": "code", + "source": [ + "h_col = tf.io.FixedLenFeature(shape=(1), dtype=tf.float32)\n", + "w_col = tf.io.FixedLenFeature(shape=(1), dtype=tf.float32)\n", + "hw_dict = {'h': h_col, 'w': w_col}\n", + "\n", + "def parse_tfrecord(example_proto):\n", + " \"\"\"Parse a serialized example, dynamic determination of shape and keys.\"\"\"\n", + " hw = tf.io.parse_single_example(example_proto, hw_dict)\n", + " h = int(hw['h'].numpy())\n", + " w = int(hw['w'].numpy())\n", + "\n", + " example = tf.train.Example()\n", + " example.ParseFromString(example_proto.numpy())\n", + " f_list = list(example.features.feature.keys())\n", + " f_dict = {e: tf.io.FixedLenFeature(shape=(h,w), dtype=tf.float32) for e in f_list if e not in ('h', 'w')}\n", + " return tf.io.parse_single_example(example_proto, f_dict)" + ], + "metadata": { + "id": "wxcbEKQxaua9" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "dataset = tf.data.TFRecordDataset(OUTPUT_FILE, compression_type='GZIP')\n", + "parsed_data = [parse_tfrecord(rec) for rec in dataset]" + ], + "metadata": { + "id": "v0eA8-ikbJUP" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Get an animation of the data in a record\n", + "\n", + "See [this reference](https://matplotlib.org/stable/gallery/animation/dynamic_image.html) for details, including options to save the animation." + ], + "metadata": { + "id": "w6JK6FUXMSkE" + } + }, + { + "cell_type": "code", + "source": [ + "array_dict = parsed_data[400]\n", + "\n", + "fig, ax = plt.subplots()\n", + "\n", + "# This order.\n", + "images_names = np.sort(list(array_dict.keys()))\n", + "first_image = images_names[0]\n", + "\n", + "ax.imshow(np.squeeze(array_dict[first_image])) # show an initial one first\n", + "ims = []\n", + "for image in images_names[1:]:\n", + " im = ax.imshow(np.squeeze(array_dict[image]), animated=True)\n", + " ims.append([im])\n", + "\n", + "ani = animation.ArtistAnimation(fig, ims, interval=100, blit=True, repeat_delay=1000)\n", + "ani" + ], + "metadata": { + "id": "YoaiLbl9GCfz" + }, + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/guides/linked/TF_demo1_keras.ipynb b/guides/linked/TF_demo1_keras.ipynb new file mode 100644 index 000000000..52adba86c --- /dev/null +++ b/guides/linked/TF_demo1_keras.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"TF_demo1_keras.ipynb","provenance":[],"private_outputs":true,"collapsed_sections":[],"toc_visible":true},"kernelspec":{"name":"python3","display_name":"Python 3"},"accelerator":"GPU"},"cells":[{"cell_type":"code","metadata":{"id":"fSIfBsgi8dNK","colab_type":"code","colab":{}},"source":["#@title Copyright 2020 Google LLC. { display-mode: \"form\" }\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","# https://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License."],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"aV1xZ1CPi3Nw","colab_type":"text"},"source":["
\n","\n"," Run in Google Colab\n","\n"," View source on GitHub
"]},{"cell_type":"markdown","metadata":{"id":"AC8adBmw-5m3","colab_type":"text"},"source":["# Introduction\n","\n","This is an Earth Engine <> TensorFlow demonstration notebook. Specifically, this notebook shows:\n","\n","1. Exporting training/testing data from Earth Engine in TFRecord format.\n","2. Preparing the data for use in a TensorFlow model.\n","2. Training and validating a simple model (Keras `Sequential` neural network) in TensorFlow.\n","3. Making predictions on image data exported from Earth Engine in TFRecord format.\n","4. Ingesting classified image data to Earth Engine in TFRecord format.\n","\n","This is intended to demonstrate a complete i/o pipeline. For a workflow that uses a [Google AI Platform](https://cloud.google.com/ai-platform) hosted model making predictions interactively, see [this example notebook](http://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/Earth_Engine_TensorFlow_AI_Platform.ipynb)."]},{"cell_type":"markdown","metadata":{"id":"KiTyR3FNlv-O","colab_type":"text"},"source":["# Setup software libraries\n","\n","Import software libraries and/or authenticate as necessary."]},{"cell_type":"markdown","metadata":{"id":"dEM3FP4YakJg","colab_type":"text"},"source":["## Authenticate to Colab and Cloud\n","\n","To read/write from a Google Cloud Storage bucket to which you have access, it's necessary to authenticate (as yourself). *This should be the same account you use to login to Earth Engine*. When you run the code below, it will display a link in the output to an authentication page in your browser. Follow the link to a page that will let you grant permission to the Cloud SDK to access your resources. Copy the code from the permissions page back into this notebook and press return to complete the process.\n","\n","(You may need to run this again if you get a credentials error later.)"]},{"cell_type":"code","metadata":{"id":"sYyTIPLsvMWl","colab_type":"code","cellView":"code","colab":{}},"source":["from google.colab import auth\n","auth.authenticate_user()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Ejxa1MQjEGv9","colab_type":"text"},"source":["## Authenticate to Earth Engine\n","\n","Authenticate to Earth Engine the same way you did to the Colab notebook. Specifically, run the code to display a link to a permissions page. This gives you access to your Earth Engine account. *This should be the same account you used to login to Cloud previously*. Copy the code from the Earth Engine permissions page back into the notebook and press return to complete the process."]},{"cell_type":"code","metadata":{"id":"HzwiVqbcmJIX","colab_type":"code","cellView":"code","colab":{}},"source":["import ee\n","ee.Authenticate()\n","ee.Initialize()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"iJ70EsoWND_0","colab_type":"text"},"source":["## Test the TensorFlow installation\n","\n","Import the TensorFlow library and check the version."]},{"cell_type":"code","metadata":{"id":"i1PrYRLaVw_g","colab_type":"code","cellView":"code","colab":{}},"source":["import tensorflow as tf\n","print(tf.__version__)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"b8Xcvjp6cLOL","colab_type":"text"},"source":["## Test the Folium installation\n","\n","We will use the Folium library for visualization. Import the library and check the version."]},{"cell_type":"code","metadata":{"id":"YiVgOXzBZJSn","colab_type":"code","colab":{}},"source":["import folium\n","print(folium.__version__)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"DrXLkJC2QJdP","colab_type":"text"},"source":["# Define variables\n","\n","This set of global variables will be used throughout. For this demo, you must have a Cloud Storage bucket into which you can write files. ([learn more about creating Cloud Storage buckets](https://cloud.google.com/storage/docs/creating-buckets)). You'll also need to specify your Earth Engine username, i.e. `users/USER_NAME` on the [Code Editor](https://code.earthengine.google.com/) Assets tab."]},{"cell_type":"code","metadata":{"id":"GHTOc5YLQZ5B","colab_type":"code","colab":{}},"source":["# Your Earth Engine username. This is used to import a classified image\n","# into your Earth Engine assets folder.\n","USER_NAME = 'username'\n","\n","# Cloud Storage bucket into which training, testing and prediction \n","# datasets will be written. You must be able to write into this bucket.\n","OUTPUT_BUCKET = 'your-bucket'\n","\n","# Use Landsat 8 surface reflectance data for predictors.\n","L8SR = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')\n","# Use these bands for prediction.\n","BANDS = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7']\n","\n","# This is a training/testing dataset of points with known land cover labels.\n","LABEL_DATA = ee.FeatureCollection('projects/google/demo_landcover_labels')\n","# The labels, consecutive integer indices starting from zero, are stored in\n","# this property, set on each point.\n","LABEL = 'landcover'\n","# Number of label values, i.e. number of classes in the classification.\n","N_CLASSES = 3\n","\n","# These names are used to specify properties in the export of\n","# training/testing data and to define the mapping between names and data\n","# when reading into TensorFlow datasets.\n","FEATURE_NAMES = list(BANDS)\n","FEATURE_NAMES.append(LABEL)\n","\n","# File names for the training and testing datasets. These TFRecord files\n","# will be exported from Earth Engine into the Cloud Storage bucket.\n","TRAIN_FILE_PREFIX = 'Training_demo'\n","TEST_FILE_PREFIX = 'Testing_demo'\n","file_extension = '.tfrecord.gz'\n","TRAIN_FILE_PATH = 'gs://' + OUTPUT_BUCKET + '/' + TRAIN_FILE_PREFIX + file_extension\n","TEST_FILE_PATH = 'gs://' + OUTPUT_BUCKET + '/' + TEST_FILE_PREFIX + file_extension\n","\n","# File name for the prediction (image) dataset. The trained model will read\n","# this dataset and make predictions in each pixel.\n","IMAGE_FILE_PREFIX = 'Image_pixel_demo_'\n","\n","# The output path for the classified image (i.e. predictions) TFRecord file.\n","OUTPUT_IMAGE_FILE = 'gs://' + OUTPUT_BUCKET + '/Classified_pixel_demo.TFRecord'\n","# Export imagery in this region.\n","EXPORT_REGION = ee.Geometry.Rectangle([-122.7, 37.3, -121.8, 38.00])\n","# The name of the Earth Engine asset to be created by importing\n","# the classified image from the TFRecord file in Cloud Storage.\n","OUTPUT_ASSET_ID = 'users/' + USER_NAME + '/Classified_pixel_demo'"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"ZcjQnHH8zT4q","colab_type":"text"},"source":["# Get Training and Testing data from Earth Engine\n","\n","To get data for a classification model of three classes (bare, vegetation, water), we need labels and the value of predictor variables for each labeled example. We've already generated some labels in Earth Engine. Specifically, these are visually interpreted points labeled \"bare,\" \"vegetation,\" or \"water\" for a very simple classification demo ([example script](https://code.earthengine.google.com/?scriptPath=Examples%3ADemos%2FClassification)). For predictor variables, we'll use [Landsat 8 surface reflectance imagery](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C01_T1_SR), bands 2-7."]},{"cell_type":"markdown","metadata":{"id":"0EJfjgelSOpN","colab_type":"text"},"source":["## Prepare Landsat 8 imagery\n","\n","First, make a cloud-masked median composite of Landsat 8 surface reflectance imagery from 2018. Check the composite by visualizing with folium."]},{"cell_type":"code","metadata":{"id":"DJYucYe3SPPr","colab_type":"code","colab":{}},"source":["# Cloud masking function.\n","def maskL8sr(image):\n"," cloudShadowBitMask = ee.Number(2).pow(3).int()\n"," cloudsBitMask = ee.Number(2).pow(5).int()\n"," qa = image.select('pixel_qa')\n"," mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(\n"," qa.bitwiseAnd(cloudsBitMask).eq(0))\n"," return image.updateMask(mask).select(BANDS).divide(10000)\n","\n","# The image input data is a 2018 cloud-masked median composite.\n","image = L8SR.filterDate('2018-01-01', '2018-12-31').map(maskL8sr).median()\n","\n","# Use folium to visualize the imagery.\n","mapid = image.getMapId({'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 0.3})\n","map = folium.Map(location=[38., -122.5])\n","\n","folium.TileLayer(\n"," tiles=mapid['tile_fetcher'].url_format,\n"," attr='Map Data © Google Earth Engine',\n"," overlay=True,\n"," name='median composite',\n"," ).add_to(map)\n","map.add_child(folium.LayerControl())\n","map"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"UEeyPf3zSPct","colab_type":"text"},"source":["## Add pixel values of the composite to labeled points\n","\n","Some training labels have already been collected for you. Load the labeled points from an existing Earth Engine asset. Each point in this table has a property called `landcover` that stores the label, encoded as an integer. Here we overlay the points on imagery to get predictor variables along with labels."]},{"cell_type":"code","metadata":{"id":"iOedOKyRExHE","colab_type":"code","colab":{}},"source":["# Sample the image at the points and add a random column.\n","sample = image.sampleRegions(\n"," collection=LABEL_DATA, properties=[LABEL], scale=30).randomColumn()\n","\n","# Partition the sample approximately 70-30.\n","training = sample.filter(ee.Filter.lt('random', 0.7))\n","testing = sample.filter(ee.Filter.gte('random', 0.7))\n","\n","from pprint import pprint\n","\n","# Print the first couple points to verify.\n","pprint({'training': training.first().getInfo()})\n","pprint({'testing': testing.first().getInfo()})"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uNc7a2nRR4MI","colab_type":"text"},"source":["## Export the training and testing data\n","\n","Now that there's training and testing data in Earth Engine and you've inspected a couple examples to ensure that the information you need is present, it's time to materialize the datasets in a place where the TensorFlow model has access to them. You can do that by exporting the training and testing datasets to tables in TFRecord format ([learn more about TFRecord format](https://www.tensorflow.org/tutorials/load_data/tf-records)) in your Cloud Storage bucket."]},{"cell_type":"code","metadata":{"id":"Pb-aPvQc0Xvp","colab_type":"code","colab":{}},"source":["# Make sure you can see the output bucket. You must have write access.\n","print('Found Cloud Storage bucket.' if tf.io.gfile.exists('gs://' + OUTPUT_BUCKET) \n"," else 'Can not find output Cloud Storage bucket.')"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Wtoqj0Db1TmJ","colab_type":"text"},"source":["Once you've verified the existence of the intended output bucket, run the exports."]},{"cell_type":"code","metadata":{"id":"TfVNQzg8R6Wy","colab_type":"code","colab":{}},"source":["# Create the tasks.\n","training_task = ee.batch.Export.table.toCloudStorage(\n"," collection=training,\n"," description='Training Export',\n"," fileNamePrefix=TRAIN_FILE_PREFIX,\n"," bucket=OUTPUT_BUCKET,\n"," fileFormat='TFRecord',\n"," selectors=FEATURE_NAMES)\n","\n","testing_task = ee.batch.Export.table.toCloudStorage(\n"," collection=testing,\n"," description='Testing Export',\n"," fileNamePrefix=TEST_FILE_PREFIX,\n"," bucket=OUTPUT_BUCKET,\n"," fileFormat='TFRecord',\n"," selectors=FEATURE_NAMES)"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"id":"QF4WGIekaS2s","colab_type":"code","colab":{}},"source":["# Start the tasks.\n","training_task.start()\n","testing_task.start()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"q7nFLuySISeC","colab_type":"text"},"source":["### Monitor task progress\n","\n","You can see all your Earth Engine tasks by listing them. Make sure the training and testing tasks are completed before continuing."]},{"cell_type":"code","metadata":{"id":"oEWvS5ekcEq0","colab_type":"code","colab":{}},"source":["# Print all tasks.\n","pprint(ee.batch.Task.list())"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"43-c0JNFI_m6","colab_type":"text"},"source":["### Check existence of the exported files\n","\n","If you've seen the status of the export tasks change to `COMPLETED`, then check for the existence of the files in the output Cloud Storage bucket."]},{"cell_type":"code","metadata":{"id":"YDZfNl6yc0Kj","colab_type":"code","colab":{}},"source":["print('Found training file.' if tf.io.gfile.exists(TRAIN_FILE_PATH) \n"," else 'No training file found.')\n","print('Found testing file.' if tf.io.gfile.exists(TEST_FILE_PATH) \n"," else 'No testing file found.')"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"NA8QA8oQVo8V","colab_type":"text"},"source":["## Export the imagery\n","\n","You can also export imagery using TFRecord format. Specifically, export whatever imagery you want to be classified by the trained model into the output Cloud Storage bucket."]},{"cell_type":"code","metadata":{"id":"tVNhJYacVpEw","colab_type":"code","colab":{}},"source":["# Specify patch and file dimensions.\n","image_export_options = {\n"," 'patchDimensions': [256, 256],\n"," 'maxFileSize': 104857600,\n"," 'compressed': True\n","}\n","\n","# Setup the task.\n","image_task = ee.batch.Export.image.toCloudStorage(\n"," image=image,\n"," description='Image Export',\n"," fileNamePrefix=IMAGE_FILE_PREFIX,\n"," bucket=OUTPUT_BUCKET,\n"," scale=30,\n"," fileFormat='TFRecord',\n"," region=EXPORT_REGION.toGeoJSON()['coordinates'],\n"," formatOptions=image_export_options,\n",")"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"id":"6SweCkHDaNE3","colab_type":"code","colab":{}},"source":["# Start the task.\n","image_task.start()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"JC8C53MRTG_E","colab_type":"text"},"source":["### Monitor task progress"]},{"cell_type":"code","metadata":{"id":"BmPHb779KOXm","colab_type":"code","colab":{}},"source":["# Print all tasks.\n","pprint(ee.batch.Task.list())"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"SrUhA1JKLONj","colab_type":"text"},"source":["It's also possible to monitor an individual task. Here we poll the task until it's done. If you do this, please put a `sleep()` in the loop to avoid making too many requests. Note that this will block until complete (you can always halt the execution of this cell)."]},{"cell_type":"code","metadata":{"id":"rKZeZswloP11","colab_type":"code","colab":{}},"source":["import time\n","\n","while image_task.active():\n"," print('Polling for task (id: {}).'.format(image_task.id))\n"," time.sleep(30)\n","print('Done with image export.')"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"9vWdH_wlZCEk","colab_type":"text"},"source":["# Data preparation and pre-processing\n","\n","Read data from the TFRecord file into a `tf.data.Dataset`. Pre-process the dataset to get it into a suitable format for input to the model."]},{"cell_type":"markdown","metadata":{"id":"LS4jGTrEfz-1","colab_type":"text"},"source":["## Read into a `tf.data.Dataset`\n","\n","Here we are going to read a file in Cloud Storage into a `tf.data.Dataset`. ([these TensorFlow docs](https://www.tensorflow.org/guide/data) explain more about reading data into a `Dataset`). Check that you can read examples from the file. The purpose here is to ensure that we can read from the file without an error. The actual content is not necessarily human readable.\n","\n"]},{"cell_type":"code","metadata":{"id":"T3PKyDQW8Vpx","colab_type":"code","cellView":"code","colab":{}},"source":["# Create a dataset from the TFRecord file in Cloud Storage.\n","train_dataset = tf.data.TFRecordDataset(TRAIN_FILE_PATH, compression_type='GZIP')\n","# Print the first record to check.\n","print(iter(train_dataset).next())"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"BrDYm-ibKR6t","colab_type":"text"},"source":["## Define the structure of your data\n","\n","For parsing the exported TFRecord files, `featuresDict` is a mapping between feature names (recall that `featureNames` contains the band and label names) and `float32` [`tf.io.FixedLenFeature`](https://www.tensorflow.org/api_docs/python/tf/io/FixedLenFeature) objects. This mapping is necessary for telling TensorFlow how to read data in a TFRecord file into tensors. Specifically, **all numeric data exported from Earth Engine is exported as `float32`**.\n","\n","(Note: *features* in the TensorFlow context (i.e. [`tf.train.Feature`](https://www.tensorflow.org/api_docs/python/tf/train/Feature)) are not to be confused with Earth Engine features (i.e. [`ee.Feature`](https://developers.google.com/earth-engine/api_docs#eefeature)), where the former is a protocol message type for serialized data input to the model and the latter is a geometry-based geographic data structure.)"]},{"cell_type":"code","metadata":{"id":"-6JVQV5HKHMZ","colab_type":"code","cellView":"code","colab":{}},"source":["# List of fixed-length features, all of which are float32.\n","columns = [\n"," tf.io.FixedLenFeature(shape=[1], dtype=tf.float32) for k in FEATURE_NAMES\n","]\n","\n","# Dictionary with names as keys, features as values.\n","features_dict = dict(zip(FEATURE_NAMES, columns))\n","\n","pprint(features_dict)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"QNfaUPbcjuCO","colab_type":"text"},"source":["## Parse the dataset\n","\n","Now we need to make a parsing function for the data in the TFRecord files. The data comes in flattened 2D arrays per record and we want to use the first part of the array for input to the model and the last element of the array as the class label. The parsing function reads data from a serialized [`Example` proto](https://www.tensorflow.org/api_docs/python/tf/train/Example) into a dictionary in which the keys are the feature names and the values are the tensors storing the value of the features for that example. ([These TensorFlow docs](https://www.tensorflow.org/tutorials/load_data/tfrecord) explain more about reading `Example` protos from TFRecord files)."]},{"cell_type":"code","metadata":{"id":"x2Q0g3fBj2kD","colab_type":"code","cellView":"code","colab":{}},"source":["def parse_tfrecord(example_proto):\n"," \"\"\"The parsing function.\n","\n"," Read a serialized example into the structure defined by featuresDict.\n","\n"," Args:\n"," example_proto: a serialized Example.\n","\n"," Returns:\n"," A tuple of the predictors dictionary and the label, cast to an `int32`.\n"," \"\"\"\n"," parsed_features = tf.io.parse_single_example(example_proto, features_dict)\n"," labels = parsed_features.pop(LABEL)\n"," return parsed_features, tf.cast(labels, tf.int32)\n","\n","# Map the function over the dataset.\n","parsed_dataset = train_dataset.map(parse_tfrecord, num_parallel_calls=5)\n","\n","# Print the first parsed record to check.\n","pprint(iter(parsed_dataset).next())"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Nb8EyNT4Xnhb","colab_type":"text"},"source":["Note that each record of the parsed dataset contains a tuple. The first element of the tuple is a dictionary with bands for keys and the numeric value of the bands for values. The second element of the tuple is a class label."]},{"cell_type":"markdown","metadata":{"id":"xLCsxWOuEBmE","colab_type":"text"},"source":["## Create additional features\n","\n","Another thing we might want to do as part of the input process is to create new features, for example NDVI, a vegetation index computed from reflectance in two spectral bands. Here are some helper functions for that."]},{"cell_type":"code","metadata":{"id":"lT6v2RM_EB1E","colab_type":"code","cellView":"code","colab":{}},"source":["def normalized_difference(a, b):\n"," \"\"\"Compute normalized difference of two inputs.\n","\n"," Compute (a - b) / (a + b). If the denomenator is zero, add a small delta.\n","\n"," Args:\n"," a: an input tensor with shape=[1]\n"," b: an input tensor with shape=[1]\n","\n"," Returns:\n"," The normalized difference as a tensor.\n"," \"\"\"\n"," nd = (a - b) / (a + b)\n"," nd_inf = (a - b) / (a + b + 0.000001)\n"," return tf.where(tf.math.is_finite(nd), nd, nd_inf)\n","\n","def add_NDVI(features, label):\n"," \"\"\"Add NDVI to the dataset.\n"," Args:\n"," features: a dictionary of input tensors keyed by feature name.\n"," label: the target label\n","\n"," Returns:\n"," A tuple of the input dictionary with an NDVI tensor added and the label.\n"," \"\"\"\n"," features['NDVI'] = normalized_difference(features['B5'], features['B4'])\n"," return features, label"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"nEx1RAXOZQkS","colab_type":"text"},"source":["# Model setup\n","\n","The basic workflow for classification in TensorFlow is:\n","\n","1. Create the model.\n","2. Train the model (i.e. `fit()`).\n","3. Use the trained model for inference (i.e. `predict()`).\n","\n","Here we'll create a `Sequential` neural network model using Keras. This simple model is inspired by examples in:\n","\n","* [The TensorFlow Get Started tutorial](https://www.tensorflow.org/tutorials/)\n","* [The TensorFlow Keras guide](https://www.tensorflow.org/guide/keras#build_a_simple_model)\n","* [The Keras `Sequential` model examples](https://keras.io/getting-started/sequential-model-guide/#multilayer-perceptron-mlp-for-multi-class-softmax-classification)\n","\n","Note that the model used here is purely for demonstration purposes and hasn't gone through any performance tuning."]},{"cell_type":"markdown","metadata":{"id":"t9pWa54oG-xl","colab_type":"text"},"source":["## Create the Keras model\n","\n","Before we create the model, there's still a wee bit of pre-processing to get the data into the right input shape and a format that can be used with cross-entropy loss. Specifically, Keras expects a list of inputs and a one-hot vector for the class. (See [the Keras loss function docs](https://keras.io/losses/), [the TensorFlow categorical identity docs](https://www.tensorflow.org/guide/feature_columns#categorical_identity_column) and [the `tf.one_hot` docs](https://www.tensorflow.org/api_docs/python/tf/one_hot) for details). \n","\n","Here we will use a simple neural network model with a 64 node hidden layer, a dropout layer and an output layer. Once the dataset has been prepared, define the model, compile it, fit it to the training data. See [the Keras `Sequential` model guide](https://keras.io/getting-started/sequential-model-guide/) for more details."]},{"cell_type":"code","metadata":{"id":"OCZq3VNpG--G","colab_type":"code","cellView":"code","colab":{}},"source":["from tensorflow import keras\n","\n","# Add NDVI.\n","input_dataset = parsed_dataset.map(add_NDVI)\n","\n","# Keras requires inputs as a tuple. Note that the inputs must be in the\n","# right shape. Also note that to use the categorical_crossentropy loss,\n","# the label needs to be turned into a one-hot vector.\n","def to_tuple(inputs, label):\n"," return (tf.transpose(list(inputs.values())),\n"," tf.one_hot(indices=label, depth=N_CLASSES))\n","\n","# Map the to_tuple function, shuffle and batch.\n","input_dataset = input_dataset.map(to_tuple).batch(8)\n","\n","# Define the layers in the model.\n","model = tf.keras.models.Sequential([\n"," tf.keras.layers.Dense(64, activation=tf.nn.relu),\n"," tf.keras.layers.Dropout(0.2),\n"," tf.keras.layers.Dense(N_CLASSES, activation=tf.nn.softmax)\n","])\n","\n","# Compile the model with the specified loss function.\n","model.compile(optimizer=tf.keras.optimizers.Adam(),\n"," loss='categorical_crossentropy',\n"," metrics=['accuracy'])\n","\n","# Fit the model to the training data.\n","model.fit(x=input_dataset, epochs=10)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Pa4ex_4eKiyb","colab_type":"text"},"source":["## Check model accuracy on the test set\n","\n","Now that we have a trained model, we can evaluate it using the test dataset. To do that, read and prepare the test dataset in the same way as the training dataset. Here we specify a batch size of 1 so that each example in the test set is used exactly once to compute model accuracy. For model steps, just specify a number larger than the test dataset size (ignore the warning)."]},{"cell_type":"code","metadata":{"id":"tE6d7FsrMa1p","colab_type":"code","cellView":"code","colab":{}},"source":["test_dataset = (\n"," tf.data.TFRecordDataset(TEST_FILE_PATH, compression_type='GZIP')\n"," .map(parse_tfrecord, num_parallel_calls=5)\n"," .map(add_NDVI)\n"," .map(to_tuple)\n"," .batch(1))\n","\n","model.evaluate(test_dataset)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"nhHrnv3VR0DU","colab_type":"text"},"source":["# Use the trained model to classify an image from Earth Engine\n","\n","Now it's time to classify the image that was exported from Earth Engine. If the exported image is large, it will be split into multiple TFRecord files in its destination folder. There will also be a JSON sidecar file called \"the mixer\" that describes the format and georeferencing of the image. Here we will find the image files and the mixer file, getting some info out of the mixer that will be useful during model inference."]},{"cell_type":"markdown","metadata":{"id":"nmTayDitZgQ5","colab_type":"text"},"source":["## Find the image files and JSON mixer file in Cloud Storage\n","\n","Use `gsutil` to locate the files of interest in the output Cloud Storage bucket. Check to make sure your image export task finished before running the following."]},{"cell_type":"code","metadata":{"id":"oUv9WMpcVp8E","colab_type":"code","colab":{}},"source":["# Get a list of all the files in the output bucket.\n","files_list = !gsutil ls 'gs://'{OUTPUT_BUCKET}\n","# Get only the files generated by the image export.\n","exported_files_list = [s for s in files_list if IMAGE_FILE_PREFIX in s]\n","\n","# Get the list of image files and the JSON mixer file.\n","image_files_list = []\n","json_file = None\n","for f in exported_files_list:\n"," if f.endswith('.tfrecord.gz'):\n"," image_files_list.append(f)\n"," elif f.endswith('.json'):\n"," json_file = f\n","\n","# Make sure the files are in the right order.\n","image_files_list.sort()\n","\n","pprint(image_files_list)\n","print(json_file)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"RcjYG9fk53xL","colab_type":"text"},"source":["## Read the JSON mixer file\n","\n","The mixer contains metadata and georeferencing information for the exported patches, each of which is in a different file. Read the mixer to get some information needed for prediction."]},{"cell_type":"code","metadata":{"id":"Gn7Dr0AAd93_","colab_type":"code","colab":{}},"source":["import json\n","\n","# Load the contents of the mixer file to a JSON object.\n","json_text = !gsutil cat {json_file}\n","# Get a single string w/ newlines from the IPython.utils.text.SList\n","mixer = json.loads(json_text.nlstr)\n","pprint(mixer)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"6xyzyPPJwpVI","colab_type":"text"},"source":["## Read the image files into a dataset\n","\n","You can feed the list of files (`imageFilesList`) directly to the `TFRecordDataset` constructor to make a combined dataset on which to perform inference. The input needs to be preprocessed differently than the training and testing. Mainly, this is because the pixels are written into records as patches, we need to read the patches in as one big tensor (one patch for each band), then flatten them into lots of little tensors."]},{"cell_type":"code","metadata":{"id":"tn8Kj3VfwpiJ","colab_type":"code","cellView":"code","colab":{}},"source":["# Get relevant info from the JSON mixer file.\n","patch_width = mixer['patchDimensions'][0]\n","patch_height = mixer['patchDimensions'][1]\n","patches = mixer['totalPatches']\n","patch_dimensions_flat = [patch_width * patch_height, 1]\n","\n","# Note that the tensors are in the shape of a patch, one patch for each band.\n","image_columns = [\n"," tf.io.FixedLenFeature(shape=patch_dimensions_flat, dtype=tf.float32) \n"," for k in BANDS\n","]\n","\n","# Parsing dictionary.\n","image_features_dict = dict(zip(BANDS, image_columns))\n","\n","# Note that you can make one dataset from many files by specifying a list.\n","image_dataset = tf.data.TFRecordDataset(image_files_list, compression_type='GZIP')\n","\n","# Parsing function.\n","def parse_image(example_proto):\n"," return tf.io.parse_single_example(example_proto, image_features_dict)\n","\n","# Parse the data into tensors, one long tensor per patch.\n","image_dataset = image_dataset.map(parse_image, num_parallel_calls=5)\n","\n","# Break our long tensors into many little ones.\n","image_dataset = image_dataset.flat_map(\n"," lambda features: tf.data.Dataset.from_tensor_slices(features)\n",")\n","\n","# Add additional features (NDVI).\n","image_dataset = image_dataset.map(\n"," # Add NDVI to a feature that doesn't have a label.\n"," lambda features: add_NDVI(features, None)[0]\n",")\n","\n","# Turn the dictionary in each record into a tuple without a label.\n","image_dataset = image_dataset.map(\n"," lambda data_dict: (tf.transpose(list(data_dict.values())), )\n",")\n","\n","# Turn each patch into a batch.\n","image_dataset = image_dataset.batch(patch_width * patch_height)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"_2sfRemRRDkV","colab_type":"text"},"source":["## Generate predictions for the image pixels\n","\n","To get predictions in each pixel, run the image dataset through the trained model using `model.predict()`. Print the first prediction to see that the output is a list of the three class probabilities for each pixel. Running all predictions might take a while."]},{"cell_type":"code","metadata":{"id":"8VGhmiP_REBP","colab_type":"code","colab":{}},"source":["# Run prediction in batches, with as many steps as there are patches.\n","predictions = model.predict(image_dataset, steps=patches, verbose=1)\n","\n","# Note that the predictions come as a numpy array. Check the first one.\n","print(predictions[0])"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"bPU2VlPOikAy","colab_type":"text"},"source":["## Write the predictions to a TFRecord file\n","\n","Now that there's a list of class probabilities in `predictions`, it's time to write them back into a file, optionally including a class label which is simply the index of the maximum probability. We'll write directly from TensorFlow to a file in the output Cloud Storage bucket.\n","\n","Iterate over the list, compute class label and write the class and the probabilities in patches. Specifically, we need to write the pixels into the file as patches in the same order they came out. The records are written as serialized `tf.train.Example` protos. This might take a while."]},{"cell_type":"code","metadata":{"id":"AkorbsEHepzJ","colab_type":"code","colab":{}},"source":["print('Writing to file ' + OUTPUT_IMAGE_FILE)"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"id":"kATMknHc0qeR","colab_type":"code","cellView":"code","colab":{}},"source":["# Instantiate the writer.\n","writer = tf.io.TFRecordWriter(OUTPUT_IMAGE_FILE)\n","\n","# Every patch-worth of predictions we'll dump an example into the output\n","# file with a single feature that holds our predictions. Since our predictions\n","# are already in the order of the exported data, the patches we create here\n","# will also be in the right order.\n","patch = [[], [], [], []]\n","cur_patch = 1\n","for prediction in predictions:\n"," patch[0].append(tf.argmax(prediction, 1))\n"," patch[1].append(prediction[0][0])\n"," patch[2].append(prediction[0][1])\n"," patch[3].append(prediction[0][2])\n"," # Once we've seen a patches-worth of class_ids...\n"," if (len(patch[0]) == patch_width * patch_height):\n"," print('Done with patch ' + str(cur_patch) + ' of ' + str(patches) + '...')\n"," # Create an example\n"," example = tf.train.Example(\n"," features=tf.train.Features(\n"," feature={\n"," 'prediction': tf.train.Feature(\n"," int64_list=tf.train.Int64List(\n"," value=patch[0])),\n"," 'bareProb': tf.train.Feature(\n"," float_list=tf.train.FloatList(\n"," value=patch[1])),\n"," 'vegProb': tf.train.Feature(\n"," float_list=tf.train.FloatList(\n"," value=patch[2])),\n"," 'waterProb': tf.train.Feature(\n"," float_list=tf.train.FloatList(\n"," value=patch[3])),\n"," }\n"," )\n"," )\n"," # Write the example to the file and clear our patch array so it's ready for\n"," # another batch of class ids\n"," writer.write(example.SerializeToString())\n"," patch = [[], [], [], []]\n"," cur_patch += 1\n","\n","writer.close()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"1K_1hKs0aBdA","colab_type":"text"},"source":["# Upload the classifications to an Earth Engine asset"]},{"cell_type":"markdown","metadata":{"id":"M6sNZXWOSa82","colab_type":"text"},"source":["## Verify the existence of the predictions file\n","\n","At this stage, there should be a predictions TFRecord file sitting in the output Cloud Storage bucket. Use the `gsutil` command to verify that the predictions image (and associated mixer JSON) exist and have non-zero size."]},{"cell_type":"code","metadata":{"id":"6ZVWDPefUCgA","colab_type":"code","colab":{}},"source":["!gsutil ls -l {OUTPUT_IMAGE_FILE}"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"2ZyCo297Clcx","colab_type":"text"},"source":["## Upload the classified image to Earth Engine\n","\n","Upload the image to Earth Engine directly from the Cloud Storage bucket with the [`earthengine` command](https://developers.google.com/earth-engine/command_line#upload). Provide both the image TFRecord file and the JSON file as arguments to `earthengine upload`."]},{"cell_type":"code","metadata":{"id":"NXulMNl9lTDv","colab_type":"code","cellView":"code","colab":{}},"source":["print('Uploading to ' + OUTPUT_ASSET_ID)"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"id":"V64tcVxsO5h6","colab_type":"code","colab":{}},"source":["# Start the upload.\n","!earthengine upload image --asset_id={OUTPUT_ASSET_ID} --pyramiding_policy=mode {OUTPUT_IMAGE_FILE} {json_file}"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Yt4HyhUU_Bal","colab_type":"text"},"source":["## Check the status of the asset ingestion\n","\n","You can also use the Earth Engine API to check the status of your asset upload. It might take a while. The upload of the image is an asset ingestion task."]},{"cell_type":"code","metadata":{"id":"_vB-gwGhl_3C","colab_type":"code","cellView":"code","colab":{}},"source":["ee.batch.Task.list()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vvXvy9GDhM-p","colab_type":"text"},"source":["## View the ingested asset\n","\n","Display the vector of class probabilities as an RGB image with colors corresponding to the probability of bare, vegetation, water in a pixel. Also display the winning class using the same color palette."]},{"cell_type":"code","metadata":{"id":"kEkVxIyJiFd4","colab_type":"code","colab":{}},"source":["predictions_image = ee.Image(OUTPUT_ASSET_ID)\n","\n","prediction_vis = {\n"," 'bands': 'prediction',\n"," 'min': 0,\n"," 'max': 2,\n"," 'palette': ['red', 'green', 'blue']\n","}\n","probability_vis = {'bands': ['bareProb', 'vegProb', 'waterProb'], 'max': 0.5}\n","\n","prediction_map_id = predictions_image.getMapId(prediction_vis)\n","probability_map_id = predictions_image.getMapId(probability_vis)\n","\n","map = folium.Map(location=[37.6413, -122.2582])\n","folium.TileLayer(\n"," tiles=prediction_map_id['tile_fetcher'].url_format,\n"," attr='Map Data © Google Earth Engine',\n"," overlay=True,\n"," name='prediction',\n",").add_to(map)\n","folium.TileLayer(\n"," tiles=probability_map_id['tile_fetcher'].url_format,\n"," attr='Map Data © Google Earth Engine',\n"," overlay=True,\n"," name='probability',\n",").add_to(map)\n","map.add_child(folium.LayerControl())\n","map"],"execution_count":0,"outputs":[]}]} \ No newline at end of file diff --git a/guides/linked/Uploading_image_tiles_as_a_single_asset_using_a_manifest.ipynb b/guides/linked/Uploading_image_tiles_as_a_single_asset_using_a_manifest.ipynb new file mode 100644 index 000000000..e6fdf7868 --- /dev/null +++ b/guides/linked/Uploading_image_tiles_as_a_single_asset_using_a_manifest.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Uploading_image_tiles_as_a_single_asset_using_a_manifest.ipynb","provenance":[{"file_id":"1nblLe678Tucbe0Iatdfo0fuztBDxedfp","timestamp":1588787451968}],"private_outputs":true,"collapsed_sections":[],"authorship_tag":"ABX9TyOq42l9DdvNKTF0Ej9L/AS6"},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"code","metadata":{"id":"fSIfBsgi8dNK","colab_type":"code","colab":{}},"source":["#@title Copyright 2020 Google LLC. { display-mode: \"form\" }\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","# https://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License."],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"aV1xZ1CPi3Nw","colab_type":"text"},"source":["
\n","\n"," Run in Google Colab\n","\n"," View source on GitHub
"]},{"cell_type":"markdown","metadata":{"id":"RPBL-XjRFNop","colab_type":"text"},"source":["# Uploading an image from tiles using a manifest\n","\n","This notebook demonstrates uploading a set of image tiles into a single asset using a manifest file. See [this doc](https://developers.google.com/earth-engine/image_manifest) for more details about manifest upload using the Earth Engine command line tool.\n","\n","10-meter land cover images derived from Sentinel-2 ([reference](https://doi.org/10.1016/j.scib.2019.03.002)) from the [Finer Resolution Global Land Cover Mapping (FROM-GLC) website](http://data.ess.tsinghua.edu.cn/) are downloaded directly to a Cloud Storage bucket and uploaded to a single Earth Engine asset from there. A manifest file, described below, is used to configure the upload."]},{"cell_type":"markdown","metadata":{"id":"K57gwmayH24H","colab_type":"text"},"source":["First, authenticate with Google Cloud, so you can access Cloud Storage buckets."]},{"cell_type":"code","metadata":{"id":"a0WqP4vKIM5v","colab_type":"code","colab":{}},"source":["from google.colab import auth\n","auth.authenticate_user()"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"1tPSX8ABIB36","colab_type":"text"},"source":["## Download to Cloud Storage\n","\n","Paths from [the provider website](http://data.ess.tsinghua.edu.cn/fromglc10_2017v01.html) are manually copied to a list object as demonstrated below. Download directly to a Cloud Storage bucket to which you can write."]},{"cell_type":"code","metadata":{"id":"TQGLIdH6IQmn","colab_type":"code","colab":{}},"source":["# URLs of a few tiles.\n","urls = [\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_36_-120.tif',\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_36_-122.tif',\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_36_-124.tif',\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_38_-120.tif',\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_38_-122.tif',\n"," 'http://data.ess.tsinghua.edu.cn/data/fromglc10_2017v01/fromglc10v01_38_-124.tif'\n","]\n","\n","# You need to have write access to this bucket.\n","bucket = 'your-bucket-folder'\n","\n","# Pipe curl output to gsutil.\n","for f in urls:\n"," filepath = bucket + '/' + f.split('/')[-1]\n"," !curl {f} | gsutil cp - {filepath}"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"nIOsWbLf66F-","colab_type":"text"},"source":["## Build the manifest file\n","\n","Build the manifest file from a dictionary. Turn the dictionary into JSON. Note the use of the `gsutil` tool to get a listing of files in a Cloud Storage bucket ([learn more about `gsutil`](https://cloud.google.com/storage/docs/gsutil)). Also note that the structure of the manifest is described in detail [here](https://developers.google.com/earth-engine/image_manifest#manifest-structure-reference). Because the data are categorical, a `MODE` pyramiding policy is specified. Learn more about how Earth Engine builds image pyramids [here](https://developers.google.com/earth-engine/scale)."]},{"cell_type":"code","metadata":{"id":"DPddpXYrJlap","colab_type":"code","colab":{}},"source":["# List the contents of the cloud folder.\n","cloud_files = !gsutil ls {bucket + '/*.tif'}\n","\n","# Get the list of source URIs from the gsutil output.\n","sources_uris = [{'uris': [f]} for f in cloud_files]\n","\n","asset_name = 'path/to/your/asset'\n","\n","# The enclosing object for the asset.\n","asset = {\n"," 'name': asset_name,\n"," 'tilesets': [\n"," {\n"," 'sources': sources_uris\n"," }\n"," ],\n"," 'bands': [\n"," {\n"," 'id': 'cover_code',\n"," 'pyramiding_policy': 'MODE',\n"," 'missing_data': {\n"," 'values': [0]\n"," }\n"," }\n"," ]\n","}\n","\n","import json\n","print(json.dumps(asset, indent=2))"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"D2j6_TbCUiwZ","colab_type":"text"},"source":["Inspect the printed JSON for errors. If the JSON is acceptable, write it to a file and ensure that the file matches the printed JSON."]},{"cell_type":"code","metadata":{"id":"frZyXUDnFHVv","colab_type":"code","colab":{}},"source":["file_name = 'gaia_manifest.json'\n","\n","with open(file_name, 'w') as f:\n"," json.dump(asset, f, indent=2)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"k9WBqTW6XAwn","colab_type":"text"},"source":["Inspect the written file for errors."]},{"cell_type":"code","metadata":{"id":"wjunR9SLWn2A","colab_type":"code","colab":{}},"source":["!cat {file_name}"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"4MWm6WWbXG9G","colab_type":"text"},"source":["## Upload to Earth Engine\n","\n","If you are able to `cat` the written file, run the upload to Earth Engine. First, import the Earth Engine library, authenticate and initialize."]},{"cell_type":"code","metadata":{"id":"hLFVQeDPXPE0","colab_type":"code","colab":{}},"source":["import ee\n","ee.Authenticate()\n","ee.Initialize()"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"id":"A3ztutjFYqmt","colab_type":"code","colab":{}},"source":["# Do the upload.\n","!earthengine upload image --manifest {file_name}"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vELn42MrZxwY","colab_type":"text"},"source":["## Visualize the uploaded image with folium\n","\n","This is what [FROM-GLC](http://data.ess.tsinghua.edu.cn/) says about the classification system:\n","\n","| Class | Code |\n","| ------------- | ------------- |\n","| Cropland | 10 |\n","| Forest | 20 |\n","| Grassland | 30 |\n","| Shrubland | 40 |\n","| Wetland | 50 |\n","| Water | 60 |\n","| Tundra | 70 |\n","| Impervious | 80 |\n","| Bareland | 90 |\n","| Snow/Ice | 100 |\n","\n","Use a modified FROM-GLC palette to visualize the results."]},{"cell_type":"code","metadata":{"id":"mKQOEbkvPAS0","colab_type":"code","colab":{}},"source":["palette = [\n"," 'a3ff73', # farmland\n"," '267300', # forest\n"," 'ffff00', # grassland\n"," '70a800', # shrub\n"," '00ffff', # wetland\n"," '005cff', # water\n"," '004600', # tundra\n"," 'c500ff', # impervious\n"," 'ffaa00', # bare\n"," 'd1d1d1', # snow, ice\n","]\n","vis = {'min': 10, 'max': 100, 'palette': palette}\n","\n","ingested_image = ee.Image('projects/ee-nclinton/assets/fromglc10_demo')\n","map_id = ingested_image.getMapId(vis)\n","\n","import folium\n","\n","map = folium.Map(location=[37.6413, -122.2582])\n","folium.TileLayer(\n"," tiles=map_id['tile_fetcher'].url_format,\n"," attr='Map Data © Google Earth Engine',\n"," overlay=True,\n"," name='fromglc10_demo',\n",").add_to(map)\n","map.add_child(folium.LayerControl())\n","map"],"execution_count":0,"outputs":[]}]} \ No newline at end of file