Skip to content

daniel-pereira-guimaraes/elasticsearch-study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elasticsearch Study

Table of Contents

Preparing the study environment

We will study Elasticsearch and Logstash. So we need to install these tools.

Installing Elasticsearch on Windows, using the zip package

  • Download the Elasticsearch zip package for Windows.
  • Extract the contents of the zip file to a directory (for example: C:\elasticsearch).
  • Open CMD or PowerShell, run the following command and wait:
C:\elasticsearch\bin\elasticsearch.bat

See also: Checking Elasticsearch status

Running Elasticsearch with Docker

Before proceeding, make sure you have Docker installed and running on your system. More information at: https://www.docker.com/get-started/

Basic configuration

docker run -d -p 9200:9200 --name my_container elasticsearch:8.13.4

Where:

  • 9200: Elasticsearch TCP/IP port.
  • my_container: Arbitrary name for the container.
  • elasticsearch:8.13.4: Docker image name and version for Elasticsearch.

With custom network

docker network create my_network
docker run -d -p 9200:9200 --name my_container --net my_network elasticsearch:8.13.4

Where my_network is a arbitrary name for Docker network.

With basic authentication enabled

docker run -d \
  -e discovery.type=single-node \
  -e xpack.security.enabled=true \
  -e ELASTIC_PASSWORD=my_password \
  -p 9200:9200 \
  --name my_container \
  elasticsearch:8.13.4

Where my_password is the password that should be provided for each access to Elasticsearch, as shown in the following example:

curl -u elastic:my_password http://localhost:9200?pretty

With custom network and basic authentication enabled

docker network create my_network

docker run -d \
  -e discovery.type=single-node \
  -e xpack.security.enabled=true \
  -e ELASTIC_PASSWORD=my_password \
  -p 9200:9200 \
  --name my_container \
  --net my_network \
  elasticsearch:8.13.4

Checking Elasticsearch status

In the terminal (CMD or PowerShell on Windows), execute this command:

curl http://localhost:9200?pretty

But if basic authentication is enabled in Elasticsearch, the command should include the credentials, such as:

curl -u elastic:my_password http://localhost:9200?pretty

Example of the expected output:

{
  "name" : "115dcea1258b",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "PJB8bJTcQouZybMFHH2-xg",
  "version" : {
    "number" : "8.13.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "09df99393193b2c53d92899662a8b8b3c55b45cd",
    "build_date" : "2024-03-22T03:35:46.757803203Z",
    "build_snapshot" : false,
    "lucene_version" : "9.10.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

Installing Logstash on Windows, using zip package

  • Download the Logstash zip package for Windows.
  • Extract the contents of the zip file to a directory (for example: C:\logstash).

See also: Import countries data from CSV file, using Logstash

Importing data

Import movies data from JSON file

Resource:

Command line:

curl -s -H "Content-Type: application/json" -XPUT localhost:9200/_bulk?pretty --data-binary @movies.json

Create series index and import data

Create series index (master/detail)

curl -s -H "Content-Type: application/json" -XPUT localhost:9200/series -d '
{
  "mappings": {
    "properties": {
      "film_to_franchise": {
        "type":"join",
        "relations":{
          "franchise":"film"
        }
      }
    }
  }
}'

Import series from JSON file

Resource:
Command line:
curl -s -H "Content-Type: application/json" -XPUT localhost:9200/_bulk?pretty --data-binary @series.json

Create product index and import data

Create products index

curl -H "Content-Type: application/json" -XPUT "http://localhost:9200/products" -d '
{
  "mappings": {
    "properties": {
      "id": { "type": "integer" },
      "name": { 
        "type": "text",
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }
      },
      "group": { "type": "keyword" },
      "stock": { "type": "integer" },
      "price": { "type": "float" }
    }
  }
}'

Import product data from JSON file

Resource:
Command line:
curl -H "Content-Type: application/json" -XPOST "http://localhost:9200/products/_bulk?pretty" --data-binary "@products.json"

Import countries data from CSV file, using Logstash

Resources:

Command line:

c:\logstash\bin\logstash -f C:\data\csv-countries.conf

See also: Installing Logstash on Windows, using zip package

Index operations

List all indices

curl -X GET http://localhost:9200/_cat/indices?v

Delete movies index

curl -s -XDELETE localhost:9200/movies

Create movies index

curl -s -H "Content-Type: application/json" -XPUT localhost:9200/movies -d '
{
  "mappings": {
    "properties": {
      "year": {
        "type": "date"
      }
    }
  }
}'

Get movies mapping

curl -s -XGET localhost:9200/movies/_mapping

Inserting, getting, updating and deleting documents

Insert a movie

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/movies/_doc/109487 -d '
{
  "genre":["IMAX","Sci-Fi"],
  "title":"Interestellar",
  "year":2014
}'

Get all movies

curl -s -XGET localhost:9200/movies/_search?pretty

Get a movie

curl -s -XGET localhost:9200/movies/_doc/109487?pretty

Update a movie

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/movies/_doc/109487/_update -d '
{
  "doc": {
    "title":"Interestellar UPDATED"
  }
}'

Conditional update movie

curl -s -H "Content-Type: application/json" -XPUT "localhost:9200/movies/_doc/109487?if_seq_no=10&if_primary_term=1" -d '
{
  "genres":["IMAX","Sci-Fi"],
  "title":"Interestellar 2",
  "year":2014
}'

Update movie, retry on conflict

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/movies/_doc/109487/_update?retry_on_conflict=5 -d '
{
  "doc":{
    "title":"Interestellar 3"  
  }
}'

Delete a movie

curl -s -XDELETE localhost:9200/movies/_doc/109487?pretty

Master/details queries

Get all films (details) from a serie (master)

curl -s -H "Content-Type: application/json" -XGET localhost:9200/series/_search?pretty -d '
{
  "query":{
    "has_parent":{
      "parent_type":"franchise",
      "query":{
        "match":{
          "title":"Star Wars"
        }
      }
    }
  }
}'

Get serie (master) from a film (detail)

curl -s -H "Content-Type: application/json" -XGET localhost:9200/series/_search?pretty -d '
{
  "query":{
    "has_child":{
      "type":"film",
      "query":{
        "match":{
          "title":"The Force Awakens"
        }
      }
    }
  }
}'

Finding data with simple query

Find movies by title

curl -s -XGET "http://127.0.0.1:9200/movies/_search?q=title:Wars&pretty"

Find movies where year > 2015

curl -s -XGET "http://127.0.0.1:9200/movies/_search?q=year:>2015&pretty"

Find movies where year > 2010 AND year < 2016

curl -s -XGET "http://127.0.0.1:9200/movies/_search?q=year:>2010+AND+year:<2016&pretty"

Find movies where year > 2010 AND title contains "force"

curl -s -XGET "http://127.0.0.1:9200/movies/_search?q=year:>2010+AND+title:force&pretty"

Finding data using JSON query

Find movies by title

curl -s -H "Content-Type:application/json" -XGET http://127.0.0.1:9200/movies/_search?pretty -d'
{
  "query":{
    "match":{
      "title":"force"
    }
  }
}'

Find movies where year > 2015

curl -s -H "Content-Type:application/json" -XGET http://127.0.0.1:9200/movies/_search?pretty -d'
{
  "query":{
    "range":{
      "year":{
        "gt":2015
      }
    }
  }
}'

Find movies where year > 2010 AND year < 2016

curl -s -H "Content-Type:application/json" -XGET http://127.0.0.1:9200/movies/_search?pretty -d'
{
  "query":{
    "range":{
      "year":{
        "gt":2010,
        "lt":2016
      }
    }
  }
}'

Find movies where year > 2010 AND title contains "force"

The word must is equivalent to the AND operator.

curl -s -H "Content-Type: application/json" -XGET "http://127.0.0.1:9200/movies/_search?pretty" -d '{
  "query": {
    "bool": {
      "must": [
        { "range": { "year": { "gt": 2010 } } },
        { "match": { "title": "force" } }
      ]
    }
  }
}'

Find movies where year < 2010 OR year > 2015

The word should is equivalent to the OR operator.

curl -s -H "Content-Type: application/json" -XGET "http://127.0.0.1:9200/movies/_search?pretty" -d '{
  "query": {
    "bool": {
      "should": [
        { "range": { "year": { "lt": 2010 } } },
        { "range": { "year": { "gt": 2015 } } }
      ]
    }
  }
}'

Find movies where title contains a phrase

curl -s -H "Content-Type:application/json" -XGET http://127.0.0.1:9200/movies/_search?pretty -d'
{
  "query":{
    "match_phrase":{
      "title":"star wars"
    }
  }
}'

Find movies where title contains a few words

curl -s -H "Content-Type:application/json" -XGET http://127.0.0.1:9200/movies/_search?pretty -d'
{
  "query":{
    "match_phrase":{
      "title":{
        "query":"episode star",
        "slop":100
      }
    }
  }
}'

Pagination

To use query result pagination, set the from and size parameters.

Pagination using simple query

curl -s -XGET "localhost:9200/movies/_search?pretty&from=0&size=2"

Pagination using JSON query

curl -s -H "Content-Type: application/json" -XGET "localhost:9200/movies/_search?pretty" -d '{
  "from": 0,
  "size": 2,
  "query": {
    "match_all": {}
  }
}'

Ordering the results

To order query results, set the sort parameter.

Order by year, ascending

Simple query:

curl -s -H "Content-Type: application/json" -XGET "localhost:9200/movies/_search?sort=year&pretty"

JSON query:

curl -s -H "Content-Type: application/json" -XGET "localhost:9200/movies/_search?pretty" -d '{
  "sort": [
    { "year": "asc" }
  ]
}'

Order by year, descending

curl -s -H "Content-Type: application/json" -XGET "localhost:9200/movies/_search?pretty" -d '{
  "sort": [
    { "year": "desc" }
  ]
}'

Query three products with the highest prices

curl -H 'Content-Type: application/json' -XGET "localhost:9200/products/_search?pretty" -d '
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ],
  "size": 3
}'

Query products by name and order by name.raw

curl -H 'Content-Type: application/json' -XGET "localhost:9200/products/_search?pretty" -d '
{
  "query": {
    "match": {
      "name" : "PVC"
    }
  },
  "sort": [
    { "name.raw": "asc" }
  ]
}'

Returning only some attributes

curl -s -H 'Content-Type: application/json' -XGET "localhost:9200/products/_search?pretty&filter_path=hits.hits._source" -d '
{
  "query": {
    "match": {
      "name" : "PVC"
    }
  },
  "sort": [
    { "name.raw": "asc" }
  ],
  "_source": ["id", "name", "price"]
}'

Changing the field type

Create a temporary index

curl -H 'Content-Type: application/json' -XPUT "http://localhost:9200/temp-index?pretty"  -d '
{
  "mappings": {
    "properties": {
      "id": {"type": "integer"},
      "code": {"type": "keyword"},
      "name": {"type": "text"},
      "currency": {"type": "keyword"},
      "latitude": {"type": "float"},
      "longitude": {"type": "float"}
    }
  }
}'

Copy data to temporary index

curl -H 'Content-Type: application/json' -XPOST "http://localhost:9200/_reindex?pretty" -d '
{
  "source": {
    "index": "countries"
  },
  "dest": {
    "index": "temp-index"
  }
}'

Delete old index

curl -XDELETE http://localhost:9200/countries

Recreate the old index

curl -H 'Content-Type: application/json' -XPUT "http://localhost:9200/countries?pretty"  -d '
{
  "mappings": {
    "properties": {
      "id": {"type": "integer"},
      "code": {"type": "keyword"},
      "name": {"type": "text"},
      "currency": {"type": "keyword"},
      "latitude": {"type": "float"},
      "longitude": {"type": "float"}
    }
  }
}'

Copy data to old recreated index

curl -H 'Content-Type: application/json' -XPOST "http://localhost:9200/_reindex?pretty" -d '
{
  "source": {
    "index": "temp-index"
  },
  "dest": {
    "index": "countries"
  }
}'

Delete temporary index

curl -XDELETE http://localhost:9200/temp-index

Aggregating data

Count of products by group

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/products/_search?pretty'  -d '
{
  "size": 0,
  "aggs": {
    "group_aggs": {
      "terms": {
        "field": "group"
      }
    }
  }
}'

Quantity in stock per group

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/products/_search?pretty' -d '{
  "size": 0,
  "aggs": {
    "group_aggs": {
      "terms": {
        "field": "group"
      },
      "aggs": {
        "total_stock": {
          "sum": {
            "field": "stock"
          }
        }
      }
    }
  }
}'

Financial value of stock by group

curl -H 'Content-Type: application/json' -XGET "localhost:9200/products/_search?pretty" -d '
{
  "size": 0,
  "aggs": {
    "by_group": {
      "terms": {
        "field": "group"
      },
      "aggs": {
        "total": {
          "sum": {
            "script": {
              "source": "doc[\"stock\"].value * doc[\"price\"].value"
            }
          }
        }
      }
    }
  }
}'

Min, max and avg price by group

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/products/_search?pretty' -d '{
  "size": 0,
  "aggs": {
    "group_aggs": {
      "terms": {
        "field": "group"
      },
      "aggs": {
        "min_price": {
          "min": {
            "field": "price"
          }
        },
        "max_price": {
          "max": {
            "field": "price"
          }
        },
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}'

Calculated field

Query products with these fields:

  • id
  • name
  • stock
  • price
  • total (stock * price)
curl -H 'Content-Type: application/json' -XGET "localhost:9200/products/_search?pretty" -d '
{
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "total": {
      "script": {
        "source": "doc[\"stock\"].value * doc[\"price\"].value"
      }
    }
  },
  "_source": ["id", "name", "stock", "price"],
  "size": 10
}'

Users and permissions

The next topics demonstrate how to create, modify, and delete a user. A user will be created who can only read data from a specific index.

Creating a new index for access testing

curl -s -u elastic:password \
     -H "Content-Type: application/json" \
     -X PUT "localhost:9200/countries" \
     -d '{
           "mappings": {
             "properties": {
               "iso_code": {
                 "type": "keyword"
               },
               "country_name": {
                 "type": "text"
               }
             }
           }
         }'

Expected output:

{"acknowledged":true,"shards_acknowledged":true,"index":"countries"}

Insert some countries for testing

curl -s -u elastic:password \
     -H "Content-Type: application/json" \
     -X POST "localhost:9200/countries/_doc?pretty" \
     -d '{
           "iso_code": "BR",
           "country_name": "Brazil"
         }'

curl -s -u elastic:password \
     -H "Content-Type: application/json" \
     -X POST "localhost:9200/countries/_doc?pretty" \
     -d '{
           "iso_code": "US",
           "country_name": "United States"
         }'

Create a new role for read-only access to the countries index

curl -s -u elastic:password \
     -H "Content-Type: application/json" \
     -X PUT "localhost:9200/_security/role/read_only_countries" \
     -d '{
           "indices": [
             {
               "names": ["countries"],
               "privileges": ["read"]
             }
           ]
         }'

Expected output:

{"role":{"created":true}}

Create a new user with read-only access to the countries index:

curl -s -u elastic:password \
  -H "Content-Type: application/json" \
  -X POST "localhost:9200/_security/user/daniel" \
  -d '
    {
      "password": "myPassword",
      "roles": ["read_only_countries"]
    }'

Expected output:

{"created" : true}

Important! To update a user, replace the POST method with PUT.

Get user data

curl -s -u elastic:password -X GET "localhost:9200/_security/user/daniel?pretty"

Expected output:

{
  "daniel" : {
    "username" : "daniel",
    "roles" : [
      "read_only_countries"
    ],
    ...
  }
}

List all users

curl -s -u elastic:password -X GET "localhost:9200/_security/user?pretty"

Expected output:

JSON containing data of all users.

Listing countries with new user credentials

curl -s -u daniel:myPassword -X GET localhost:9200/countries/_search?pretty

Expected output:

JSON containing countries!

Trying insert a new country with new user credentials:

curl -s -u daniel:MyPassword \
     -H "Content-Type: application/json" \
     -X POST "localhost:9200/countries/_doc?pretty" \
     -d '{
           "iso_code": "PT",
           "country_name": "Portugal"
         }'

Expected output:

JSON with error information (security_exception)

Deleting an user

curl -s -u elastic:password -X DELETE "localhost:9200/_security/user/daniel"

Expected output:

{"found":true}

Deleting a role

curl -s -u elastic:password -X DELETE "localhost:9200/_security/role/read_only_countries"

Expected output:

{"found":true}

Resetting password via command line script

  • Access the bin subdirectory of your elasticsearch installation.
  • Run elasticsearch-reset-password script:
    • elasticsearch-reset-password <options> --username <username>
  • Examples:
    • Reset the password for the elastic user to an automatically generated password:
      • elasticsearch-reset-password -a --username elastic
    • Reset the password for the elastic user to a specified password:
      • elasticsearch-reset-password -i --username elastic

More information about Elasticsearch security

For more information about Elasticsearch security, clique here!

More queries

Filtering results with filter_path parameter and _source

Returns only _id and some _source fields

curl -X GET "localhost:9200/countries/_search?pretty&filter_path=hits.hits._id,hits.hits._source" \
     -H "Content-Type:application/json" \
     -d '{
        "_source": ["name", "currency"]
      }'

Searching with SQL in Elasticsearch

Single SQL query

curl -s \
     -X POST "localhost:9200/_sql?format=csv" \
     -H "Content-Type:application/json" \
     -d '{"query":"SELECT iso2, name, longitude FROM countries ORDER BY latitude LIMIT 5"}'
Expected output:
iso2,name,longitude
AQ,Antarctica,4.48
GS,South Georgia,-37.0
BV,Bouvet Island,3.4
HM,Heard Island and McDonald Islands,72.51666666
FK,Falkland Islands,-59.0

SQL query with parameters, JSON output format, no headers

curl -s -X POST "localhost:9200/_sql?format=json&filter_path=rows&pretty" \
     -H "Content-Type:application/json" \
     -d '{
        "query":"SELECT name FROM countries WHERE currency = ?",
        "params":["USD"]
      }'
Expected output:
{
  "rows" : [
    [
      "American Samoa"
    ],
    [
      "Bonaire, Sint Eustatius and Saba"
    ],
    ...
  ]
} 

Counting all countries

curl -s -X GET "localhost:9200/countries/_count"

Expected result:

{"count":250,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}

Counting countries with USD currency

curl -s -X POST "localhost:9200/countries/_count" \
     -H "Content-Type: application/json" \
     -d '{
        "query": {
            "term": {
                "currency": "USD"
            }
        }
     }'

Expected result:

{"count":17,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}

All statistics about a field

curl -s -X GET "localhost:9200/countries/_search?filter_path=aggregations.latitude_stats&pretty" \
     -H "Content-Type:application/json" \
     -d '{
        "aggs": {
          "latitude_stats": {
            "extended_stats": {
              "field": "latitude"
            }
          }
        },
        "size":0
      }'

Expected output:

{
  "aggregations" : {
    "latitude_stats" : {
      "count" : 250,
      "min" : -74.65,
      "max" : 78.0,
      "avg" : 16.40259736452,
      "sum" : 4100.64934113,
      "sum_of_squares" : 245532.3413868999,
      "variance" : 713.0841652450413,
      "variance_population" : 713.0841652450413,
      "variance_sampling" : 715.9479570733346,
      "std_deviation" : 26.70363580572955,
      "std_deviation_population" : 26.70363580572955,
      "std_deviation_sampling" : 26.75720383510457,
      "std_deviation_bounds" : {
        "upper" : 69.8098689759791,
        "lower" : -37.0046742469391,
        "upper_population" : 69.8098689759791,
        "lower_population" : -37.0046742469391,
        "upper_sampling" : 69.91700503472913,
        "lower_sampling" : -37.11181030568914
      }
    }
  }
}

Releases

No releases published

Packages

No packages published