Elasticsearch Deep Dive
December 5th, 2018
BigData Training Architect II in Content
Follow right on the heels of the Elastic Stack Essentials course with the Elasticsearch Deep Dive. Get to understand and go hands-on with the core functionality of Elasticsearch (installing, indexing, querying). Next, learn how to configure it for production use with TLS encryption, user access control, monitoring, and alerting with X-Pack and automated management with Elasticsearch Curator. Get to understand best practices around heap and cluster sizing, hardware requirements, and performing live upgrades.
If you're wondering whether this course is for you and what you need to know before taking it, then look no further. This video describes the intended audience, pre-requisites, and a brief overview of the concepts covered throughout this course.
About the Author
Get to know a little bit about me, the author!
Nomenclature: ELK vs Elastic Stack
You may hear references to ELK and Elastic Stack and not know the difference. Well, lets clear that up with this short video describing the nomenclature of Elastic's product suite.
Elastic Stack Overview
Before getting deep into Elasticsearch, lets talk about how it fits into the overall Elastic Stack ecosystem to better familiarize ourselves with how important Elasticsearch is for the Elastic Stack and how it can be used outside of the stack on its own. This will provide you with a brief overview of the Elastic Stack components (Beats, Logstash, Elasticsearch, Kibana, and X-Pack) and also detail which of the Elastic Stack services we will use in this course and how.
Elasticsearch Installation: Part 1
In part one of this topic, we will demonstrate how to get up and running with elasticsearch very quickly with an installation via archive. This is a generally cross platform installation method with minor differences between OSes.
Elasticsearch Installation: Part 2
In part two of this topic, we demonstrate how to setup the Elastic Stack YUM repository to install and manage Elasticsearch via YUM package manager. This is going to be a more production friendly way to install and version manage Elasticsearch.
Using Kibana to Interface with Elasticsearch
This video demonstrates how to install Kibana in order to use it as an interface for Elasticsearch. Interacting with Elasticsearch purely through the command line can be tedious especially when crafting long REST calls with JSON. Kibana makes this much easier with it's console tool which provides an easy to use REST console with auto-completion, syntax highlighting, JSON pretty-printing, and more.
Lets go over some basic terminology and concepts around how Elasticsearch is structued, stores data, and distributes that data in a fault tolerant way. So, in this video we will talk about some common use cases, basic terminology, cluster states, and node types.
Creating a Cluster
Whether you need a 3-node or 1000-node Elasticsearch cluster, the setup procedure is the same. Here we demonstrate how to setup a multi-node Elasticsearch cluster with dedicated master and data nodes. Feel free to follow along as we will be using the Linux Academy cloud servers which are available to all Linux Academy students.
Exploring Your Cluster
Working with Indexes
Before we starting indexing and searching data with Elasticsearch, we need to know how to create and configure indexes. In this video, we cover how to create an index with aliases, explicit mappings, and specific settings like how many shards and replicas to allocate.
Working with Documents
Now that we know how to create and configure indexes, lets demonstrate how to use them with documents. In ths video, we will show how to index new docuements with the index and bulk APIs, how to view those documents with the get API, how to update documents by using the update API, and lastly how to use the delete api to delete documents from our indexes. Each of these CRUD operations is essential to administering any Elasticsearch cluster.
Elasticsearch CAT APIs
It is important to know how to get basic information about your cluster. Which node is the elected master? How many shards do I have on each node? What state is my cluster in? How many documents do I have in each index? All of these quests and a whole lot more are answerable by the Compact and Aligned Text (CAT) APIs. In this video we cover the common parameters for the CAT APIs, how to use them on the cammand line and Kibana's console tool, and how to use the various parameters to find and format the exact data we want to see.
Exploring Your Data
Searching and Filtering
Now that we can create and configure indexes and perform CRUD operations on our documents, we can start to explore the primary use case for Elasticsearch; to search! This video will explain how the search API in Elasticsearch is structured, how relevancy scoring is calculated, and how the various types of search in Elasticsearch differ from each other.
Searching documents and colculating relevancy scores is an important part of the Elasticsearch search API. Fortuneatly, it doesnt stop there. By utilizing aggregations in the search API, we can ask our data complex questions to group our data and extract statistics. In this lesson, we will dive into the world of aggregation in Elasticsearch. We wil cover the 3 main types of aggregations and demonstrate how to use them by performing several aggregations on our bank account data we bulk ingested earlier in the course.
Elasticsearch in Production
Efficiently restarting an Elasticsearch cluster is not quite as simple as restarting the service on each node. In order to avoid uneccessary recovery and re-balance operations, we need to follow a certain procedure to restart a cluster while also maintaining cluster uptime. This video lesson will cover this procedure and demonstrate it on the cloud cluster we've created throughout this course.
Just like doing a rolling restart of an Elasticsearch cluster, performing a rolling upgrade also has a specific procedure in order to reduce the number of recovery operations and time to upgrade the whole cluster. There are also situations in which a rolling cluster upgrade is not possible in which case you would need to perform a full cluster restart upgrade with downtime. This video describes the procudure for both types of upgrades and demonstrates how to do a rolling upgrade with our cloud cluster.
The only free plugin in the X-Pack plugin pack is Monitoring. With the basic license included with every installation of Elasticsearch, you can enable the Monitoring plugin to gather useful performance, health, and usage information about your Elasticsearch cluster. This video will walk you through the various ways of enabling this feature and showcases the Monitoring interface in Kibana.
Once you're up and running with Elasticsearch, you quickly discover numerous maintenance tasks that have to be done regularly. Elasticsearch Curator is a maintenance automation tool designed around automating these periodic tasks for you. There are a hundreds of ways you can use Curator with your Elasticsearch cluster, so this video focuses on how to get Curator installed, configured, and how the actions are created so that you know how to automate your unique maintenance tasks yourself.
By default, Elasticsearch is wide open. In order to configure user access control, cluster network encryption, and detailed audit logging, you must utilize X-Pack Security with a paid license. The basic license that Elasticsearch ships with will not grant you access to use the X-Pack Security plugin. In this video lesson, we will use a 30-day trial license to show how to enable transport network encryption, user access control, and detailed audit logging with our cloud cluster we've been using throughout this course.
In a produciton environment, X-Pack Security is essential to locking down your Elasticsearch cluster and encrypting cluster communications. In this lesson, we show how to enable X-Pack Security with a 30-day Trial license, generate node certificates for our cloud cluster, encrypt the transport network, and set the built-in user account passwords for Elasticsearch.
User Access Control
Once you enable X-Pack Security, you can no longer access Elasticsearch endpoints or Kibana without User Authentication. In this lesson, we go over how to use the default mechanism in Elasticsearch for creating roles and users. We will explore how to create roles and users with both the Kibana UI and Elasticsearch APIs. Furthermore, we will enable X-Pack Security's audit logging feature to demonstrate how every action can be logged to meet your data's regulatory requiremnts.
There are a few pitfalls to avoid and best practices to follow when configuring the JVM heap size for your Elasticsearch nodes. This lesson will explain each and give a demonstration on how to prepare your Elasticsearch nodes to avoid memory swaping on linux systems.
What should my cluster heap to storage ratio be? How many shards should I have per node? How big should my shards be? How many data nodes do I need? All these questions and more are very common for anyone designing their first Elasticsearch cluster. Unfortunately, there are no clear cut answers to any of these questions but there are some general guildelines you can follow to make sure your on the right track. In this lesson, we go over some of these best practicies for designing and sizing an Elasticsearch cluster.
Whether your deploying your Elasticsearch cluster to bare metal or in the cloud, there are a few general guildlines to follow when picking your hardware. In this lesson, we will disicuss these guildelines so that you can provision the optimum hardware configuration for your use case.
Congratulations! If you've made it this far, then you have successfully completed the Elasticsearch Deep Dive course. In this video I want to thank you for spending your precious time taking my course, and tell you how youc an connect with me for any questions you may have. I also want to reccomend other courses here are Linux Academy that synergize with what you learned throughout this course.
Learn how to showcase your success in completing this course with this video.