How to Setup Elk Stack

How to Setup Elk Stack The Elk Stack—comprising Elasticsearch, Logstash, and Kibana—is one of the most powerful and widely adopted open-source platforms for log management, real-time analytics, and observability. Originally developed by Elastic, the stack has become the de facto standard for organizations seeking to centralize, visualize, and analyze massive volumes of structured and unstructured

Oct 30, 2025 - 12:32
Oct 30, 2025 - 12:32
 0

How to Setup Elk Stack

The Elk Stack—comprising Elasticsearch, Logstash, and Kibana—is one of the most powerful and widely adopted open-source platforms for log management, real-time analytics, and observability. Originally developed by Elastic, the stack has become the de facto standard for organizations seeking to centralize, visualize, and analyze massive volumes of structured and unstructured data. Whether you're monitoring application logs, securing infrastructure, or optimizing user behavior, the Elk Stack provides the tools needed to transform raw data into actionable insights.

Setting up the Elk Stack correctly is critical to ensuring performance, scalability, and reliability. A poorly configured stack can lead to data loss, indexing bottlenecks, or degraded search performance. This guide walks you through every step required to deploy a production-ready Elk Stack, from initial installation to advanced configuration and optimization. By the end of this tutorial, you will have a fully functional, secure, and scalable Elk Stack environment ready to ingest, process, and visualize data from multiple sources.

Step-by-Step Guide

Prerequisites

Before beginning the setup process, ensure your environment meets the following requirements:

  • A Linux-based server (Ubuntu 22.04 LTS or CentOS 8/9 recommended)
  • At least 4 GB of RAM (8 GB or more recommended for production)
  • Minimum 2 CPU cores
  • At least 20 GB of free disk space (SSD strongly recommended)
  • Root or sudo access
  • Java 11 or Java 17 installed (Elasticsearch requires a JVM)
  • Network connectivity for package downloads and external data sources

For production deployments, consider deploying each component on separate servers to isolate workloads and improve fault tolerance. For learning or development purposes, a single-node setup is acceptable.

Step 1: Install Java

Elasticsearch is built on Java and requires a compatible Java Virtual Machine (JVM) to run. Oracle JDK is no longer freely available for production use, so we recommend OpenJDK.

On Ubuntu:

sudo apt update

sudo apt install openjdk-17-jdk -y

On CentOS/RHEL:

sudo dnf install java-17-openjdk-devel -y

Verify the installation:

java -version

You should see output indicating OpenJDK 17 is installed. If multiple Java versions exist, set the default using:

sudo update-alternatives --config java

Step 2: Install Elasticsearch

Elasticsearch is the distributed search and analytics engine at the core of the Elk Stack. It stores, indexes, and enables fast retrieval of data.

Add the Elastic GPG key and repository:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

Update the package list and install Elasticsearch:

sudo apt update

sudo apt install elasticsearch -y

Configure Elasticsearch by editing its main configuration file:

sudo nano /etc/elasticsearch/elasticsearch.yml

Update the following key settings:

cluster.name: my-elk-cluster

node.name: node-1

network.host: 0.0.0.0

discovery.type: single-node

http.port: 9200

cluster.initial_master_nodes: ["node-1"]

Note: In a multi-node cluster, replace discovery.type: single-node with discovery.seed_hosts and cluster.initial_master_nodes with the IP addresses of all master-eligible nodes.

Enable and start Elasticsearch:

sudo systemctl enable elasticsearch

sudo systemctl start elasticsearch

Verify Elasticsearch is running:

curl -X GET "localhost:9200"

You should receive a JSON response containing cluster details, including version and name.

Step 3: Install Kibana

Kibana is the visualization layer of the Elk Stack. It provides a web interface to explore data, build dashboards, and monitor system health.

Install Kibana using the same repository:

sudo apt install kibana -y

Edit the Kibana configuration file:

sudo nano /etc/kibana/kibana.yml

Set the following values:

server.port: 5601

server.host: "0.0.0.0"

elasticsearch.hosts: ["http://localhost:9200"]

i18n.locale: "en"

Enable and start Kibana:

sudo systemctl enable kibana

sudo systemctl start kibana

Verify Kibana is accessible by visiting http://your-server-ip:5601 in your browser. You should see the Kibana welcome screen.

Step 4: Install Logstash

Logstash is the data processing pipeline that ingests data from multiple sources, transforms it, and sends it to Elasticsearch. It supports a wide range of inputs, filters, and outputs.

Install Logstash:

sudo apt install logstash -y

Logstash configurations are stored in /etc/logstash/conf.d/. Create a basic configuration file:

sudo nano /etc/logstash/conf.d/01-input.conf

Add the following input configuration to accept data via Beats (Filebeat) or TCP:

input {

beats {

port => 5044

}

}

Create a filter configuration to parse logs (optional):

sudo nano /etc/logstash/conf.d/02-filter.conf

Add a simple Grok filter for Apache logs:

filter {

if [type] == "apache-access" {

grok {

match => { "message" => "%{COMBINEDAPACHELOG}" }

}

}

}

Create an output configuration to send data to Elasticsearch:

sudo nano /etc/logstash/conf.d/03-output.conf

output {

elasticsearch {

hosts => ["http://localhost:9200"]

index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

document_type => "%{[@metadata][type]}"

}

}

Test your configuration for syntax errors:

sudo /usr/share/logstash/bin/logstash --path.settings /etc/logstash -t

If the test passes, start Logstash:

sudo systemctl enable logstash

sudo systemctl start logstash

Step 5: Install Filebeat (Optional but Recommended)

While Logstash can ingest data directly, Filebeat is a lightweight, resource-efficient shipper designed specifically for forwarding log files to Logstash or Elasticsearch. It is ideal for server-side log collection.

Install Filebeat:

sudo apt install filebeat -y

Configure Filebeat to send logs to Logstash:

sudo nano /etc/filebeat/filebeat.yml

Update the following sections:

filebeat.inputs:

- type: log

enabled: true

paths:

- /var/log/*.log

- /var/log/apache2/*.log

output.logstash:

hosts: ["localhost:5044"]

Enable the Apache module (if applicable):

sudo filebeat modules enable apache2

sudo filebeat setup

Start Filebeat:

sudo systemctl enable filebeat

sudo systemctl start filebeat

Step 6: Configure Kibana Index Patterns

Once data begins flowing into Elasticsearch, you need to define index patterns in Kibana to make the data searchable and visualizable.

Open Kibana in your browser at http://your-server-ip:5601.

Navigate to Stack ManagementIndex PatternsCreate Index Pattern.

Enter the index pattern name. For Filebeat, use filebeat-*. For Logstash, use logstash-*.

Select @timestamp as the time field and click Create index pattern.

Once created, go to Discover to explore raw log entries. You should now see data appearing in real time.

Step 7: Create Your First Dashboard

With data indexed, create visualizations and dashboards to monitor system health.

Go to DashboardCreate dashboard.

Click Add from library and select a pre-built template like “System” or “Apache” if you’re using Filebeat modules.

Alternatively, create custom visualizations:

  • Go to Visualize LibraryCreate visualization
  • Select “Line” or “Bar” chart
  • Choose your index pattern
  • Set X-axis to “Date Histogram” based on @timestamp
  • Set Y-axis to “Count” or a custom metric like “response_code”

Save each visualization and add it to your dashboard. Name your dashboard “Server Monitoring” or similar.

Best Practices

1. Use Separate Nodes for Production Deployments

In production environments, avoid running Elasticsearch, Logstash, and Kibana on the same server. Distribute them across dedicated nodes to prevent resource contention. Elasticsearch requires significant memory and CPU for indexing and search operations. Logstash can be memory-intensive during transformation pipelines. Kibana, while lighter, benefits from low-latency network access to Elasticsearch.

2. Secure Your Stack with TLS and Authentication

By default, the Elk Stack runs without authentication. In any environment exposed to external networks, enable security features:

  • Enable Elasticsearch’s built-in security: Set xpack.security.enabled: true in elasticsearch.yml
  • Generate certificates using elasticsearch-certutil for encrypted communication
  • Configure Kibana to use HTTPS and authenticate against Elasticsearch
  • Use role-based access control (RBAC) to restrict user permissions

Run the following to generate certificates:

cd /usr/share/elasticsearch

sudo bin/elasticsearch-certutil cert --out /opt/certs.zip

sudo unzip /opt/certs.zip -d /etc/elasticsearch/certs/

Update elasticsearch.yml:

xpack.security.enabled: true

xpack.security.transport.ssl.enabled: true

xpack.security.transport.ssl.verification_mode: certificate

xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12

xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12

Update kibana.yml:

elasticsearch.hosts: ["https://localhost:9200"]

elasticsearch.ssl.certificateAuthorities: ["/etc/elasticsearch/certs/elastic-certificates.p12"]

elasticsearch.username: "kibana_system"

elasticsearch.password: "your-strong-password"

server.ssl.enabled: true

server.ssl.certificate: /etc/kibana/certs/kibana.crt

server.ssl.key: /etc/kibana/certs/kibana.key

3. Optimize Elasticsearch Indexing and Sharding

Index design directly impacts performance. Follow these guidelines:

  • Use time-based indices (e.g., logs-2024.05.01) for log data to enable efficient retention policies
  • Limit the number of shards per index (ideally under 50 per node)
  • Set number_of_shards to match the number of data nodes (e.g., 3 shards for 3 nodes)
  • Set number_of_replicas to 1 in production for high availability
  • Use index lifecycle management (ILM) to automate rollover and deletion

Example ILM policy via Kibana Dev Tools:

PUT _ilm/policy/logs_policy

{

"policy": {

"phases": {

"hot": {

"actions": {

"rollover": {

"max_size": "50GB",

"max_age": "30d"

}

}

},

"delete": {

"min_age": "90d",

"actions": {

"delete": {}

}

}

}

}

}

4. Monitor Resource Usage and Set JVM Heap Limits

Elasticsearch is memory-sensitive. Never allocate more than 50% of your system RAM to the JVM heap, and cap it at 32 GB due to JVM pointer compression limits.

Edit /etc/elasticsearch/jvm.options:

-Xms4g

-Xmx4g

Monitor heap usage using Kibana’s Monitoring tab or external tools like Prometheus and Grafana.

5. Use Filebeat Modules for Standardized Parsing

Filebeat comes with pre-built modules for common services like Apache, Nginx, MySQL, and System logs. These modules include optimized parsers, dashboards, and index templates.

Enable a module:

sudo filebeat modules enable apache2 mysql system

Reload the configuration:

sudo filebeat setup

sudo systemctl restart filebeat

This reduces the need for custom Grok patterns and ensures consistency across environments.

6. Implement Log Retention and Cleanup

Logs can consume massive disk space. Automate cleanup using Elasticsearch’s Index Lifecycle Management (ILM) or Curator (deprecated in favor of ILM).

Use ILM policies to automatically delete indices older than 90 days, reducing storage costs and maintaining performance.

7. Back Up Critical Data Regularly

Use Elasticsearch snapshots to back up indices to shared storage (NFS, S3, HDFS):

PUT _snapshot/my_backup

{

"type": "fs",

"settings": {

"location": "/mnt/backups/elasticsearch"

}

}

Take a snapshot:

PUT _snapshot/my_backup/snapshot_1

Restore when needed:

POST _snapshot/my_backup/snapshot_1/_restore

Tools and Resources

Official Documentation

Always refer to the official Elastic documentation for version-specific details:

Monitoring and Alerting Tools

Enhance your Elk Stack with external monitoring tools:

  • Prometheus + Grafana – Monitor system metrics (CPU, memory, disk I/O) and Elasticsearch cluster health
  • Alertmanager – Trigger notifications based on Kibana alert rules
  • Netdata – Real-time system monitoring with built-in Elasticsearch integration

Community and Support

Engage with the active Elk Stack community for troubleshooting and best practices:

Sample Data Generators

For testing and development, generate realistic log data:

  • GoAccess – Generate Apache/Nginx logs from sample traffic
  • Loggen – A utility to simulate high-volume log streams
  • Mockaroo – Generate custom JSON/CSV datasets for testing

Containerized Deployments (Docker & Kubernetes)

For scalable, portable deployments, use Docker Compose:

version: '3.8'

services:

elasticsearch:

image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0

environment:

- discovery.type=single-node

- xpack.security.enabled=false

ports:

- "9200:9200"

volumes:

- esdata:/usr/share/elasticsearch/data

kibana:

image: docker.elastic.co/kibana/kibana:8.12.0

ports:

- "5601:5601"

depends_on:

- elasticsearch

logstash:

image: docker.elastic.co/logstash/logstash:8.12.0

ports:

- "5044:5044"

volumes:

- ./logstash/pipeline:/usr/share/logstash/pipeline

depends_on:

- elasticsearch

volumes:

esdata:

Run with:

docker-compose up -d

Real Examples

Example 1: Monitoring Web Server Logs

A mid-sized e-commerce company uses the Elk Stack to monitor Apache web server logs across 12 frontend servers. Each server runs Filebeat to ship access and error logs to a central Logstash instance.

Logstash applies filters to extract:

  • Client IP addresses
  • HTTP status codes
  • Request duration
  • User agent strings

These fields are indexed into Elasticsearch. Kibana dashboards display:

  • Real-time traffic spikes
  • Top 10 most visited pages
  • 4xx/5xx error trends
  • Geographic distribution of visitors

Alerts are configured to notify the DevOps team when error rates exceed 5% for 5 minutes. This proactive monitoring reduced incident response time by 70%.

Example 2: Security Incident Detection

A financial services firm uses the Elk Stack to detect anomalous SSH login attempts. Filebeat collects system logs from 50+ Linux servers. Logstash parses auth.log and tags failed login attempts.

A Kibana machine learning job analyzes login frequency by user and IP. It flags:

  • Multiple failed logins from the same IP within 60 seconds
  • Logins from unusual geographic locations
  • Attempts using known compromised usernames

When anomalies are detected, an alert triggers a Slack notification and automatically blocks the IP via firewall rules. This system has prevented 12 brute-force attacks in the last quarter.

Example 3: Application Performance Monitoring

A SaaS provider instruments its Node.js application to emit structured JSON logs to stdout. These logs are captured by Filebeat and sent to Logstash.

Logstash enriches logs with:

  • Environment (production/staging)
  • Service name
  • Request ID for distributed tracing

Kibana visualizations track:

  • Latency percentiles (p95, p99)
  • Throughput per endpoint
  • Database query durations

Engineers use these dashboards to identify slow API endpoints and optimize database queries, resulting in a 40% reduction in average response time.

FAQs

What is the difference between the Elk Stack and the EFK Stack?

The Elk Stack uses Logstash and Filebeat for log ingestion, while the EFK Stack (Elasticsearch, Fluentd, Kibana) replaces Logstash with Fluentd. Fluentd is often preferred in Kubernetes environments due to its native container support and lightweight architecture. However, Logstash offers richer filtering capabilities and a larger plugin ecosystem.

Can I use the Elk Stack without Kibana?

Yes. Elasticsearch can be queried directly via its REST API using tools like cURL, Postman, or Python scripts. However, Kibana provides a user-friendly interface for visualization, dashboards, and monitoring that is essential for most teams.

How much disk space does the Elk Stack require?

Storage needs depend entirely on data volume. As a rule of thumb: 10 GB per day of uncompressed logs is a reasonable estimate for medium traffic. Always provision additional space for replication, snapshots, and temporary indexing buffers.

Is the Elk Stack free to use?

Elasticsearch, Logstash, and Kibana are open-source under the SSPL (Server Side Public License). Core features are free. However, advanced features like machine learning, alerting, and security are part of Elastic’s paid subscription (Elastic Stack Premium). For many use cases, the free tier is sufficient.

Why is my Kibana dashboard blank even though Elasticsearch has data?

Common causes include:

  • Incorrect index pattern (e.g., typing logstash instead of logstash-*)
  • Time filter set to a range with no data
  • Index not yet created (wait for data to be ingested)
  • Permissions issue preventing Kibana from reading indices

Check the Discover tab first to confirm data exists. Then verify your time filter and index pattern.

How do I upgrade the Elk Stack to a newer version?

Always follow Elastic’s upgrade guide. Never skip major versions. Steps include:

  1. Take a snapshot of all indices
  2. Stop all services (Kibana → Logstash → Elasticsearch)
  3. Upgrade Elasticsearch first
  4. Upgrade Logstash
  5. Upgrade Kibana
  6. Restart services in reverse order
  7. Verify data integrity and functionality

Can I run the Elk Stack on Windows?

Yes. Elastic provides Windows installers for Elasticsearch, Kibana, and Filebeat. However, Linux is strongly recommended for production due to better performance, stability, and community support.

What should I do if Elasticsearch fails to start?

Check the logs:

sudo journalctl -u elasticsearch -n 50 --no-pager

Common issues:

  • Insufficient memory (adjust JVM heap)
  • Port conflict (9200 or 9300 already in use)
  • File permissions on data directory
  • Invalid configuration syntax

Conclusion

Setting up the Elk Stack is a foundational skill for modern DevOps, SRE, and security teams. From centralized logging to real-time monitoring and anomaly detection, the stack empowers organizations to gain deep visibility into their systems and applications. This guide has walked you through the complete process—from installing Java and configuring Elasticsearch, Logstash, and Kibana, to implementing security, optimization, and real-world use cases.

Remember: a well-configured Elk Stack is not a one-time setup. It requires ongoing maintenance, monitoring, and refinement. Regularly review your index patterns, update your filters, and expand your dashboards as your data needs evolve. Use automation tools like ILM, Docker, and configuration management systems (Ansible, Terraform) to scale your deployment reliably.

As data volumes continue to grow and system complexity increases, the Elk Stack remains one of the most robust, flexible, and community-supported solutions available. Whether you’re managing a single server or a global infrastructure, investing time in mastering the Elk Stack will pay dividends in operational efficiency, faster troubleshooting, and proactive system health management.