How to Forward Logs to Elasticsearch

How to Forward Logs to Elasticsearch Log data is the silent heartbeat of modern IT infrastructure. Every server request, application error, security event, and system metric generates a stream of logs that, when properly collected and analyzed, reveal critical insights into performance, reliability, and security. However, raw log files scattered across dozens of machines are nearly impossible to i

alex

Oct 30, 2025 - 20:31

How to Forward Logs to Elasticsearch

Log data is the silent heartbeat of modern IT infrastructure. Every server request, application error, security event, and system metric generates a stream of logs that, when properly collected and analyzed, reveal critical insights into performance, reliability, and security. However, raw log files scattered across dozens of machines are nearly impossible to interpret at scale. This is where Elasticsearch comes in a powerful, distributed search and analytics engine designed to ingest, store, and query massive volumes of structured and unstructured data in near real time.

Forwarding logs to Elasticsearch transforms chaotic log files into actionable intelligence. Whether you're managing microservices on Kubernetes, monitoring cloud-native applications, or securing enterprise networks, centralizing logs in Elasticsearch enables powerful visualizations, alerting, and root-cause analysis. This tutorial provides a comprehensive, step-by-step guide to forwarding logs to Elasticsearch, covering tools, configurations, best practices, real-world examples, and common pitfalls to avoid.

By the end of this guide, youll understand how to securely and efficiently transport logs from diverse sources including Linux systems, Docker containers, cloud platforms, and custom applications into Elasticsearch for centralized monitoring and analysis.

Step-by-Step Guide

1. Understand Your Log Sources and Requirements

Before configuring log forwarding, identify where your logs originate and what format they use. Common sources include:

System logs on Linux/Unix servers (e.g., /var/log/syslog, /var/log/auth.log)
Application logs (e.g., Node.js, Python, Java applications writing to files)
Docker and containerized environments (stdout/stderr streams)
Cloud services (AWS CloudWatch, Azure Monitor, Google Cloud Logging)
Network devices and firewalls (via syslog or API)

Determine the volume of logs per second, retention policies, and required fields (e.g., timestamp, hostname, level, message, service name). This informs your choice of forwarding agent and Elasticsearch mapping strategy.

2. Set Up an Elasticsearch Cluster

Elasticsearch can be deployed on-premises, in the cloud, or as a managed service. For production use, a cluster of at least three nodes is recommended for high availability.

Install Elasticsearch using one of the following methods:

Managed Service: Elastic Cloud (SaaS), Amazon OpenSearch Service, or Azure Managed Instance for Elasticsearch.
Self-Hosted: Download from elastic.co and follow installation instructions for your OS.

After installation, verify Elasticsearch is running:

curl -X GET "localhost:9200"

You should receive a JSON response containing cluster name, version, and node details. Ensure the HTTP port (default 9200) is accessible from your log sources.

3. Secure Your Elasticsearch Instance

Never expose Elasticsearch to the public internet without authentication. Enable security features:

Enable X-Pack Security (built into Elasticsearch 7.0+)
Generate certificates for TLS encryption
Create users and roles with minimal privileges

Example: Create a log-forwarder user with read/write access to log indices:

POST /_security/user/log_forwarder
{
"password" : "your_strong_password_123!",
"roles" : [ "logstash_writer" ],
"full_name" : "Log Forwarder Service"
}

Configure your forwarding agent to use HTTPS and authenticate with these credentials.

4. Choose a Log Forwarding Agent

Log forwarding agents collect logs from sources and ship them to Elasticsearch. The three most widely used agents are:

Filebeat Lightweight, optimized for file-based logs (ideal for servers and containers)
Fluentd Highly configurable, Ruby-based, supports many input/output plugins
Logstash Feature-rich, supports complex filtering and transformation (heavier resource usage)

For most use cases, Filebeat is the recommended choice due to its low overhead and tight integration with the Elastic Stack.

5. Install and Configure Filebeat

Install Filebeat on each host that generates logs:

Ubuntu/Debian curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-amd64.deb sudo dpkg -i filebeat-8.12.0-amd64.deb CentOS/RHEL curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-x86_64.rpm sudo rpm -vi filebeat-8.12.0-x86_64.rpm

Configure Filebeat by editing /etc/filebeat/filebeat.yml:

filebeat.inputs: - type: filestream enabled: true paths: - /var/log/*.log - /var/log/syslog - /var/log/auth.log processors: - add_host_metadata: when.not.contains.tags: forwarded - add_cloud_metadata: ~ output.elasticsearch: hosts: ["https://your-elasticsearch-host:9200"] username: "log_forwarder" password: "your_strong_password_123!" ssl.certificate_authorities: ["/etc/filebeat/ca.crt"] index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

Key configuration notes:

paths: Specify exact log file locations. Use wildcards cautiously to avoid performance issues.
processors: Add metadata like hostname, cloud provider, and container labels automatically.
output.elasticsearch: Use HTTPS with TLS certificate verification. Never disable SSL in production.
index: Use date-based indexing for easier management and retention policies.

6. Enable and Start Filebeat

Test your configuration before starting the service:

sudo filebeat test config sudo filebeat test output

If both commands return OK, start and enable Filebeat:

sudo systemctl enable filebeat sudo systemctl start filebeat

Monitor logs to ensure Filebeat is running without errors:

sudo journalctl -u filebeat -f

7. Verify Logs Are Reaching Elasticsearch

After Filebeat starts, check if logs are indexed in Elasticsearch:

curl -X GET "localhost:9200/_cat/indices?v"

You should see indices like filebeat-8.12.0-2024.06.15. Query a sample document:

curl -X GET "localhost:9200/filebeat-*/_search?size=1"

Ensure the response includes fields like @timestamp, host.name, log.file.path, and message. If fields are missing, revisit your Filebeat input configuration or consider using a log parser (see Step 8).

8. Parse and Structure Unstructured Logs

Most application logs are plain text. Elasticsearch performs best with structured JSON. Use Filebeats built-in processors or Logstash filters to parse logs into structured fields.

Example: Parsing Apache access logs using Filebeats dissect processor:

processors:
- dissect:
tokenizer: "%{client_ip} - - [%{timestamp}] \"%{request_method} %{request_path} %{protocol}\" %{status_code} %{bytes_sent}"
field: "message"
target_prefix: "apache"

This transforms:

192.168.1.10 - - [15/Jun/2024:10:23:45 +0000] "GET /index.html HTTP/1.1" 200 1234

into structured fields:

apache.client_ip ? 192.168.1.10
apache.timestamp ? 15/Jun/2024:10:23:45 +0000
apache.request_method ? GET
apache.status_code ? 200

For complex parsing (e.g., Java stack traces, multi-line logs), use the multiline processor:

- type: filestream
enabled: true
paths:
- /var/log/myapp/*.log
multiline:
pattern: '^[[:space:]]+at |^[[:space:]]+Caused by:'
match: after

This combines multi-line Java exceptions into a single log event.

9. Forward Docker Container Logs

Containerized applications output logs to stdout/stderr. Filebeat can read Docker logs directly using the Docker input module:

filebeat.inputs:
- type: docker
containers.ids:
- "*"
processors:
- add_docker_metadata: ~
output.elasticsearch:
hosts: ["https://your-elasticsearch-host:9200"]
username: "log_forwarder"
password: "your_strong_password_123!"
ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
index: "docker-logs-%{[agent.version]}-%{+yyyy.MM.dd}"

Ensure Filebeat has read access to Docker socket:

sudo usermod -aG docker filebeat sudo systemctl restart filebeat

Docker metadata (container name, image, labels) will be automatically enriched in each log event.

10. Forward Logs from Cloud Platforms

For AWS, use CloudWatch Logs Agent or Fluent Bit to forward logs to Elasticsearch:

Install Fluent Bit on EC2 instances
Configure output plugin to send to Elasticsearch endpoint
Use IAM roles for authentication instead of credentials

Example Fluent Bit config (/etc/fluent-bit/fluent-bit.conf):

[INPUT] Name tail Path /var/log/awslogs.log Tag awslogs [OUTPUT] Name es Match * Host your-es-domain.region.es.amazonaws.com Port 443 AWS_Auth On AWS_Region us-east-1 Index aws-logs Type _doc

For Azure, use the Log Analytics Agent (now part of Azure Monitor) to forward to a custom Log Analytics workspace, then use Azure Data Explorer or a connector to push to Elasticsearch.

11. Set Up Index Lifecycle Management (ILM)

Without ILM, your Elasticsearch cluster will eventually run out of disk space. Define an ILM policy to automatically roll over, shrink, and delete old indices.

Create an ILM policy:

PUT _ilm/policy/log_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50GB",
"max_age": "7d"
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"allocate": {
"number_of_replicas": 1
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}

Apply the policy to your index template:

PUT _index_template/log_template
{
"index_patterns": ["filebeat-*", "docker-logs-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"index.lifecycle.name": "log_policy",
"index.lifecycle.rollover_alias": "filebeat"
}
}
}

Now Filebeat will automatically create new indices and manage lifecycle based on size and age.

Best Practices

1. Use Index Naming Conventions

Adopt a consistent naming scheme: appname-environment-timestamp (e.g., web-prod-2024.06.15). This improves readability, filtering, and automation.

2. Avoid Over-Indexing

Do not send every minor log event. Filter out debug-level logs in non-production environments. Use Filebeats drop_fields or include_lines/ exclude_lines to reduce noise.

3. Optimize Field Types

Use Elasticsearchs dynamic mapping cautiously. Define explicit mappings for critical fields like status_code (integer), timestamp (date), and message (text with keyword subfield). This improves search performance and prevents mapping conflicts.

4. Enable Compression

Enable gzip compression in Filebeat to reduce network bandwidth:

output.elasticsearch: hosts: ["https://your-elasticsearch-host:9200"] compression_level: 6

5. Monitor Forwarding Health

Use Filebeats built-in monitoring endpoint:

curl http://localhost:5066/status?pretty

Monitor metrics like output.elasticsearch.events.acked and output.elasticsearch.events.failed to detect delivery issues.

6. Implement Redundancy and Retry Logic

Configure Filebeat to retry failed deliveries:

output.elasticsearch: max_retries: 5 bulk_max_size: 50 timeout: 90s

Use a local queue to buffer logs during network outages:

queue.spool: 1000 queue.mem.events: 4096

7. Separate Logs by Source and Sensitivity

Use different indices for different log types (e.g., security-logs, application-logs, network-logs). Apply different retention and access controls per index.

8. Avoid Large Log Events

Logs exceeding 100KB can cause performance degradation. Use log rotation tools like logrotate to limit file sizes and avoid sending massive stack traces as single events.

9. Use TLS Everywhere

Never send logs over plain HTTP. Always use TLS 1.2+ with certificate pinning or CA validation. Self-signed certificates are acceptable if properly distributed and trusted.

10. Document Your Architecture

Create a diagram showing log sources ? agents ? Elasticsearch ? Kibana. Document configuration paths, credentials, and escalation procedures. This is critical for onboarding and incident response.

Tools and Resources

Core Tools

Filebeat Lightweight log shipper from Elastic. Documentation
Fluent Bit Fast, low-memory alternative to Fluentd. Website
Logstash For complex transformations. Documentation
Elasticsearch Search and analytics engine. Website
Kibana Visualization and dashboarding. Website

Helper Tools

Logrotate Automate log file rotation and compression on Linux.
jq Command-line JSON processor for debugging log formats.
curl Test Elasticsearch APIs and verify connectivity.
Telegraf For metric and log collection (especially in IoT and edge environments).
OpenSearch Dashboards Open-source alternative to Kibana for Amazon OpenSearch.

Learning Resources

Elastic Stack Documentation Comprehensive guides for all components. Link
Fluentd Documentation Plugin reference and configuration examples. Link
GitHub Repositories Search for filebeat elasticsearch example for community configs.
YouTube Tutorials Search Elastic Stack log forwarding tutorial for visual walkthroughs.
Reddit r/elastic Active community for troubleshooting and advice.

Cloud Provider Integrations

AWS: Use Fluent Bit with IAM roles or AWS FireLens.
Azure: Use Azure Monitor Agent (AMA) or Log Analytics Agent.
Google Cloud: Use Cloud Logging with custom sinks to Elasticsearch via Pub/Sub.
Kubernetes: Use DaemonSets with Filebeat or Fluent Bit as sidecars.

Real Examples

Example 1: Forwarding Nginx Logs from 50 Servers

Scenario: A company runs 50 web servers with Nginx. Each server generates 500 log entries per minute.

Implementation:

Install Filebeat on all 50 servers.
Configure input to read /var/log/nginx/access.log and /var/log/nginx/error.log.
Use dissect processor to parse Nginx format into structured fields: client_ip, method, path, status, bytes, user_agent.
Apply ILM policy: roll over at 20GB, delete after 60 days.
Use Kibana to create a dashboard showing top 10 error codes, request rates by endpoint, and geographic distribution of clients.

Result: Engineers reduced mean time to detect (MTTD) HTTP 500 errors from 4 hours to under 5 minutes.

Example 2: Containerized Microservices on Kubernetes

Scenario: A team deploys 30 microservices on Kubernetes. Each service logs to stdout.

Implementation:

Deploy Fluent Bit as a DaemonSet on all worker nodes.
Configure input to read from /var/log/containers/*.log.
Use Kubernetes filter plugin to extract pod name, namespace, container name, and labels.
Enrich logs with service version from Kubernetes annotations.
Send to Elasticsearch using TLS with certificate from Kubernetes secrets.
Create Kibana dashboard per service: error rate, latency percentiles, log volume trends.

Result: The team achieved full observability across services without modifying application code. Debugging cross-service issues became 70% faster.

Example 3: Security Event Forwarding from Firewalls

Scenario: A financial institution needs to centralize firewall and IDS logs for compliance.

Implementation:

Configure Palo Alto firewalls to send syslog over TLS to a central syslog server.
Run Filebeat on the syslog server, reading /var/log/fortinet/ and /var/log/paloalto/.
Use grok filters (via Logstash) to parse firewall rule IDs, threat types, and source/destination IPs.
Apply strict access control: only SOC team can query security-logs-* indices.
Set up alerts for repeated failed SSH attempts or outbound connections to known malicious IPs.

Result: The organization passed a SOC 2 audit and reduced incident response time by 65%.

FAQs

Can I forward logs to Elasticsearch without installing an agent on every server?

Yes, but with limitations. You can configure network devices or applications to send logs directly to Elasticsearch via syslog over TCP/UDP or HTTP POST. However, this bypasses buffering, retry logic, and metadata enrichment. For reliability, always use a lightweight agent like Filebeat or Fluent Bit as an intermediary.

How do I handle high-volume logs without overwhelming Elasticsearch?

Use batching, compression, and horizontal scaling. Increase the number of Elasticsearch data nodes. Tune bulk request size (520 MB). Use ILM to move older data to cold storage. Consider using Kafka or Redis as a buffer between agents and Elasticsearch for peak loads.

Whats the difference between Filebeat and Logstash?

Filebeat is a lightweight shipper designed to collect and forward logs with minimal resource usage. Logstash is a full ETL tool that can parse, transform, filter, and enrich logs but requires more memory and CPU. Use Filebeat for simple forwarding; use Logstash when you need complex processing (e.g., geo-IP enrichment, conditional routing, field renaming).

Do I need Kibana to use Elasticsearch for logs?

No. Elasticsearch can store and query logs without Kibana. However, Kibana provides essential visualization, dashboards, alerting, and discovery tools. Without it, youre limited to raw API queries, making log analysis impractical at scale.

How do I secure log data in transit and at rest?

Use TLS for all connections between agents and Elasticsearch. Enable encryption at rest using Elasticsearchs built-in disk encryption (AWS EBS, Azure Disk Encryption, etc.). Restrict access via role-based permissions. Never store credentials in plain text use secrets management tools like HashiCorp Vault or Kubernetes Secrets.

Can I forward logs from Windows machines?

Yes. Install Filebeat on Windows and configure inputs to read Windows Event Logs (EventLog) or text logs from C:\ProgramData\MyApp\logs\*.log. Use the winlog input module for event logs and enrich with Windows host metadata.

What happens if Elasticsearch goes down?

Filebeat and Fluent Bit maintain a local spool queue (on disk) and retry sending logs until successful. Ensure sufficient disk space on agents to handle outages. Monitor queue depth and set alerts if it exceeds 80% capacity.

Is it better to use Elasticsearch or a dedicated log management tool like Splunk?

Elasticsearch is more cost-effective and flexible, especially for teams with engineering resources. Splunk offers more out-of-the-box features and support but is significantly more expensive. For most modern DevOps teams, the Elastic Stack provides superior value and scalability.

How often should I rotate log files on the source?

Rotate daily or when files reach 100500 MB. Use logrotate on Linux with compression and deletion policies. Large files slow down Filebeats tailing process and increase memory usage.

Can I forward logs to multiple Elasticsearch clusters?

Yes. Filebeat supports multiple output destinations. Use the loadbalance or failover option to send logs to primary and backup clusters for disaster recovery.

Conclusion

Forwarding logs to Elasticsearch is not merely a technical task its a foundational practice for modern observability, security, and operational excellence. By centralizing logs from servers, containers, cloud services, and applications into a single, searchable repository, you unlock the ability to detect anomalies, troubleshoot failures, and optimize performance at scale.

This guide has walked you through the complete process: from selecting the right tools and securing your Elasticsearch cluster, to configuring agents, parsing unstructured data, managing indices, and applying real-world best practices. Whether youre managing a handful of servers or thousands of microservices, the principles remain the same: automate, structure, secure, and monitor.

Remember: logs are only as valuable as the insights they enable. Invest time in building robust, maintainable log pipelines. Document your configurations. Monitor your forwarders. Continuously refine your dashboards and alerts. The goal is not just to collect logs its to turn them into actionable intelligence that drives reliability, security, and innovation.

Start small. Test thoroughly. Scale deliberately. And let your logs guide you not overwhelm you.

alex

How to Forward Logs to Elasticsearch

How to Forward Logs to Elasticsearch

Step-by-Step Guide

1. Understand Your Log Sources and Requirements

2. Set Up an Elasticsearch Cluster

3. Secure Your Elasticsearch Instance

4. Choose a Log Forwarding Agent

5. Install and Configure Filebeat

Ubuntu/Debian

CentOS/RHEL

6. Enable and Start Filebeat

7. Verify Logs Are Reaching Elasticsearch

8. Parse and Structure Unstructured Logs

9. Forward Docker Container Logs

10. Forward Logs from Cloud Platforms

11. Set Up Index Lifecycle Management (ILM)

Best Practices

1. Use Index Naming Conventions

2. Avoid Over-Indexing

3. Optimize Field Types

4. Enable Compression

5. Monitor Forwarding Health

6. Implement Redundancy and Retry Logic

7. Separate Logs by Source and Sensitivity

8. Avoid Large Log Events

9. Use TLS Everywhere

10. Document Your Architecture

Tools and Resources

Core Tools

Helper Tools

Learning Resources

Cloud Provider Integrations

Real Examples

Example 1: Forwarding Nginx Logs from 50 Servers

Example 2: Containerized Microservices on Kubernetes

Example 3: Security Event Forwarding from Firewalls

FAQs

Can I forward logs to Elasticsearch without installing an agent on every server?

How do I handle high-volume logs without overwhelming Elasticsearch?

Whats the difference between Filebeat and Logstash?

Do I need Kibana to use Elasticsearch for logs?

How do I secure log data in transit and at rest?

Can I forward logs from Windows machines?

What happens if Elasticsearch goes down?

Is it better to use Elasticsearch or a dedicated log management tool like Splunk?

How often should I rotate log files on the source?

Can I forward logs to multiple Elasticsearch clusters?

Conclusion

Related Posts

Popular Posts

Recommended Posts

Popular Tags