How to Forward Logs to Elasticsearch
How to Forward Logs to Elasticsearch Log data is the silent heartbeat of modern IT infrastructure. Every server request, application error, security event, and system metric generates a stream of logs that, when properly collected and analyzed, reveal critical insights into performance, reliability, and security. However, raw log files scattered across dozens of machines are nearly impossible to i
How to Forward Logs to Elasticsearch
Log data is the silent heartbeat of modern IT infrastructure. Every server request, application error, security event, and system metric generates a stream of logs that, when properly collected and analyzed, reveal critical insights into performance, reliability, and security. However, raw log files scattered across dozens of machines are nearly impossible to interpret at scale. This is where Elasticsearch comes in — a powerful, distributed search and analytics engine designed to ingest, store, and query massive volumes of structured and unstructured data in near real time.
Forwarding logs to Elasticsearch transforms chaotic log files into actionable intelligence. Whether you're managing microservices on Kubernetes, monitoring cloud-native applications, or securing enterprise networks, centralizing logs in Elasticsearch enables powerful visualizations, alerting, and root-cause analysis. This tutorial provides a comprehensive, step-by-step guide to forwarding logs to Elasticsearch, covering tools, configurations, best practices, real-world examples, and common pitfalls to avoid.
By the end of this guide, you’ll understand how to securely and efficiently transport logs from diverse sources — including Linux systems, Docker containers, cloud platforms, and custom applications — into Elasticsearch for centralized monitoring and analysis.
Step-by-Step Guide
1. Understand Your Log Sources and Requirements
Before configuring log forwarding, identify where your logs originate and what format they use. Common sources include:
- System logs on Linux/Unix servers (e.g., /var/log/syslog, /var/log/auth.log)
- Application logs (e.g., Node.js, Python, Java applications writing to files)
- Docker and containerized environments (stdout/stderr streams)
- Cloud services (AWS CloudWatch, Azure Monitor, Google Cloud Logging)
- Network devices and firewalls (via syslog or API)
Determine the volume of logs per second, retention policies, and required fields (e.g., timestamp, hostname, level, message, service name). This informs your choice of forwarding agent and Elasticsearch mapping strategy.
2. Set Up an Elasticsearch Cluster
Elasticsearch can be deployed on-premises, in the cloud, or as a managed service. For production use, a cluster of at least three nodes is recommended for high availability.
Install Elasticsearch using one of the following methods:
- Managed Service: Elastic Cloud (SaaS), Amazon OpenSearch Service, or Azure Managed Instance for Elasticsearch.
- Self-Hosted: Download from elastic.co and follow installation instructions for your OS.
After installation, verify Elasticsearch is running:
curl -X GET "localhost:9200"
You should receive a JSON response containing cluster name, version, and node details. Ensure the HTTP port (default 9200) is accessible from your log sources.
3. Secure Your Elasticsearch Instance
Never expose Elasticsearch to the public internet without authentication. Enable security features:
- Enable X-Pack Security (built into Elasticsearch 7.0+)
- Generate certificates for TLS encryption
- Create users and roles with minimal privileges
Example: Create a log-forwarder user with read/write access to log indices:
POST /_security/user/log_forwarder
{
"password" : "your_strong_password_123!",
"roles" : [ "logstash_writer" ],
"full_name" : "Log Forwarder Service"
}
Configure your forwarding agent to use HTTPS and authenticate with these credentials.
4. Choose a Log Forwarding Agent
Log forwarding agents collect logs from sources and ship them to Elasticsearch. The three most widely used agents are:
- Filebeat – Lightweight, optimized for file-based logs (ideal for servers and containers)
- Fluentd – Highly configurable, Ruby-based, supports many input/output plugins
- Logstash – Feature-rich, supports complex filtering and transformation (heavier resource usage)
For most use cases, Filebeat is the recommended choice due to its low overhead and tight integration with the Elastic Stack.
5. Install and Configure Filebeat
Install Filebeat on each host that generates logs:
Ubuntu/Debian
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-amd64.deb
sudo dpkg -i filebeat-8.12.0-amd64.deb
CentOS/RHEL
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-x86_64.rpm
sudo rpm -vi filebeat-8.12.0-x86_64.rpm
Configure Filebeat by editing /etc/filebeat/filebeat.yml:
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /var/log/*.log
- /var/log/syslog
- /var/log/auth.log
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
output.elasticsearch:
hosts: ["https://your-elasticsearch-host:9200"]
username: "log_forwarder"
password: "your_strong_password_123!"
ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"
Key configuration notes:
- paths: Specify exact log file locations. Use wildcards cautiously to avoid performance issues.
- processors: Add metadata like hostname, cloud provider, and container labels automatically.
- output.elasticsearch: Use HTTPS with TLS certificate verification. Never disable SSL in production.
- index: Use date-based indexing for easier management and retention policies.
6. Enable and Start Filebeat
Test your configuration before starting the service:
sudo filebeat test config
sudo filebeat test output
If both commands return “OK,” start and enable Filebeat:
sudo systemctl enable filebeat
sudo systemctl start filebeat
Monitor logs to ensure Filebeat is running without errors:
sudo journalctl -u filebeat -f
7. Verify Logs Are Reaching Elasticsearch
After Filebeat starts, check if logs are indexed in Elasticsearch:
curl -X GET "localhost:9200/_cat/indices?v"
You should see indices like filebeat-8.12.0-2024.06.15. Query a sample document:
curl -X GET "localhost:9200/filebeat-*/_search?size=1"
Ensure the response includes fields like @timestamp, host.name, log.file.path, and message. If fields are missing, revisit your Filebeat input configuration or consider using a log parser (see Step 8).
8. Parse and Structure Unstructured Logs
Most application logs are plain text. Elasticsearch performs best with structured JSON. Use Filebeat’s built-in processors or Logstash filters to parse logs into structured fields.
Example: Parsing Apache access logs using Filebeat’s dissect processor:
processors:
- dissect:
tokenizer: "%{client_ip} - - [%{timestamp}] \"%{request_method} %{request_path} %{protocol}\" %{status_code} %{bytes_sent}"
field: "message"
target_prefix: "apache"
This transforms:
192.168.1.10 - - [15/Jun/2024:10:23:45 +0000] "GET /index.html HTTP/1.1" 200 1234
into structured fields:
apache.client_ip→192.168.1.10apache.timestamp→15/Jun/2024:10:23:45 +0000apache.request_method→GETapache.status_code→200
For complex parsing (e.g., Java stack traces, multi-line logs), use the multiline processor:
- type: filestream
enabled: true
paths:
- /var/log/myapp/*.log
multiline:
pattern: '^[[:space:]]+at |^[[:space:]]+Caused by:'
match: after
This combines multi-line Java exceptions into a single log event.
9. Forward Docker Container Logs
Containerized applications output logs to stdout/stderr. Filebeat can read Docker logs directly using the Docker input module:
filebeat.inputs:
- type: docker
containers.ids:
- "*"
processors:
- add_docker_metadata: ~
output.elasticsearch:
hosts: ["https://your-elasticsearch-host:9200"]
username: "log_forwarder"
password: "your_strong_password_123!"
ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
index: "docker-logs-%{[agent.version]}-%{+yyyy.MM.dd}"
Ensure Filebeat has read access to Docker socket:
sudo usermod -aG docker filebeat
sudo systemctl restart filebeat
Docker metadata (container name, image, labels) will be automatically enriched in each log event.
10. Forward Logs from Cloud Platforms
For AWS, use CloudWatch Logs Agent or Fluent Bit to forward logs to Elasticsearch:
- Install Fluent Bit on EC2 instances
- Configure output plugin to send to Elasticsearch endpoint
- Use IAM roles for authentication instead of credentials
Example Fluent Bit config (/etc/fluent-bit/fluent-bit.conf):
[INPUT]
Name tail
Path /var/log/awslogs.log
Tag awslogs
[OUTPUT]
Name es
Match *
Host your-es-domain.region.es.amazonaws.com
Port 443
AWS_Auth On
AWS_Region us-east-1
Index aws-logs
Type _doc
For Azure, use the Log Analytics Agent (now part of Azure Monitor) to forward to a custom Log Analytics workspace, then use Azure Data Explorer or a connector to push to Elasticsearch.
11. Set Up Index Lifecycle Management (ILM)
Without ILM, your Elasticsearch cluster will eventually run out of disk space. Define an ILM policy to automatically roll over, shrink, and delete old indices.
Create an ILM policy:
PUT _ilm/policy/log_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50GB",
"max_age": "7d"
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"allocate": {
"number_of_replicas": 1
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}
Apply the policy to your index template:
PUT _index_template/log_template
{
"index_patterns": ["filebeat-*", "docker-logs-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"index.lifecycle.name": "log_policy",
"index.lifecycle.rollover_alias": "filebeat"
}
}
}
Now Filebeat will automatically create new indices and manage lifecycle based on size and age.
Best Practices
1. Use Index Naming Conventions
Adopt a consistent naming scheme: appname-environment-timestamp (e.g., web-prod-2024.06.15). This improves readability, filtering, and automation.
2. Avoid Over-Indexing
Do not send every minor log event. Filter out debug-level logs in non-production environments. Use Filebeat’s drop_fields or include_lines/ exclude_lines to reduce noise.
3. Optimize Field Types
Use Elasticsearch’s dynamic mapping cautiously. Define explicit mappings for critical fields like status_code (integer), timestamp (date), and message (text with keyword subfield). This improves search performance and prevents mapping conflicts.
4. Enable Compression
Enable gzip compression in Filebeat to reduce network bandwidth:
output.elasticsearch:
hosts: ["https://your-elasticsearch-host:9200"]
compression_level: 6
5. Monitor Forwarding Health
Use Filebeat’s built-in monitoring endpoint:
curl http://localhost:5066/status?pretty
Monitor metrics like output.elasticsearch.events.acked and output.elasticsearch.events.failed to detect delivery issues.
6. Implement Redundancy and Retry Logic
Configure Filebeat to retry failed deliveries:
output.elasticsearch:
max_retries: 5
bulk_max_size: 50
timeout: 90s
Use a local queue to buffer logs during network outages:
queue.spool: 1000
queue.mem.events: 4096
7. Separate Logs by Source and Sensitivity
Use different indices for different log types (e.g., security-logs, application-logs, network-logs). Apply different retention and access controls per index.
8. Avoid Large Log Events
Logs exceeding 100KB can cause performance degradation. Use log rotation tools like logrotate to limit file sizes and avoid sending massive stack traces as single events.
9. Use TLS Everywhere
Never send logs over plain HTTP. Always use TLS 1.2+ with certificate pinning or CA validation. Self-signed certificates are acceptable if properly distributed and trusted.
10. Document Your Architecture
Create a diagram showing log sources → agents → Elasticsearch → Kibana. Document configuration paths, credentials, and escalation procedures. This is critical for onboarding and incident response.
Tools and Resources
Core Tools
- Filebeat – Lightweight log shipper from Elastic. Documentation
- Fluent Bit – Fast, low-memory alternative to Fluentd. Website
- Logstash – For complex transformations. Documentation
- Elasticsearch – Search and analytics engine. Website
- Kibana – Visualization and dashboarding. Website
Helper Tools
- Logrotate – Automate log file rotation and compression on Linux.
- jq – Command-line JSON processor for debugging log formats.
- curl – Test Elasticsearch APIs and verify connectivity.
- Telegraf – For metric and log collection (especially in IoT and edge environments).
- OpenSearch Dashboards – Open-source alternative to Kibana for Amazon OpenSearch.
Learning Resources
- Elastic Stack Documentation – Comprehensive guides for all components. Link
- Fluentd Documentation – Plugin reference and configuration examples. Link
- GitHub Repositories – Search for “filebeat elasticsearch example” for community configs.
- YouTube Tutorials – Search “Elastic Stack log forwarding tutorial” for visual walkthroughs.
- Reddit r/elastic – Active community for troubleshooting and advice.
Cloud Provider Integrations
- AWS: Use Fluent Bit with IAM roles or AWS FireLens.
- Azure: Use Azure Monitor Agent (AMA) or Log Analytics Agent.
- Google Cloud: Use Cloud Logging with custom sinks to Elasticsearch via Pub/Sub.
- Kubernetes: Use DaemonSets with Filebeat or Fluent Bit as sidecars.
Real Examples
Example 1: Forwarding Nginx Logs from 50 Servers
Scenario: A company runs 50 web servers with Nginx. Each server generates 500 log entries per minute.
Implementation:
- Install Filebeat on all 50 servers.
- Configure input to read
/var/log/nginx/access.logand/var/log/nginx/error.log. - Use
dissectprocessor to parse Nginx format into structured fields: client_ip, method, path, status, bytes, user_agent. - Apply ILM policy: roll over at 20GB, delete after 60 days.
- Use Kibana to create a dashboard showing top 10 error codes, request rates by endpoint, and geographic distribution of clients.
Result: Engineers reduced mean time to detect (MTTD) HTTP 500 errors from 4 hours to under 5 minutes.
Example 2: Containerized Microservices on Kubernetes
Scenario: A team deploys 30 microservices on Kubernetes. Each service logs to stdout.
Implementation:
- Deploy Fluent Bit as a DaemonSet on all worker nodes.
- Configure input to read from
/var/log/containers/*.log. - Use Kubernetes filter plugin to extract pod name, namespace, container name, and labels.
- Enrich logs with service version from Kubernetes annotations.
- Send to Elasticsearch using TLS with certificate from Kubernetes secrets.
- Create Kibana dashboard per service: error rate, latency percentiles, log volume trends.
Result: The team achieved full observability across services without modifying application code. Debugging cross-service issues became 70% faster.
Example 3: Security Event Forwarding from Firewalls
Scenario: A financial institution needs to centralize firewall and IDS logs for compliance.
Implementation:
- Configure Palo Alto firewalls to send syslog over TLS to a central syslog server.
- Run Filebeat on the syslog server, reading
/var/log/fortinet/and/var/log/paloalto/. - Use
grokfilters (via Logstash) to parse firewall rule IDs, threat types, and source/destination IPs. - Apply strict access control: only SOC team can query
security-logs-*indices. - Set up alerts for repeated failed SSH attempts or outbound connections to known malicious IPs.
Result: The organization passed a SOC 2 audit and reduced incident response time by 65%.
FAQs
Can I forward logs to Elasticsearch without installing an agent on every server?
Yes, but with limitations. You can configure network devices or applications to send logs directly to Elasticsearch via syslog over TCP/UDP or HTTP POST. However, this bypasses buffering, retry logic, and metadata enrichment. For reliability, always use a lightweight agent like Filebeat or Fluent Bit as an intermediary.
How do I handle high-volume logs without overwhelming Elasticsearch?
Use batching, compression, and horizontal scaling. Increase the number of Elasticsearch data nodes. Tune bulk request size (5–20 MB). Use ILM to move older data to cold storage. Consider using Kafka or Redis as a buffer between agents and Elasticsearch for peak loads.
What’s the difference between Filebeat and Logstash?
Filebeat is a lightweight shipper designed to collect and forward logs with minimal resource usage. Logstash is a full ETL tool that can parse, transform, filter, and enrich logs — but requires more memory and CPU. Use Filebeat for simple forwarding; use Logstash when you need complex processing (e.g., geo-IP enrichment, conditional routing, field renaming).
Do I need Kibana to use Elasticsearch for logs?
No. Elasticsearch can store and query logs without Kibana. However, Kibana provides essential visualization, dashboards, alerting, and discovery tools. Without it, you’re limited to raw API queries, making log analysis impractical at scale.
How do I secure log data in transit and at rest?
Use TLS for all connections between agents and Elasticsearch. Enable encryption at rest using Elasticsearch’s built-in disk encryption (AWS EBS, Azure Disk Encryption, etc.). Restrict access via role-based permissions. Never store credentials in plain text — use secrets management tools like HashiCorp Vault or Kubernetes Secrets.
Can I forward logs from Windows machines?
Yes. Install Filebeat on Windows and configure inputs to read Windows Event Logs (EventLog) or text logs from C:\ProgramData\MyApp\logs\*.log. Use the winlog input module for event logs and enrich with Windows host metadata.
What happens if Elasticsearch goes down?
Filebeat and Fluent Bit maintain a local spool queue (on disk) and retry sending logs until successful. Ensure sufficient disk space on agents to handle outages. Monitor queue depth and set alerts if it exceeds 80% capacity.
Is it better to use Elasticsearch or a dedicated log management tool like Splunk?
Elasticsearch is more cost-effective and flexible, especially for teams with engineering resources. Splunk offers more out-of-the-box features and support but is significantly more expensive. For most modern DevOps teams, the Elastic Stack provides superior value and scalability.
How often should I rotate log files on the source?
Rotate daily or when files reach 100–500 MB. Use logrotate on Linux with compression and deletion policies. Large files slow down Filebeat’s tailing process and increase memory usage.
Can I forward logs to multiple Elasticsearch clusters?
Yes. Filebeat supports multiple output destinations. Use the loadbalance or failover option to send logs to primary and backup clusters for disaster recovery.
Conclusion
Forwarding logs to Elasticsearch is not merely a technical task — it’s a foundational practice for modern observability, security, and operational excellence. By centralizing logs from servers, containers, cloud services, and applications into a single, searchable repository, you unlock the ability to detect anomalies, troubleshoot failures, and optimize performance at scale.
This guide has walked you through the complete process: from selecting the right tools and securing your Elasticsearch cluster, to configuring agents, parsing unstructured data, managing indices, and applying real-world best practices. Whether you’re managing a handful of servers or thousands of microservices, the principles remain the same: automate, structure, secure, and monitor.
Remember: logs are only as valuable as the insights they enable. Invest time in building robust, maintainable log pipelines. Document your configurations. Monitor your forwarders. Continuously refine your dashboards and alerts. The goal is not just to collect logs — it’s to turn them into actionable intelligence that drives reliability, security, and innovation.
Start small. Test thoroughly. Scale deliberately. And let your logs guide you — not overwhelm you.