Shipping logs to Logstash with Filebeat

Shipping logs to Logstash with Filebeat

I've been spending some time looking at how to get data into my ELK stack, and one of the least disruptive options is Elastic's own Filebeat log shipper. It's one of the easiest ways to upgrade applications to centralised logging as it doesn't require any code or configuration changes - as long as they're already logging to a file, Filebeat can plug straight in to that ecosystem and push log events across to Logstash. As a bonus, it can send using an SSL transport, so log data can be kept secure.

Unfortunately it's a lot more of a pain to set up than you'd expect, especially if there's a Windows host in the mix. In particular SSL turned out to be a bit of a rabbit hole, and I've decided to leave that for another article as there's so much messing around with silent log files, generating certificates of the right format on Windows, and getting things to work that I think it'd get in the way if it wasn't an article in itself.

In the meantime, though, here's how to get the basics set up:

Installing

Grab Filebeat from https://www.elastic.co/downloads/beats/filebeat and extract it, or install via package manager if you have it in your repositories. (On Windows, extract the files into a directory).

Configure Filebeat

Filebeat is set up using the filebeat.yml file. On Windows this will be in the directory with Filebeat, on Linux it should end up in /etc/filebeat/filebeat.yml. There are a lot of options in this, but to get things up and running you don't need anything too complex. For a simple setup, you'll need:

  • At least one prospector to gather data
  • At least one output (we'll be using Logstash, but you can output directly to ElasticSearch if you don't need Logstash in the mix)

Here's my example configuration file for a Windows application server:

filebeat:
  prospectors:
    -
      paths:
        - C:/Sandbox/myapp/App_Data/*.txt
      document_type: myapp_log
    -
      paths:
        - C:/Sandbox/myotherapp/App_Data/*.txt
      document_type: myotherapp_log
  registry_file: "C:/ProgramData/filebeat/registry"

output:
  logstash:
    hosts: ["123.45.67.89:5000"]
    worker: 1

Here I tell Filebeat to look at log files from a couple of development app folders, using paths. I also set document_type for each, which I can use in my Logstash configuration to appropriately choose things like Grok filters for different logs. (So Filebeat can send logs from applications with many different log formats)

I also set a registry file for Filebeat to keep track of what it's sent.

Finally, I set up an output section for Logstash. 123.45.67.89:5000 is the IP (and port) of my server, and because I'm only setting up a simple example I choose to have a single worker thread sending data.

Note there's no TLS section in this configuration - everything will be sent unencrypted. (Filebeat does compress the stream, but anyone who can intercept and decompress it will be able to read it)

Timestamp errors

Now for the first problem I encountered. If you're using the default logstash container, you'll get the following error when you configure beats input:

"Beats input: unhandled exception", :exception=>#<TypeError: The field '@timestamp' must be a (LogStash::Timestamp, not a String (2015-11-19T11:58:10.424Z)>

This is due to the beats plugin shipped with the logstash container not being up to date.

I've created a container with an up-to-date version, available on the Docker hub - or adapt the Dockerfile for your own needs from my Github repository. If you're using my ELK stack tutorial as a basis, change the following lines in your docker-compose.yml file to pull the correct container:

logstash:
  image: mattkimber/logstash_beats:2.0.0-1

Set up Logstash to filter different document types

To take advantage of the document types I set up in the Filebeat configuration, I need to update the filters section of the logstash.conf file on the logging server, adding conditionals to choose between the different types:

filter {
        if [type] == "myapp_log" {
                multiline {
                        ...
                }
                grok {
                        ...
                }
                date {
                        ...
                }
        }
        if [type] == "myotherapp_log" {
                multiline {
                        ...
                }
                grok {
                        ...
                }
                date {
                        ...
                }
        }
}

(Obviously, you'll have real configuration values where I put ...)

This allows you to use different multiline, grok and date filters for each of your different log formats, meaning you can share logs from quite disparate sources on the same ELK stack server.

Everything ready: installing the Filebeat service

This will vary depending on platform. My application servers are running Windows, so I run the following from my Filebeat directory in Powershell:

.\install-service-filebeat.ps1

This will install Filebeat as a service on your host. On Linux it's a bit easier as most of the packages install the service for you, so you just need a quick sudo /etc/init.d/filebeat start or system-specific equivalent.

On Windows you can run Filebeat from a console to test your settings, simply by executing filebeat -c /path/to/config. This is useful while you're still debugging your configuration, or if you're only setting things up on a development server.

You should now see events from your application server appearing in your ELK stack - without needing to make any changes to the applications themselves.

Image by Cralize CC-SA 3.0