Sending Logs to Scalyr Using Fluentd

Fluentd is an open source software which allows you to unify log data collection and it is designed to scale and simplify log management. You can stream logs to Scalyr with fluentd-plugin-scalyr, so you can search logs, setup alerts and build dashboards from a centralized log repository.


Step 1: Install Fluentd

Follow the instructions to install Fluentd on your machine.

Choose the installation instruction depending on your operating system. I did all of my testings using Fluentd 0.16 and Centos 7. In general, Fluentd 0.14 and above should all be fine.

Step 2: Install fluent-plugin-scalyr

Run the following command to get fluent-plugin-scalyr

td-agent-gem install fluent-plugin-scalyr

You can check to see if this is the latest version as well as checking dependencies by checking this page.

You may need to install ruby-dev (and possibly make/gcc) as well, depending on your current environment.

Step 3: Setup Fluentd Configuration File

You can find the Fluentd configuration file at /etc/td-agent/td-agent.conf. I am going to use the following sample to demonstrate the Fluentd to Scalyr ingestion workflow.

The configuration file consists of a series of directives and you need to include at least “source”, “filter”, and “match” in order to send logs to Scalyr.

Source directives control the input sources. In this example Fluentd is accepting requests from 3 different sources

  • HTTP messages from port 8888
  • TCP packets from port 24224
  • Read events from the tail of the access log file

The scalyr.apache.access tag in the access log source directive matches the “filter” and “match” directives in the latter parts of the configuration.

Filter directives determine the event processing pipelines. In the scope of log ingestion to Scalyr, filter directives are used to specify the parser, i.e. myapp, accessLog, and append additional fields, i.e. cluster, fluentd_parser_time, to the log event.

Match directives determine the output destinations. Copy the Scalyr write logs key from Manage API Keys and paste the value to the “api_write_token”. You can also specify serverAttributes by adding additional fields, such as “serverHost” and “parser”.

We are adding one extra field, “message_field”, to the match directive. “Message_field” specifies the field that contains the actual log message you want to send to Scalyr. You can use either message or log, message being the default. Fluentd checks to see if the field specified by “message_field” exists. If so then it uses that, otherwise it uses message.

Step 4: Start Fluentd

Use /etc/init.d/td-agent to start, stop or restart Fluentd agent.

$ sudo /etc/init.d/td-agent start
Starting td-agent (via systemctl):                         [OK]

Check td-agent log (i.e. /var/log/td-agent/td-agent.log) if you encounter any issues launching the td-agent.

Step 5: Send Logs

Earlier, I mentioned that we would use three methods to send logs to Scalyr:

  • Sending logs with HTTP

In this example, we’re sending an HTTP POST call with body {“msg”: “hello scalyr from http”} using port 8888. Append “myapp.access” to the URL path to route the traffic using the scalyr.myapp.access filter:

curl  -XPOST
     -d 'json={"message":"{"msg":"hello scalyr from http"}"}'
     http://127.0.0.1:8888/scalyr.myapp.access
  • Sending logs with TCP

Here, we’re using Docker’s Fluentd log driver to send a message from stdout to Scalyr. The actual log message {“msg”: “hello scalyr from tcp”} is applied to the “log” field of the log event, so it is important to include “message_field log” in your match directive; otherwise, the “myapp” parser will not be applied to the log message.

sudo docker run
            --log-driver=fluentd
            --log-opt tag=scalyr.myapp.access
            --log-opt fluentd-address=127.0.0.1:24224
            ubuntu echo '{"msg": "hello scalyr from tcp"}'
  • Sending logs from an access log file

In this example, I am sending my Apache httpd access log to Scalyr. I’ve specified the access log file path (/var/log/httpd/access_log) in the source directive. Fluentd will then start reading the tail of the access log to Scalyr. Open a URL from your web server, refresh an already open page, or use curl to generate a call against your server and write to your access log.

Step 6: View Logs at Scalyr.com

Go to scalyr.com and search $serverHost == “fluentd_host” to find logs the logs you ingested in the above examples.

image

The parser “myapp” has one simple format “${parse=json}$” to parse json logs and it is applied to logs ingested using http and tcp. Click “INSPECT FIELDS” on the log event to verify the parsing; you should find a msg field with value “hello scalyr from http”. In addition, you should also see fluentd_parser_time and cluster attributes if you used the sample Fluentd configuration file from step 3. Those fields were added because we included the  “record_modifier” option in the filter directive.

image


That’s it! Hopefully, this process is fairly straightforward, and we’d love to hear about your experience using it. Leave us your thoughts in the comments.