Logs Collector/Forwarder

Overview

This plugin is built on top of the modified Remote Syslog 2 engine which, as the official documentation says, tails one or more log files and sends syslog messages to a remote central syslog server. It generates packets itself, ignoring the system syslog daemon, so its configuration doesn’t affect system-wide logging. It provides a very broad scope of use cases such as:

  • Collecting logs from servers & daemons which don’t natively support syslog
  • When reconfiguring the system logger is less convenient than a purpose-built daemon (e.g., automated app deployments)
  • Aggregating files not generated by daemons (e.g., package manager logs)

It is a special plugin that not only collects and submits metrics to SolarWinds Snap Agent but also additionally forwards logs to a remote log analysis system. The officially supported systems for sending logs to are:

which are also both from the SolarWinds DevOps Cloud Product Family Suite.

Setup

The logs plugin is included with the SolarWinds Snap Agent by default, please follow the directions below to enable it for a Snap Agent instance. It can be enabled by default at the installation time by the official installer script by providing an optional --loggly-token parameter set to a proper Loggly Customer Token.

Prerequisites

This plugin requires an active Loggly or Papertrail account.

Configuration

The Snap Agent provides an example configuration file to help you get started quickly. It defines the plugin and task file to be loaded by the agent, but requires you to provide the correct settings for your Logs server. To enable the plugin:

1. Make a copy of the logs example configuration file /opt/SolarWinds/Snap/etc/plugins.d/logs.yaml.example, renaming it to /opt/SolarWinds/Snap/etc/plugins.d/logs.yaml:

$ sudo cp /opt/SolarWinds/Snap/etc/plugins.d/logs.yaml.example /opt/SolarWinds/Snap/etc/plugins.d/logs.yaml
  1. Update /opt/SolarWinds/Snap/etc/plugins.d/logs.yaml configuration file with settings specific to your Logs server for example:
collector:
  logs:
    all:
      loggly_token: "LOGGLY_TOKEN"
      api_host: "logs-01.loggly.com"
      api_port: 6514

      ## Papertrail host and port details: change this to YOUR papertrail host.
      # papertrail_token: "PAPERTRAIL_TOKEN"
      # api_host: HOST.papertrailapp.com
      # api_port: 12345

      files: |
        /var/log/SolarWinds/Snap/swisnapd.log
        /var/log/syslog

      # exclude_files: |
      #   \.\d$
      #   .bz2
      #   .gz

      exclude_patterns: |
        .*self-skip-logs-collector.*

      hostname: "myhost"

      windows_events: |
        enable: Yes
        filters:
        - channel: System
          Level: Error
        - channel: Application
          Level:
          - Error
          - Warning
          EventId:
          - 50
          - range:
              min: 60
              max: 63
          Source:
          - AppOptics
          - Snapteld
          Message:
            matches: "event[0-9]{2,3}"

load:
  plugin: snap-plugin-collector-aologs
  task: task-aologs.yaml

You can only configure one of the log analysis systems (either Loggly or Papertrail).

  • For Loggly:

    • The setting loggly_token is required and should be set to a proper Loggly Customer Token.
    • The setting api_host is optional and defaults to "logs-01.loggly.com".
    • The setting api_port is optional and defaults to 6514. Use 6514 with TLS or 514 with TCP (to be set via api_protocol setting).
  • For Papertrail:

    • The setting papertrail_token is required, but at present can be set to any non-empty string.
    • The settings api_host and api_port are required and should be changed to the proper Customer’s papertrail host and port values.
  • The setting api_protocol is optional and defaults to "tls" which means that this plugin will use TLS-encrypted syslog by default. This is a protocol to be used when connecting, the other possible values are udp or tcp for using unencrypted syslog protocol.

  • The setting connect_timeout is optional and defaults to "30s". This is a timeout for connecting to logs-accepting API.

  • The setting write_timeout is optional and defaults to "30s". This is a timeout for writing to logs-accepting API.

  • The setting max_line_length is optional and defaults to 1024. This is a maximum line length to be written at once (in utf-8 characters). 0 is a special value and means no limit would be enforced.

    NOTE This setting is associated with max length of a line in the syslog format, not the raw log line in file.

  • The setting new_file_check_interval is optional and defaults to "30s". This is an interval for looking for new files matching given pattern(s).

  • The setting files is optional and defaults to

    |
      /var/log/SolarWinds/Snap/swisnapd.log
      /var/log/syslog
    

    This is an array of files or filename patterns to watch. Wildcards such as in /home/**/*.log are valid to be used.

    NOTE: Be careful when attempting to handle swisnapd logs as those might also contain log entries of logs collector to avoid infinite recurrence effect you should apply exclude pattern below by adding ".*self-skip-logs-collector.*" pattern.

    NOTE: In YAML, string values can span multiple lines using | or >. Here we want to use a syntax of “Literal Block Scalar” (ie. |) which will include the newlines and any trailing spaces

  • The setting exclude_files is optional and defaults to an empty array. You can use it to provide one or more regular expressions to prevent certain files from being matched.

  • The setting exclude_patterns is optional and defaults to

    |
      .*self-skip-logs-collector.*
    

    There may be certain log messages that you do not want to be sent. These may be repetitive log lines that are “noise” that you might not be able to filter out easily from the respective application. To filter these lines, use exclude_patterns with an array of regular expression.

    NOTE: In YAML, string values can span multiple lines using | or >. Here we want to use a syntax of “Literal Block Scalar” (ie. |) which will include the newlines and any trailing spaces

  • The setting hostname is optional and defaults to the OS-provided hostname. You can override it here to alter the hostname tag used for logs reported by this agent.

    NOTE: This does not affect the AppOptics metrics tagging.

  • The setting windows_events is optional and enables user to configure windows events that needs to be traced and logged in external logging service (ie. loggly). Be aware that windows_events field is a string containing YAML structure (use windows_events: | instead of windows_events:).

    NOTE: It’s available only on Windows systems.

    • Field enable allows user to turn on/off gathering windows events.

    • Field filters enumerates channels user wants to observe.

      Each channel can provide independent filters describing which messages should be passed to Loggly/Papertrail.

      Currently following fields are supported:
      • Level - level of event (ie. Error, Warning, Information, Success Audit, Failure Audit)
      • EventId - event identifier
      • Source - application, which triggered event (ie. VSS, Winlogon)
      • Computer - computer on which event was triggered
      • User - user which triggered event
      • Message - message associated with the event

      For each field user can provide either single value or list of possible values (refer to “Level” fields below). Field names and value(s) are case sensitive.

      windows_events: |
        enable: Yes
        filters:
        - channel: System
          Level: Error
        - channel: Application
          Level:
          - Error
          - Warning
      
    There are also special matchers:
    • range - allows to specify range of numbers instead of listing them one by one,
    • contains - allows to check whether string contains specific word
    • matches - allows to check whether string is matching to provided regular expression

    range should be used only with EventId field, contains and matches are the most usable with Message fields, although might be used as a matcher with any other fields (requiring string argument). Example:

    windows_events: |
      enable: Yes
      filters:
      - channel: Application
        Level: Error
        EventId:
          - range:
              min: 50
              max: 53
          - 60
        Message:
        - matches: "event[0-9]{2,3}"
        - contains: message
    
  1. Restart SolarWinds Snap Agent:
$ sudo service swisnapd restart
  1. Enable the Logs plugin in the AppOptics UI

On the Integrations Page you will see the Logs plugin available if the previous steps were successful. If you do not see the plugin, see Troubleshooting.

Select the Logs plugin to open the configuration menu in the UI, and enable the plugin.

You should soon see the logs metrics reported to your dashboard.

Metrics and Tags

The table below lists each of the metrics gathered from the status endpoint.

Default Metrics

Tag Namespace Description
logs.lines_total The total number of lines that were detected.
logs.lines_forwarded The total number of lines that passed all rules and were applied for submission.
logs.bytes_forwarded The total amount of bytes that were applied for submission.
logs.lines_skipped The total number of lines that were filtered out by some rules and thus were not applied for submission.
logs.lines_failed The total number of lines for which submission has failed.
logs.bytes_failed The total amount of bytes for which submission has failed.
logs.lines_succeeded The total number of lines for which submission has succeeded.
logs.bytes_succeeded The total amount of bytes for which submission has succeeded.
logs.attempts_total The total number of logs submission attempts.
logs.failed_attempts_total The total number of failed logs submission attempts.

The general rule for lines counters can be described using the following equations:

lines_total = lines_forwarded + lines_skipped
lines_forwarded = lines_succeeded + lines_failed

NOTE: Occasionally the above equations may not be satisfied which occurs typically when the values were taken after taking the lines submission attempt but before it has completed (ie. succeeded or failed).

Default Metric Tags

All Logs metrics are tagged with hostname. Instead of using this tag we recommend using the @host alias. If dynamic metrics are used (see exemplary task file provided with the plugin), then server name or hostname and port from logs will be added as a tag.

Troubleshooting

Too many open files

When the collector stops forwarding lines and in SolarWinds Snap Agent logs (typically located in /var/log/SolarWinds/Snap/swisnapd.log) you notice errors like this:

ERRO[2019-04-18T13:35:30-04:00] time="2019-04-18T13:35:30-04:00" level=error msg="follower error" error="too many open files" self-skip-logs-collector= submodule=remote_syslog _module=plugin-exec io=stderr plugin=logs

it means your OS limits have to be leveraged to handle monitoring of all the requested files. You can determine the maximum number of inotify (an underlying files watcher) instances that can be created using:

cat /proc/sys/fs/inotify/max_user_instances

and then increase this limit using:

echo VALUE >> /proc/sys/fs/inotify/max_user_instances

where VALUE is greater than the present setting.

When you confirm that the limits are met and the logs collector/forwarder works as expected you should apply this new value permanently by adding the following to /etc/sysctl.conf:

fs.inotify.max_user_instances = VALUE