Introduction to Logstash Grok

Elasticsearch Logstash Grok
Introduction to Logstash Grok

An Introduction to Logstash Grok

At Canadian Web Hosting we are continually looking for new ways to look at our data and one of the tools we utilize is Logstash Grok.  In order to effectively analyze and query any data that has been sent to Elk Stack your information must be readable. Therefore, when unstructured data enters the system, it must first be translated into structured message lines. Typically, this daunting task is picked up by Logstash, or one of the other log shippers that are currently available. However, Logstash is the most advanced and beneficial service available today. No matter what log shipper you choose, it is imperative that the logs on hand are embellished in order to make sure that they are analyzed correctly before being delivered to Elasticsearch. Filter plug-ins perform data manipulation within Logstash. Perhaps the most useful and popular filter plug-in is the Logstash Grok which is used to analyze unstructured data and transform it into structured data.

How Does Logstash Grok Operate?

Put into simpler terms, Grok takes and matches a line with a typical expression. It then proceeds to map selected segments of the line into designated areas. This mapping indicates which actions should be performed. With several out of the box built-in patterns included in Logstash, it is easy to filter specific dates, words, and numbers. If you are unable to locate the pattern you need, it is completely feasible to write your own unique pattern. The basic composition of a Logstash Grok filter is as follows:

%{Pattern:FieldName} 
This will match the predefined pattern and allow you to map it to a designated area. Because Grok is based on a mix of basic expressions, it allows the user to produce their own regex-based filter. 

For example: 
(?\d\d-\d\d-\d\d) 
This specific one will contest with the basic expression 22-22-22 along with any other digit added to the field name. 

Getting Started with Logstash Grok

When you use a Grok filter the main goal is to effectively break down the log line at hand into specific segments. For example: timestamp, log level, class, followed by the rest of the message. The Grok pattern below will efficiently accomplish this task.

1. Grok { 
2. Match => { “message” => “%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} \[%{DATA:class}\]:%{GREEDYDATA:message}” 
3. } 
This will attempt to pair up the incoming log with the provided pattern. In the instance that a match is found, the log will then be broken down into the designated areas set forth previously in within the filter. If there is no match, Logstash will go ahead and add a tag called _GROKPARSEFAILURE. 

For all intents and purposes our filter will indeed match, resulting in the following output:

1) { 

2) “message” => “Starting transaction for session -464410bf -37bf -475a-afc0498e0199f008”, 

3) “timestamp” => “2016-07-11T23:56:42:.000+00:00” 

4) “log-level” => “INFO”, 

5) “class” => “MySecretApp.com.Transaction.Manager” 

6) } 

 The Grok Debugger

There is a Grok debugger tool available at: http://grokdebug.herokuapp.com/. This handy tool makes it possible for the user to paste their log message and build the grok pattern, while simultaneously testing your compilation.

The Bottom Line?

Logstash Grok is one of many filters that you can apply to your logs prior to directing them to Elasticsearch. It plays an extremely crucial role in the logging pipeline, making it perhaps the most popular and commonly-used filter tools.  At Canadian Web Hosting we are embracing Elasticsearch, Logstash and looking for new methods to help improve our customers experience.

LEAVE A REPLY

Please enter your comment!
Please enter your name here