Thursday, July 31, 2025

Creating and Testing Ingest Pipelines with Grok for NGINX Logs in Elasticsearch

The Elastic Stack is widely used for log management, monitoring, analytics, and observability. It includes several tools such as Beats, which collect logs from various systems. However, not all data in these logs is necessary for indexing. To filter, transform, and enrich documents before they are indexed, we can use ingest pipelines. In this blog, we will discuss how to create a simple ingest pipeline and test it.

Prerequisites:

ELK Stack with Beats installed and configured.

  • Open Kibana and go to Stack Management --> Ingest Pipelines.
  • Click on create pipeline --> New pipeline.
  • Provide a name and description for the pipeline. After that, we can add processors to modify the document, such as parsing fields, adding or removing fields, and converting data types.



  • Let's add Grok processor with the values below.
            1. Add message as the field value and apply the Grok pattern below. This pattern matches the beginning part of the log but doesn't extract any data from it. It only matches and extracts fields from the end part of the log

%{DATA} "(-|%{IPORHOST:backend.ip}:%{NUMBER:backend.port})" "(-|%{NUMBER:http.response.time:float})"

            2. Select ignore missing toggle to avoid failures due to missing field values.



  • Now, let's test the created ingest pipeline using the log below.

"10.70.90.122:443 11.90.1.132 - - [26/May/2024:13:08:37 +0000] \"GET https://myapp.abc.com/student/registry/login HTTP/1.1\" 200 38 \"-\" \"Elastic-Heartbeat/8.15.2 (linux; amd64; 202341567932255345; 2024-01-19 09:21:13 +0000 UTC)\" \"-\" \"11.190.4.1:443\" \"0.039\""

  • But before using it as a test pipeline document, format it as shown below.

[

  {

    "_source": {

      "message": "10.70.90.122:443 11.90.1.132 - - [26/May/2024:13:08:37 +0000] \"GET https://myapp.abc.com/student/registry/login HTTP/1.1\" 200 38 \"-\" \"Elastic-Heartbeat/8.15.2 (linux; amd64; 202341567932255345; 2024-01-19 09:21:13 +0000 UTC)\" \"-\" \"11.190.4.1:443\" \"0.039\""

    }

  }

]




Click on Run the pipeline. You will see the modified output as shown below. It should include the new fields extracted by the Grok pattern: http.response.time, backend.port, and backend.ip.




Similarly, we can add more processors to the ingest pipeline to further modify the document and test the outcome.


No comments:

Post a Comment