jahed.dev

Simple Event Tracking with Nginx

Over a year ago I wrote a way to use OpenResty (Nginx and Lua) to log events coming from a web application. A few days later, I changed my approach. I didn't like the reliance on OpenResty's ugly Lua logs so I put a few ideas together for as simpler solution. I didn't get around to writing about it until now so here it goes.

Sending Events

While tracking is most compatible using images and GET requests, it's a limited and ugly hack. If your application already uses JavaScript, it's easier to use navigator.sendBeacon which is designed for tracking.

// Make this as elaborate as you want.
const track = (event) => navigator.sendBeacon("/event", JSON.stringify(event));

Logging Events

sendBeacon uses POST requests. Since Nginx doesn't process $request_body by default, it's not straight-forward to log it. You first need something to consume the body which will trigger Nginx to process it. This is easily done using proxy_pass to a local sink endpoint. Then you can log it as JSON.

http {
  log_format event escape=json
  '{'
    # Add as many variables as you want.
    '"request_body":"$request_body"'
  '}';

  server {
    # This is your local sink server.
    server_name 127.0.0.1;
    listen      80;
    access_log  off;

    location = /_request-body-sink-204 {
      return 204;
    }
  }

  server {
    # This is your main server.
    location = /event {
      access_log /var/log/nginx/event.log event;
      proxy_pass http://127.0.0.1/_request-body-sink-204;
    }
  }
}

Setting Rate Limits

To avoid flooding your server with events, you can set a rate limit.

limit_req_zone $remote_addr zone=events:2m rate=1r/s;
# ...
location = /event {
  limit_req zone=events burst=10 nodelay;
  # ...
}

Processing Logs

The request_body will be logged as a JSON string using escape=json. This makes it safe to log and avoid invalid and malicious request bodies.

{ "request_body": "{\"level\":\"info\"}" }
{ "request_body": "{\"level\":\"error\"}" }
{ "request_body": "bad\ninput" }

When you're processing these logs, make sure to parse the JSON string correctly. For example, in jq you can use fromjson.

cat event.log | jq '.request_body | fromjson | .type'

Or with ElasticSearch, you can use the JSON processor in your pipeline.

{
  "processors": [
    {
      "json": {
        "field": "request_body",
        "target_field": "event"
      }
    }
  ]
}

For more on this approach, see the follow-up post: On-Demand Dashboards with Elasticsearch and Kibana.

Conclusion

That's about everything. For small projects where you want to avoid third-party dependencies, this is a simple way to get some diagnostics without hoarding too much data.

Thanks for reading.