Home About

Published

- 3 min read

GoAccess Web Log Analyzer

img of GoAccess Web Log Analyzer

GoAccess

Yesterday I described my tech stack for this website. Today I got curious how to get an overview about the access log. After some research I came across GoAccess (GitHub) which looked pretty amazing. That is how they describe themselves:

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

Install

The good thing: I didn’t even had to install anything. This is how it works:

  1. Make sure the nginx config creates an access log file. Have look here.
  2. On your local machine connect to your VM via ssh and pipe the log to the Docker instance of GoAccess:
   ssh -n sebastian@server 'zcat /var/log/nginx/sebastian-reck.de.access.log.*.gz | cat - /var/log/nginx/sebastian-reck.de.access.log /var/log/nginx/sebastian-reck.de.access.log.1' | docker run -p 7890:7890 --rm -i -e LANG=$LANG allinurl/goaccess -a -o html --log-format COMBINED --real-time-html - > report.html

For explanation:

  • ssh -n prevents reading from STDIN. Run man ssh or check here.
  • The part in single quotes is an argument of ssh and contains the command being executed remotely:
    • It handles the default log rotation, where logs get compressed every 24 hours.
    • zcat is provided by gzip and decompresses a binary and prints the result to STDOUT like cat does. Therefore the name.
    • The result gets piped to cat which reads from STDIN (therefore the -) and additionally the two other un-compressed logs from yesterday and today.
    • You might wonder if the order is messed up. Yes it is, but every row gets read and processed by GoAccess individually and be shown correctly in the end.
  • Whatever comes after the pipe | will receive the response from ssh as STDIN.
  • docker run spins up a container. Here it redirects TCP port 7890 from inside the container to the outside. We’ll cover why that is later in section Enjoy. --rm will remove the container after its shutdown. Without it the container remains which is not necessary in this case. Fire and forget is what we want. -i stands for “interactive” which is needed as we pipe to STDIN. -e sets the environment variable LANG to whatever you have configured as your locale. allinurl/goaccess defines the docker hub user and image that will be pulled. With no tag given it will pull the latest version.
  • Everything that comes next is strictly speaking docker territory still, but will be interpreted as GoAccess options. -a enables a list of user-agents by host. See here for further instructions. -o defines the output format. Here it is html as we want the shiny web-app. The self speaking log format option is set to “COMBINED” which should work fine with the default, unaltered nginx logs. The --real-time-html option is necessary as we watch changes while they appear. The dash - is docker territory signaling it to read from STDIN (we are piping in from ssh).
  • The greater than > token redirects STDOUT from docker to the specified file. Here report.html.
  1. Open the report.html in your browser: On Linux exec: xdg-open report.html which opens known file extensions with their default settings. Firefox for example.

Do you remember that the docker command redirected TCP port 7890? Inspect the Network-tab of your Web-Developer tools and you will discover, that the html file opens up a websocket to that port. Pretty clever, if you ask me. You can further examine that the report.html does not alter if new log messages arrive. Check via stat report.html. Inode and size stay the same.

Enjoy

This is what you get. Impressive, huh?

My GoAccess report

Don’t forget to vanish the log after some time to accommodate privacy regulations. But how that works might be a story for another time. 😉

Best
Sebastian

Impressum Datenschutz