How Digital Analytics trackers are collecting data?

In this part we will explain and describe the main components of Digital Analytics solutions such as Matomo Analytics. You don't have to reproduce all the explanations here. The idea is for you to understand the big picture. Don't panic, everything is going to be alright. If you already have some experience in Digital Analytics, this part will be very valuable to you.

Understanding the Database Concept

Digital Analytics is about analyzing data. Those data have to be somewhere and this place is called a database. There are different ways you can visualize databases:

In general it takes the form of this icon within documentation.

The visual and sexy part of it looks like this:

A database example

The CLI (Command Line Interface) part looks like this:

A MySQL server will all its databases

There exist different types of database management software, one of the most popular being MySQL. The illustrations you can view in this lesson are MySQLs.

The illustration below shows a piece of software called PhpMyAdmin, which is dedicated to managing MySQL databases. On the left side of the screen, you can see the list of all the databases that were created (here, for a WordPress installation). One of those databases (wp_posts) is selected. Its properties can then be visualized in the main part of the screen. You can thus access its tables and get into the content of those tables easily. The content of the database is stored in tables, this is what you can see hereunder.

Here are the data

Everything is in here. And every time you get a report in Matomo Analytics for instance, Matomo is "asks" its database to show results based on conditions you have set (a particular timeframe or a given geographic area e.g.). In the illustration below, you can see a specific report for which Matomo "asked" its database to show the requested results.

A report example

In order to "ask" its database, Matomo uses a standardized database access language called SQL (Structured Query Language). Other languages exist, but SQL is the most common. In technical terms, each time Matomo "asks" its database for information, we say that it makes a request.

Here is an SQL request example:

SELECT * FROM `wp_posts` WHERE 1

Creating a database is really easy, you just need to install a MySQL server and then you are good to go.  Anyway, you had better install phpMyAdmin too (see above) as it will show you an interface (aka GUI - Graphical User Interface), which is more comfortable for marketers.

Alright. Now you know the vital minimum about databases. Let's now see how to send some data inside them.

Collecting Data for Your Database

Here, we are interested in building a tracker, so we need to know what we would like to collect within it. We can collect a lot, but let's just stick to a couple of data to start with:

  • Time of the connection
  • Page URL
  • User agent + any additional data

Our tracker will fill a database with 4 columns: one column for each type of data mentioned just above, and one addional (but mandatory) column to store an ID number. The latter serves as what is called the table primary key, which is needed to connect tables with each other. But I digress...

The Data Collection

In order to write inside the database, we need to send instructions to our database server. Many languages can be used to do that, such as:

  • PHP
  • HTML
  • JavaScript

Let's see each one of them in detail.

The PHP Tracking Code

You would rarely use a PHP tracker as it requires a high level of expertise and it is slower than other means. Moreover, it is a server side programming language, so if you make a single mistake in your code, your website is down. Most of the time you use it to call very special analytics features such as in e-commerce. The reason is that you need to call a database for that as the information you are looking for is not on the page.

Here is an example of a script written in PHP. Disclaimer: the code below is not secure, so never used it for a real project, as SQL injection could be executed.

<?php
$servername = "localhost";
$username = "root";
$password = "root";
$dbname = "picsou";

// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
}

// Get the time
date_default_timezone_set('France/Paris');
$date = date('Y/m/d h:i:s', time());
// Get the URL
$url = 'http://'.$_SERVER['HTTP_HOST'].$_SERVER['PHP_SELF'];
// Get the user agent
$user_agent = $_SERVER['HTTP_USER_AGENT'];

$sql = "INSERT INTO analytics (connectiontime, useragent, pageurl)
VALUES ('$date', '$title', '$user_agent', '$url')";

if ($conn->query($sql) === TRUE) {
    echo "New record created successfully";
} else {
    echo "Error: " . $sql . "<br>" . $conn->error;
}

$conn->close();
?>

In order to integrate this code on your page, you can write the following code in a webpage:

<?php include 'tracker.php';?>

One important thing to note is that your webpage must have a .php extension (pretty useful to know when you are working on a local server). Servers in production/live environments use rules, which rewrite those extensions and that is the reason why most of the time you don't see them.

The HTML Tracking Code

The HTML tracking code, more commonly known as image tag, image tracker or even "pixel", is useful to setup tracking on pages where you cannot use PHP or JavaScript. That's why it is often used in a guest blog post or in emails for instance.

It consists of a simple HTML image code, like this:

<img src="http://localhost/yourscript.php" style="border:0" alt="" />

As you can see here, the whole magic relies of the image source attribute (src=). It does not point to an actual image URL but to a webpage, where the tracking code (here written in PHP) is executed. That is this code that will assign which information should be stored in our database.

Here is an example of a webpage in HTML that displays the text "Hello World", meanwhile collecting data thanks to the PHP code that we saw previously:

<html>

<h1>Hello World</h1>

<img src="http://localhost/tracker.php" style="border:0" alt="" />

</html>

As a result, the data are now within my database.

The JavaScript Tracking Code

There are several ways JavaScript tracking is working. The drawback of JavaScript is that it generates a lot of code lines and can be complex sometimes. We will use then jQuery instead. jQuery is a popular JavaScript library. Like many libraries, its interest relies on the fact that you can write code (using "functions"), sparing you too many lines (and making things easier than pure JavaScript).

Here is an example of a JavaScript tracker:

<script>
var w = window.innerWidth;
var h = window.innerHeight;
$.post(
  "tracker.php",
  {resolution: w + 'x' + h} 
);
</script>

This tracking code is simply picking up the width and height of the browser window and sending them as a post request to the tracker.php file.

The default window element contains everything you need in terms of data in order to build your tracker.

Last modified: Wednesday, 29 July 2020, 4:30 PM