Regular expressions (REGEX)
In some cases, Matomo Analytics or other solutions may ask you to enter regular expression combinations in order to match some values. Regular Expressions also called Regex are special caracters you can use in IT in order to create pattern of values. It is very useful when you would like to save a lot of time and automate some processes. Let's see how to use them for analytics.
- ^ : this special caracter is used in order to get only the values which start by the caracter you will indicate next. For example if you filter the country report value with ^F you will get as a result Fidji Islands, France... but you won't get Ireland, United States... because they are not starting by the letter F.
- $ : this caracter will match any value ending with the caracter before this dollar caracter. So if you write a$, you will get Albania, Canada but you won't get France.
- . : the dot is used in Regex in order to match any caracter. Regex caracters can be all combined together so if you type ^......$ you will get France as it composed of 6 letters but you won't get Congo as it is composed of 5 caracters.
- * : this caracter is called wildcard and it means that the preceding caracter can be present 0 or an unlimited times. So if you write ^F.*e$ it means that it will include all values which start with uppercase F and a final e whatever the number of caracters written in between.
- + : this caracter is kind of a cousin of the wildcard it is just that the preceding caracter as to be included at least 1 time up to an unlimited time.
- \ : this caracter is a bit special, it is here in order to cancel the use of a Regex caracter. For example sometimes you would like to write values such as 192.168.12.1, but as we previously saw . are REGEX caracter so they will be counted as it. In order to cancel them, you need to use the \ and write 192\.168\.12\.1, in REGEX we say that we are escaping the caracter.
- [] : this is an interval, it is used in order to match a caracter from a sequence. Sof when you write 19[2-4]\.168\.12\.1 it means that it is matching the following values 192\.168\.12\.1 and 193\.168\.12\.1 and 194\.168\.12\.1
- ? : it means that the preceding can be here or not.
- {} : it indicates the number of time the preceding caracter has to be present, so for example: ^.{5}$ means that the preceding caracter has to be here 5 times, so here it will match any values composed of only 5 caracters.
- () : parenthesis are here in order to define groups, it is very useful when you use it with the or condition.
- | : the or condition. You access to this caracter by typing Alt Gr + 6. If you write for example something like ^France|Congo$ it will match either the value France or the value Congo.
Regex take more sense when you are using them directly within a software, so please have them a try within the Matomo UI you can use them within the search filter of any report.
The list introduced below is far from being exhaustive, as result do not hesitate to look for keywords such as "Regex Cheatsheet" within search engines to find valuable contents.
Last modified: Sunday, 8 September 2019, 4:32 PM