How to Use Google Analytics Filters to Clean Up Shopify Pages Data

Reading Time: 6 minutes

Shopify is a well-set E-commerce platform, in my opinion, and I would undoubtfully support any online business owner in a decision to build a new or move his or her existing website to this engine. Primarily, that is because they have done an excellent job in making the e-shop management easy, but as well because of a secure environment and fast enough servers.

In the past though, honestly, I did vote against their Google Analytics integration which was one of the reasons it was not my platform-of-choice. That has changed since, and nowadays their integration with GA needs just a little outside help and setup to make it collect robust, accurate and useful data.

What you can find in this article are the Google Analytics Filters that I personally find useful for cleaning up Shopify data in GA’s Pages report. In case you would find useful reading an intro and more about Pages report in general, start with another article which focuses on why duplicated URLs in Google Analytics should be resolved.

How to make Shopify numbers in the Google Analytics Pages report more accurate

Problem

In case you went to the All Pages report in GA (Behavior –> Site Content –> All Pages), which you would have if you wanted to see how one single page performs (Bounce Rate, Avg. Time on Page, and not enough valued metric Page Value), you would have noticed that there are repetitions of one page in numerous variations:

  • /example-products
  • /example-products?sort_by=price-ascending
  • /example-products?page=3.

As you can see on the screenshot above, the appearance of these parameters — numbers and letters after the question mark, means that one page’s metrics have been segmented and broken down to multiple pages instead of showing only one row in the report for each Page path. Consequentially, the amount of Pageviews for one single page is not just 6,587, or whatever it might be, but is much more because there might be 10, 20, 50 or more instances of this page with a parameter in the Page path with 50, 60 or only 1 or 2 Pageviews. This could mean that this single page had 7,659 Pageviews (as an example) instead.

The issue reproduces to all the other essential Page metrics: Bounce Rate, Entrances, %Exit, etc.

Let’s explain now what are those two parameters in Shopify, shown in this case:

  • “sort_by” (link to Shopify documentation) in the Page path means that someone sorted by criteria on the Collection pages – price, date, featured, etc.
  • “page” in the Page path means that someone went beyond the first page of Collections, and shows the Page number someone visited – 2nd, 3rd, etc.

Now, just removing these would not be advised, merely because they are useful. You do want to know if one viewed products behind the first page, or by which criteria they sorted. Both can tell you which products you can add to the featured ones, for example. So, we will first store them in GA’s Custom Dimension and then remove them to clean up the Pages report, as we initially intended to do.

Solution

Solving this issue requires adding new Filters to GA: one to remove “sort_by”, one to remove “page” and then one to remove the remaining characters (like ? and &). Also, before doing that, we would be creating new GA Custom Dimensions to store those values, and then create filters that would “pull” those values from the link before they are removed.

This is all done from the GA Admin panel. Next steps follow:

  1. Creating new Custom Dimensions

You would create one for each value that needs to be stored. Thus, making two new Custom Dimensions is what we should do. This is where to do that:

You can name them however you want, but I would propose “Shopify Collections Pagination” and “Shopify Collections Sorting”. Both should be of the “Hit” Scope.

  1. Creating new Filters

First, you should build Filters to pull the values out of the URL. This is done by making a new Filter, choosing Custom as the Filter Type, and then Advanced radio button:

In the Field A, choose Request URI from the list and put (sort_by=[^&]*&?) to the empty value box. Field B should be left empty and unchanged. Output To should be your Custom Dimension for the sorting, previously created, and $A1 should be put into the value box.

Same you should do with the other Custom Dimension, although with (page=[^&]*&?) put in the Field A box, and a different Custom Dimension chosen in the Output To list. This is how it would look like:

After that is done, creating Filters to clean up these values is the next on the list.

The first one, for “sort_by”:

The second one, for “page”:

Notice that for these Filters we are using Search and Replace, instead of the Advanced Filter. Values are (sort_by=[^&]*&?) and (page=[^&]*&?).

This is sorted out now, and what is left is to clean up the remaining characters. That requires only one additional Filter:

Value of this one is ([?&]$).

IMPORTANT: You should know that removing Query strings from Google Analytics should not be done in the most expected place — in the Settings of the GA View:

The reason is that removing these query parameters (like “sort_by) is processed prior to Filters and the parameters would be stripped from the URL before the Filters can pick them up and remove themselves. This means that if you put the parameters in the box shown on the image above, storing the values and your Filters won’t work, even though you would get the clean Page data as a result.

  1. Assigning Filter order

Google Analytics processes data in Filters in the order of their appearance in the list — the first one gets processed prior to the second on the list. This is why we need to make sure that we don’t remove these parameters before we have stored them in the Custom Dimension, as well as to make sure that the final “clean-up” Filter comes as the last.

Assigning Filter order is done from this location:

It is done by choosing a Filter and moving it up or down with the buttons available. For our case, this is what the order should be in the end:

With this, we have completed the setup for cleaning up parameters from the Page path and getting the valid data in your GA reports.

You should know as well that there can be other parameters that appear in the Page path, generated by Shopify, like “limit” (which shows how many products visitor chose to see at the same time on the Collections page). For these cases, you can apply the same methodology — create a Custom Dimension to store it, create one filter that pulls it from URL and stores it and then creating one which is removing that parameter. All that you would need to change is the text of the parameter: being (limit=[^&]*&?) instead of the (page=[^&]*&?), for example. Don’t forget to assign the Filter order, and make the filter that removes remaining & and ? characters always the last one.

Posted on: July 24, 2018 by igorkolosov

Leave a Reply

Your email address will not be published. Required fields are marked *