Using Regular Expressions in Google Search Console

Written by Tiago Silva. Updated on 30, May 2022

Using RegEx in Google Search Console makes life better for marketers and SEOs to build advanced filters across the query and URL data available in the tool.

Previously you have been able to do basic filtering on Google Search Console data, such as equals, contains, and not-contains. But these types of filters have their limitations.

RegEx allows you to do complex pattern matching on your site’s query and URL data to filter on multiple values at the same time. This makes it easier and quicker to uncover valuable queries that you can use to optimize existing content and create new pages on your site.

That’s why RegEx is a big deal, and in this article, you will discover more about what RegEx is and why it can help you find post-purchase searches from customers or get more content ideas. 

Don’t sleep on RegEx. It has a bit of a steep learning curve, but it can save you huge amounts of time when analyzing your website query data. There are also a lot of examples to get you started which we cover soon.

This article makes up part of our Google Search Console tutorials and training section, make sure to check the others out.

What is RegEx (regular expression)?

RegEx is short for regular expression, invented by Stephen Cole Kleene in the 1950s.

A RegEx is a sequence of characters used for searching or manipulating text. These expressions use metacharacters and literal characters.

Metacharacters combine strings that give special meaning to perform searches. For example, \s (lowercase S) means it will match whitespace, and \S (uppercase S) will match everything that isn't whitespace.

A literal character is a normal one. This means a literal character doesn’t have a secondary meaning. For example, in the expression “.*Tesla.*”, we are searching for strings that contain ‘Tesla’, while the dots and asterisks are metacharacters.

Please note the difference between lowercase and uppercase in the example above.

Interestingly, all the query data stored in Google Search Console is lowercase, so you don’t need to worry about case sensitivity with your RegEx queries.

These expressions are used by search engines (like Google), text editors (like VS Code or Notepad), and word processors (like Microsoft Word).

Some practical examples of RegEx are:

  • Find and replace text;
  • Use a list of words or expressions and find them in a document;
  • Find URLs that contain a specific word (for example, product names).

A RegEx expression can look like this: (?i)^(Google Search Console|Search Console| GSC).

This expression is a case insensitive search for "Google Search Console", "Search Console", and "GSC". This means "search console" and "gsc" are possible results.

Note: The pipe character ("|") means or.

Why is RegEx in Google Search Console a big deal?

As we have seen, RegEx is used for searching data stored as text, and as part of our job as SEOs, we deal with lots of query and URL data.

Up until now, if we have wanted to do any kind of advanced filtering on Search Console data, we have had to export the data to a spreadsheet or got a programmer to interface via the Search Console API. Any export was limited as Search Console only exported the first 1,000 rows of data for you!

With RegEx support now directly within the Search Console user interface, we can run advanced filters and queries directly within the tool. As this is directly within Search Console the filters are run over the entire dataset of queries or URLs available which can often run into tens of thousands.

This gives you the ability to see a much wider variety of queries your site appears in Google for, and an opportunity to target those queries with specific content upgrades or new pages.

How to use Regex in Google Search Console

GSC allows positive and negative matching RegExes. This is useful for excluding search queries from the stats, like removing brand searches to better understand the pages people find on your site.

How to apply RegEx filtering in GSC?

To use RegEx on Search Console, go to "Performance" (or "Search Results" if you have that option)

To apply a filter on GSC go to Performance and then click new to apply a RegEx filter.

At the top, click on "New", then select "Query" or "Page", and pick "Custom (regex)".

Pop up window with the 4 types of filtering allowed in Google Search Console

Now, it's time to write or paste your RegEx into Google Search Console.

A RegEx expression filtering 3 specific words.

Attention: The only filters that support RegEx are "Query" and "Page".

Why use RegEx instead of other filters?

Now you might be asking yourself when to use the default filter in GSC or RegEx. Well, that's a great question.

In the first place, Google Search Console filtering options are pretty basic compared to RegEx. For example, you can only filter 1 URL, 1 keyword, or 1 country at a time with the default filters. This is a severe limitation of the searches you can make.

That's where RegEx comes to the rescue. Especially as you can get creative and write expressions for your personal use case.

Sidenote: It's also worth mentioning that searching for special characters like foreign characters in URLs isn't supported with RegEx in GSC.

Helpful RegEx filters in Google Search Console

The use cases for RegExes inside Search Console are ever-growing, as SEOs keep finding more valuable expressions. You'll find a list of helpful filters below.

Find longtail keyword questions with RegEx

There are several ways to find longtail keywords with RegEx. Here you’ll read about 3 different examples. 

The first one was shared in the SEO Notebook newsletter by Steve Toth.

The RegEx is: ([^” “]*\s){7,}?

This expression will show you all the queries with 8 or more words. 

To get information about shorter queries, change the number "7". For example, to get 5 words, change "7" into a "4". Basically, put the number of words you want to find minus 1 into the expression.

Using Regex to find longtail queries.

Find pages ending in a specific slug

Hannes-Jeremia shared how to find URLs with a similar ending on Twitter. This simple tip is helpful for large websites. 

The RegEx you should use is: word$

In this case, replace “word” with the keyword you are looking for in the URLs. Then you only need to put finish the expression with a dollar sign ($).

Let's see an example for the word holidays.

I'll use holidays$ as a RegEx in GSC. The results are the following:

This image shows the results of using Regex in GSC to find URLs that end with the same slug.

Brand vs Non-Brand traffic

Jean-Cristophe-Chouinard has an extensive list of RegEx use cases for GSC. 

One of the most significant examples in that list is comparing brand versus non-brand organic traffic.

This filter will give you an overview of visitors to the website that already knows your company versus potential first-time visitors. 

To do it, use the comparison tab and this expression: .*domainName* as you can see on the screenshot below.

Regex filter to compare brand versus non-branded searches in Search Console.

Find after purchase queries

RegEx can help you find what post-purchase queries your site is currently ranking. This is useful for knowing potential problems with your products and creating content around these searches. 

The tip was shared by Christopher on Twitter.

And looking at the tweet comments, I'll change his expression to accommodate Timothée suggestion that gets even more queries from Google Search Console.

The RegEx is: \b(clean|broken|wash off|shattered|polish|problem|treat|doesn't work|replace|doesn't start|scratch|repair|manual|fix|protect|renew|coverage|warranty)[” “]

Using RegEx to find customers questions after a purchase.

Understand User Intent

Another use for regular expressions is understanding the user intent. 

Usually, these types of searches are divided into: 

  • Informational;
  • Navigational;
  • Commercial;
  • Transactional.

This tip is a combination from Jean-Cristophe-Chouinard, Steve Toth, and Michael Martinez.

For quite some time, Google has understood the user's search intent. Thus, filtering keywords similarly to what people search is good for producing or improving content related to those searches.  

You can paste the following expressions into GSC to find queries based on user intent.

Informational

RegEx to use: who|what|where|when|why|how|was|did|do|is|are|aren’t|won’t|does|if|can|could|should|would|won’t|were|weren’t|shouldn’t|couldn’t|cannot|can’t|didn’t|did not|does|doesn’t|wouldn’t

Navigational

Example of Regex to use: .*brand.*

Note: If your company is called Tesla, replace "brand" with “Tesla” to perform the search. 

Another potential use for this brand could be to check if you are ranking for queries, including a competitor name. It’s popular to build pages targeting “versus” and “alternative to”, so this is an excellent expression to check those rankings.

Commercial

RegEx to use: .*(best|top|vs|review*).*

Transactional

RegEx to use: .*(buy|cheap|price|purchase|order).*

Final Thoughts

RegEx is a powerful instrument for any marketer to create advanced filters when dealing with data, including Google Search Console. 

It's one of the best ways to filter and find unobvious information to make performance reports in a short period.

As you saw, many generous people share their RegExes and use cases online that can help with your job and learn RegEx faster.

And speaking of learning RegEx, Regex101.com is one of the most mentioned resources.

Also, it's worth mentioning that JC Chouinard keeps adding more use cases to this list, so it's helpful to bookmark the page and visit it regularly.