Prioritizing Broken Pages by Top Metrics

In a perfect world our websites would have no broken pages. But if that were a reality, than the unicorn frappuccino from Starbucks would actually be good.

In my work on the Visual Studio team at Microsoft, we have thousands of pages across subdomains. Finding all the 404s and redirecting all of them to proper pages would take a lifetime. So, I used a technique I first learned from Annie Cushing, on how to find the most important 404s to fix.

Data Collection

Step 1: Crawl for 404s

The first step here is to find the broken pages. The tool I used for this is Screaming Frog. You can crawl up to 500 URLs for free. We use the paid version at Microsoft, but even then it’s affordable. There are lots of amazing things you can do with Screaming Frog, but for our purposes, we just have to put in the site URL and let it crawl.

Browse to the Response Codes tab and filter by Client Error (4xx) to find your 404 pages.

screamingfrog

Bonus: Inlink data is provided by Screaming Frog.

Step 2: Page Performance Data

After collecting my 4xx response pages, I head over to Google Analytics to grab some session/pageview data.

Option 1 (Sessions): Acquisition > Channel > Organic > Primary Dimension: Landing Page > Export

google-analytics

Pro tip: to export more then 10 rows, scroll to the bottom of the screen and select Show rows: 100.

Option 2 (Pageviews): Behavior > Site Content > All Pages > Export

google-analytics-allpages

You’ll want to select a healthy amount of time too, I suggest a month or two for most sites.

Step 3: Backlink Link Data

Lastly, I’ll use Moz, Majestic, SEMRush, or Ahrefs to grab some backlink data.

Free: Open Site Explorer > URL > Target: this root domain > Link Source: only external > Request CSV

(You will have to log into a free account to be able to download.)

open-site-explorer

This is harder to use with small site, because Mozscape mostly crawls domains with a certain authority.

Paid: SEMRush > Domain Analytics > Backlinks > Indexed Pages > Export

(Other options are Majestic, Ahrefs.)

semrush-indexedpages

Pro tip: You can also grab social share data from tools like Buzzsumo to add more insight to the analysis.

Married That Data

Using Excel, or Google Sheets, collect all the data in three sheets of one workbook. This is where the real analysis comes in.

Step 4: Simplify URLs

Find and replace in Excel is a dream send. Use it. To find, use the keyboard shortcut Command/Ctrl + F, then there is an option to replace.

Find all the http:// and https:// and replace with nothing. This trick is great for setting up our urls to be the same format to that we can to that VLOOKUP.

Step 5: VLOOKUP TIME!

Watch this video from Annie Cushing that breaks down how to do a VLOOKUP. Then add some columns to your 404 page data tab for sessions and backlinks. These columns are where you’ll put those VLOOKUPs. I also added some table formatting for easy filtering later.

Next, you’ll want to follow the steps in the video tutorial to match the 404 pages to pages that have received backlinks or sessions. Your final result will look something like this.

I wrapped my VLOOKUP function in an IFERROR function to show “”, or blank values if there is no url that was found.

Step 6: Glitter

Conditional Formatting is your friend. It’s hard when you see a bunch of numbers to analyze which are the most important.

Excel > Home > Conditional Formatting > Color Scales

Now you’re able to prioritize which 404 pages need to be fixed and with what level of urgency.

Improve UX and SEO

Step 7: Get to Fixing

I would recommend setting up some 301 redirects to that pages that best fit the visitor’s intent. It might be easy to redirect to the homepage. But take the time to find what the best alternative to that broken page is, then link to it.

When setting up these redirects, I recommend doing a quick check in Screaming Frog to see if you have any daisy-chain redirects going on.

More Random Advertising


2 thoughts on “Prioritizing Broken Pages by Top Metrics

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>