Blog

How to Make a 404 Page for SEO, Usability (404 FAQs & More)

February 17, 2021
by
Tory Gray
Tory Gray

Originally published in 2011. Updated May 2021.

404 Error Pages play a very important role in search engine optimization and website usability. Here's a Q&A of the most commonly asked questions we get from clients about 404 Not Found pages, things that commonly go wrong (so you know how to fix them), plus advice for how to set up & monitor 404 errors properly.

404 PAGE NOT FOUND ERRORS FAQ:

What Is A 404 Error?

A 404 error (HTTP 404), also called a “header response code” or “http status code”, or simply "crawl errors", is the computer equivalent of saying “Not Found” or “Page Not Found.”

404-page-design

Here’s the “tech speak” definition:

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.”
- via
w3.org

In other (less technical) words, this response code essentially tells search engines - and users - that the resource (or requested URL) being referenced does not exist or literally can’t be found.

It’s a robot’s version of a shrug and blank look.

There are many other response error types, and all URLs return response codes of some sort. A correctly functioning page, for example, should return a “200” status code, which means “Found.” The other major error type is called "server errors", represented by HTTP status codes 500-599. Different error types help webmasters diagnose the source of the errors so they can fix it appropriately.

404 errors are generally the most common error type. They are also often handled incorrectly by well-meaning people - hence the purpose of writing this article! 404 HTTP errors are sometimes called "Client Errors", where the "client" typically refers to the user's web browser (Google Chrome, IE, Firefox, etc.)

What Is A Soft 404?

A soft 404 is essentially a page that doesn’t return a 404 response code, but Google(bot) nevertheless believes the page is experiencing an error.

What are 410 Errors? 403 Errors? And the other 4{xx} errors?

Errors 400 to 499 are all various types of Client Errors. Some other common 4xx errors include:

  • 400 (Bad Request)
  • 401 (Unauthorized)
  • 403 (Forbidden) - this one commonly occurs when you crawl a site too quickly and the platform/server (e.g. Shopify!) will deny you access to reduce their web server load.
  • 410 (Gone) - whatever this resource was, it's now gone - and that's permanent & intentional.
  • Find a full list of all the 4{xx} status errors.

What's The Difference Between 404s And Soft 404s?

We can keep this one short and sweet - essentially, a soft 404 does not issue a 404 header status code;  it (perhaps incorrectly) issues a 200 (Okay) status code instead.

A proper 404 page does correctly issue a 404 (Page Not Found) HTTP status code.

How And When Do 404 Errors Occur?

  • When pages have been moved (e.g. a missing page, perhaps due to it getting moved to a different section of your website) or removed (e.g. a non-existent or deleted page was forgotten and never redirected)
  • When a webmaster, CMS user or software engineer mistypes a URL on a page or "page template" or has a copy-and-paste mistake (e.g. the wrong url was linked to)
  • When broken links or accidentally truncated links occur on web pages, social media posts, or in an email message
  • Real soft 404s occur when a page issues a 200 (OK) status when it should have issued some other error, because something went wrong. Most commonly the website should have issued a 404 error, but didn’t. Usually this is an indication that 404s aren’t functioning properly. Sometimes it works in some sections of your site, but not others.

How Are 404 Pages Useful?

A 404 error tells search engines, website visitors and webmasters when page URLs are broken, or never existed in the first place.

When we can see where these error code occur, we can fix the problem for future website visitors (generally via a 301 redirect to the new location of that content), thereby preserving the power of the page that once existed (... or the incorrect link to a page that never did. That’s what we call “broken backlink building".)

If you don’t issue a 404 response code, you won’t inherently know that the page - and therefore the user - is experiencing an error.

So YES. A 404 page is useful. You can’t fix what you don’t know about!

Are 404 Errors Bad for Search Engine Optimization (SEO)? Are 404s Errors Good for SEO?

There is some dissent online debating this issue.

  • In theory, at least, 404 errors are "bad" because they represent errors on your website (or on the web, but relating to your website). But mistakes happen, and they are easily forgiven on a small scale.
  • That said, a user that comes across a 404 page is less likely to return to that site later. (We’ll cover how to make this less likely, below.)
  • Too many 404 header response errors - or 403s, 500s, or really any other type of 4{xx} or 5{xx} error - across a site can create an overall high error rate vs. success rate.
  • This results in trust issues: if Google (or Bing, etc.) sees too many errors vs. functional pages, they won’t want to send users to your site. Why would they? That user will get lost and mad, and have a bad experience. It would mean that Google failed at their job - helping the user quickly & easily find the answer to their question.
  • If your website is actively linking to a bunch of non-functional pages, that means you are passing "page rank" (e.g. SEO equity) … to nothing. Picture a scenario wherein your website is like a purse with holes in it, with money dripping out. No bueno!
  • That said - 404 errors can and will happen no matter what. A properly functioning 404 page will notify the webmaster that the error occurred, and where it occurred. Then we can go about fixing it via a 301 (permanent) redirect. The user, and search engines, never need to come across this particular issue again.
Key Point: Some teams see 404s as “bad” and so avoid this inherent "badness" by simply choosing not to issue a 404 status code on a 404 page. The trouble is - the errors real users are experiencing can and will occur no matter what. Let’s not “shoot the messenger” - mistakes happen, and error codes help us identify those! Issuing a correct 404 status code means that you can more easily find issues in order to fix them.

So… to state the obvious… don’t forget to fix them! Too many outstanding, unfixed 404s can be actively bad, or can be big missed opportunities. Plus it’s just a bad user experience.

How Should A 404 Page Function? (User Experience Matters!)

  • Step 1: Remain at the URL that was called, (for example: https://www.thegray.company/i-made-this-up/) - e.g. DON’T redirect this to some other page like /404.html. The error should load on the URL with the error, so we can see the error - where it occurred - and therefore fix it properly.
  • Step 2: A 404 "http status error" should be issued from the server. Use a tool like httpstatus.io to double check that it’s working.
  • Step 3: A 404 message - visible to the website visitor, on the page - should clearly explain what happened, and include resources, links, and ideally the ability to search your site to find what they were looking for. More on how to make this UX friendly below.
Example results from httpstatus.io
Example results from httpstatus.io

What Are the Most Common Improper Setups For 404 Pages? What Are The Most Common 404 Setup Mistakes/Errors?

  • Redirecting to a 404 page. This hurts everyone. Users are lost, search engines think everything is honky-dory when it's not, and since you don't know when it's happening - you can't fix it.
  • Automatically redirecting to the page you assume search engines and users want. A risky solution, that can easily go wrong (typically because they are sent to irrelevant content - like your homepage.) It's best not to assume. Find the issue and fix it.
  • Serving a 404 message on the page visually, but not delivering the corresponding 404 http response code. This hurts everyone (for the same reason as the "redirect to a 404 page" item above.)
  • Serving a different status code to different user agents. For example, are you sending Bingbot a 404 (not found), and users in a browser a 200 (okay) message? Sometimes this occurs across different devices, e.g. there's a 404 on mobile (and for Googlebot-Mobile) but not on desktop browsers.
  • Not finding & resolving critical errors - on a regular basis. Engineering teams are busy - and they don't want more work, so it's not uncommon for them to refuse to do to. Especially when they don't fully understand the value of the work to the business.
  • SPAs (Single Page Applications) can't create a proper 404 "out of the box". Here’s the correct way to fix that.

What Are The SEO Repercussions of 404 Setup Mistakes?

  • If there are too many 404 errors occurring on the website, the 404 URL itself can start ranking (an issue that occurs when you combine a redirect to a 404 page with a URL that doesn't issue a proper 404 http header response).
  • Pages on the website that should be benefiting from the link are not, lowering the overall search engine ranking potential. This results in less traffic to the website overall and you are left to figure out what to do next to recover your traffic.
  • Users see a 404 page - instead of the page they should be seeing - often causing them to leave.

Unfortunately, these users are not likely to return:

  • 404 pages have a high bounce rate - i.e., the percentage of visitors who visit this page first (often via a search engine) and immediately leave the website.
  • 404 pages have a high exit rate - i.e., the percentage of visitors that find this page from clicking a link somewhere on your website, whereupon they immediately leave.

How Can I Make My 404 Page More User-Friendly? (E.g. What Are 404 Page Best Practices)?

There are several things you can do to improve your website's 404 page - and the likelihood that website visitors will stick around and check out the rest of your site - including:

  • Explain, in plain English (or whatever language your site targets), exactly what happened. This is for the user's benefit…. because that matters!
  • Include a link to your contact page so users can attempt to solve the issue with your help.
  • Include a simplified HTML sitemap (embedded in the body of the page) so the user can find their own way. Track what's being searched most often and make those resources easier to find.
  • Consider adding a search bar to the 404 page template - so the user can look for the resource if they don't see listed already.
  • Make sure the 404 template’s title includes the text “404” or “Page Not Found” - this way you can figure out which pages your site's visitors are hitting, and the frequency this occurs at, in Google Analytics (or your analytics tracking tool of choice.)
  • Have fun with it! Extend your brand's personality & connect with your audience. Examples of great implementations of this can be found here.

How Can I Find 404 Errors On My Website?

There are several good ways to go about this. I’d honestly recommend doing them all - since some tools can find issues that other tools don’t see.

  • Crawl your website. There are a host of really great crawlers, my favorite being Screaming Frog for small sites, and DeepCrawl for very large/enterprise sites (both often work in all situations, it’s often a personal preference or budget consideration. Sitebulb is another fantastic tool.) Each will tell you what status code errors are linked on your site.
  • Check the Google Search Console (GSC - formerly known as Google Webmaster Tools) Coverage Report, in the Excluded section:
The Google Search Console (GSC) Coverage report. Click the “Excluded” box to see what URLs on your site are excluded.
The Google Search Console (GSC) Coverage report. Click the “Excluded” box to see what URLs on your site are excluded.
This version helps you see URLs that Google believes are Soft 404s.
This version helps you see URLs that Google believes are Soft 404s.
If you have 404 errors that Google has discovered, you will find them by clicking on this header.
If you have 404 errors that Google has discovered, you will find them by clicking on this header.
  • Check Google Analytics (GA) Search for page titles that contain “not found” or “404” - this is a nice list that tells you how many times a particular page has been hit, and therefore can better help you prioritize which errors need fixing first.
Finding 404s in Google Analytics
How to find 404s in Google Analytics
  • Check Your Log Files. This is my favorite way to check for 404 errors (and other issues too!), since you can see exactly who - aka which bot - is hitting what errors, on what specific pages, how frequently, and your site's total "error rate". Unfortunately, not everyone can get access to log files, depending on your platform and host. Talk to your IT and/or development team to see if this is an option for you. You’ll need a log analyzing tool to do this; my favorite is from Screaming Frog.

How Can I Find 404 Errors On The Pages They Are (Allegedly) Linked From?

If you are having issues finding the 4{xx} error on the page it’s linked from (e.g. from internal links), here are some things you can test. If you can’t find the URL using any of these methods, it’s likely that the issue used to be present but has been fixed.

1) Periodically run a new crawl of the site to find and fix errors old and new.

Screaming Frog report on 404 error types
Screaming Frog report on 404 error types
  • First check if the link is from an inlink or a redirect. Screaming Frog’s 404 inlink report will tell you which is which; other crawlers like DeepCrawl and Sitebulb can give you similar data.
  • If the answer is “AHREF”, it's findable via a direct hyperlink and you can proceed to the steps below.
  • If the answer is “HTTP Redirect”, the source of the link is another linked URL, which then redirects to your broken page. Check your redirect file for the original URL (e.g. in Apache, or an Htaccess file) or your website's admin center/CMS (for example, in a Wordpress redirect plugin.) Update the redirect location so it’s no longer pointing to a broken page.
Google Analytics Navigation Summary Report can help you find the source of linked 404 errors
Google Analytics Navigation Summary Report can help you find the source of linked 404 errors

2) View Source on the URL in question and Control+Find (C+F) for the broken URLs’s path.

3) View Rendered Source (3rd party browser plugins like this View Rendered Source plugin for Chrome) will show you source code AFTER the browser has rendered it (e.g after Javascript and CSS have run.) Again, C+F to find the broken URL path.

4) Use Google Analytics and search for "404" or "Page Not Found" in the Title. Once you narrow down into the offending URL, click into the Navigation panel to find the previous pages in that path

5) Run the URL through the Mobile-Friendly Tool from GSC. Once it’s run, click the HTML tab, and Copy + Paste the contents of the results into a Word doc, text editor, etc. Again, C+F to find the broken URL path.

Pro Tip: Using the Mobile-Friendly Tool gives you actual Googlebot-rendered code - an EXCELLENT resource for QAing issues like this.
Googlebot rendered HTML in the Google Mobile Friendly Tool
Find Googlebot rendered HTML in the Google Mobile Friendly Tool

How Can I Find 404 Errors That Aren't Linked on My Website?

It’s not uncommon for 404 errors to happen on URLs... that never existed on your website. This is generally due to either:

  • Issues with Googlebot crawling JS, HTML or CSS incorrectly. Find these errors via GSC > in the Excluded report. Fortunately, as Google has improved their ability to crawl JS effectively, this happens less and less commonly.
  • External links to your site that have broken paths (e.g. another websites is actively linking to a non-functional page on your website.) You can discover these with 3rd party tools. My favorite is the Broken Backlink report in Ahrefs (a paid SEO tool.) You can export this data in a CVS file, then crawl each URL in List Mode to see what’s actively 404ing today.

Pro Tips: 
1) Sometimes you'll see 404 errors in GSC without a "source", and which you can't find any links to - internally or externally. Depending on the volume, you might ignore these, or just redirect them anyway.

2) Web cacheing can be a blocker to the QA process - it's possible that a particular URL is not currently 404ing, but was - or that it wasn't, but it is now. So remember to clear your websites cache, and your browser's cache too, if you experience any weirdness.

3) Beware creating "redirect hops" (a redirect from page A to page B and then to page C, for example) or "redirect loops" (e.g. a redirect from page D to page E and then back to page F, so the web visitor can't access any functional URL!) Most good crawlers can help you identify these in action.

4) Keep in mind that not every external link is worth redirecting. If the broken link is from a spammy or really low value website, you might be better off ignoring it. You can determine this answer for yourself, again via paid 3rd party tools OR via a manual visual inspection. Examples of ways to determine quality quickly include: Ahref’s DR (domain rating) and UR (URL rating metrics,) or Moz’s DA (Domain Authority) or Spam scores. Learn more about vetting errors to determine what you fix.

How Do I Fix 404 Errors? Can I Prevent 404 Errors?

There are a variety of ways to fix 404 errors, depending on your site’s setup/platform and your software development team’s capabilities/priorities. But the simplest answer is implementing a 301 redirect.

  • You might be able to implement this for yourself in your website’s admin center. If so, this is the easiest way to do so! Just pick out the best new URL this page should be redirected too (ideally something highly relevant.)
  • Sometimes you’ll need to go through your software or IT team. It’s not uncommon for them to push back on doing this work - full stop, or due to how many there are to implement (sometimes there are a lot). They don’t always see the value, and they are spread thin working on other business priorities. It’s generally something I do recommend fighting for, within limits. See more on this below.

Can you prevent them? Nope. They are just a reality of the world, and ignoring them won’t help. Consider 404s "technical debt" that you need to find/vet/resolve periodically.

How Do I Fix Soft 404 Errors?

Part 1: Vet your GSC “Soft 404” list to ensure that any issues found are actually issues to fix... because they aren’t always.

To do so, export the Soft 404 error URLs from GSC, and inspect them 1) manually (visually - do they look like they have content?), 2) via a crawler for status code checker (for the actual status code they issue), and potentially 3) via a JavaScript audit.

One fascinating use case about Soft 404s shared by the amazing Paige Ford: "We discovered multiple Soft 404s on valid Help Center articles, and theorized that Google wouldn’t index them because the help articles were explaining error codes - where those messages could be misinterpreted as 404 messages. We kept error codes (that weren’t 404 or 500) but removed the error message, and Google ended up indexing  it."
  • For items that were errors and you actively want those URLs indexed - and you believe they are okay now - submit them for indexing via GSC.
  • For items that aren’t a priority to index, you may be able to ignore them. It really depends on how many there are - don't let there be so many you don't see actual, real issues!

Part 2: Share your vetted list of issues with your software development team or fix them directly. In most cases it’s one of the following:

  • It’s not a valid page that search engines should be accessing or indexing anyway. In this case, you can either block the page / page path via the Robots.txt file, or via the Remove URL tool in GSC if you need to get it deindexed.
  • It’s a page that should have issued a 404 error, but didn’t. Your software team will need to dig in to understand why that’s happening & resolve it. Once it’s correctly issuing a 404 again, you can proceed to part 3.
  • There's a JavaScript crawling or rendering issue resulting in visible content that users can see - but search engine bots can't. To troubleshoot if this is happening, you can use the Inspect URL tool in GSC, or use rendering tools like View Rendered Source, your browser's Inspect Element tooltip, or the HTML that's rendered from the Mobile-Friendly Tool.

Part 3: Follow the steps outlined to fix regular 404s, above.

Can't I Just Let Pages Keep 404ing? Google Says It's Okay!

Yes they do say this! Unfortunately, Google sometimes says things that aren’t - strictly speaking - accurate. See: Marketers Say Most of Google’s Public Statements Are False or Misleading. Here are some clarifying points: 

  • Actively linking to 404s is just bad - for everyone. Lost users, and therefore lost revenue opportunities.
  • Having too many 404s, or other site errors, can contribute to an overall high error rate on your site, which can cause Google to distrust your site overtime. Plus, you really don’t want Googlebot spending time crawling your 404s instead of your functional, high-quality content - right?
  • Error URL can have SEO equity - that you can't/don't receive if you don't fix them.
  • When there is a business case for not redirecting some URLs - consider issuing a 410 “Gone” message instead. Use cases for not redirecting things: your website site got hacked and the URLs in question were spammy / malware issues! Or your purchased your domain from a 3rd party, and there used to be pages for a different business - and you actively don’t want them associated with your new business.) Googlebot tends to respect 410s much more quickly than 404s.

The issue with 404s, as we see it - is that Googlebot appears to treat them as "temporary". If there are active links to 404 pages - either on your site, or elsewhere on the web - Google can and will keep checking them to see when they will get fixed. They will continue to do this for months - and sometimes years - after the URLs stop working (even when you fix these), if you don’t resolve them.

You can validate this yourself by viewing your log files, and seeing the quantity/frequency Googlebot hits 404ing pages.

If/when you need to "vet" what gets redirected, here's our recommended process.

How To Vet & Prioritize 404s & Other Errors for Fixing

Though we generally recommend fixing 404s and other errors via 301 redirects, there are times when fixing them all is not possible, or not feasible - due to technical constraints, competing business priorities, or internal politics. (Sometimes it's just not a battle worth fighting!)

Here's how we approach this "error vetting & prioritization " process:

  • Identify error pages that website visitors experience. Use Google Analytics and/or your log files to identify these. Prioritize them by frequency, and potentially by historically-driven revenue (e.g. redirect pages that made you a lot of money in the past!.
  • Identify error pages that search engine bots experience. Use your log files to identify these. Prioritize them by frequency & longevity (in other words - is Googlebot still hitting it months and years after it was retired? If so, consider redirecting it. Googlebot is checking it repeatedly for a reason!)
  • Identify error pages that have live/functional backlinks to them. URLs with more/better backlinks are more important to redirect, so you can benefit from the power of these existing links; this is called "broken backlink building."

You might also consider using GSC's Prioritization Insights in the error report - allegedly, these are in rank priority order.

If the URLs you see in GSC don't meet any of the above qualifications, you may be safe ignoring them (and letting them keep 404ing.) Alternatively, consider utilizing a 410 (Gone) status code instead of a 404. Google tends to respect 410 errors more quickly (eg by deindexing them and stopping crawling them.)

How Do I QA My 404 Page to Ensure It's Working As Expected?

Here’s our step-by-step process to confirm if your 404 page is working as expected:

1) Find a 404 URL & audit it manually.

  • This can be as simple as sticking a random string of letters after the domain / homepage and clicking enter, e.g. at https://thegray.company/i-made-this-up
  • DO NOTE, however, that larger, more complex sites might have functionality differences in different site sections - in other words, the 404 page might work in some places on your site, but not in others.
  • If you have a small site, don’t worry about this. If you have a large site, at minimum, try these QA steps within each major site section, e.g. https://thegray.company/services/i-made-this-up AND https://thegray.company/clients/i-made-this-up. Monitor GSC more carefully for 404 errors and soft 404 errors, just in case.

2) Confirm that the page is issuing the correct 404 response. I recommend https://httpstatus.io/. Copy + paste the fake URL into the field and click the Check Status button (you can check multiple URLs at one time, and also test it via different user agents):

This is where you enter the URL (or list of URLs) to test them in httpstatus.io
This is where you enter the URL (or list of URLs) to test them in httpstatus.io
  • Confirm that it’s issuing a 404 code. (Rinse and repeat as needed for 404s in other site sections)
  • It should not: redirect (301 or 302) to a 404, OR issue a 200 (ok) status.

3) Outside of this, the primary items to pay attention to are around usability: if/when the user hits the page, do you successfully help them find their way again? How can you do that better? Read more about this and best practices for 404 pages. (You can also learn more about QA for SEO.)

Should 404s be indexed?

Definitely not. Don't index error pages! That's like asking for a less ideal user experience!

(This question was contributed by Sarah McDowell, who's needed to answer this question for her clients. Sarah co-hosts WTSEO's Community Podcast - be sure to check it out!)

Contact us if you need help fixing or improving the 404 page functionality on your website - we’d love to help!

Work With Us
We’ll help teach, mastermind, and carry out SEO roadmaps that check all the boxes.
CONNECT THE DOTS WITH US