SEO Blog
orphan pages

Orphan Pages & SEO: What are Orphan Pages & How to Find them

Most web admins handle major SEO aspects like publishing quality content, building strong internal and external links, ensuring fast load times and a seamless user experience.

But one aspect of SEO that can result in more organic traffic if taken care of, is orphan pages. Yet many web admins tend to ignore orphan pages as fixing them is a tedious, time-consuming process.

Today, we will discuss everything about orphan pages: what they are, how they affect SEO, how to find orphaned pages, and how to fix them.

What is an orphan page?

A web page that doesn’t have any links pointing to it from any other sections or pages on the website is called an orphan page. Since there are no links pointing to orphan pages, website visitors and search engine crawlers won’t be able to find them. The only way to access an orphan page is to type in the exact URL in the address bar

Orphaned Pages & SEO: How do orphan pages affect SEO?

Search engine crawlers usually find new pages in two ways:

● Through internal or inbound links pointing to them
● Through the URLs listed in the XML sitemap

Since orphan pages don’t have any backlinks, search engines won’t even know that these pages exist. Hence, they do not get crawled or indexed, and rarely get traffic, even if they have top-quality content and perfect on-page SEO. If you know the orphan pages on your website, it is easy to fix them. But the actual challenge lies in finding the orphan pages first.

How to find orphan pages on your website? How to fix them?

Before we go into the details, here’s the truth: many of the orphan page checker tools available on the internet today cannot guarantee you to find all the orphan pages on your website.

Why? The real challenge is to find hidden web pages.

Since orphan pages don’t have backlinks, they are hidden from crawlers, so crawlers will have a difficult time finding them. If those tools have access to your website backend, they can get a list of all your website URLs and conduct a gap analysis on the data from the crawler and the full URL list. The result will reveal orphan pages.

In simple terms, the proper way to find orphan pages is to conduct a gap analysis between all your website URLs and the URLs that crawlers can access. Let’s see how you can do it manually, step-by-step.

Step 1: Identify all crawlable pages on your website using a crawler

The first step is to create a list of all the pages on your website that crawlers can access. You will need a crawler to do this.

You can use ScreamingFrog’s SEO spider, or any other SEO spider, to crawl and find all crawlable pages on your website. The goal is to create a list and export it into a spreadsheet; so you can use any crawler or SEO spider you want.

Here’s how you can do it with ScreamingFrog:

On the ScreamingFrog desktop app, enter your website homepage URL and click on ‘Start’ to crawl your entire website. Make sure it is set to crawl only indexable pages. You can check it in configuration→robots.txt→settings.

You may have to wait for a few minutes to complete the crawl, depending on how large your website is. Once the spider has finished crawling, it will display all the crawlable links on your website.

Now, the ‘Internal’ tab may have image URLs and other resource URLs that we don’t need in this case. You can click on ‘Page Titles’ to see only a list of indexable pages.

Once you have the list, export them into a spreadsheet by clicking on the ‘Export’ button.

Again, we only need the URLs for this, so you can delete all the other columns in the spreadsheet. The final result will look something like this:

You may name this column ‘Crawlable pages’ for easy identification.

Step 2: List all pages on the website

Now, you can find all pages on a website through different methods. The ideal and fail-proof method is to ask your website development team to export a list of all URLs from the server-side.

Since server configurations can vary from website to website, we are going to explain an alternative method—using Google Analytics.

If the orphan pages have ever been visited at least once, their records are there in Google Analytics or the web analytics tool you use. Here’s how to find all pages on a website using Google Analytics:

In the left sidebar, go to Behavior → Site Content → All Pages.

Since orphan pages aren’t crawlable and have no inbound links, it is likely that they’ve been visited only a few times.

You can click on ‘Unique Pageviews’ to see the least visited pages first. It’ll rearrange the list in ascending order, from the least visited pages to the most visited pages.

To leave no margin for errors, you can choose a long date range at the top right corner, preferably from a date before you installed Google Analytics.

Google Analytics only lists ten pages by default, so you need to change it to the highest number to see all pages of the website.

In the bottom right corner, you’ll see a ‘Show rows’ drop-down menu. Click on it and select the highest possible number—in this case, 5000.

We’re going to export this list and add it to the spreadsheet now. But remember, if you have more than 5000 URLs on your website, you’ll have to repeat the next step multiple times to view all pages of the website.

Go to the top right corner and click on the ‘Export’ button. You can export it either to Google sheets, Excel, or a CSV file.

Now, copy all the URLs from the exported file to the crawlable page spreadsheet you created earlier.

Analytics doesn’t include the root domain in their list, so you have to add it manually.

To do this, add a new column to the left of ‘Analytics URLs’ and paste your homepage URL in all rows.

Next, you can combine these two columns using the concat formula (=concat(B2,C2)) and drag down to do the same for all rows.

Step 3: Identify orphan URLs through gap analysis

Now that you have the list of crawlable pages and all pages on your website, the only thing you need to do is to find the pages that are not common in both columns.

In other words, conduct a gap analysis of the columns ‘Crawlable Pages’ and ‘Final Analytics URL.’ Doing this manually will take a significant amount of time, so we will use a formula to automate it.

To do this, you can use the ‘match’ function that’ll check whether each URL in the ‘Final Analytics URL’ list is also found in the ‘Crawlable URLs’ list.

In mathematical terms, use the ‘=match(D2,$A$2:$A$11,0)’ formula in the above spreadsheet. You can compare the cell addresses in your spreadsheet to our example and change the formula accordingly.

Here’s our result:

As you can see, the analytics URLs corresponding to the cells marked #N/A are orphan URLs.

To see only orphan URLs, you can create a filtered view by selecting column E and going to Data→ Create a filter. Click on the filter icon in the first cell and deselect all except #N/A. Click OK.

Now this will show you only orphan URLs and filter out all other URLs.

Step 4: Fix orphan URLs

The next step is to fix orphan URLs on your website. To do this, log in to your website admin panel and identify the pages corresponding to the orphan links. Understand the content of these pages.

Next, look through the non-orphan pages on your website to identify the pages that have related content. You can edit those pages and insert links to the orphan pages to fix this issue.

Alternatively, if you are using WordPress CMS, you can use the Link Whisper plugin to fix all orphan URLs in a few clicks.

Click to rate this page!
[Total: 0 Average: 0]

how useful was this post?

Click on a star to rate it!

Frequently asked questions


What are orphan URLs?

Orphan URLs are web pages that do not have any links pointing to them from other pages on your website. Users and search engines can’t access these pages without knowing the exact URL of the web page.

Why are orphaned pages bad?

Since they have no links pointing to them, they rarely get any visits, and they rarely get indexed on search engines. Orphan pages have no value, even if the content is of high quality.

Where are orphan pages in Google Search Console?

Google cannot find orphan pages since there are no links pointing to them, so it is impossible to get a list of orphan pages from Google Search Console. You can get a list of URLs Google can crawl from Google Search Console by going to ‘Coverage’ or ‘Links’ in the left sidebar.

You have to find orphaned sitemap pages also, so it is not ideal to rely on Google Search Console to find orphan pages.

Can Google find orphan pages?

If the orphan pages are included in the XML sitemap, Google can find those pages. But if they’re not included in the sitemap, Google cannot find them since there is no way for Google to know that those pages exist.

What is orphaned content in WordPress?

In WordPress, content that doesn’t have any links pointing to it from other pages or posts on the same website is called orphaned content.

How to fix orphan pages in WordPress?

To fix WordPress orphaned content, you can manually add links to orphan pages or use a plugin like Link Whisper.

What is the difference between an orphan page and a dead-end?

If a web page doesn’t have any internal links from other pages on the website, it is called an orphan page. When a web page doesn’t link to any internal or external web page, it is called a dead-end.