How to remove URLs from Google Search

There are several ways to remove URLs from Google Search, but unfortunately, there is no one-size-fits-all solution. It all depends on the specific case and each situation is a story in itself. In this article, we will present several ways to remove URLs from Google Search.

For starters, it’s important to know that removing URLs from Google Search must be done correctly – otherwise, you will not achieve the desired result, and your URLs will not be removed from the Google index. What’s worse, improper removal can have a negative effect on your SEO.

 

 

How to check if a URL is indexed

What is common in SEO analysis is to first check if the site is indexed. For this, we often use the site: search operator. Although it is useful when it comes to identifying potentially problematic parts of a website, it should be borne in mind that it is not reliable enough to identify indexed pages. This means that it can show pages that are familiar to Google, but those pages do not have to appear in the regular SERP, that is when you search without a site: operator.
 
It is important to keep in mind that if you search one domain with the site: operator, it can also show you content that is related to that domain but is actually on other domains. This can be the case with, say, your old domain. For example, if you have changed a domain, a regular search will take you to your new domain. If you use the site: operator with the name of your old domain, it can display the content, title, and other elements from your new domain in addition to the name of that domain. This can be confusing and make you think you didn’t migrate well to the old domain and start removing or blocking URLs from the old domain. This is by no means a good solution and you can only harm your new domain.
 
To avoid possible confusion, a better solution is to use the Google Search Console and its Index Coverage report , or for individual URLs, a tool called the URL Inspection Tool.
 
These two tools give you information about whether a site/page has been indexed and provide additional information about how Google treats that page/site.
 

Options for removing URLs from Google Search

Delete content
If you remove a page and set the status codes to 404 (not found) or 410 (gone), that page will be removed from the index shortly after Google goes over it again. Until it is removed, it may still appear in search results. Even if the page itself is no longer available, its cached version may be temporarily available.
 
Noindex
A noindex meta robots tag or x-robots header response will tell the search engine to remove the page from the index. The meta robots tag works for pages where x-robot response works for pages and additional file types, such as PDFs. For these tags to be visible, the search engine must go through those pages/sites. For this reason, it is important that they are not blocked in the robots.txt file. In addition, keep in mind that removing pages from the index can prevent link consolidation and other signals.
 
Primer meta robots noindex taga
 
<meta name=”robots” content=”noindex”>
 
Primer x-robots noindex tags u header response-u
 
HTTP/1.1 200 OK
X-Robots-Tag: noindex

Disabling access

If you want the site to be available to some users, but not to the search engine, then it is likely that it contains:
 
Some kind of login system
HTTP authentication (where a password is required to access)
IP Whitelisting (which only allows some IP addresses to access the page)
This type of setup is best if you have an internal network, access for members only, or during site development. It allows certain groups of users to access the page, while at the same time it will not be available to the search engine, so for that reason, it will not be indexed.

URL Removal Tool

At first glance, it appears to be a tool that removes URLs, when in fact it only temporarily hides content. Google will still see and navigate your content, but the pages will not appear in the search. This temporary effect at Google lasts for six months. Bing has a similar tool that makes the effect of hiding URLs last for three months.
 
We recommend that you use these tools only in some extreme cases, such as data leaks or other security issues.
 

Canonization

When you have multiple versions of a page and want to convert signals (such as links) into one version, what you need to do is some kind of canonization. This is primarily to prevent duplicate content.
 
You have several versions of canonization available:
 
Canonical tag. We use the canonical tag to define an additional URL as the canonical version of the page you want to display. If the pages are duplicated or have similar content, then this is the recommended setting. If they are too different, the canonical link can be ignored, because it is a hint and not a directive.
 
Redirections. Redirection guides the user and searches the bot from one side to the other. The most commonly used is certainly 301 (permanent) redirections. It tells the search engine that you want the final URL to appear in the search results, where the signals will be consolidated. Unlike it, 302 (temporary redirection) tells the search engine that you want the original URL to stay in the index and that it consolidates signals in it.
 
URL parameter management. The parameter is appended to the end of the URL and typically implies a question mark. For example plus.rs?this=parameter. This Google tool allows us to tell it how to treat a URL with a specific parameter. For example, you can define whether the parameter will change the content of the page will only be used to track usage.
 

What are the priorities when removing a URL

If you have several pages (URLs) to remove from the Google index, then it is important and you understand in what order it should be done, or what is the order of priority:
 
Top Priority t: These pages are most often linked to potential security issues, or to data confidentiality. This includes pages that contain personal information, customer information, and the like.
 
Medium priority: This usually includes content intended for a specific group of users. Company internal network (intranet), internal portal, content intended only for employees, as well as test or development environments.
 
Low priority: These pages usually include content that is duplicated in some form. For example, pages that are served with multiple different URLs, URLs with parameters, as well as those that come from a test or development environment.
 

The most common mistakes to avoid

Here are some of the most common mistakes when removing errors, which should be avoided because they either do not help or may even harm the SEO of your site.
 
Noindex u robots.txt failure
Although Google previously unofficially supported noindex in the robots.txt file, it was never the official standard, and now that support is formally absent. In other words, sites that used noindex in a robots.txt file did so incorrectly.
 
Block crawl in robots.txt file
Crawling is not the same as crawling. Even if Google is blocked from navigating a page, if there are internal or external links to that page, it will still be able to index it. Google will not know what is on that page, because it will not go over it, but it will know that the page exists and will even display its title in SERP, based on signals such as anchor text from the links that lead to it.
 
Nofollow
This is often confused with the noindex directive and is even mistakenly used at the page level to exempt it from indexing by the search engine. Nofollow is a recommendation (hint), so even though it originally stopped crawling on pages and individual links, it’s no longer a Nike case. Google can now go over these links if it wants to. Nofollow was also used on individual links to try to stop Google from navigating over those specific pages. As with the above, this is no longer the case, as nofollow is just a recommendation.
 
Our recommendation is to check if nofollow is present on your site and if so if it was accidentally added instead of noindex.
 
Noindex canonical
These signals work the opposite way. While noindex tells the search engine to remove a page from the index, canonical says it indexes it, only to another version of that page. This situation will in most cases result in consolidation, ie work in favor of canonical because Google will ignore the noindex and use canonical as the main signal. However, it should be borne in mind that this may not always be the case. Since Google also uses its own algorithm, it can happen that the noindex tag is still counted as a signal. In this case, pages that have these two tags will not consolidate properly
 

Image removal

The easiest way to remove images from Google search is to use a robots.txt file. Although, as already mentioned, Google has stopped providing unofficial support for removing pages using a robots.txt file, it is enough to use disallow for the images you want to remove.
 
For a single image:
User-agent: Googlebot-Image
Disallow: /images/dogs.jpg
 
For all images:
User-agent: Googlebot-Image
Disallow: /
 

Leave a Comment

Your email address will not be published. Required fields are marked *