How to hide a site from robots.txt indexing

Mimaktsa10 · Post by **Mimaktsa10** » Thu Jan 30, 2025 5:16 am

There are three ways to prohibit indexing of sections or pages of a web resource, from point to high-level:

The noindex tag and the rel="nofollow" attribute are completely different code elements that pursue different goals, but are equally valuable assistants to SEO optimizers. The question of their processing by search engines has become almost philosophical, but the fact remains: noindex allows you to hide part of the text from robots (it is not in the HTML standards, but it definitely works for Yandex), and nofollow prohibits following a link and transferring its weight (it is included in the standard classification, valid for all search engines).

The robots meta tag, written ecuador email list on a specific page, affects it specifically. Below we will consider in more detail how to indicate a ban on indexing and following links located in the document. The meta tag is completely valid, systems take into account (or try to take into account) the specified data. Moreover, Google, choosing between robots in the form of a file in the root directory of the site and the meta tag of the page, gives priority to the latter.

robots.txt — this method is completely valid, supported by all search engines and other bots living on the Internet. However, its directives are not always regarded as an order for execution (the lack of authority for Google is mentioned above). The indexing rules specified in the file are valid for the site as a whole: individual pages, directories, sections.

Case: VT-metall
Find out how we reduced the cost of attracting an application by 13 times for a metalworking company in Moscow
Find out how
Let's look at examples of a ban on indexing a portal and its parts.

Complete prohibition of site indexing by robots.txt
There are many reasons to prevent spiders from indexing a website. It is still under development, is being redesigned or upgraded, or is an experimental platform not intended for users.

You can close a site from indexing robots.txt for all search engines, for a separate robot, or prohibit it for all but one.

Do not allow indexing to anyone User-agent: *
Disallow: /
Do not allow indexing by a single bot

User-agent: YandexImages
Disallow: /

Allow indexing to only one bot

User-agent: *
Disallow: /
User-agent: Yandex
Allow: /

How to Disable Robots.txt Indexing on Specific Pages
If the resource is small, then it is unlikely that you will need to hide pages (what is there to hide on a business card site), but large portals containing a solid amount of service information cannot do without restrictions. It is necessary to close from robots:

admin panel;

service directories;

search on the site;

personal account;

registration forms;

order forms;

product comparison;

favorites;

basket;

captcha;

pop-ups and banners;

session identifiers.

Outdated news and events, calendar events, promotions, special offers are the so-called junk pages that are best hidden. Outdated content on information sites is also best closed to avoid negative ratings from search engines. Try to keep updates regular - then you won't have to play hide and seek with search engines.

Prohibition of robots from indexing:

Specific page User-agent: *
Disallow: /contact.html
Specific section

User-agent: *
Disallow: /catalog/

The entire web resource, except for one section

User-agent: *
Disallow: /
Allow: /catalog

The entire section, except for one subsection

User-agent: *
Disallow: /product
Allow: /product/auto

Search the site

User-agent: *
Disallow: /search

Admin panels

User-agent: *
Disallow: /admin