SEO Checklist - Part 5 - On Page Optimization - Robots
Meta Tag Analysis for your website also includes the analysis of Meta robots and Robots.txt file. Today I will talk about Meta Robots and Robots.txt, what they are, what they do, the difference between the two and the syntax we use in these cases.
On-page Optimization: Robots Meta Directives & Robots.txt
What is Robots Meta Directives?
Pieces of code that provide crawlers instructions for how to crawl or index a particular web page content. This is placed in the <head> section of the web page. See example: <meta name="robots" content="noindex, nofollow">
Indexation-controlling parameters for meta robots tag:-
- Noindex: Tells a search engine not to index a page.
- Index: Tells a search engine to index a page. Note that you don’t need to add this meta tag; it’s the default.
<meta name="robots" content="noindex"> or <meta name="robots" content="index"> - Follow: Even if the page isn’t indexed, the crawler should follow all the links on a page and pass equity to the linked pages.
<meta name="robots" content="follow"> - Nofollow: Tells a crawler not to follow any links on a page or pass along any link equity.
<meta name="robots" content="nofollow"> - Noimageindex: Tells a crawler not to index any images on a page.
<meta name="robots" content="noimageindex"> - None: Equivalent to using both the noindex and nofollow tags simultaneously.
<meta name="robots" content="none"> or <meta name="robots" content="noindex, nofollow"> - Noarchive: Search engines should not show a cached link to this page on a SERP.
<meta name="robots" content="noarchive"> - Nocache: Same as noarchive, but only used by Internet Explorer and Firefox.
<meta name="robots" content="nocache"> - Nosnippet: Tells a search engine not to show a snippet of this page (i.e. meta description) of this page on a SERP. <meta name="robots" content="nosnippet">
- Noodyp/noydir [OBSOLETE]: Prevents search engines from using a page’s DMOZ description as the SERP snippet for this page. However, DMOZ was retired in early 2017, making this tag obsolete. <meta name="robots" content="noodyp"> or <meta name="robots" content="noydir">
- Unavailable_after: Search engines should no longer index this page after a particular date.
<meta name="robots" content="unavailable_after: 23-Jul-2007 18:00:00 EST">
<meta name=“robots” content=“[PARAMETER]”>
This is standard, you can also provide directives to specific crawlers by replacing the “robots” with the name of a specific user-agent. Then the structure will be like this:
<meta name=“googlebot” content=“[DIRECTIVE]”>
For example: <meta name=“googlebot” content=“nofollow”> Want to use more than one directive on a page? See example:- <meta name=“robots” content=“noimageindex,” “nofollow,” “nosnippet”>
This is standard, you can also provide directives to specific crawlers by replacing the “robots” with the name of a specific user-agent. Then the structure will be like this:
<meta name=“googlebot” content=“[DIRECTIVE]”>
For example: <meta name=“googlebot” content=“nofollow”> Want to use more than one directive on a page? See example:- <meta name=“robots” content=“noimageindex,” “nofollow,” “nosnippet”>
What is Robots.txt file?
A file that gives bots suggestions for how to crawl a website's pages. Robots meta directives provide more firm instructions on how to crawl and index a page's content. This is a separate file that is placed inside the root directory and outside any sub-folder.
The syntaxes used in robots.txt file
SYNTAX | WHY USED |
---|---|
User-agent: * | Allowing all web crawlers |
User-agent: Googlebot | Blocking a specific web crawler |
Disallow: / | Blocking all web crawlers from crawling any folder |
Disallow: /cgi-bin/ | Blocking a particular folder named cgi-bin |
Disallow: /tmp/ | Blocking a particular folder named tmp |
Disallow: /~joe/ | Explanation |
Disallow: /example-subfolder/blocked-page.html | Blocking all web crawlers from a specific web page |
Comments
Post a Comment