WordPress Robots.txt is the text file located in the root of your Web Directory. One of the most confusing part in SEO is setting up your Robots.txt file. There are lot of Webmasters out there, who miss out the use of WP Robots txt file. Actually Robots.txt file followed by the search engine bots. Those bots follow the guide lines in Robots.txt file, to understand that which part of your blog they have to crawl, and which part to left. In short it helps to block search engine bots to index and crawl admin pages and important part of your blog. A wrong configured Robots.txt file can completely remove the presence of your blog from search engines. Thats why it is the most confusing part of SEO. Robots.txt is not only work in WordPress, actually it is available in every platform such as Drupal, Joomla, ZenCart etc. It resides at the root of the domain for example www.howupdates.com/robots.txt
What is WordPress Robots.txt File?
Robots.txt file is located in the root of your Blog. When a search engine bot or spider comes to your site for indexing, they follow the Robots.txt file first. This file help Search engine bots to understand which part to crawl and which to avoid. If you can’t find any Robots.txt file in host root, you can simply create a new one with the name of robots.txt and edit it. Or if you can’t do this then open up the notepad in your PC, and save the text file as rob
Robots.txt Tags/Commands Details
Once you find your robots.txt file you have to write the commands/tags in that file which will be followed by Search engine Bots. If you want to get quality traffic on your blog you should allow the bots of every search engine to crawl your blog. To allow every search engine bot just type User-agent: * or if you want to allow only google to crawl your blog, then type User-agent: googlebot.
User-agent: * User-agent: googlebot User-agent: bingbot
User-agent: * This tag allow bots from every search engine to crawl your blog
User-agent: googlebot This tag allow only google to crawl your blog
User-agent: bingbot This tag allow only bing to crawl your blog
If you are choosing a specific bot to crawl your blog you have to put the Allow command below it. Or if you want to disallow a specific search engine to crawl your blog you have to type the disallow command below it. For example lets suppost i want to allow the googlebot to crawl my blog but i dont want bing to crawl my blog. So the tags i am going to be use are below.
User-agent: googlebot Allow: / User-agent: bingbot Disallow: /
So from this example, its easy to understand that to allow the crawler, you have to type
Allow: / and to disallow the crawler type
Disallow: / and thats it. You should type the location of your Blog sitemap in Robots.txt file. Simply type Sitemap: http://www.yourdomain.com/sitemap.xml or the location where your sitemap exists. If you have more then one sitemap, then follow the example below.
Sitemap: http://www.yourdomain.com/sitemap1.xml Sitemap: http://www.yourdomain.com/sitemap2.xml Sitemap: http://www.yourdomain.com/sitemap3.xml
Simply type each of the sitemap in the next line. So the crawler will understand the location of your Sitemap and will easily index your links. Ok now you have to disallow the Search crawlers to crawl the admin area or important directories of your blog. For this purpose you have to put the tag
Disallow: /directory name/
Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/
Disallow: /cgi-bin/ This tag will disallow the crawler to crawl CGI Bin directory.
Disallow: /wp-admin/ This tag will disallow the crawler to crawl Wp-Admin directory.
Disallow: /wp-content/ This tag will disallow the crawler to crawl Wp-Content directory.
So if you want to Disallow any of the directory simply follow the guidelines above and type the directory name or you can also Disallow any file with the same rule.
Things to avoid while creating a Robots.txt File
Creating a Robots.txt is not much hard. But a single mistake is very dangerous for the health of your blog traffic. A single mistake can completely remove the presence of your blog from the Search Engines. Here are the common mistakes which you should avoid.
- Creating spaces while putting commands/tags. Example: Dis allow
- Creating spaces in the start of command. Example: Disallow: /wp-admin/
- Usage of extra Capital letters. Example: DisAllow
HowUpdates Robots.txt File
Here is the robots file which i wrote for HowUpdates. You can copy that one and change the sitemap location to yours and leave the rest as it is. And if you want to put more limitations you can edit it according to your needs. But remember a single wrong Disallow command can completely remove your blog from search engines.
sitemap: http://www.howupdates.com/sitemap.xml User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ User-agent: NinjaBot Allow: / User-agent: Mediapartners-Google* Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Mobile Allow: /
Robots.txt file of Famous Websites
Below you can find link to the robots.txt file of some Famous Websites all around the globe.
- Amazon Robots File: http://www.amazon.com/robots.txt
- eBay Robots File: http://ebay.com/robots.txt
- Pinterest Robots File: http://www.pinterest.com/robots.txt
Start using robots.txt file and start monetizing your blog. You must create a robots.txt file in order to do Quality SEO of your blog. If you required any help for Robots.txt file, do let me know via comments below. Share this post with your friends and Don’t forget to subscribe to the blog for Daily Newsletters.