How to create robots.txt file?


What is a robots.txt file? Well, this is an extremely important part of our WordPress, although very often underestimated. It affects what will be copied from our service by search engines. The idea is not to present the entire contents of the page but only the content we want. The code that is entered into a robots.txt file, informs the robots of the service content, some websites or part of the page, which should not be published. What should be done to create such a code?

How to create a robots.txt file structure?
By default, the main root of WordPress does not include a robots.txt file. This is because WordPress has a function to generate a “virtual” robots.txt file, which blocks indexing wp-admin directory. Settings and status of the addresses which we want to lock, we can check in the Google Webmaster Tools> Site configuration> access to the robots.
So what is the structure which robots.txt file should take a for a site that is based on WordPress?

Disallow: */wp-admin/
Disallow: */wp-includes/
Disallow: */wp-content/plugins/
Disallow: */wp-content/cache/
Disallow: */trackback/
Disallow: */feed/
Disallow: */page/
Disallow: */comments/

These settings (disallow – no permit) will aim to prevent indexing of the folder administrator, comments, pages, such as previous, next or channel subscriptions. What will also bring this structure? First of all, robots will not see the the same content on multiple pages which happens a lot.

“User- agent” and “Allow”
The robots.txt file contains also two other phrases: User-agent which means the client of user and Allow user (allowing). User-Agent is nothing but a determination of the search engine robot or software indexing. “Allow” allows you to provide a URL for Google sub directory, which is located in a locked parent directory. These two lines (the User-agent and Allow) form one entry in the file, as opposed to Disallow, which previously presented the possibility to refer the user to multiple clients.

Top tips for saving robots.txt
It is very important to follow a few simple but essential rules whereby robots have the opportunity to find and identify the file. Firstly, the robots.txt file MUST be saved as a text file. This file must also be placed in the root directory of the domain. Please also note that the file must be named robots.txt.

With such a stored file, problems with content duplication will disappear, which will in turn contribute to our site a good position in search results. Another benefit that guarantee our robots.txt, will be much less use of the transfer on the server, due to the fact that we exclude a lot of redundant and unnecessary search engines from crawling and indexing.