Robots dot txt file

Standard

Robots exclusion protocol know as simply robots.txt.
We use robots.txt in websites to communicate with web crawlers and other web robots.

we manipulate crawlers boat through this files. it the main purpose of this files.

Few important things for robots.txt file.

-> Robots can ignore your robots.txt. Especially malware robots that scan the web for security vulnerabilities.

-> Anyone can see robots.txt file of server and it’s publicly available file. So don’t hide information.

-> we need to put this files on our root repository.

It works likes this: a robot wants to visit a Web site URL,
say http://www.domain.com/ Before it does so, it firsts checks for http://www.domain.com/robots.txt, and finds:

An example of robots.txt file content.

User-agent: *
Disallow: /

The “User-agent: *” means this section applies to all robots.
The “Disallow: /” tells the robot that it should not visit any pages on the site.

Advertisements

Say your thought

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s