WordPress Robots.txt Sample

0
22

WordPress Robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Adding Sitemaps to WordPress Robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Sitemap: http://www.example.com/post-sitemap.xml

Explanation

Allowing all Bots

  • Allowing any Bots to Crawl
User-agent: *
Disallow:

Not Allowing any Bots

  • Not Allowing any Bots to Crawl
User-agent: *
Disallow: /

Block a Folder

User-agent: *
Disallow: /Folder/

Block a File

User-agent: *
Disallow: /file.html

 Block a page and/or a directory named private

User-agent: *
Disallow: /private

Block All Sub Folders starting with private

User-agent: *
Disallow: /private*/

Block URL's end with

User-agent: *
Disallow: /*.asp$

Block URL's which includes Question Mark (?)

User-agent: *
Disallow: /*?*

Block a File Type

User-agent: *
Disallow: /*.jpeg$

Block all Paginated pages which don't have "?" at the end

  • http://www.example.com/blog/? ( Allow )
  • http://www.example.com/blog/?page=2 ( Not Allow )

Helps us to Block Paginated pages from Crawling

User-agent: *
Disallow: /*? # block URL that includes ?
Allow: /*?$ # allow URL that ends in ?

Using Hash

# Hash is used for commenting out

Bots / User Agents

Top 10 Bots

Robot
bingbot
Googlebot
Googlebot Mobile
AhrefsBot
Baidu
MJ12bot
proximic
A6
ADmantX
msnbot/2.0b

Individual Crawl request for each Bots

User-Agent: Googlebot
Allow: /

User-Agent: Googlebot-Mobile
Allow: /

User-Agent: msnbot
Allow: /

User-Agent: bingbot
Allow: /

# Adsense
User-Agent: Mediapartners-Google
Disallow: / 

# Blekko
User-Agent: ScoutJet
Allow: / 

User-Agent: Yandex
Allow: / 

# CommonCrawl
User-agent: ccbot
Allow: / 

User-agent: baiduspider
Allow: / 

User-agent: DuckDuckBot
Allow: / 

User-Agent: *
Disallow: /

LEAVE A REPLY

Please enter your comment!
Please enter your name here

*