WordPress Robots.txt Sample

WordPress Robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Adding Sitemaps to WordPress Robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Sitemap: http://www.example.com/post-sitemap.xml

Explanation

Allowing all Bots

  • Allowing any Bots to Crawl
User-agent: *
Disallow:

Not Allowing any Bots

  • Not Allowing any Bots to Crawl
User-agent: *
Disallow: /

Block a Folder

User-agent: *
Disallow: /Folder/

Block a File

User-agent: *
Disallow: /file.html

 Block a page and/or a directory named private

User-agent: *
Disallow: /private

Block All Sub Folders starting with private

User-agent: *
Disallow: /private*/

Block URL's end with

User-agent: *
Disallow: /*.asp$

Block URL's which includes Question Mark (?)

User-agent: *
Disallow: /*?*

Block a File Type

User-agent: *
Disallow: /*.jpeg$

Block all Paginated pages which don't have "?" at the end

  • http://www.example.com/blog/? ( Allow )
  • http://www.example.com/blog/?page=2 ( Not Allow )

Helps us to Block Paginated pages from Crawling

User-agent: *
Disallow: /*? # block URL that includes ?
Allow: /*?$ # allow URL that ends in ?

Using Hash

# Hash is used for commenting out

Bots / User Agents

Top 10 Bots

Robot
bingbot
Googlebot
Googlebot Mobile
AhrefsBot
Baidu
MJ12bot
proximic
A6
ADmantX
msnbot/2.0b

Individual Crawl request for each Bots

User-Agent: Googlebot
Allow: /

User-Agent: Googlebot-Mobile
Allow: /

User-Agent: msnbot
Allow: /

User-Agent: bingbot
Allow: /

# Adsense
User-Agent: Mediapartners-Google
Disallow: / 

# Blekko
User-Agent: ScoutJet
Allow: / 

User-Agent: Yandex
Allow: / 

# CommonCrawl
User-agent: ccbot
Allow: / 

User-agent: baiduspider
Allow: / 

User-agent: DuckDuckBot
Allow: / 

User-Agent: *
Disallow: /

SEO Cannibalisation

SEO Cannibalisation is a process of Consolidating many similar pages which dilutes SEO Value, in to a single page.

Similar Pages

  • Many Pages with Exactly Similar Page Titles
  • Many Pages with content targeting the same keyword

Reason for SEO Cannibalisation

Have too many similar pages?

  • It could confuse Google
  • It could dilute the SEO Value
  • It could decrease the overall Conversion Rate
  • It could confuse Website visitors
  • It lacks focus

So we need to do Keyword Cannibalisation

SEO Dilution

Issues on having too many similar Pages: 

  • Internal Linking - Linking to various pages with same Anchor Text dilutes SEO value
  • Backlinks - Dilution of Backlinks to various similar pages, decreasing the overall value if its linked to a single page.
  • Content Quality - Ideas and research about a single topic divided across many different pages dilutes content quality
  • Conversion Rate  - Various similar pages have various conversion rates,  so the overall conversion rate will be less

Consolidation

Considering similar pages,

Choosing THE Best Page to Cannibalise

A Page with

  • More Conversion Rate
  • High Quality Content
  • High Quality Backlinks

Consolidate SEO Value

  • Consolidate Content from other similar pages to The Single Page
  • Update Internal Links & its Anchor Text to The Single Page
  • Update Backlinks from others sites to The Single Page if possible
  • Use 301 Redirect from other pages to The Single Page

Advantages of SEO Cannibalisation

  • Google is not confused now
  • Solves the issue of SEO Dilution
  • Website Visitors are not confused now
  • Improves Overall Conversion Rate

Its Good to FOCUS!!!

SEO for a Historical Website

SEO ( Search Engine Optimisation ) for the website which has a big history behind it. It is a kind of Website which has been active for a long period of time with regular activities.

Refining the Website Structure

  • Proper Internal Linking

Improving the User Interface of the Website

  • Making it possible for the user to navigate across the historical website and be able to get relevant information
  • Allowing Social Sharing
  • Making it more interactive

Fixing the 404 Error Backlinks

  • 404 is a HTTP status code, which tells us that the there is a broken backlink. Hence wasting the Link Juice
  • Using a redirect from the broken link to an appropriate page would add extra value to that page and the website

Off site backlink cleanup

  • Websites "Old Backlink" research,
  • And Removing unwanted backlinks

Improving the Title and Meta Description

  • Title Contributes more towards increasing Click Through Rate
  • Meta Description is the second important thing which contributes towards Click Through Rate

Implementing Rich Snippets

It is also called as "adding semantics to the website"

  • Making Google understand each part of the website
  • Highlighting to Google the meaning of each section of the website
    • Use Webmasters Tools to Highlight this information
    • Or It can also implemented via the backend coding

Other ideas for the Historical Website

  • Setting up a XML sitemap for Google
    • for Normal Pages
    • for Images
    • for News
    • and setting up its Priority and its frequency
    • Finally, Submit the XML sitemap to Google via Webmasters Tool
  • Check for duplicate Titles, Duplicate content, short meta description issues and fix it.
  • Check if Canonicalisation is necessary
  • Check if Cannibalisation is necessary

SEO Canonicalisation

Canonicalisation in terms of SEO

Show Google which one is the Preferred URL or Main URL

Canonicalisation Code Implementation

<link rel="canonical" href="http://example.com" />

Similar Pages in the eyes of Google

All below pages are the same:

General

  • http://www.example.com/
  • http://example.com/

Apache Web Server

  • http://example.com/index.html
  • http://www.example.com/index.html
  • http://example.com/index.php
  • http://www.example.com/index.php

Microsoft Internet Information Services (IIS)

  • http://www.example.com/default.asp (or .aspx )
  • http://example.com/default.asp (or .aspx)

Use Redirection to the Preferred URL

This has to be considered if we have exactly same page with multiple URL's as mentioned above.

  • 301 HTTP status code : Moved Permanently
  • 302 HTTP status code : Moved Temporarily

Canonicalisation for Various Parameters in the URL

  • http://example.com/
  • http://example.com/?track=123
  • http://example.com/?track=431

In this above example, the parameter "track" is being used for tracking, in which case the content of the page remains the same.

This has to be explained to Google in order prevent duplicate content issues

Hence canonicalisation is important in terms of On Page SEO