Leveraging non text/html content with rel=canonical HTTP Headers

Major search engines such as Google, Bing, and Yahoo allowed webmasters to use the link rel=”canonical” tag that allowed them to include a URL in text/html form for indexing. In recent times, Google has also allowed webmasters to use the rel=canonical tag for HTTP Headers.

This move allowed SEOs to ensure that non text/html content like PDF files that also form part of several ecommerce sites for creating brochures, product manuals, etc., receive a specified URL. In addition, this feature also helped reduce duplicate content that could penalize websites due to revised search algorithms.

Canonical HTTP headers have not been fully exploited since they pose difficulties in creation and implementation as compared to the link HTML tag. They also need to be tested rigorously on a test environment before being utilized on live servers.

Here are some tips on how to use the rel=canonical HTTP Header to text/HTML and non text/HTML content.

If you have text/HTML content which supports PHP then you can easily add the rel=”canonical” HTTP Header that in turn will add this link to the desired headers before sending them. This function is similar to the rel=”canonical” link tag except that it uses the HTTP Header instead of the link.

If you want to modify the HTTP Header by using .htaccess for non text/HTML content such as PDF files then the command will append the HTTP Header to that PDF file, which in turn will direct towards a specified HTML page by using the URL. A wild card string using ? or a filename should be included in the filename argument.

You can also engage in dynamic HTTP Header implementation if you are familiar with .htaccess as well as PHP. You can begin by using the RewriteRule command to rewrite a URL and create a PHP file. This will help redirect search engines towards the pdf.php file. You can then add the canonical HTTP Header after including the code for performing conditional logic to confirm if the URL of the PDF file actually exists.

You can use this logic on other file formats such as txt or csv files too that might hold a listing of PDF files or tables containing data within the file. It is also very important to specify content type as application/pdf to ensure that the file is not mistaken as a text/HTML file.

You can use several internet-based tools to confirm if your HTTP Headers are sent. You could try using tools such as the Web Developer Toolbar from Firefox or Live HTTP Headers to verify the status of your Headers.

The above tips will help you to leverage non text/HTML content with rel=”canonical” HTTP headers. Try adapting this function for your benefit and remember to test the process before going live.

Leave a Reply

Your email address will not be published.