Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

HomeHomeUsing DNN Platf...Using DNN Platf...Administration ...Administration ..."Googlebot can´t access your file" - how to add robots.txt?"Googlebot can´t access your file" - how to add robots.txt?
Previous
 
Next
New Post
1/2/2013 7:13 AM
 

It seems like the Google crawler is expecting to find robots.txt to be able to crawl the DotNetNuke site. I´ve got several messages from Google Webmastertools lately looking something like this:

"Over the last 24 hours, Googlebot encountered 175 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%."

I really don't like the part saying "we postponed our crawl". Does this mean we must set up a robots.txt file in DNN? I know we have the sitemap-file available, but it seems like that´s not enough anymore.

So: how can we publish robots.txt files to several portals in one installation?

 
New Post
1/2/2013 7:17 AM
 
The quick and dirty way is to just add a robots.txt file that allows all bots to crawl everything. If you want to go portal specific, you would need to create a robots.txt handler ... but quite frankly.. i don't see much benefit in that

Erik van Ballegoij, Former DNN Corp. Employee and DNN Expert

DNN Blog | Twitter: @erikvb | LinkedIn: Erik van Ballegoij on LinkedIn

 
New Post
1/2/2013 7:18 AM
 
Thanx, so I just add a common robots file for all portals on installation root?
 
New Post
1/2/2013 7:25 AM
 
yes.

So a robots.txt file that allows all bots to crawl everything would look like this:

User-agent: *
Disallow:

If you look at the robots.txt file of DotNetNuke.com, you could also do something like this:

User-agent: *
Disallow: /admin/
Disallow: /App_Browsers/
Disallow: /App_Code/
Disallow: /App_Data/
Disallow: /App_GlobalResources/
Disallow: /bin/
Disallow: /Components/
Disallow: /Config/
Disallow: /contest/
Disallow: /controls/
Disallow: /DesktopModules/
Disallow: /Documentation/
Disallow: /HttpModules/
Disallow: /images/
Disallow: /Install/
Disallow: /js/
Disallow: /Portals/
Disallow: /Providers/

Which would prevent bots to index any files in the application directly. Adding the Portals directory will also prevent indexing that directory.. which may be something you do not want, depending on the type of content you've got in that directory

Erik van Ballegoij, Former DNN Corp. Employee and DNN Expert

DNN Blog | Twitter: @erikvb | LinkedIn: Erik van Ballegoij on LinkedIn

 
New Post
2/12/2013 10:13 AM
 

Thanks for the examples shown from the DNN website. They helped me make changes to my robots.txt that should now stop indexing of lots of extraneous files. I do have a couple of questions however and would appreciate some help with them.

1. Is there any function within DNN to help exclude files from indexing other than the robots.txt file?

2. How do you make sure certain "private" pages in a portal are not indexed? Would you add a line for the page URL as in: "Disallow: /pagename.aspx/"?

Thanks for your help!
Frank

 
Previous
 
Next
HomeHomeUsing DNN Platf...Using DNN Platf...Administration ...Administration ..."Googlebot can´t access your file" - how to add robots.txt?"Googlebot can´t access your file" - how to add robots.txt?


These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

  1. No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
  2. No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
  3. Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
  4. No Flaming or Trolling.
  5. No Profanity, Racism, or Prejudice.
  6. Site Moderators have the final word on approving / removing a thread or post or comment.
  7. English language posting only, please.
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out