Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

HomeHomeOur CommunityOur CommunityGeneral Discuss...General Discuss...ASP.Net errors when Google & Yahoo crawlers hit my siteASP.Net errors when Google & Yahoo crawlers hit my site
Previous
 
Next
New Post
7/17/2006 1:03 PM
 

For some reason I don't have access to that post.

Does anyone have a current .broswer file for Googlebot? Apparently, I don't have the parameters set correctly in my googlebot.browser file since I'm still seeing the Cannot use a leading .. in my IIS logs when googlebot crawls. If anyone has a good googlebot .browser file, I'd appreciate it!

 
New Post
7/17/2006 3:20 PM
 

Sure enough, that post is in the Benefactors forum.  There really were two salient points.  I devised a solution to this problem which I posted as:

Please check to see whether or not you've been receiving any errors in the log like this:

Message: DotNetNuke.Services.Exceptions.PageLoadException: Cannot use a leading .. to exit above the top directory. -

I recently devised a fix to this error, by modifying the line in web.config to the following:

<!-- Forms or Windows authentication -->

<authentication mode="Forms">

<forms name=".DOTNETNUKE" protection="All" timeout="60" cookieless="UseCookies" />

</authentication>

My fix has since been added into the default web.config for 4.3.2 but if you are running an earlier version of DNN, this could explain your problem.  It prevents some spiders, such as Google, from spidering your site.

You can view more details in this thread:

http://www.dotnetnuke.com/tabid/795/forumid/108/threadid/46036/threadpage/1/scope/posts/Default.aspx

 

In addition, Charles Nurse provided a post with more details:

There is an issue with ASP.NET 2 sites and google/yahoo that is "fixed" by the useCookies attribute mentioned above.

This fix has been added to the next stabilization release.

Also, you may see exceptions in your logs from some Spiders.  It appears that some Spiders (especially msnbot) do not work very well with Friendly URLS. 

For example, if the base page is http://www.mysite.com/MyPage/tabid/45/Default.aspx.

When a spider sees a link on the page like this href="/LinkClick.aspx?link=xx&tabid=yy&mid=zz" it seems to combine the path's like this

http://www.mysite.com/MyPage/tabid/45/LinkClick.aspx?link=xx&tabid=yy&mid=zz

Now this is ok if yy = 45 which it does on the page "MyPage", but as the spider follows the links off the page, it seems to use the original BasePage as its starting point, rather than using the new page as a base page.

This is only an explanation of what appears to be happening, based on exceptions being logged on dnn.com.  We are not sure that this is the case.

The upshot of this is that yy <> 45 and the FriendlyUrlProvider has trouble parsing the tabid and causes an exception.

While we cannot "fix" how the spiders generate their links, we have added better parsing to the FriendlyUrlProvider to allow it to get a single tabid from the path.



Shane Miller
Call Centers 24x7
 
New Post
7/17/2006 3:32 PM
 

Thanks very much! I made the change to my DNN 4.02 web.config file, so hopefully it works. Interesting that there are apparently two fixes for this:

 

1) Using .browser files

2) Adding  cookieless="UseCookies"   in the web.config

 

 
New Post
7/17/2006 5:07 PM
 
Please report back. I made this change to my web.config file. Google did attempt to crawl my site since then and I have only very slightly improved results. Still get the .. exception and still desperate for a fix.

Thanks
Ken
 
New Post
7/17/2006 6:10 PM
 

The fix did solve my problem with googlebot anyway (Slurp has yet to crawl since my change) . Googlebot has been crawling my site all afternoon error free.  Hell yes! The one thing though is that I am not sure what did it, the web.config change, adding the browser files, or both. See, I changed my googlebot.browser file and the web.config at about the same time, so I don't know which one did it.

There is a way to find out, but I'm not changing anything at this point.

I would recommend doing both. Modify the web.config and also implement the App_Browser folder containing a .browser file for each bot.

 
Previous
 
Next
HomeHomeOur CommunityOur CommunityGeneral Discuss...General Discuss...ASP.Net errors when Google & Yahoo crawlers hit my siteASP.Net errors when Google & Yahoo crawlers hit my site


These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

  1. No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
  2. No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
  3. Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
  4. No Flaming or Trolling.
  5. No Profanity, Racism, or Prejudice.
  6. Site Moderators have the final word on approving / removing a thread or post or comment.
  7. English language posting only, please.
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out