Sure enough, that post is in the Benefactors forum. There really were two salient points. I devised a solution to this problem which I posted as:
Please check to see whether or not you've been receiving any errors in the log like this:
Message: DotNetNuke.Services.Exceptions.PageLoadException: Cannot use a leading .. to exit above the top directory. -
I recently devised a fix to this error, by modifying the line in web.config to the following:
<!-- Forms or Windows authentication -->
<authentication mode="Forms">
<forms name=".DOTNETNUKE" protection="All" timeout="60" cookieless="UseCookies" />
</authentication>
My fix has since been added into the default web.config for 4.3.2 but if you are running an earlier version of DNN, this could explain your problem. It prevents some spiders, such as Google, from spidering your site.
You can view more details in this thread:
http://www.dotnetnuke.com/tabid/795/forumid/108/threadid/46036/threadpage/1/scope/posts/Default.aspx
In addition, Charles Nurse provided a post with more details:
There is an issue with ASP.NET 2 sites and google/yahoo that is "fixed" by the useCookies attribute mentioned above.
This fix has been added to the next stabilization release.
Also, you may see exceptions in your logs from some Spiders. It appears that some Spiders (especially msnbot) do not work very well with Friendly URLS.
For example, if the base page is http://www.mysite.com/MyPage/tabid/45/Default.aspx.
When a spider sees a link on the page like this href="/LinkClick.aspx?link=xx&tabid=yy&mid=zz" it seems to combine the path's like this
http://www.mysite.com/MyPage/tabid/45/LinkClick.aspx?link=xx&tabid=yy&mid=zz
Now this is ok if yy = 45 which it does on the page "MyPage", but as the spider follows the links off the page, it seems to use the original BasePage as its starting point, rather than using the new page as a base page.
This is only an explanation of what appears to be happening, based on exceptions being logged on dnn.com. We are not sure that this is the case.
The upshot of this is that yy <> 45 and the FriendlyUrlProvider has trouble parsing the tabid and causes an exception.
While we cannot "fix" how the spiders generate their links, we have added better parsing to the FriendlyUrlProvider to allow it to get a single tabid from the path.