Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

HomeHomeArchived Discus...Archived Discus...Developing Under Previous Versions of .NETDeveloping Under Previous Versions of .NETASP.Net 2.0ASP.Net 2.0Creating a web bot/crawler/spider for multiple websitesCreating a web bot/crawler/spider for multiple websites
Previous
 
Next
New Post
10/21/2008 12:02 AM
 

Hello

I need to create a web bot/crawler/spider that would go into different web sites and collect data for us and store in a database. The crawler needs to 'READ' the options on a website (either from drop-downs, radio-buttons or check-boxesand) to create some input itself OR use some generic pre-defined words (that we provide it with).

For example, a webpage might be structure with a text field and some drop-downs. Typically, if the user enters the case number of a court case the web-site displays the status, and also there might be different legal documents thay could be retrieved through drop-down options like: 'Industry Permits', 'Civil Cases', 'Criminl cases' etc. So the crawler should be able to read and self-generate a list of suitable options and use them to get the data. we want to create a bot/crawler/spider that will automatically enter the information about multiple cases etc. i.e. case numbers (text field), case type (from drop-downs) and retrieve the data about the relevant cases available on the website.

What is the best approach to achieve this? We can write inidividual bots for each website but are trying to come-up with a more intelligent bot or crawler that can be used to crawl multiple websites. Please advise on how we can achive this.

We are not doing anything illegal, everything perfectly legal. Please advise on how we can achieve this.

Regards
Kishore

 
New Post
10/21/2008 8:35 AM
 

this is a dotnetnuke forum, i would suggest you have more chance of getting an answer at the forums @ asp.net or forums.msdn.microsoft.com


Buy the new Professional DNN7: Open Source .NET CMS Platform book Amazon US
 
Previous
 
Next
HomeHomeArchived Discus...Archived Discus...Developing Under Previous Versions of .NETDeveloping Under Previous Versions of .NETASP.Net 2.0ASP.Net 2.0Creating a web bot/crawler/spider for multiple websitesCreating a web bot/crawler/spider for multiple websites


These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

  1. No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
  2. No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
  3. Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
  4. No Flaming or Trolling.
  5. No Profanity, Racism, or Prejudice.
  6. Site Moderators have the final word on approving / removing a thread or post or comment.
  7. English language posting only, please.
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out