DNN SearchProvider - Lucene.net - General Discussion

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

Welcome to the DNN Community Forums, your preferred source of online community support for all things related to DNN.
In order to participate you must be a registered DNNizen

Home All Forums

Home

Our Community

General Discuss...

DNN SearchProvider - Lucene.net

1/12/2007 7:32 AM

cent_usr

Joined: 3/11/2004

Posts: 103

DNN SearchProvider - Lucene.net

Where did the Core Searchprovider forum go, owned by Shawn?? I guess I have been away a little too long.

Anyhow. I'm hoping some of you are to tag on with this suggestion. Adapt a SearchProvider towards http://www.dotlucene.net/
I have worked with this now in several CSM tools and it works splendid. I think most of the job is already made in the current provider and my guess is that not much needs to be done to get this into place. (If someone has worked with Lucene.net please have a look at the searchprovider to hook it up)

1/12/2007 7:47 AM

cent_usr

Joined: 3/11/2004

Posts: 103

Re: DNN SearchProvider - Lucene.net Modified By cent_usr on 1/12/2007 7:49:21 AM

Maybe I should explain incase it's not clear the benefits this would provide. What you do is that you index information within your database outside of SQL server, in a fashion pretty much like google, and you search through your extracted index files pretty much as google and as you get a hit, the indexed information deeplink back into you system. Simple and really really quick.. The forum here on DNN would Truly benefit from this. I think that the SearchProvider is quite easy to hook upp to the Lucene search API and there by index all DNN content externally. So so quick, and so so powerful. This will enhance the options to search through word, ppt, pdf files uploaded for example through the document module. Not to forget release the stress on the SQL server.

1/12/2007 10:07 AM

Joe Brinkman

blog.theaccidentalgeek.com
Joined: 5/18/2003

Posts: 2356

Re: DNN SearchProvider - Lucene.net

The DNN search engine is made up of 3 parts: The DataStoreProvider, the Search Indexer and the core search engine which is a thin layer of glue for coordinating between the two. In a previous MSDN webcast, I showed how to create a new DataStoreProvider using Lucene.net. I have not tackled the indexer yet because there are some issues involved that make it a much more difficult task.

1. Lucene does not have any concept of DotNetNuke roles or permission. Using a standard spider approach does not work for sites, where much of the content is hidden depending on the role of the user. What we tried to do in the standard indexer was create a mechanism for tagging content so that only people with the appropriate permission can view the content. This allows us to only display search results for people with the correct permission for the associated content. Also, it is nearly impossible for an app to index content that may be reliant on "context" identifiers like querystrings, httpcontext, viewstate, etc. Our solution in the first iteration was to not guess at how to solve this, but instead to let module developers provide the content in a format that could be stored and searched. There are some problems with this approach, but it can provide much better results than trying to spider partial content, or show search result sets that are inaccessible by the user.

2. Right now there are some issues in the glue code and the data formats used for indexing and storing the search content. Before we can fix the indexing issues, we have to first identify a better core API to allow modules to index content on the fly rather than relying on a batch process. We also have to improve the method we use for gathering the search content so that we have a much richer set of metadata to work with.

In my opinion, relying on an engine like Lucene.net will only replace the existing issues with new problems. Instead, I think we need to work on correcting the existing infrastructure such that we can better integrate any indexing or storage solution. We also need to make it easier for module developers to correctly index their content. One of our biggest problems is that the current API is to difficult for module developers to implement correctly and that the core engine is too flaky and non-performant to make it worth the module developers' time investment.

Joe Brinkman
DNN Corp.

1/12/2007 10:26 AM

cent_usr

Joined: 3/11/2004

Posts: 103

Re: DNN SearchProvider - Lucene.net Modified By cent_usr on 1/27/2007 3:07:35 PM

Thanks Joe, this was a clear answer to the current situation. I mainly uses Lucene searchproviders in CMS tools. We make SSO integrations with various Webtools and in some cases towards CommunityServer which also uses Lucene within their Enterprise Search (ES) package, which respects role dependent content. ES triggers indexing of objects due to a listner concept, as content is added the listening ES firesup and index the newly added object. Index optimization fires at cycles.

The explanation around the provider matureness makes things easy to understand.. Thank you for taking the time to explain this.

1/31/2007 2:11 AM

Dave Woestenborghs

Joined: 4/7/2004

Posts: 53

Re: DNN SearchProvider - Lucene.net

jbrinkman wrote

Where can we find this webcast ?

Page 1 of 2

Home

Our Community

General Discuss...

DNN SearchProvider - Lucene.net

These Forums are dedicated to discussion of DNN Platform and Evoq Solutions.

For the benefit of the community and to protect the integrity of the ecosystem, please observe the following posting guidelines:

No Advertising. This includes promotion of commercial and non-commercial products or services which are not directly related to DNN.
No vendor trolling / poaching. If someone posts about a vendor issue, allow the vendor or other customers to respond. Any post that looks like trolling / poaching will be removed.
Discussion or promotion of DNN Platform product releases under a different brand name are strictly prohibited.
No Flaming or Trolling.
No Profanity, Racism, or Prejudice.
Site Moderators have the final word on approving / removing a thread or post or comment.
English language posting only, please.

Products

Solutions

Resources

Partners

Community

Blog

About

QA

Ideas Test

New Community Website

DNN Corp. (DotNetNuke)