February 18, 2009
I had a conversation today with one of our new employees about the power of the Lucene search engine in MindTouch Deki, and I thought I would share some of the high points here.
Like many of the features of MindTouch Deki, the power of the search engine is only revealed once you start digging deeper into the functionality. On it’s face, our search engine is like many others, with the most common interface being a plain text box on each of the Deki pages. The real power of the search is revealed in the complex expressions that can be interpreted by the search engine. For those of you that have used university library search engines like ABI/Inform or Lexis-Nexus, some of this will be familiar.
First, it is important to note that our search engine indexes not just pages, but also within many common file types, like Word Documents, PDFs, and many other file types. So, a simple search for hello world will return pages that contain the terms hello or world (or both), along with files having the words hello or world in the filename, and even documents that contain the words hello or world in the text of the document. This search probably returned a lot of results; suppose you are only looking for results that have both terms in the title?
Simple. MindTouch Deki provides many fields that can be used to refine your search. By adding the Title: field to your search, and surrounding the terms in quotes, like this: Title: “Hello World”, your search will only return results with hello world in the title or filename.
Of course, you are not limited to a single expression in your search queries; boolean operators allow you to combine search terms for even more precision by using the “AND”, “OR”, and “NOT” operators.
If you wanted to refine the above search to only return documents, and ignore images or pages, you would add the Type: operator together with AND, like this: Title: “Hello World” AND Type:”Document”, and now you would have search results comprised only of documents with the words hello and world in the title.
The NOT operator can help you eliminate results that you know are not relevant to your search. As an example, suppose you are looking for any printed information about MindTouch Deki. Certainly this information could be in PDFs, Word Docs, and of course Deki Pages. It would not likely be in images however; so to eliminate them from your search, use the NOT operator in front of the term that you are removing, as in this example:
MindTouch AND Deki NOT Type:”image”
Which will return all pages and documents containing the terms MindTouch and Deki, but not images.
And one last trick that would be useful for finding relevant information is to only return results that have been viewed a certain number of times; for example, if you are looking for troubleshooting documentation for your 1969 Chevy Nova, you may want to limit your search to wiki pages that have been viewed more than 100 times. Simply use the Viewcount operator to refine your search:
1969 AND Chevy AND Nova AND Type:”wiki” AND Viewcount > 100
Of course, powerful enterprise search would be useless without appropriate permissions; The MindTouch Deki search engine will only return results to users that they have permission to view, maintaining a search environment that is fast, powerful and secure.
That’s it for our brief overview. For those of you that would like more in-depth information, head over to our developer pages for a detailed article on Lucene, the search engine that powers MindTouch Deki.
All Posts
All Feeds