Tuesday, July 28, 2009

Enterprise Search Architecture & Configuration - Part 4

WSS Search Scenarios
WSS uses same architecture for search with some limitations, Following list show details of WSS search-
  • A single sever will perform both the role indexing and query, that is called a “Search server”
  • Multiple content databases crawled by a single search server
  • Ability to index only local content. i twill not allow to add different data sources.
  • Content is automatically indexed – minimal search administration
  • Most of the capabilities for Search are configured automatically during installation.
  • STSAdm command exposes some admin operations for WSS
  • Ability to query at a site and below subsites, list or library, or folder.
  • Only SharePoint content in the site collection can be crawled.
  • There is no aggregation of search results across site collections.
  • Full crawls occur as specified in the Administrator-controlled crawl schedule.

Indexing Service in WSS
  • Uses the Windows SharePoint Services 3.0 protocol handler and appropriate IFilters to extract and filter individual items from the site.
  • Appropriate IFilters for each document are applied, and the Filter Daemon passes the extracted text and metadata to the index engine.
  • The index engine saves document properties to a property store that is separate from the content index.
  • The property store also maintains and enforces document-level security.
  • The actual text of a content item is stored in the content index, so it can be used for content queries.
  • The index engine uses word breakers to further process the text and properties picked up during the crawl. The word breaker component is used to break the text into words and phrases.
  • The index engine creates an inverted index for full-text searching.

Query Service in WSS
  • Query engine passes the query through a language-specific word-breaker & stemmer.
  • Query engine executes a property value query, the index is checked first to get a list of possible matches.
  • The properties for the matching documents are loaded from the property store, and the properties in the query are checked again to ensure that there was a match.
  • The result of the query is a list of all matching results, ordered according to their relevance to the query words.
  • If the user does not have permission to view a matching document, the Query engine filters that document out of the list that is returned.

<<Part3 - Configuration & Administration

No comments: