Support
|
|
|
Help Save Reptile!
|
Navigation
|
Essentials
Installation
Developers
P2P (content distribution)
Search Infrastructure
Services
Proposals
Resources
|
|
Search
|
One of the most important pieces of Reptile is out Search framework.
This provides Reptile with a plugin infrastructure for tying in 3rd
party search infrastructures such as JXTA, Lucene and even the
internal Torque DB index that Reptile uses.
Here is a UML sequence diagram which demonstrates how everything
is put together.
|
|
SearchProviders
|
Essentially everything is based around a SearchProvider:
All searches are abstracted with a SearchRequest:
After running a request, each SearchProvider is held with a
SearchProviderManager.
SearchProviderManagers are also used for obtaining references to
SearchProviders and for garbage collection activities.
|
|
Results
|
All Reptile search results are then serialized into an XML
result set. This content is used by the Reptile sequence system
to provide a UI for the user so that they can navigate the
search results.
Since this is just XML, Reptile can provide additional XSL
stylesheet for each type of provider. For example a
ChannelSearchProvider can provide a stylesheet for navigating
and subscribing to channels.
Example:
 |
 |
 |
 |
<!--
Search document declaration. Includes all namespaces and:
- provider-handle: used for future requests from this provider.
- provier-state: ( search-started |search-in-progress | search-complete )
Basically the state the provider is in.
- provider-name: The short classname of this SearchProvider.
- request-name: The short classname of this SearchRequest (or
AdvancedSearchRequest)
- search-started: the (UNIX) time this search started
- search-completed: the (UNIX) time this search completed (optional)
- search-time: the total number of millisecond this request took to execute. (optional)
-->
<search:search xmlns:search="http://schemas.openprivacy.org/reptile/search"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns="http://www.w3.org/1999/xhtml"
provider-handle="1012887691"
provider-state="search-complete"
provider-name="ArticleSearchProvider"
request-name="SearchRequest"
search-started="1013209070341"
search-completed="1013297300042"
search-time="100">
<!-- SearchProvider/SearchRequestion specific information can be added here.
We can also add other reptile specific information here including OCS
feeds, subscriptions, monitors, etc. -->
<!-- A serialization of the SearchRequest for 'Linux' -->
<search:request>
<search:criteria>
<!-- List of strings given as criterias -->
<xsd:string>Linux</xsd:string>
</search:criteria>
<search:search-fields>
<xsd:string>TITLE</xsd:string>
<xsd:string>DESCRIPTION</xsd:string>
</search:search-fields>
<search:sort-order>
<xsd:string>DATE_FOUND</xsd:string>
</search:sort-order>
</search:request>
<!--
results element provides an X SearchProvider mechanism navigating through
results.
Attributes:
- start: The index number for the first entry in this result set.
- end: The index number for the first entry in this result set.
- found: The total number of results this search has found.
- total: The total number of entries this SearchProvider is exposed to.
For in-memory databases this is provided. For distributed
systems, aka JXTA, this can't be determined (because P2P
systems by definition are non-deterministic) and could
potentially be the same as 'found'
- page: the page number that this request falls on (uses a 0 based
index).
- total-pages: The total number of pages this SearchProvider contains.
This can be used by a stylesheet for providing a UI so
that the user can navigate to the next page or an
arbitrary page in the index.
-->
<search:results start="0" end="9" found="25" total="398" page="0" total-pages="40">
<search:entry>
<!-- title for this entry -->
<dc:title>Linux on the desktop is alive and kicking!</dc:title>
<!-- A description, this is optional as in RSS -->
<dc:description>
This article over at LinuxPlanet tries to convince
the reader that because of recent events, Linux on the desktop is
never going to happen. The author couldn't be more wrong. His
logic doesn't hold up when compared to historical evidence. He
sites the death of Eazel as one example. This is at the very
minimum irrelevant. The Linux movement has nothing to do with
companies (even though it is nice to have their involvement). The
entire KDE project was created with little involvement from
companies.
</dc:description>
<!--
link information including the date information.
In distributed systems that don't use the DB index, date-found is
the current time in milliseconds that we found the entry.
When using the index, this is the last time the metaupdate system
updated this entry.
- date-found: cointains the date this links was found (in
milliseconds since Jan 1 1970) (required)
- last-updated: same as date-found but the time this URL was
last updated. (required)
- channel: the channel (RSS, etc) that this URL is held
in. (optional)
-->
<search:link date-found="1009940886"
last-updated="1009940889"
channel="http://www.slashdot.org/slashdot.rdf"
location="http://www.linuxplanet.com/linuxplanet/opinions/3387/1/"/>
</search:entry>
</search:results>
</search:search>
|
 |
 |
 |
 |
|
|
Extensible search parameters
|
Certain types of SearchProviders support different types of search
parameters that can't really be abstracted into a generic
SearchRequest. A decent example would be Mojonation:
"File retrieval on Mojo Nation begins with a content search. At the
search page, the user can select from a growing number of content
types, and each of the content types presents its own array of type
fields to delineate the user's search (that is, the user could
search for a certain 'bitrate' among the 'audio' content types, but
not others). After the user provides his search criteria and clicks
'search,' the Broker goes back to work"
If we wanted to map a Reptile search provider on top of Mojonation,
and at the same type provide the type of query framework a
Mojanation user might expect, we would need to have a manner for
providing Mojonation style content types within a SearchRequest.
Fortunately such an extension mechanism exists. The SearchRequest
object supports an ExtendedProperties object which accept types and
multivalued properties. Basically name|value|type pairs that a
developer can set in order to tweak search parameters.
|
|
Invoking searches through Actions
|
All searching through Reptile (when done through a browser) are
executed by Actions. The default
SearchAction accepts the following parameters:
- reptile.search.order
- channel | title | date_found
- reptile.search.fields
-
comma separated list of fields to match
Example: title, description, location, etc.
- reptile.search.maxcount
- maximum number of items to return 100, 200, 300, etc
- reptile.search.provider
- The short classname of the provider to use to execute the query.
- reptile.search.request
- The short classname of the request to use to execute the query.
- reptile.search.request-name
- A named request name. IE 'NewestArticlesSearchRequest'.
- reptile.search.provider-name
- Execute a request on a specific provider.
All pages that need to invoke searches should use the Search
action. This will handle executing the search with the correct
provider and search request and redirect to
the urn:search sequence with the correct params.
|
|
Serializing requests and page navigation.
|
Every search provider is broken down into individual atomic pages of
results and presented to the user. This is similar to the usual
page navigation used in any popular
search engine. Each page can display around 10 results and then
you can navigate forward and backward through the result set.
The SearchSerializer handles breaking down a SearchProviders results
into XML documents which represent a page.
All required XML is given to Reptile from the SearchExtension:
This extension is invoked as an Xalan extension element in the
following manner:
 |
 |
 |
 |
<serializer:serialize page="2" provider-handle="1012870437"/> |
 |
 |
 |
 |
When invoked within a sequence we use the reptile.search.page
parameter to determine which page number to serialize.
The provider-handle is used pop this provider from the
ProviderManager.
|
|
Search sequences
|
The following stylesheet sequences are used
by Reptile search.
-
urn:search-serialize
- Serializes individual pages and provides XML output.
-
urn:search-serialize/control
- Serializes individual pages and provides XHTML output within a
Reptile control.
-
urn:search-serialize/page
- Serializes individual pages and provides XHTML output within a
Reptile page.
-
urn:search-request
- Executes a search request and then serializes the results to XML.
-
urn:search-request/control
- Executes a search request and then serializes the results within
a Reptile control.
-
urn:search-request/page
- Executes a search request and then serializes the results within
a Reptile page.
|
|
Advanced search requests
|
Reptile also supports the concept of a AdvancedSearchRequest. This
is a compiled search request that contains complex queries that may
be specific to a Provider.
For example the UnreadArticlesSearchRequest will find all
articles that have not been marked read.
|
|
Remote search providers
|
Reptile supports the concept of a RemoteSearchProvider. This allows
us to invoke requests on behalf of a specific network binding. The
search() request takes any action necessary to execute
|
|
|