sru home » Rel Context Set

Relevance Ranking Context Set version 1.1

Version 1.1, 2nd September 2009
see also version 1.0

The default ordering of a result set is left up to the server, including a lack of any explicit ordering. This is addressed in SRU for the most part through the use of the 'sort' / 'sortKeys' parameter in SRU v1.1 and by the 'sortBy' keyword in SRU v1.2 queries. However,for sophisticated relevance based ranking, different algorithms are available, and specific methods might be requested to combine the results of evaluating each operand or clause. This context set attempts to address this issue by defining relation and boolean modifiers for the various known algorithms, and combinations of their results. Several known algorithms have their documentation linked in the table in Appendix A below.

If the 'relevant' relation modifier from the cql context set is given, but no named algorithm, then the server should continue to use the basic semantics -- the server may decide which algorithm to use. It is also legal to include both cql.relevant along with an algorithm from this set, in which case that algorithm should be used. Hence there is no need to include an 'any algorithm' relation modifier in this set.

Also, please note that, as with all context sets, these modifiers are case insensitive. "rel.algorithm=CORI" and "rel.algorithm=cori" are to be treated the same. This is especially true as most of the modifiers are acronyms so may be entered in upper case into queries, even though they are listed in lower case below.

To return relevancy information attached to a record, please see the record metadata extension. (To be written up, ala 'rec' context set)

The identifier for the context set is: info:srw/cql-context-set/2/relevance-1.1
The recommended short name is: rel
The maintainer of the context set is: john.harrison@liv.ac.uk

Indexes

There are no indexes defined in this context set.

Relations

There are no relations defined in this context set.

Relation Modifiers

Modifier Name	Description
algorithm	The algorithm to be used to assign relevance scores to results (see table in Appendix A for examples).
combine	The method to be used to combine scores generated for individual operands (see table in Appendix B for examples).
feedback	Apply blind relevance feedback to increase recall.
minRaw	The minimum raw score that must be achieved (after scores from individual operands have been combined) to be included in results.
minScaled	The minimum scaled score that must be achieved (after scores from individual operands have been combined) to be included in results. Scaled scores are proportionate to the highest score. 0 <= scaledScore <= 1 .
const_*	A named constant relevant to the algorithm, eg const_k=0.7 This allows constants to be overridden for specific queries or indexes in order to either ensure consistency across servers or to fine tune the results.

Booleans

There are no booleans defined in this context set.

Boolean Modifiers

Modifier Name	Description
combine	Method to be used to combine scores generated for individual clauses.
minRaw	The minimum raw score that must be achieved (after scores from individual clauses have been combined) to be included in results.
minScaled	The minimum scaled score that must be achieved (after scores from individual clauses have been combined) to be included in results. Scaled scores are proportionate to the highest score. 0 <= scaledScore <= 1 .
const_*	A named constant relevant to the algorithm, as in Relation Modifiers.

Examples

Some examples of how the context set might be used.

    dc.title any/rel.algorithm=lr "fish squid burger cheese"    
  cql.anywhere all/rel.algorithm=cori "sanderson denenberg" 
or/rel.combine=mean dc.description any/rel.algorithm=cori "information retrieval"      
dc.title any/rel.algorithm=lr/rel.const_c0=-0.705 "logistic regression relevance ranking techniques"

Appendix A - Relevance Score Assignment Algorithms

Modifier Value	Description
lr	Logistic Regression algorithm from UC Berkeley
cori	CORI algorithm of Callan et al. (Carnegie Mellon)
okapi	OKAPI BM-25 of Robertson et al. (City University, London)
gloss	Glossary of Servers of Gravano et al. (Stanford)
ggloss	Generalised Glossary of Servers
dtf-cori	Decision-Theoretic Framework extension to CORI of Fuhr, Nottelmann (University of Duisburg-Essen)
redde	Relevant Document Distribtion Estimation of Callan et al. (Carnegie Mellon)
cdr	Cover Density Ranking
pagerank	Google's PageRank algorithm of Brin, Page (ex Stanford)
hilltop	The Hilltop algorithm of Bharat, Milahila (Google, University of Toronto)

Appendix B - Relevance Score Combination Methods

Modifier Value	Description
sum	Add the values
mean	Average the values
nsum	Normalised the summed values
cmbz	Normalise and rescale values
max	Select maximum value
min	Select minimum value
nprv	Normalise values and privilege high ranked documents
pivot	Normalise sub-record retrieval scores based on document scores

sru home » Rel Context Set