Function Queries
 Using Function Query
 Available Functions
 abs Function
 childfield(field) Function
 concat Function
 "constant" Function
 def Function
 div Function
 dist Function
 docfreq(field,val) Function
 field Function
 hsin Function
 idf Function
 if Function
 linear Function
 log Function
 map Function
 max Function
 maxdoc Function
 min Function
 ms Function
 norm(field) Function
 numdocs Function
 ord Function
 payload Function
 pow Function
 product Function
 query Function
 recip Function
 rord Function
 scale Function
 sqedist Function
 sqrt Function
 strdist Function
 sub Function
 sum Function
 sumtotaltermfreq Function
 termfreq Function
 tf Function
 top Function
 totaltermfreq Function
 Boolean Functions
 Example Function Queries
 Sort By Function
Function queries enable you to generate a relevancy score using the actual value of one or more numeric fields.
Function queries are supported by the DisMax, Extended DisMax, and standard query parsers.
Function queries use functions. The functions can be a constant (numeric or string literal), a field, another function or a parameter substitution argument. You can use these functions to modify the ranking of results for users. These could be used to change the ranking of results based on a user’s location, or some other calculation.
Using Function Query
Functions must be expressed as function calls (for example, sum(a,b)
instead of simply a+b
).
There are several ways of using function queries in a Solr query:

Via an explicit query parser that expects function arguments, such
func
orfrange
. For example:q={!func}div(popularity,price)&fq={!frange l=1000}customer_ratings

In a Sort expression. For example:
sort=div(popularity,price) desc, score desc

Add the results of functions as pseudofields to documents in query results. For instance, for:
&fl=sum(x, y),id,a,b,c,score&wt=xml
the output would be:
... <str name="id">foo</str> <float name="sum(x,y)">40</float> <float name="score">0.343</float> ...

Use in a parameter that is explicitly for specifying functions, such as the eDisMax query parser’s
boost
parameter, or the DisMax query parser’sbf
(boost function) parameter. (Note that thebf
parameter actually takes a list of function queries separated by white space and each with an optional boost. Make sure you eliminate any internal white space in single function queries when usingbf
). For example:q=dismax&bf="ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3"

Introduce a function query inline in the Lucene query parser with the
_val_
keyword. For example:q=_val_:mynumericfield _val_:"recip(rord(myfield),1,2,3)"
Only functions with fast random access are recommended.
Available Functions
The table below summarizes the functions available for function queries.
abs Function
Returns the absolute value of the specified value or function.
Syntax Examples

abs(x)

abs(5)
childfield(field) Function
Returns the value of the given field for one of the matched child docs when searching by {!parent}. It can be used only in sort
parameter.
Syntax Examples

sort=childfield(name) asc
implies$q
as a second argument and therefore it assumesq={!parent ..}..
; 
sort=childfield(field,$bjq) asc
refers to a separate parameterbjq={!parent ..}..
; 
sort=childfield(field,{!parent of=…}…) desc
allows to inline block join parent query
concat Function
Concatenates the given string fields, literals and other functions.
Syntax Example

concat(name," ",$param,def(opt,""))
def Function
def
is short for default. Returns the value of field "field", or if the field does not exist, returns the default value specified. Yields the first value where exists()==true
.
Syntax Examples

def(rating,5)
: Thisdef()
function returns the rating, or if no rating specified in the doc, returns 5 
def(myfield, 1.0):
equivalent toif(exists(myfield),myfield,1.0)
div Function
Divides one value or function by another. div(x,y)
divides x
by y
.
Syntax Examples

div(1,y)

div(sum(x,100),max(y,1))
dist Function
Returns the distance between two vectors (points) in an ndimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number.
There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector.
Syntax Examples

dist(2, x, y, 0, 0)
: calculates the Euclidean distance between (0,0) and (x,y) for each document. 
dist(1, x, y, 0, 0)
: calculates the Manhattan (taxicab) distance between (0,0) and (x,y) for each document. 
dist(2, x,y,z,0,0,0):
Euclidean distance between (0,0,0) and (x,y,z) for each document. 
dist(1,x,y,z,e,f,g)
: Manhattan distance between (x,y,z) and (e,f,g) where each letter is a field name.
docfreq(field,val) Function
Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index).
You can quote the term if it’s more complex, or do parameter substitution for the term value.
Syntax Examples

docfreq(text,'solr')

…&defType=func
&q=docfreq(text,$myterm)&myterm=solr
field Function
Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself without using the field(…)
syntax.
When using docValues, an optional 2nd argument can be specified to select the min
or max
value of multivalued fields.
0 is returned for documents without a value in the field.
Syntax Examples These 3 examples are all equivalent:

myFloatFieldName

field(myFloatFieldName)

field("myFloatFieldName")
The last form is convenient when your field name is atypical:

field("my complex float fieldName")
For multivalued docValues fields:

field(myMultiValuedFloatField,min)

field(myMultiValuedFloatField,max)
hsin Function
The Haversine distance calculates the distance between two points on a sphere when traveling along the sphere. The values must be in radians. hsin
also take a Boolean argument to specify whether the function should convert its output to radians.
Syntax Example

hsin(2, true, x, y, 0, 0)
idf Function
Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also tf
.
Syntax Example

idf(fieldName,'solr')
: measures the inverse of the frequency of the occurrence of the term'solr'
infieldName
.
if Function
Enables conditional function queries. In if(test,value1,value2)
:

test
is or refers to a logical value or expression that returns a logical value (TRUE or FALSE). 
value1
is the value that is returned by the function iftest
yields TRUE. 
value2
is the value that is returned by the function iftest
yields FALSE.
An expression can be any function which outputs boolean values, or even functions returning numeric values, in which case value 0 will be interpreted as false, or strings, in which case empty string is interpreted as false.
Syntax Example

if(termfreq (cat,'electronics'),popularity,42)
: This function checks each document for to see if it contains the term "electronics" in thecat
field. If it does, then the value of thepopularity
field is returned, otherwise the value of42
is returned.
linear Function
Implements m*x+c
where m
and c
are constants and x
is an arbitrary function. This is equivalent to sum(product(m,x),c)
, but slightly more efficient as it is implemented as a single function.
Syntax Examples

linear(x,m,c)

linear(x,2,4)
: returns2*x+4
log Function
Returns the log base 10 of the specified function.
Syntax Examples

log(x)

log(sum(x,100))
map Function
Maps any values of an input function x
that fall within min
and max
inclusive to the specified target
. The arguments min
and max
must be constants. The arguments target
and default
can be constants or functions.
If the value of x
does not fall between min
and max
, then either the value of x
is returned, or a default value is returned if specified as a 5th argument.
Syntax Examples

map(x,min,max,target)

map(x,0,0,1)
: Changes any values of 0 to 1. This can be useful in handling default 0 values.


map(x,min,max,target,default)

map(x,0,100,1,1)
: Changes any values between0
and100
to1
, and all other values to` 1`. 
map(x,0,100,sum(x,599),docfreq(text,solr))
: Changes any values between0
and100
to x+599, and all other values to frequency of the term 'solr' in the field text.

max Function
Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: max(x,y,…)
. The max
function can also be useful for "bottoming out" another function or field at some specified constant.
Use the field(myfield,max)
syntax for selecting the maximum value of a single multivalued field.
Syntax Example

max(myfield,myotherfield,0)
maxdoc Function
Returns the number of documents in the index, including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index).
Syntax Example

maxdoc()
min Function
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: min(x,y,…)
. The min
function can also be useful for providing an "upper bound" on a function using a constant.
Use the field(myfield,min)
syntax for selecting the minimum value of a single multivalued field.
Syntax Example

min(myfield,myotherfield,0)
ms Function
Returns milliseconds of difference between its arguments. Dates are relative to the Unix or POSIX time epoch, midnight, January 1, 1970 UTC.
Arguments may be the name of a DatePointField
, TrieDateField
, or date math based on a constant date or NOW
.

ms()
: Equivalent toms(NOW)
, number of milliseconds since the epoch. 
ms(a):
Returns the number of milliseconds since the epoch that the argument represents. 
ms(a,b)
: Returns the number of milliseconds that b occurs before a (that is, a  b)
Syntax Examples

ms(NOW/DAY)

ms(20000101T00:00:00Z)

ms(mydatefield)

ms(NOW,mydatefield)

ms(mydatefield, 20000101T00:00:00Z)

ms(datefield1, datefield2)
norm(field) Function
Returns the "norm" stored in the index for the specified field. This is the product of the index time boost and the length normalization factor, according to the {lucenejavadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field.
Syntax Example

norm(fieldName)
numdocs Function
Returns the number of documents in the index, not including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index).
Syntax Example

numdocs()
ord Function
Returns the ordinal of the indexed field value within the indexed list of terms for that field in Lucene index order (lexicographically ordered by unicode value), starting at 1.
In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multivalued). 0
is returned for documents without a value in the field.
ord() depends on the position in an index and can change when other documents are inserted or deleted.

See also rord
below.
Syntax Example

ord(myIndexedField)

If there were only three values ("apple","banana","pear") for a particular field X, then
ord(X)
would be1
for documents containing "apple",2
for documents containing "banana", etc.
payload Function
Returns the float value computed from the decoded payloads of the term specified.
The return value is computed using the min
, max
, or average
of the decoded payloads. A special first
function can be used instead of the others, to shortcircuit term enumeration and return only the decoded payload of the first term.
The field specified must have float or integer payload encoding capability (via DelimitedPayloadTokenFilter
or NumericPayloadTokenFilter
). If no payload is found for the term, the default value is returned.

payload(field_name,term)
: default value is 0.0,average
function is used. 
payload(field_name,term,default_value)
: default value can be a constant, field name, or another float returning function.average
function used. 
payload(field_name,term,default_value,function)
: function values can bemin
,max
,average
, orfirst
.
Syntax Example

payload(payloaded_field_dpf,term,0.0,first)
pow Function
Raises the specified base to the specified power. pow(x,y)
raises x
to the power of y
.
Syntax Examples

pow(x,y)

pow(x,log(y))

pow(x,0.5):
the same assqrt
product Function
Returns the product of multiple values or functions, which are specified in a commaseparated list. mul(…)
may also be used as an alias for this function.
Syntax Examples

product(x,y,…)

product(x,2)

mul(x,y)
query Function
Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter dereferencing $otherparam
or direct specification of the query string in the Local Parameters through the v
key.
Syntax Examples

query(subquery, default)

q=product (popularity,query({!dismax v='solr rocks'})
: returns the product of the popularity and the score of the DisMax query. 
q=product (popularity,query($qq))&qq={!dismax}solr rocks
: equivalent to the previous query, using parameter dereferencing. 
q=product (popularity,query($qq,0.1))&qq={!dismax}solr rocks
: specifies a default score of 0.1 for documents that don’t match the DisMax query.
recip Function
Performs a reciprocal function with recip(x,m,a,b)
implementing a/(m*x+b)
where m,a,b
are constants, and x
is any arbitrarily complex function.
When a
and b
are equal, and x>=0
, this function has a maximum value of 1
that drops as x
increases. Increasing the value of a
and b
together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is rord(datefield)
.
Syntax Examples

recip(myfield,m,a,b)

recip(rord
(creationDate), 1,1000,1000)
rord Function
Returns the reverse ordering of that returned by ord
.
Syntax Example

rord(myDateField)
scale Function
Scales values of the function x
such that they fall between the specified minTarget
and maxTarget
inclusive. The current implementation traverses all of the function values to obtain the min and max, so it can pick the correct scale.
The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses 0.0
values for these cases. This means that if values are normally all greater than 0.0
, one can still end up with 0.0
as the min
value to map from. In these cases, an appropriate map()
function could be used as a workaround to change 0.0
to a value in the real range, as shown here: scale(map(x,0,0,5),1,2)
Syntax Examples

scale(x, minTarget, maxTarget)

scale(x,1,2)
: scales the values of x such that all values will be between 1 and 2 inclusive.
sqedist Function
The Square Euclidean distance calculates the 2norm (Euclidean distance) but does not take the square root, thus saving a fairly expensive operation. It is often the case that applications that care about Euclidean distance do not need the actual distance, but instead can use the square of the distance. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector.
Syntax Example

sqedist(x_td, y_td, 0, 0)
sqrt Function
Returns the square root of the specified value or function.
Syntax Examples

sqrt(x)

sqrt(100)

sqrt(sum(x,100))
strdist Function
Calculate the distance between two strings. Uses the Lucene spell checker StringDistance
interface and supports all of the implementations available in that package, plus allows applications to plug in their own via Solr’s resource loading capabilities. strdist
takes (string1, string2, distance measure).
Possible values for distance measure are:

jw: JaroWinkler

edit: Levenstein or Edit distance

ngram: The NGramDistance, if specified, can optionally pass in the ngram size too. Default is 2.

FQN: Fully Qualified class Name for an implementation of the StringDistance interface. Must have a noarg constructor.
Syntax Example

strdist("SOLR",id,edit)
sub Function
Returns xy
from sub(x,y)
.
Syntax Examples

sub(myfield,myfield2)

sub(100, sqrt(myfield))
sum Function
Returns the sum of multiple values or functions, which are specified in a commaseparated list. add(…)
may be used as an alias for this function.
Syntax Examples

sum(x,y,…)

sum(x,1)

sum(sqrt(x),log(y),z,0.5)

add(x,y)
sumtotaltermfreq Function
Returns the sum of totaltermfreq
values for all terms in the field in the entire index (i.e., the number of indexed tokens for that field). (Aliases sumtotaltermfreq
to sttf
.)
Syntax Example If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A):

docFreq(fieldX:A)
= 2 (A appears in 2 docs) 
freq(doc1, fieldX:A)
= 4 (A appears 4 times in doc 2) 
totalTermFreq(fieldX:A)
= 5 (A appears 5 times across all docs) 
sumTotalTermFreq(fieldX)
= 7 infieldX
, there are 5 As, 1 B, 1 C
termfreq Function
Returns the number of times the term appears in the field for that document.
Syntax Example

termfreq(text,'memory')
tf Function
Term frequency; returns the term frequency factor for the given term, using the {lucenejavadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. The tfidf
value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the document, which helps to control for the fact that some words are generally more common than others. See also idf
.
Syntax Examples

tf(text,'solr')
top Function
Causes the function query argument to derive its values from the toplevel IndexReader containing all parts of an index. For example, the ordinal of a value in a single segment will be different from the ordinal of that same value in the complete index.
The ord()
and rord()
functions implicitly use top()
, and hence ord(foo)
is equivalent to top(ord(foo))
.
totaltermfreq Function
Returns the number of times the term appears in the field in the entire index. (Aliases totaltermfreq
to ttf
.)
Syntax Example

ttf(text,'memory')
Boolean Functions
The following functions are boolean – they return true or false. They are mostly useful as the first argument of the if
function, and some of these can be combined. If used somewhere else, it will yield a '1' or '0'.
and Function
Returns a value of true if and only if all of its operands evaluate to true.
Syntax Example

and(not(exists(popularity)),exists(price))
: returnstrue
for any document which has a value in theprice
field, but does not have a value in thepopularity
field.
or Function
A logical disjunction.
Syntax Example

or(value1,value2):
true
if eithervalue1
orvalue2
is true.
xor Function
Logical exclusive disjunction, or one or the other but not both.
Syntax Example

xor(field1,field2)
returnstrue
if eitherfield1
orfield2
is true; FALSE if both are true.
not Function
The logically negated value of the wrapped function.
Syntax Example

not(exists(author))
:true
only whenexists(author)
is false.
exists Function
Returns true
if any member of the field exists.
Syntax Example

exists(author)
: returnstrue
for any document has a value in the "author" field. 
exists(query(price:5.00))
: returnstrue
if "price" matches "5.00".
Comparison Functions
gt
, gte
, lt
, lte
, eq
5 comparison functions: Greater Than, Greater Than or Equal, Less Than, Less Than or Equal, Equal.
eq
works on not just numbers but essentially any value like a string field.
Syntax Example

if(lt(ms(mydatefield),315569259747),0.8,1)
translates to this pseudocode:if mydatefield < 315569259747 then 0.8 else 1
Example Function Queries
To give you a better understanding of how function queries can be used in Solr, suppose an index stores the dimensions in meters x,y,z of some hypothetical boxes with arbitrary names stored in field boxname
. Suppose we want to search for box matching name findbox
but ranked according to volumes of boxes. The query parameters would be:
q=boxname:findbox _val_:"product(x,y,z)"
This query will rank the results based on volumes. In order to get the computed volume, you will need to request the score
, which will contain the resultant volume:
&fl=*, score
Suppose that you also have a field storing the weight of the box as weight
. To sort by the density of the box and return the value of the density in score, you would submit the following query:
http://localhost:8983/solr/collection_name/select?q=boxname:findbox _val_:"div(weight,product(x,y,z))"&fl=boxname x y z weight score
Sort By Function
You can sort your query results by the output of a function. For example, to sort results by distance, you could enter:
http://localhost:8983/solr/collection_name/select?q=*:*&sort=dist(2, point1, point2) desc
Sort by function also supports pseudofields: fields can be generated dynamically and return results as though it was normal field in the index. For example,
&fl=id,sum(x, y),score&wt=xml
would return:
<str name="id">foo</str>
<float name="sum(x,y)">40</float>
<float name="score">0.343</float>