The search mechanism supports a large variety of patterns, including simple strings, strings with classes of characters, sets of strings, wild cards, and regular expressions.
|Rule||Explanation||To search for...||Enter...|
|Boolean AND||To search for multiple terms, separate by semicolons||larry AND moe AND curly||larry;moe;curly|
|Boolean OR||To search for any of several terms, separate by commas||larry OR moe OR curly||larry,moe,curly|
Strings are any sequence of characters, including the special symbols `^' for beginning of line and `$' for end of line. The following special characters ( `$', `^', `*', `[', `^', `|', `(', `)', `!', and `\' ) as well as the following meta characters special to the search: `;', `,', `#', `<', `>', `-', and `.', should be preceded by `\' if they are to be matched as regular characters. For example, \^abc\ corresponds to the string ^abc\, whereas ^abc corresponds to the string abc at the beginning of a line.
Classes of characters
A list of characters inside  (in order) corresponds to any character from the list. For example, [a-ho-z] is any character between a and h or between o and z. The symbol `^' inside  complements the list. For example, [^i-n] denote any character in the character set except character `i' to `n'. The symbol `^' thus has two meanings, but this is consistent with egrep. The symbol `.' stands for any symbol (except for the newline symbol).
The search supports an `AND' operation denoted by the symbol `;' an `OR' operation denoted by the symbol `,', or any combination. For example, `pizza;cheeseburger' will output all lines containing both patterns.
The symbol `#' is used to denote a sequence of any number (including 0) of arbitrary characters . The symbol # is equivalent to .* in egrep. In fact, .* will work too, because it is a valid regular expression (see below), but unless this is part of an actual regular expression, # will work faster.
Combination of exact and approximate matching Any pattern inside angle brackets <> must match the text exactly even if the match is with errors. For example, <mathemat>ics matches mathematical with one error (replacing the last s with an a), but mathe<matics> does not match mathematical no matter how many errors are allowed.
Since the index is word based, a regular expression must match words that appear in the index for the search to find it. The search first strips the regular expression from all non-alphabetic characters, and searches the index for all remaining words. It then applies the regular expression matching algorithm to the files found in the index. For example, `abc.*xyz' will search the index for all files that contain both `abc' and `xyz', and then search directly for `abc.*xyz' in those files. The union operation `|', Kleene closure `*', and parentheses () are all supported. Currently `+' is not supported. Regular expressions are currently limited to approximately 30 characters (generally excluding meta characters). The maximal number of errors for regular expressions that use `*' or `|' is 4.