NAME

NAME
Regexp::Common::Apache2 - Apache2 Expressions

SYNOPSIS
use Regexp::Common qw( Apache2 );
use Regexp::Common::Apache2 qw( $ap_true $ap_false );

while( <> )
{
my $pos = pos( $_ );
/\G$RE{Apache2}{Word}/gmc and print "Found a word expression at pos $pos\n";
/\G$RE{Apache2}{Variable}/gmc and print "Found a variable $+{varname} at pos $pos\n";
}

# Override Apache2 expressions by the legacy ones
$RE{Apache2}{-legacy => 1}
# or use it with the Legacy prefix:
if( $str =~ /^$RE{Apache2}{LegacyVariable}$/ )
{
print( "Found variable $+{variable} with name $+{varname}\n" );
}

VERSION
v0.1.0

DESCRIPTION
This is the perl port of Apache2 expressions
<https://httpd.apache.org/docs/trunk/en/expr.html>

The regular expressions have been designed based on Apache2 Backus-Naur
Form (BNF) definition as described below in "APACHE2 EXPRESSION"

You can also use the extended pattern by calling Regexp::Common::Apache2
like:

$RE{Apache2}{-legacy => 1}

All of the regular expressions use named capture. See "%+" in perlvar
for more information on named capture.

APACHE2 EXPRESSION
comp
BNF:

stringcomp
| integercomp
| unaryop word
| word binaryop word
| word "in" listfunc
| word "=~" regex
| word "!~" regex
| word "in" "{" list "}"

$RE{Apache2}{Comp}

For example:

"Jack" != "John"
123 -ne 456
# etc

This uses other expressions namely "stringcomp", "integercomp", "word",
"listfunc", "regex", "list"

The capture names are:

*comp*
Contains the entire capture block

*comp_binary*
Matches the expression that uses a binary operator, such as:

==, =, !=, <, <=, >, >=, -ipmatch, -strmatch, -strcmatch, -fnmatch

*comp_binaryop*
The binary op used if the expression is a binary comparison. Binary
operator is:

==, =, !=, <, <=, >, >=, -ipmatch, -strmatch, -strcmatch, -fnmatch

*comp_integercomp*
When the comparison is for an integer comparison as opposed to a
string comparison.

*comp_list*
Contains the list used to check a word against, such as:

"Jack" in {"John", "Peter", "Jack"}

*comp_listfunc*
This contains the *listfunc* when the expressions contains a word
checked against a list function, such as:

"Jack" in listMe("some arguments")

*comp_regexp*
The regular expression used when a word is compared to a regular
expression, such as:

"Jack" =~ /\w+/

Here, *comp_regexp* would contain "/\w+/"

*comp_regexp_op*
The regular expression operator used when a word is compared to a
regular expression, such as:

"Jack" =~ /\w+/

Here, *comp_regexp_op* would contain "=~"

*comp_stringcomp*
When the comparison is for a string comparison as opposed to an
integer comparison.

*comp_unary*
Matches the expression that uses unary operator, such as:

-d, -e, -f, -s, -L, -h, -F, -U, -A, -n, -z, -T, -R

*comp_word*
Contains the word that is the object of the comparison.

*comp_word_in_list*
Contains the expression of a word checked against a list, such as:

"Jack" in {"John", "Peter", "Jack"}

*comp_word_in_listfunc*
Contains the word when it is being compared to a listfunc, such as:

"Jack" in listMe("some arguments")

*comp_word_in_regexp*
Contains the expression of a word checked against a regular
expression, such as:

"Jack" =~ /\w+/

Here the word "Jack" (without the parenthesis) would be captured in
*comp_word*

*comp_worda*
Contains the first word in comparison expression

*comp_wordb*
Contains the second word in comparison expression

cond
BNF:

"true"
| "false"
| "!" cond
| cond "&&" cond
| cond "||" cond
| comp
| "(" cond ")"

$RE{Apache2}{Cond}

For example:

use Regexp::Common::Apache qw( $ap_true $ap_false );
($ap_false && $ap_true)

The capture names are:

*cond*
Contains the entire capture block

*cond_and*
Contains the expression like:

($ap_true && $ap_true)

*cond_false*
Contains the false expression like:

($ap_false)

*cond_neg*
Contains the expression if it is preceded by an exclamation mark,
such as:

!$ap_true

*cond_or*
Contains the expression like:

($ap_true || $ap_true)

*cond_true*
Contains the true expression like:

($ap_true)

expr
BNF: cond | string

$RE{Apache2}{Expr}

The capture names are:

*expr*
Contains the entire capture block

*expr_cond*
Contains the expression of the condition

*expr_string*
Contains the expression of a string

function
BNF: funcname "(" words ")"

$RE{Apache2}{Function}

For example:

base64("Some string")

The capture names are:

*function*
Contains the entire capture block

*function_args*
Contains the list of arguments. In the example above, this would be
"Some string"

*function_name*
The name of the function . In the example above, this would be
"base64"

integercomp
BNF:

word "-eq" word | word "eq" word
| word "-ne" word | word "ne" word
| word "-lt" word | word "lt" word
| word "-le" word | word "le" word
| word "-gt" word | word "gt" word
| word "-ge" word | word "ge" word

$RE{Apache2}{IntegerComp}

For example:

123 -ne 456
789 gt 234
# etc

The hyphen before the operator is optional, so you can say "eq" instead
of "-eq"

The capture names are:

*stringcomp*
Contains the entire capture block

*integercomp_op*
Contains the comparison operator

*integercomp_worda*
Contains the first word in the string comparison

*integercomp_wordb*
Contains the second word in the string comparison

join
BNF:

"join" ["("] list [")"]
| "join" ["("] list "," word [")"]

$RE{Apache2}{Join}

For example:

join({"word1" "word2"})
# or
join({"word1" "word2"}, ', ')

This uses "list" and "word"

The capture names are:

*join*
Contains the entire capture block

*join_list*
Contains the value of the list

*join_word*
Contains the value for word used to join the list

list
BNF:

split
| listfunc
| "{" words "}"
| "(" list ")

$RE{Apache2}{List}

For example:

split( /\w+/, "Some string" )
# or
{"some", "words"}
# or
(split( /\w+/, "Some string" ))
# or
( {"some", "words"} )

This uses "split", "listfunc", words and "list"

The capture names are:

*list*
Contains the entire capture block

*list_func*
Contains the value if a "listfunc" is used

*list_list*
Contains the value if this is a list embedded within parenthesis

*list_split*
Contains the value if the list is based on a split

*list_words*
Contains the value for a list of words.

listfunc
BNF: listfuncname "(" words ")"

$RE{Apache2}{Function}

For example:

base64("Some string")

This is quite similar to the "function" regular expression

The capture names are:

*listfunc*
Contains the entire capture block

*listfunc_args*
Contains the list of arguments. In the example above, this would be
"Some string"

*listfunc_name*
The name of the function . In the example above, this would be
"base64"

regany
BNF: regex | regsub

$RE{Apache2}{Regany}

For example:

/\w+/i
# or
m,\w+,i

This regular expression includes "regany" and "regsub"

The capture names are:

*regany*
Contains the entire capture block

*regany_regex*
Contains the regular expression. See "regex"

*regany_regsub*
Contains the substitution regular expression. See "regsub"

regex
BNF:

"/" regpattern "/" [regflags]
| "m" regsep regpattern regsep [regflags]

$RE{Apache2}{Regex}

For example:

/\w+/i
# or
m,\w+,i

The capture names are:

*regex*
Contains the entire capture block

*regflags*
The regula expression modifiers. See perlre

This can be any combination of:

i, s, m, g

*regpattern*
Contains the regular expression. See perlre for example and
explanation of how to use regular expression. Apache2 uses PCRE,
i.e. perl compliant regular expressions.

*regsep*
Contains the regular expression separator, which can be any of:

/, #, $, %, ^, |, ?, !, ', ", ",", ;, :, ".", _, -

regsub
BNF: "s" regsep regpattern regsep string regsep [regflags]

$RE{Apache2}{Regsub}

For example:

s/\w+/John/gi

The capture names are:

*regflags*
The modifiers used which can be any combination of:

i, s, m, g

See perlre for an explanation of their usage and meaning

*regstring*
The string replacing the text found by the regular expression

*regsub*
Contains the entire capture block

*regpattern*
Contains the regular expression which is perl compliant since
Apache2 uses PCRE.

*regsep*
Contains the regular expression separator, which can be any of:

/, #, $, %, ^, |, ?, !, ', ", ",", ;, :, ".", _, -

split
BNF:

"split" ["("] regany "," list [")"]
| "split" ["("] regany "," word [")"]

$RE{Apache2}{Split}

For example:

split( /\w+/, "Some string" )

This uses "regany", "list" and "word"

The capture names are:

*split*
Contains the entire capture block

*split_regex*
Contains the regular expression used for the split

*split_list*
The list being split. It can also be a word. See below

*split_word*
The word being split. It can also be a list. See above

string
BNF: substring | string substring

$RE{Apache2}{String}

For example:

URI accessed is: %{REQUEST_URI}

The capture names are:

*string*
Contains the entire capture block

stringcomp
BNF:

word "==" word
| word "!=" word
| word "<" word
| word "<=" word
| word ">" word
| word ">=" word

$RE{Apache2}{StringComp}

For example:

"John" == "Jack"
sub(s/\w+/Jack/i, "John") != "Jack"
# etc

The capture names are:

*stringcomp*
Contains the entire capture block

*stringcomp_op*
Contains the comparison operator

*stringcomp_worda*
Contains the first word in the string comparison

*stringcomp_wordb*
Contains the second word in the string comparison

sub
BNF: "sub" ["("] regsub "," word [")"]

$RE{Apache2}{Sub}

For example:

sub(s/\w/John/gi,"Peter")

The capture names are:

*sub*
Contains the entire capture block

*sub_regsub*
Contains the substitution expression, i.e. in the example above,
this would be:

s/\w/John/gi

*sub_word*
The target for the substitution. In the example above, this would be
"Peter"

substring
BNF: cstring | variable

$RE{Apache2}{Substring}

For example:

Jack
# or
%{REQUEST_URI}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}

See "variable" and "word" regular expression for more on those.

The capture names are:

*substring*
Contains the entire capture block

variable
BNF:

"%{" varname "}"
| "%{" funcname ":" funcargs "}"
| "%{:" word ":}"
| "%{:" cond ":}"
| rebackref

$RE{Apache2}{Variable}
# or
$RE{Apache2}{LegacyVariable}

For example:

%{REQUEST_URI}
# or
%{md5:"some string"}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}
# or a reference to previous regular expression capture groups
$1, $2, etc..

See "word" and "cond" regular expression for more on those.

The capture names are:

*variable*
Contains the entire capture block

*var_cond*
If this is a condition inside a variable, such as:

%{:$ap_true == $ap_false}

*var_func_args*
Contains the function arguments.

*var_func_name*
Contains the function name.

*var_word*
A variable containing a word. See "word" for more information about
word expressions.

*varname*
Contains the variable name without the percent sign or dollar sign
(if legacy regular expression is enabled) or the possible
surrounding accolades

word
BNF:

digits
| "'" string "'"
| '"' string '"'
| word "." word
| variable
| sub
| join
| function
| "(" word ")"

$RE{Apache2}{Word}

This is the most complex regular expression used, since it uses all the
others and can recurse deeply

For example:

12
# or
"John"
# or
'Jack'
# or
%{REQUEST_URI}
# or
%{HTTP_HOST}.%{HTTP_PORT}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}
# or
sub(s,\w+,Paul,gi, "John")
# or
join({"Paul", "Peter"}, ', ')
# or
md5("some string")
# or any word surrounded by parenthesis, such as:
("John")

See "string", "word", "variable", "sub", "join", "function" regular
expression for more on those.

The capture names are:

*word*
Contains the entire capture block

*word_digits*
If the word is actually digits, thise contains those digits.

*word_dot_word*
This contains the text when two words are separated by a dot.

*word_enclosed*
Contains the value of the word enclosed by single or double quotes
or by surrounding parenthesis.

*word_function*
Contains the word containing a "function"

*word_join*
Contains the word containing a "join"

*word_quote*
If the word is enclosed by single or double quote, this contains the
single or double quote character

*word_sub*
If the word is a substitution, this contains tha substitution

*word_variable*
Contains the word containing a "variable"

words
BNF:

word
| word "," list

$RE{Apache2}{Words}

For example:

"Jack"
# or
"John", {"Peter", "Paul"}
# or
sub(s/\b\w+\b/Peter/, "John"), {"Peter", "Paul"}

See "word" and "list" regular expression for more on those.

The capture names are:

*words*
Contains the entire capture block

*words_word*
Contains the word

*words_list*
Contains the list

LEGACY
There are 2 expressions that can be used as legacy:

*comp*
See "comp"

*variable*
See "variable"

CHANGES & CONTRIBUTIONS
Feel free to reach out to the author for possible corrections,
improvements, or suggestions.

AUTHOR
Jacques Deguest <[email protected]>

SEE ALSO
<https://httpd.apache.org/docs/trunk/en/expr.html>

COPYRIGHT & LICENSE
Copyright (c) 2020 DEGUEST Pte. Ltd.

You can use, copy, modify and redistribute this package and associated
files under the same terms as Perl itself.