www.oradev.com
  Database   Sql   Functions   Packages   Performance   Books   Oracle   Other   About   XML   ORA-messages
  Regular Expressions in Oracle (pl/)SQL

Database/Sql

Oracle Regular Expressions
Timestamp
SQL Date format
String concatenation
Loop in pl/sql
SQL IN-clause
Regular Expressions Examples
Flashback query
Grant/revoke privileges
Sequence
Rename tables, columns
Insert into Oracle
Database name
Table with sequenced numbers
Oracle connect by
Add columns to table


  OraDev.com

Regular Expressions in Oracle SQL

Since Oracle 10g you can use regular expressions in the database. Regular expressions enable you to search for patterns in string data by using standardized syntax conventions. You specify a regular expression through the following types of characters:
-Metacharacters, which are operators that specify search algorithms
-Literals, which are the characters for which you are searching

Regular Expressions Functions

: Function
DescriptionSyntax
REGEXP_LIKE This is like the LIKE operator, but much more powerfull. REGEXP_LIKE(search_string, pattern [,match_option])
REGEXP_INSTR This functions searches for a regular expression pattern in a string and returns the position of the first character of the string that matches the expression. REGEXP_INSTR(search_string, pattern [, position [, occurrence [, return_option [, match_option]]]])
REGEXP_REPLACE This functions searches for a regular expression pattern in a string and replaces this with a replacement string. REGEXP_REPLACE(search_string, pattern [,replacestring [, position [, occurrence [, match_option]]]])
REGEXP_SUBSTR This function you can use to search for a regular expression pattern and return the matched substring. REGEXP_SUBSTR(search_string, pattern [, position [, occurrence [, match_option]]])
REGEXP_COUNT counts the number of times a pattern appears in a string. REGEXP_COUNT(search_string, pattern [,position] [,match_option])
with:
search_string: the string you are searching in.
pattern: A regular expression. (max 512 bytes).
position: The character position from which you want the search to begin.
occurrence: The occurence you are searching for. If you want to return the second match for the regular expression, this is a 2.
return_option: 0 means return the pattern's beginning position. 1 means return the ending character position. match_option can have one of these values:
'c': case-sensitive (default) ;
'i': case-insensitive ;
'n': Allow match-any-character operator (.) ;
'm': Treat source string as multiple line. For every line ^ and $ will be used for the beginning and ending of the line.

Example: select * from emp where REGEXP_LIKE(name,'A.{2,5))
means select rows from emp table where name starts with an A followed by 2 to 5 chararcters.

Metacharacters


POSIX (Standard) Metacharacters

The POSIX standard for matching ASCII data is used in other languages also. Here is an overview of the POSIX metacharacters used in Oracle:
Operator Description
* Matches zero or more occurrences
+ Matches one or more occurrences
? Matches zero or one occurrence
. Matches any character in the supported character set except NULL
| Alternation operator (OR) for specifying alternative matches
^ Matches the beginning of a string by default. In multiline mode, it matches the beginning of any line anywhere within the source string.
$ Matches the end of a string by default. In multiline mode, it matches the end of any line anywhere within the source string.
\ Escape character. Use a backslash to search for a character that is normally threated as a metacharacter.
[ ] Bracket expression for specifying a matching list that should match any one of the expressions represented in the list.
[^ ] Non matching character list. Matches none of the expressions represented in the list.
( ) Grouping expression, treated as a single subexpression
{m} Matches exactly m times
{m,} Matches at least m times
{m,n} Matches at least m times but no more than n times
\n The backreference expression (n is a digit between 1 and 9) matches the nth subexpression enclosed between '(' and ')' preceding the \n
[..] Specifies one collation element, and can be a multicharacter element (for example, [.ch.] in Spanish)
[: :] Specifies POSIX character classes. It matches any character within the character class. This can be:
[:alnum:] Alphanumeric characters
[:alpha:] Alphabetic characters
[:ascii:] ASCII characters
[:blank:] Space and tab
[:cntrl:] Control characters
[:digit:] Digits
[:graph:] Visible characters
[:lower:] Lowercase letters [a-z]
[:print:] Visible characters and spaces
[:punct:] Punctuation and symbols.
[:space:] All whitespace characters, including line breaks
[:upper:] Uppercase letters
[:word:] Word characters
[:xdigit:] Hexadecimal digits
Character classes are supported only within bracket expressions. So [:blank]{10} does not work, but [[:blank:]]{10} is ok.
[==] Specifies equivalence classes. For example, [=a=] matches all characters having base letter 'a'.

Perl has expressions that are not in the POSIX standard. Because these expressions are widly used, they are also implemented in Oracle.
Operator Description
\d A digit character. It is equivalent to the POSIX class [[:digit:]].
\D A nondigit character. It is equivalent to the POSIX class [^[:digit:]].
\w A word character, which is defined as an alphanumeric or underscore (_) character. It is equivalent to the POSIX class [[:alnum:]_].
\W A nonword character. It is equivalent to the POSIX class [^[:alnum:]_].
\pP Names properties. This is no POSIX equivalent class.
\s A whitespace character. It is equivalent to the POSIX class [[:space:]] except that [:space:] includes vertical tab and \s does not.
\S A nonwhitespace character. It is equivalent to the POSIX class [^[:space:]].
\A Only at the beginning of a string. In multi-line mode, that is, when embedded newline characters in a string are considered the termination of a line, \A does not match the beginning of each line.
\Z Only at the end of string or before a newline ending a string. In multi-line mode, that is, when embedded newline characters in a string are considered the termination of a line, \Z does not match the end of each line.
z Only at the end of a string.
*? The preceding pattern element 0 or more times ("nongreedy"). This quantifier matches the empty string whenever possible.
+? The preceding pattern element 1 or more times ("nongreedy").
?? The preceding pattern element 0 or 1 time ("nongreedy"). This quantifier matches the empty string whenever possible.
{n}? The preceding pattern element exactly n times ("nongreedy"). In this case {n}? is equivalent to {n}.
{n,}? The preceding pattern element at least n times ("nongreedy").
{n,m}? At least n but not more than m times ("nongreedy"). {0,m}? matches the empty string whenever possible.

Regular expressions in constraints

Regular expressions can be used to enforce constraints on a database table.

For example:
alter table employee
add CONSTRAINT emp_ck1
CHECK (REGEXP_LIKE (telephone_nr, '\d{0,4}[-]?\d{6,103}'));