Skip to content



Regular Expressions :
PCRE and POSIX RegEx Solutions

Choose Your Weapon:

A Note on PCRE Implementations in Different Text-Editors
(and other tools with Regular Expression search, search/ replace capabilities)

It’s important to recognize that, even though they are based on regular expression syntax (hence the term Perl Compatible Regular Expressions), not all RegEx-powered search tools will recognize precisely the same expression patterns. Therefore, not all RegEx enhanced search-and-replace tools will respond as expected to the search string entered. The reader should not lose hope, however, for the differences tend to be subtle when compared to the similarities. For example, one of the RegEx patterns listed below illustrates how to use a particular tool for extracting separate databases from a complete MySQL server database backup. The syntax, as shown in the example, uses a backslash character for escaping the parenthesis used to define separate RegEx Atoms. That syntax dates back to the UNIX search tools, sed and grep. For the record– most tools I’ve used do not require escaping the atom encapsulating parenthesis in that manner, but I’ve listed it here as a reminder that, sometimes a failure to match the regular expression pattern is not due to the pattern itself, but the variations on PCRE present in different search tools. Should you encounter a peculiarity such as this, it’s best to consult the manual for the software first– if any time will be spent on debugging a regular expression pattern.

RegEx is Cool!

Like most people who are unfamiliar with Regular Expressions, I too found RegEx to be intimidating at first. The primary reason for their mystical difficulty, I believe, is because the syntax is unlike anything familiar to most people. The special characters and patterns are completely arbitrary, and therefore the RegEx newbie has no familiar ground on which to stand. With every step into RegEx, the newcomer is very aware of his footing, and must be very careful not to fall into the illusion of an abyss looming below! The key to survival in Regular Expressions, in the beginning, is to accept that learning the basics will require more memorization, and less apparent logic. Don’t let it scare you, but try to understand regex before memorizing the basics, like metacharacter, atom, and class syntax, and you may be doomed to frustration.
[ more ]

Making Comparisons

Let’s take a look at the familiar quality inherent in some other scripting languages, like PHP, to be reminded of how we might have assimilated an understanding of how to create our own functional PHP apps after only a few tutorials.
For example: compare your first introduction to a logical flow control construct (in PHP, or otherwise), where the script reaches a point of decisioning, such as [ if :: elseif ]. The if:elseif syntax is not unlike a syntax with which we come in contact in our daily lives.

If they have fries, i’ll take fries and a coke. If there are no fries, then i want an orange juice.

The order is a bit unusual, but the person receiving the order is nevertheless able to understand precisely the logic he or she must employ in order to get the order correct. When we encounter an [ if :: elseif ] construct in php, for example, we can reason with it just as well. Regular Expressions, on the other hand, offer us no reasonable logic easily comparible to such a real-world situation. There is very little about regular expressions which might seem familiar upon first encounter, but so what! Think of it as a breath of fresh air, and take comfort in the thought that it won’t be long before you master the basics, and are able to perform some pretty powerful search and replace functions!

When we break it down into the number of elements which must be memorized (in PCRE RegEx at least) it’s really not that complicated. It’s true, there is a specific syntax, and as one gets deeper and deeper into the more powerful expression building, the syntax one must commit to memory will grow a bit more complex, but all of the expressions are testing for the existence of a match. The RegEx syntax is designed to test for true or false; on or off; 1 or 0, it fits or doesn’t fit– in number of ways, building pattern upon pattern, until each permutation has been satisfied, as we check the RegEx for a match against a target string. That’s not to say that I don’t make errors in my own RegEx, but the point is that the logic is VERY cut and dried. So, one must learn the syntax, practice the expressions and understand a few useful patterns, and from there– it’s a matter of growing confidence with each new use of RegEx, kind of like learning to spell more complex words– the more you write those words, the more familiar and less difficult they become.

My Regular Expression Solutions Library

I came across a few folders on an old hard drive where I had stored some Regular Expressions i found on-line. I realized that I had not yet begun a RegEx section here at NoviceNotes.Net, so i thought this would be the perfect opportunity to do so. I didn’t want to lose these notes, but I do want to wipe this old hard drive. So, I welcome you to share in the notes on Regular Expressions I intend to share here.

It’s all just syntax. To me it’s like spelling. Different words. Take what you need– it’s all just a collection of letters and words, so to speak. Good luck to you in your own RegEx Adventures!


Validate a Telephone Number – PHP preg_match() String:
This expression, written in PHP using the Perl Compatible RegEx (a.k.a. PCRE), built-in function preg_match() can be used to ensure that a user who encounters a telephone number submission form will indeed submit a valid telephone number. I have not put this expression into production, but i did try it. As far as i can tell, it does what it’s supposed to do! enjoy:


$phone = trim($phone);
if (!empty($phone) && !preg_match('/^\d{3}-\d{3}-\d{4}$/', $phone))
{
$errorCheck = TRUE;
$errorText = "Please use the 000-000-0000 format for a phone number.";
$valid = FALSE;
}

Extract Database Names from a MySQL Server Backup
If you ever find that you are unable to use phpMyAdmin to create your MySQL backup, I discovered the convenience, and apparently more failsafe method of backing up MySQL as one complete .sql file, like that which cPanel creates using the Backup Database option, or like that which is created my phpMyAdmin on Export.
If it is practical in your case to use the free-to-download, free-to-use (OEM) MySQL GUI Tool, MySQL Administrator (e.g. if you have a working copy of MySQL Administrator running on your system), instead of using phpMyAdmin for the database backup process, I suggest you try it next time. The resulting SQL file is nearly identical to that of the phpMyAdmin output we’re used to seeing, but for a few slight differences.

The following is the Regular Expression I used to search for the desired data within the MySQL Administrator SQL output; the complete localhost datbase backup created by the MySQL GUI Tool. The regular expression is shown here, enclosed in quotation marks, however the encapsulating quotation marks are NOT part of the actual Regular Expression itself:

 "^\(\-\-\sDatabase:\s\)\([\w]*\)"

Note: phpMyAdmin does not use precisely the

same syntax as MySQL Administrator to describe the database creation query to follow, but instead uses the following: “– Database: `database_name`”. Observe the character that phpMyAdmin uses as a delimiter encapsulates the database name here. Since that character can not be used in the filename, and the purpose of this RegEx search and replace operation is to make more efficient the effort of creating new .SQL files for each database extracted from the LOCALHOST .sql dump, the regular expression for a phpMyAdmin .SQL export will require the following modification, in the SEARCH string only:

"^\(\-\-\sDatabase:\s\)`\([\w]*\)`"

The “back-tick” characters have been placed outside of the second atom, so they are not included in the backreference.)

Using the .SQL file from MySQL Administrator, loaded in Notepad2 (a Scintilla-based editor), the replace syntax is as follows, again enclosed only here in quotes. do NOT use quotes if you wish to try this yourself or it won’t work properly.:

"\1\2\n-- \2.sql"

The same REPLACE string (previous sentence) should work without any changing for replacing the “find” results of the modified RegEx for phpMyAdmin (provided above, in parenthesis because it was added later) for use with phpMyAdmin.

Note: I used Notepad2 because of a unique “Select-To” feature in its Find and Replace dialogue. Instead of simply finding a phrase with my regular expression, Notepad2 can FIND one instance of the phrase, THEN “Select-to” the following instance of such a phrase, thereby selecting everything in between. As shown above, Notepad2 searched for “Create Schema”, a bit of text inserted by MySQL Administrator before each “CREATE DATABASE” expression, which marks the beginning of each new database. It was the regular expression finding this phrase accurately, and the very existence of the phrase which enabled me to make more efficient use of my time. I hope you find this to be useful! For more information on Notepad2, see http://www.flos-freeware.ch/ . I recommend you use the supplement file-manager software, mentioned in the “Notepad2″ documentation. “Metapad”, the file-browser, is simple to add to Notepad2.

Please send suggestions, or corrections to learn at novice notes ·com. Thank you for your participation!


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

You must be logged in to post a comment.