FBB::Pattern(3bobcat)

Pattern matcher
(libbobcat-dev_6.04.00)

2005-2023

NAME

FBB::Pattern - Performs RE pattern matching

SYNOPSIS

#include <bobcat/pattern>
Linking option: -lbobcat

DESCRIPTION

Pattern objects may be used for Regular Expression (RE) pattern matching. The class is a wrapper around the regcomp(3) family of functions. By default it uses `extended regular expressions', requiring you to escape multipliers and bounding-characters when they should be interpreted as ordinary characters (i.e., *, +, ?, ^, $, |, (, ), [, ], {, } should be escaped when used as literal characters).

The Pattern class supports the use of the following (Perl-like) special escape sequences:
\b - indicating a word-boundary
\d - indicating a digit ([[:digit:]]) character
\s - indicating a white-space ([:space:]) character
\w - indicating a word ([:alnum:]) character

The corresponding capitals (e.g., \W) define the complementary character sets. The capitalized character set shorthands are not expanded inside explicit character-classes (i.e., [ ... ] constructions). So [\W] represents a set of two characters: \ and W.

As the backslash (\) is treated as a special character it should be handled carefully. Pattern converts the escape sequences \d \s \w (and outside of explicit character classes the sequences \D \S \W) to their respective character classes. All other escape sequences are kept as-is, and the resulting regular expression is offered to the pattern matching compilation function regcomp(3). This function interprets escape sequences. Consequently some care should be exercised when defining patterns containing escape sequences. Here are the rules:

NAMESPACE

FBB
All constructors, members, operators and manipulators, mentioned in this man-page, are defined in the namespace FBB.

INHERITS FROM

-

TYPEDEF

CONSTRUCTORS

Copy and move constructors (and assignment operators) are available.

MEMBER FUNCTIONS

All members of std::ostringstream and std::exception are available, as Pattern inherits from these classes.

OVERLOADED OPERATORS

EXAMPLE

#include "driver.h"

#include <bobcat/pattern>

using namespace std;
using namespace FBB;

#include <algorithm>
#include <cstring>

void showSubstr(string const &str)
{
    static int count = 0;

    cout << "String " << ++count << " is '" << str << "'\n";
}

void match(Pattern const &patt, string const &text)
try
{
     Pattern pattern{ patt };

    pattern.match(text);

    Pattern p3(pattern);

    cout << "before:  " << p3.before() << "\n"
            "matched: " << p3.matched() << "\n"
            "beyond:  " << pattern.beyond() << "\n"
            "end() = " << pattern.end() << '\n';

    for (size_t idx = 0; idx != pattern.end(); ++idx)
    {
        string str = pattern[idx];

        if (str.empty())
            cout << "part " << idx << " not present\n";
        else
        {
            Pattern::Position pos = pattern.position(idx);

            cout << "part " << idx << ": '" << str << "' (" <<
                        pos.first << "-" << pos.second << ")\n";
        }
    }
}
catch (exception const &exc)
{
    cout << exc.what() << '\n';
}

int main(int argc, char **argv)
{
    string patStr = R"(\d+)";

    do
    {
        cout << "Pattern: '" << patStr << "'\n";
        try
        {
                // by default: case sensitive
                // use any args. for case insensitive
            Pattern patt(patStr, argc == 1);

            cout << "Compiled pattern: " << patt.pattern() << '\n';

            while (true)
            {
                cout << "string to match : ";

                string text;
                getline(cin, text);
                if (text.empty())
                    break;
                cout << "String: '" << text << "'\n";
                match(patt, text);
            }
        }
        catch (exception const &exc)
        {
            cout << exc.what() << ": compilation failed\n";
        }

        cout << "New pattern: ";
    }
    while (getline(cin, patStr) and not patStr.empty());
}

FILES

bobcat/pattern - defines the class interface

SEE ALSO

bobcat(7), regcomp(3), regex(3), regex(7)

BUGS

Using Pattern objects as static data members of classes (or as global objects) is potentially dangerous. If the object files defining these static data members are stored in a dynamic library they may not be initialized properly or timely, and their eventual destruction may result in a segmentation fault. This is a well-known problem with static data, see, e.g., http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.15. In situations like this prefer the use of a (shared, unique) pointer to a Pattern, initializing the pointer when, e.g., first used.

BOBCAT PROJECT FILES

BOBCAT

Bobcat is an acronym of `Brokken's Own Base Classes And Templates'.

COPYRIGHT

This is free software, distributed under the terms of the GNU General Public License (GPL).

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl).