FBB::String(3bobcat)
Operations on std::string objects
(libbobcat-dev_6.04.00)
2005-2023
NAME
FBB::String - Several operations on std::string objects
SYNOPSIS
#include <bobcat/string>
Linking option: -lbobcat
DESCRIPTION
This class offers facilities for often used transformations on
std::string objects, which are not supported by the std::string
class itself. All members of FBB::String are static.
Initially this class was derived from std::string. Deriving from
std::string, however, is considerd bad design as std::string was
not designed as a base-class.
FBB::String offers a series of static member functions
providing the facilities originally implemented as non-static members. One of
these members is the (overloaded) split member, splitting a string into
elements separated by one or more configurable characters. These elements may
contain or consist of double- or single-quoted (sub) strings and escape
characters. Escape characters are converted to their implied byte-values
(e.g., \n is converted to byte value 10) unless they are embedded in
single-quoted (sub) strings. Quotes surrounding double- and single-quoted
(sub) strings are removed from the elements returned by the split
members.
NAMESPACE
FBB
All constructors, members, operators and manipulators, mentioned in this
man-page, are defined in the namespace FBB.
INHERITS FROM
--
ENUMERATIONS
- Type:
This enumeration indicates the nature of the content of an element in
the array returned by the overloaded split members (see below).
DQUOTE, a subset of the characters in the matching string
element was delimited by double quotes in the in the string that was parsed by
the split members.
DQUOTE_UNTERMINATED, the content of the string that was
parsed by the split members started at some point with a double quote, but
the matching ending double quote was lacking.
ESCAPED_END, the content of the string that was
parsed by the split members ended in a mere backslash.
NORMAL, a normal string;
SEPARATOR, a separator;
SQUOTE, a subset of the characters in the matching string
element was delimited by quotes in the in the string that was parsed by
the split members.
SQUOTE_UNTERMINATED, the content of the string that was
parsed by the split members started at some point with a quote, but
the matching ending quote was lacking.
- SplitType:
This enumeration is used to specify how split members should
split the information in the string objects that are passed to these
members:
TOK: the split member acts like the standard C function
strtok(3). The essence here is that no empty elements are
returned. E.g., a string containing "a,," which is processed using
the TOK mode returns a NORMAL element containing "a".
TOKSEP: the split member acts like the standard C function
strtok(3), also returning information about encountered
separators. Since strtok doesn't return empty elements, TOKSEP
uses empty elements to indicate the occurrence of separators. E.g., a
string containing "a,," which is processed using the TOKSEP
mode returns a NORMAL element containing "a", followed by two
empty SEPARATOR elements.
STR: the split member acts like the standard C function
strstr(3). The essence here is that empty elements are also
returned. E.g., a string containing "a,," which is processed using
the STR mode returns an element containing "a", followed by
two empty NORMAL elements.
STRSEP: the split member acts like the standard C function
strstr(3), also returning information about encountered
separators. E.g., a string containing "a,," which is processed
using the STRSEP mode returns a NORMAL element containing
"a", followed by a SEPARATOR element containing ",",
followed by a NORMAL empty element, followed by a SEPARATOR
element containing ",", and finally followed by a NORMAL empty
element,
NESTED TYPE
The struct SplitPair defines a std::pair<std::string,
String::Type> and is used by some overloaded split members (see
below).
STATIC MEMBER FUNCTIONS
- char const **argv(std::vector<std::string> const &words):
Returns a pointer to an allocated series of pointers to the C
strings stored in the vector words. The caller is responsible for
returning the array of pointers to the common pool, but should not delete
the C-strings to which the pointers point. The last element of the
returned array is guaranteed to be a 0-pointer.
- int casecmp(std::string const &lhs, std::string const &rhs):
Performs a case-insensitive comparison of the content of two
std::string objects. A negative value is returned if lhs should be
ordered before rhs; 0 is returned if the two strings have identical
content; a positive value is returned if the lhs object should be ordered
beyond rhs.
- std::string escape(std::string const &str,
char const *series = "'\"\\"):
Returns a copy of str in which all characters in series are
prefixed by a backslash character.
- std::string join(std::vector<std::string> const &words, char sep):
The elements of the words vector are returned as one string,
separated from each other by the sep character;
- std::string join(std::vector<SplitPair> const &entries, char sep,
bool all = true):
The first fields of the elements in entries are returned as one
string, separated from each other by the sep character. If the
parameter all is specified as false then elements whose
second fields are equal to String::SEPARATOR are ignored.
- std::string lc(std::string const &str) const:
Returns a copy of str in which all letters were transformed to
lower case letters.
- std::vector<String::SplitPair> split(std::string const &str, SplitType
mode, char const *sep = " \t"):
The string str is split into substrings, separated by any of the
characters in sep. The substrings are returned in a vector of
SplitPair elements, using the specified SplitType mode
(cf. the description of the various SplitPair values and their
effects in the ENUMERATIONS section).
- std::vector<String::SplitPair> split(std::string const &str, char
const *separators = " \t", bool addEmpty = false):
This member acts like the previous one, using addEmpty == false
to select mode TOK and addEmpty == true to select mode
TOKSEP.
- size_t split(std::vector<String::SplitPair> *entries, std::string
const &str, SplitType mode, char const *sep = " \t"):
Same functionality as the first split member, but this member
stores the SplitPair elements in the vector pointed at by the
entries parameter, first clearing the vector. This member returns
the new value of entries->size().
- size_t split(std::vector<String::SplitPair> *entries, std::string
const &str, char const *sep = " \t", bool addEmpty = false):
This member acts like the previous one, using addEmpty == false
to select mode TOK and addEmpty == true to select mode
TOKSEP.
- std::vector<std::string> split(Type *type, std::string const &str,
SplitType stype, char const *sep = " \t"):
Same functionality as the first split member, but this member
merely stores the first fields of the SplitPair elements in
the returned vector. The String::Type variable whose address is
passed to the type parameter is set to NORMAL if the final
entry was successfully determined; to DQUOTE_UNTERMINATED if a
final closing double quote could not be found; to
SQUOTE_UNTERMINATED if a final closing single quote could not be
found; and to ESCAPE_END if the final character in str is a
backslash character.
- std::vector<std::string> split(Type *type, std::string
const &str, char const *sep = " \t", bool addEmpty = false):
This member acts like the previous one, using addEmpty == false
to select mode TOK and addEmpty == true to select mode
TOKSEP.
- size_t split(std::vector<std::string> *words, std::string const &str,
SplitType stype, char const *sep = " \t"):
Same functionality as the first split member, but this member
merely stores the first fields of the encountered SplitPair
elements in the vector pointed at by words, first clearing the
vector. This member returns the new value of words->size().
- size_t split(std::vector<std::string> *words, std::string const &str,
char const *sep = " \t", bool addEmpty = false):
This member acts like the previous one, using addEmpty == false
to select mode TOK and addEmpty == true to select mode
TOKSEP.
- std::string trim(std::string const &str):
Returns a copy of str from which leading and trailing blank
characters were removed.
- std::string uc(std::string const &str):
Returns a copy of str in which all letters were capitalized.
- std::string unescape(std::string const &str):
Returns a copy of str in which the escaped (i.e., prefixed by a
backslash) characters were interpreted. All standard escape characters
(\a, \b, \f, \n, \r, \t, \v) are
recognized. If an escape character is followed by x at most the
next two characters are interpreted as a hexadecimal number. If an
escape character is followed by an octal digit, then at most the next
three characters following the backslash are interpreted as an
octal number. In all other cases, the backslash is removed and the
character following the backslash is kept.
- std::string urlDecode(std::string const &str):
URL specifications use %xx encoding to encode characters, except
for alpha-numeric characters and the characters - _ . and ~,
which are kept as-is. Other characters are encode by a %
character, followed by two hexadecimal characters representing those
characters' byte value. E.g., a blank space is encoded as %20, a
plus character is encoded as %2B. The member urlDecode returns
a std::string containing the decoded characters of the url-encoded
string that is passed as argument to this member.
- std::string urlEncode(std::string const &str):
See the member urlDecode: urlEncode returns a std::string
containing the url-encoded characters of the characters in the string
that is passed as argument to this member.
EXAMPLE
#include <iostream>
#include <vector>
#include <bobcat/string>
using namespace std;
using namespace FBB;
static char const *type[] =
{
"DQUOTE_UNTERMINATED",
"SQUOTE_UNTERMINATED",
"ESCAPED_END",
"SEPARATOR",
"NORMAL",
"DQUOTE",
"SQUOTE",
};
int main(int argc, char **argv)
{
cout << "Program's name in uppercase: " << String::uc(argv[0]) << "\n\n";
vector<String::SplitPair> splitpair;
string text{ "one, two, 'thr\\x65\\145'" };
string encoded{ String::urlEncode(text) };
cout << "The string `" << text << "'\n"
" as url-encoded string: `" << encoded << "'\n"
" and the latter string url-decoded: " <<
String::urlDecode(encoded) << "\n"
"\n"
"Splitting `" << text << "' into " <<
String::split(&splitpair, text, String::STRSEP, ", ") <<
" fields\n";
for (auto it = splitpair.begin(); it != splitpair.end(); ++it)
cout << (it - splitpair.begin() + 1) << ": " <<
type[it->second] << ": `" << it->first <<
"', unescaped: `" << String::unescape(it->first) <<
"'\n";
cout << '\n' <<
text << ":\n"
" upper case: " << String::uc(text) << ",\n"
" lower case: " << String::lc(text) << '\n';
}
/*
Calling the program as
driver'
results in the following output:
Program's name in uppercase: DRIVER
Splitting `one, two, 'thr\x65\145'' into 9 fields
1: NORMAL: `one', unescaped: `one'
2: SEPARATOR: `,', unescaped: `,'
3: NORMAL: `', unescaped: `'
4: SEPARATOR: ` ', unescaped: ` '
5: NORMAL: `two', unescaped: `two'
6: SEPARATOR: `,', unescaped: `,'
7: NORMAL: `', unescaped: `'
8: SEPARATOR: ` ', unescaped: ` '
9: SQUOTE: `thr\x65\145', unescaped: `three'
one, two, 'thr\x65\145':
upper case: ONE, TWO, 'THR\X65\145',
lower case: one, two, 'thr\x65\145'
*/
FILES
bobcat/string - defines the class interface
SEE ALSO
bobcat(7)
BUGS
None Reported.
BOBCAT PROJECT FILES
- https://fbb-git.gitlab.io/bobcat/: gitlab project page;
- bobcat_6.04.00-x.dsc: detached signature;
- bobcat_6.04.00-x.tar.gz: source archive;
- bobcat_6.04.00-x_i386.changes: change log;
- libbobcat1_6.04.00-x_*.deb: debian package containing the
libraries;
- libbobcat1-dev_6.04.00-x_*.deb: debian package containing the
libraries, headers and manual pages;
BOBCAT
Bobcat is an acronym of `Brokken's Own Base Classes And Templates'.
COPYRIGHT
This is free software, distributed under the terms of the
GNU General Public License (GPL).
AUTHOR
Frank B. Brokken (f.b.brokken@rug.nl).