String Searching 

www.madshi.net

When searching for string fragments in a big string you often see limits in the "SysUtils.Pos" function. You can only search case sensitively, you can't begin searching at a specified position and you can't search backwards, either.

I've addressed all those issues in the functions "PosStr" and "PosText". The first one searches case sensitively, the second one case insensitively. Both functions can search a specified subrange of the string or the whole string. If you define "fromPos" to be bigger than "toPos", both functions even search backwards.

function PosStr  (const subStr : string;
                  const str    : string;
                  fromPos      : cardinal = 1;
                  toPos        : cardinal = maxCard) : integer;
function PosText (const subStr : string;
                  const str    : string;
                  fromPos      : cardinal = 1;
                  toPos        : cardinal = maxCard) : integer;

// Examples:
PosStr ('test', 'This test is a little test.'            )  ->  6
PosStr ('test', 'This test is a little test.', 7         )  ->  23
PosStr ('TEST', 'This test is a little test.'            )  ->  0
PosText('TEST', 'This test is a little test.'            )  ->  6
PosText('TEST', 'This test is a little test.', maxCard, 1)  ->  23
PosText('TEST', 'This test is a little test.', 22,      1)  ->  6

In some situations (e.g. when dealing with memory mapped files) you want to have the functionality of the above functions, but for a buffer/pchar instead of a string. For this purpose I've added the following function, which basically works exactly like "PosStr/Text", except that it wants to have pchar parameters and seperate length cardinals instead of just strings. If you leave the length cardinals at "0", the length of the pchars is determined internally by calling "StrLen". If you're dealing with binary files, which also contain #0 characters, you should enter the correct buffer size, then the searching does not stop at the first #0 character.

function PosPChar (subStr       : pchar;
                   str          : pchar;
                   subStrLen    : cardinal = 0;   // 0 -> StrLen is called internally
                   strLen       : cardinal = 0;
                   ignoreCase   : boolean  = false;
                   fromPos      : cardinal = 0;
                   toPos        : cardinal = maxCard) : integer;

// Examples:
PosPChar('test', 'This test is a little test.'                         )  ->  5
PosPChar('test', 'This test is a little test.', 0, 0, false, 6         )  ->  22
PosPChar('TEST', 'This test is a little test.', 0, 0, false            )  ->  -1
PosPChar('TEST', 'This test is a little test.', 0, 0, true,            )  ->  5
PosPChar('TEST', 'This test is a little test.', 0, 0, true,  maxCard, 0)  ->  22
PosPChar('TEST', 'This test is a little test.', 0, 0, true,  21,      0)  ->  5

Sometimes perhaps you're asking "if PosStr/Text(...) = 1". That's possible and senseful, but not very fast. The following functions do the same, but are easier to call and faster:

function PosStrIs1  (const subStr : string;
                     const str    : string) : boolean;
function PosTextIs1 (const subStr : string;
                     const str    : string) : boolean;

// Examples:
PosStrIs1 ('test', 'This is a test.')  ->  false
PosStrIs1 ('This', 'This is a test.')  ->  true
PosStrIs1 ('this', 'This is a test.')  ->  false
PosTextIs1('this', 'This is a test.')  ->  true

The following function searches for several characters at the same time and returns the first position where one of those characters occurs. You can search a specified subrange of the string or the whole string. If you define "fromPos" to be bigger than "toPos", the function even searches backwards.

function PosChars (const ch  : TSChar;
                   const str : string;
                   fromPos   : cardinal = 1;
                   toPos     : cardinal = maxCard) : integer;

// Examples:
PosChars(['a', 'e', '1'], 'This is a little test 123.'            )  ->  9
PosChars(['a', 'e', '1'], 'This is a little test 123.', 10        )  ->  16
PosChars(['a', 'e', '1'], 'This is a little test 123.', maxCard, 1)  ->  23
PosChars(['a', 'e', '1'], 'This is a little test 123.', 22,      1)  ->  19