Back  Index  Next

Basic String Manipulation

Perl is the absolute King when it comes to parsing, dicing, chopping, and manipulating strings of characters. Most of this power comes from its regular expressions but there are a few simpler functions that you should know about.

How Long?


$s = "Now is the time to party.";

$l = length($s);            # gets 25
The 'length' function simply returns how long the string is - how many characters the string has in it. Spaces and punctuation count as characters too!
??  550  116

Where Is?


$s = "Now is the time to party.";

$i = index $s, "ti";        # searches $s for the string "ti"
The 'index' function looks for the first occurence of one string in another and returns where it found it. Both strings may be a variable or a string constant.
Indices start at 0:

N o w   i s   t h e   t i m e   t o   p a r t y .
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4
                    1                   2
So "ti" starts at index 11.

If the string is not found, index returns -1.
181-182  549  208

It helps to draw pictures of strings with each character in a little box. And numbers on the boxes indicating the indices. It helps in counting and figuring.

Take This Part

Taking portions of a string with the function 'substr':

$s = "Now is the time to party.";

$t = substr($s, 11, 7);      # gets "time to"
The second parameter of substr tells which index to start at and the third tells how many characters to take. More examples:

print substr $s, 7, 3;      # the
print substr $s, 19, 1;      # p

182-183  561  208

Longer Example

Let's see how we can use length, index and substr to count how many t's there are in a string:

$s = "Now is the time to party.";

$count = 0;
if (index($s, 't') >= 0) {
    for ($i = 0; $i < length($s); ++$i) {
        $c = substr($s, $i, 1);     # gets the i'th character of $s
        if ($c eq "t") {            # eq not ==!
            ++$count;
        }
    }
    print "There are $count t's in '$s'\n";
}

Substr Shortcuts

With substr there are several shortcuts to help. If the second parameter (the index of where to start) is negative then counting begins at the end. The last character in the string is at index -1.

If the third parameter of substr (the number of characters to take) is omitted it will take everything to the end of the string.


$s = "hello there sweety";
print substr $s, -3;    # prints "ety"

Exercises

  1. Read lines from a file. Print only the ones that satisfy all of these criteria: Longer than 20 characters, beginning with an 'S', and ending with a period.

  2. Read lines from a file. They will be of two types:

    • Filenames (lines with a dot '.' in them somewhere) like this:

      
      nums.txt
      names.doc
      calculation.xls
      
      For these split the filenames into the prefix and the suffix and print each out:
      
      prefix: nums
      suffix: txt
      
      Once this is working, try verifying whether or not the filename is in standard DOS format - 8 chars max for the prefix and 3 chars max for the suffix.
    • Lines that do not contain a dot '.' at all. For these count how many vowels (aeiou) it has and print this tally out.

      This counting of vowels is hard but not impossible!! As my first piano teacher once said,

        It is good medicine to always have something you can't quite do.
  3. Using the 'localtime' function (see the exercise in the previous section), print only the time in 12 hour format.

Back  Index  Next