Chapter 3 The First Python Program

3.3 Strings Methods

String methods can be classified into two basic categories. The first category returns information about a string, and the second category formats a string. Because string objects are immutable, string formatting methods return a copy of the string object, rather than modifying the object in place.

A set of commonly-used string information methods include:

A set of string information methods that search for a sub string includes:

The index, rindex, find and rfind methods can take optional arguments that specify a slice of the string in which the sub string is looked for. This is achieved by two optional arguments, indicating the start index and the end index of the slice, respectively. The locations of the slicing indices are the same as those by the getslice operations. In case only the start index is specified, the slice ends at the end of the string.

>>> s = 'abcabcabcdefdefabc'
>>> s.index('abc', 3) # search s[3:]
3
>>> s.find('abc', 5, -2) # search s[5:-2]
6

A set of convenient methods for checking the beginning and end of a string include:

The methods startswith and endswith can also take a tuple of strings as the argument, in which case the return value is True if the string starts with any string in the tuple, and False otherwise.

>>> s = 'abcdefghi '
>>> s.startswith(('abc', 'def'))
True
>>> s.startswith(('a', 'b', 'c'))
True
>>> s.endswith(('def', 'abc'))
False
>>> s.endswith(('ghi', 'hi', 'i'))
True

A set of string formatting methods that modify the cases of cased characters includes:

A set of string formatting methods that adds or removes whitespaces on the ends of a string includes:

The methods above can be generalized to insert or strip arbitrary characters on the ends of a string. In particular, ljust and rjust can take an additional argument that specifies a padding character, while lstrip, rstrip and strip can take an additional argument that specifies a set of characters to be stripped.

>>> s = '123'
>>> s.ljust(5, '0') # pad '0'
'12300 '
>>> s.rjust(6, '-') # pad '-'
'---123'
>>> s = 'aabcdceft'
>>> s.lstrip('a') # strip 'a'
'bcdceft'
>>> s.lstrip('bac') # the set {'b', 'a', 'c'}
'dceft'
>>> s.rstrip('ag') # the set {'a', 'g'}
'aabcdceft'
>>> s.strip('gabf') # {'g', 'a', 'b', 'f'}
'cdceft'

Python provides a method, replace, for sub string replacement directly. It takes two string arguments, specifying the substring to be replaced and the replacement string, respectively, and returns a copy of the string after replacement.

replace allows an additional argument that specifies a count, so that the first occurrences of the sub string up to the count are replaced.

A final string formatting method is format is bound to a pattern string, and takes arguments that fill pattern fields in the string. A pattern field is formed by a pair of curly brackets, and contains either a number specifying the index of the corresponding argument, or a keyword to be filled by a keyword argument. For example,

>>> '{0} + {1} = {2}'.format(1, 2, 1+2)
'1 + 2 = 3'
>>> 'Hello, {0}'.format('Python')
'Hello , Python '
>>> '{2}-{0}-{1}'.format('abc', 'def', 'ghi')
'ghi -abc -def '
>>> '{x}-{y}-{0}'.format('ghi', x='abc', y='def')
'abc -def -ghi '

In the example above, the last method call contains two keyword arguments, x and y, which fill the keyword fields {x} and {y}, respectively. Note that similar to other function calls, keyword arguments must be placed after non-keyword arguments.

If the arguments are sequentially filled, the indics in the pattern fields can be omitted.

>>> '{} + {} = {}'.format(1, 2, 3)
'1 + 2 = 3'

Formatting specifications can be given to each pattern field by using ‘:<pattern>’, where <pattern> follows the pattern syntax in string formatting expressions. For example,

>>> '{0:d} + {1:.2f} = {2}'.format(1, 2.0, 1+2.0)
'1 + 2.00 = 3.0'
>>> s = '{0:d} + {1:.2f} = {x:s}'
>>> s.format(1, 2, x=str(1+2))
'1 + 2.00 = 3'

In the example above, ‘:d’, ‘:.2f’ and ‘:s’ are used to specify the format of an integer, a floating point number with 2 digits after the decimal point and a string, respectively. Formatting specifications can be used for keyword and non-keyword arguments.

If there are too few arguments, an error will be raised. If there are too many arguments, the first ones will be used to fill the pattern string.

>>> s '{} + {} = {}'
>>> s.format(1)
Traceback (most recent call last):
File "<stdin >", line 1, in <module > IndexError: tuple index out of range
>>> s.format(1, 2, 3, 4)
'1 + 2 = 3'

There is a built-in function, format(v,s), which takes a value v and a pattern string s as its input arguments, and return a string by calling s.format(v).

>>> format(3.1,'.2f')
'3.10 '

The methods and functions above are frequently used for sequential types. There are more methods for sequential types. The Python documentation is a useful reference for looking for a pre-defined function before writing one custom function.

Strings are Immutable. One final thing that makes strings different from some other Python collection types is that you are not allowed to modify the individual characters in the collection. It is tempting to use the [] operator on the left side of an assignment, with the intention of changing a character in a string. For example, in the following code, what happens when the first letter of greeting is changed?

Instead of producing the output Jello, world!, this code produces the runtime error TypeError: 'str' object does not support item assignment.

Strings are immutable, which means you cannot change an existing string. The best you can do is create a new string that is a variation on the original.

The solution here is to concatenate a new first letter onto a slice of greeting. This operation has no effect on the original string.

Check your understanding

© Copyright 2024 GS Ng.

Next Section - 3.4 A Simple Complete Python Program