Python string Formatter














































Python string Formatter



Python string : Formatter class

This article demonstrate how to use Formatter class in string module.
It behaves exactly same as str.format() function. This class become useful if you want to subclass it and define your own format string syntax.
The Formatter class in the string module allows you to create and customize your own string formatting behaviors using the same implementation as the built-in format() method.


How to use via code :

Sample Code :

1.
#importing string module
import string

#making object of string Formatter
formatter = string.Formatter()
print (formatter.format("{0}{1}{2}{3}{4}{5}{6}{7}{8}{9}""c""p",  "p""s",  "e""c",  "r""e",  "t""s"))
Output :
c, p, p, s, e, c, r, e, t, s


2.

from string import Formatter

formatter = Formatter()

print(formatter.format('{} {website}''Welcome to'website='cppsecrets'))

# format() behaves in similar manner
print('{} {website}'.format('Welcome to'website='cppsecrets'))
Output :
Welcome to cppsecrets
Welcome to cppsecrets


Formatter Methods :

The Formatter class takes no initialization arguments:

fmt = Formatter()


The public API methods of class Formatter are as follows:

-- format(format_string, *args, **kwargs)
-- vformat(format_string, args, kwargs)


'format' is the primary API method. It takes a format template, and an arbitrary set of positional and keyword arguments. 'format' is just a wrapper that calls 'vformat'.

'vformat' is the function that does the actual work of formatting. It is exposed as a separate function for cases where you want to pass in a predefined dictionary of arguments, rather than unpacking and repacking the dictionary as individual arguments using the *args and **kwds syntax. 'vformat' does the work of breaking up the format template string into character data and replacement fields. It calls the 'get_positional' and 'get_index' methods as appropriate (described below.)

Formatter defines the following overridable methods:

-- get_value(key, args, kwargs)
-- check_unused_args(used_args, args, kwargs)
-- format_field(value, format_spec)


'get_value' is used to retrieve a given field value. The 'key' argument will be either an integer or a string. If it is an integer, it represents the index of the positional argument in 'args'; If it is a string, then it represents a named argument in 'kwargs'.

The 'args' parameter is set to the list of positional arguments to 'vformat', and the 'kwargs' parameter is set to the dictionary of positional arguments.

For compound field names, these functions are only called for the first component of the field name; subsequent components are handled through normal attribute and indexing operations.

So for example, the field expression '0.name' would cause 'get_value' to be called with a 'key' argument of 0. The 'name' attribute will be looked up after 'get_value' returns by calling the built-in 'getattr' function.

If the index or keyword refers to an item that does not exist, then an IndexError/KeyError should be raised.

'check_unused_args' is used to implement checking for unused arguments if desired. The arguments to this function is the set of all argument keys that were actually referred to in the format string (integers for positional arguments, and strings for named arguments), and a reference to the args and kwargs that was passed to vformat. The set of unused args can be calculated from these parameters. 'check_unused_args' is assumed to throw an exception if the check fails.

'format_field' simply calls the global 'format' built-in. The method is provided so that subclasses can override it.


To get a better understanding of how these functions relate to each other, here is pseudocode that explains the general operation of vformat:

def vformat(format_stringargskwargs):

  # Output buffer and set of used args
  buffer = StringIO.StringIO()
  used_args = set()

  # Tokens are either format fields or literal strings
  for token in self.parse(format_string):
    if is_format_field(token):
      # Split the token into field value and format spec
      field_spec, _, format_spec = token.partition(":")

      # Check for explicit type conversion
      explicit, _, field_spec  = field_spec.rpartition("!")

      # 'first_part' is the part before the first '.' or '['
      # Assume that 'get_first_part' returns either an int or
      # a string, depending on the syntax.
      first_part = get_first_part(field_spec)
      value = self.get_value(first_part, args, kwargs)

      # Record the fact that we used this arg
      used_args.add(first_part)

      # Handle [subfield] or .subfield. Assume that 'components'
      # returns an iterator of the various subfields, not including
      # the first part.
      for comp in components(field_spec):
        value = resolve_subfield(value, comp)

      # Handle explicit type conversion
      if explicit == 'r':
        value = repr(value)
      elif explicit == 's':
        value = str(value)

      # Call the global 'format' function and write out the converted
      # value.
      buffer.write(self.format_field(value, format_spec))

    else:
      buffer.write(token)

  self.check_unused_args(used_args, args, kwargs)
  return buffer.getvalue()



Customizing Formatters :


This section describes some typical ways that Formatter objects can be customized.

To support alternative format-string syntax, the 'vformat' method can be overridden to alter the way format strings are parsed.

One common desire is to support a 'default' namespace, so that you don't need to pass in keyword arguments to the format() method, but can instead use values in a pre-existing namespace. This can easily be done by overriding get_value() as follows:


class NamespaceFormatter(Formatter):
   def __init__(selfnamespace={}):
       Formatter.__init__(self)
       self.namespace = namespace

   def get_value(selfkeyargskwds):
       if isinstance(key, str):
           try:
               # Check explicitly passed arguments first
               return kwds[key]
           except KeyError:
               return self.namespace[key]
       else:
           Formatter.get_value(key, args, kwds)

One can use this to easily create a formatting function that allows access to global variables, for example:
fmt = NamespaceFormatter(globals())

greeting = "hello"
print(fmt.format("{greeting}, world!"))


Format Specifiers :

Each field can also specify an optional set of 'format specifiers' which can be used to adjust the format of that field. Format specifiers follow the field name, with a colon (':') character separating the two:

"My name is {0:8}".format('Fred')


The meaning and syntax of the format specifiers depends on the type of object that is being formatted, but there is a standard set of format specifiers used for any object that does not override them.

Format specifiers can themselves contain replacement fields. For example, a field whose field width is itself a parameter could be specified via:

"{0:{1}}".format(a, b)


These 'internal' replacement fields can only occur in the format specifier part of the replacement field. Internal replacement fields cannot themselves have format specifiers. This implies also that replacement fields cannot be nested to arbitrary levels.

Note that the doubled '}' at the end, which would normally be escaped, is not escaped in this case. The reason is because the '{{' and '}}' syntax for escapes is only applied when used outside of a format field. Within a format field, the brace characters always have their normal meaning.

The syntax for format specifiers is open-ended, since a class can override the standard format specifiers. In such cases, the str.format() method merely passes all of the characters between the first colon and the matching brace to the relevant underlying formatting method.


If an object does not define its own format specifiers, a standard set of format specifiers is used. These are similar in concept to the format specifiers used by the existing '%' operator, however there are also a number of differences.

The general form of a standard format specifier is:

[[fill]align][sign][#][0][minimumwidth][.precision][type]


The brackets ([]) indicate an optional element.

Then the optional align flag can be one of the following:

'<' - Forces the field to be left-aligned within the available
      space (This is the default.)
'>' - Forces the field to be right-aligned within the
      available space.
'=' - Forces the padding to be placed after the sign (if any)
      but before the digits.  This is used for printing fields
      in the form '+000000120'. This alignment option is only
      valid for numeric types.
'^' - Forces the field to be centered within the available
      space.

Note that unless a minimum field width is defined, the field width will always be the same size as the data to fill it, so that the alignment option has no meaning in this case.

The optional 'fill' character defines the character to be used to pad the field to the minimum width. The fill character, if present, must be followed by an alignment flag.

The 'sign' option is only valid for numeric types, and can be one of the following:

'+'  - indicates that a sign should be used for both
       positive as well as negative numbers
'-'  - indicates that a sign should be used only for negative
       numbers (this is the default behavior)
' '  - indicates that a leading space should be used on
       positive numbers

If the '#' character is present, integers use the 'alternate form' for formatting. This means that binary, octal, and hexadecimal output will be prefixed with '0b', '0o', and '0x', respectively.

'width' is a decimal integer defining the minimum field width. If not specified, then the field width will be determined by the content.

If the width field is preceded by a zero ('0') character, this enables zero-padding. This is equivalent to an alignment type of '=' and a fill character of '0'.

The 'precision' is a decimal number indicating how many digits should be displayed after the decimal point in a floating point conversion. For non-numeric types the field indicates the maximum field size - in other words, how many characters will be used from the field content. The precision is ignored for integer conversions.

Finally, the 'type' determines how the data should be presented.

The available integer presentation types are:

'b' - Binary. Outputs the number in base 2.
'c' - Character. Converts the integer to the corresponding
      Unicode character before printing.
'd' - Decimal Integer. Outputs the number in base 10.
'o' - Octal format. Outputs the number in base 8.
'x' - Hex format. Outputs the number in base 16, using lower-
      case letters for the digits above 9.
'X' - Hex format. Outputs the number in base 16, using upper-
      case letters for the digits above 9.
'n' - Number. This is the same as 'd'except that it uses the
      current locale setting to insert the appropriate
      number separator characters.
'' (None) - the same as 'd'


The available floating point presentation types are:

'E' - Exponent notation. Same as 'e' except it converts the
      number to uppercase.
'f' - Fixed point. Displays the number as a fixed-point
      number.
'F' - Fixed point. Same as 'f' except it converts the number
      to uppercase.
'g' - General format. This prints the number as a fixed-point
      number, unless the number is too large, in which case
      it switches to 'e' exponent notation.
'G' - General format. Same as 'g' except switches to 'E'
      if the number gets to large.
'n' - Number. This is the same as 'g'except that it uses the
      current locale setting to insert the appropriate
      number separator characters.
'%' - Percentage. Multiplies the number by 100 and displays
      in fixed ('f'format, followed by a percent sign.
'' (None) - similar to 'g'except that it prints at least one
      digit after the decimal point.


Objects are able to define their own format specifiers to replace the standard ones. An example is the 'datetime' class, whose format specifiers might look something like the arguments to the strftime() function:

"Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())


For all built-in types, an empty format specification will produce the equivalent of str(value). It is recommended that objects defining their own format specifiers follow this convention as well.


Explicit conversion flag :

The explicit conversion flag is used to transform the format field value before it is formatted. This can be used to override the type-specific formatting behavior, and format the value as if it were a more generic type. Currently, two explicit conversion flags are recognized:

!r - convert the value to a string using repr().
!s - convert the value to a string using str().

These flags are placed before the format specifier:

"{0!r:20}".format("cppsecrets")


In the preceding example, the string "cppsecrets" will be printed, with quotes, in a field of at least 20 characters width.

A custom Formatter class can define additional conversion flags. The built-in formatter will raise a ValueError if an invalid conversion flag is specified.


Error Handling :

There are two classes of exceptions which can occur during formatting: exceptions generated by the formatter code itself, and exceptions generated by user code (such as a field object's 'getattr' function).

In general, exceptions generated by the formatter code itself are of the "ValueError" variety -- there is an error in the actual "value" of the format string. (This is not always true; for example, the string.format() function might be passed a non-string as its first parameter, which would result in a TypeError.)

The text associated with these internally generated ValueError exceptions will indicate the location of the exception inside the format string, as well as the nature of the exception.

For exceptions generated by user code, a trace record and dummy frame will be added to the traceback stack to help in determining the location in the string where the exception occurred. The inserted traceback will indicate that the error occurred at:

File "<format_string>;", line XX, in column_YY

where XX and YY represent the line and character position information in the string, respectively.




So,we can see above that he built-in string class provides the ability to do complex variable substitutions and value formatting via the format() method.The Formatter class in the string module allows you to create and customize your own string formatting behaviors using the same implementation as the built-in format() method.


******End of Article******


More Articles of Vishal Lodhi:

Name Views Likes
Python string zfill 114 0
Python string swapcase 92 0
Python string title 97 0
Python string startswith 102 0
Python string replace 144 0
Python string translate 85 0
Python string rpartition 85 0
Python string partition 74 0
Python string splitlines 114 0
Python string rsplit 79 0
Python string split 75 0
Python string rindex 82 0
Python string rfind 106 0
Python string upper 80 0
Python string lower 81 0
Python string maketrans 90 0
Python string strip 81 0
Python string rstrip 91 0
Python string lstrip 79 0
Python string rjust 96 0
Python string ljust 95 0
Python string len 82 0
Python string join 75 0
Python string casefold 82 0
Python string isprintable 85 0
Python string encode 78 0
Python string isdecimal 84 0
Python string isidentifier 78 0
Python string isupper 77 0
Python string istitle 80 0
Python string isspace 100 0
Python string isnumeric 102 0
Python string isdigit 82 0
Python string islower 84 0
Python string isalpha 96 0
Python string isalnum 102 0
Python string index 87 0
Python string find 93 0
Python string expandtabs() 81 0
Python string endswith 91 0
Python string count 86 0
Python string capitalize 77 0
Python string center 84 0
Python string Introduction 98 0
Python string Template 140 0
Python string Formatter 145 0
Python string printable 199 0
Python string whitespace 146 0
Python string punctuation 104 0
Python string octdigits 86 0
Python string hexdigits 104 0
Python string digits 79 0
Python string ascii_letters 344 0
Python string ascii_lowercase 147 0
Python string ascii_uppercase 156 0
How to Create Download Manager in Python 524 0
Python random weibullvariate 164 0
python random paretovariate 133 0
Python random vonmisesvariate 178 0
Python random normalvariate 196 0
Python random lognormvariate 120 0
Python random gauss 120 0
Python random gammavariate 131 0
Python random expovariate 194 0
Python random betavariate 185 0
Python random triangular 116 0
Python random uniform 141 0
Python random random 117 0
Python random sample 182 0
Python random shuffle 178 0
Python random choices 234 0
Python random choice 166 0
Python random randint 148 0
Python random randrange 214 0
Python random getrandbits 140 0
Python random setstate 136 0
Python random getstate 178 0
Python random seed 168 1
Python random Introduction 175 0

Comments