Python String isidentifier()

Filed Under: Python

Python String isidentifier() function returns True if the string is a valid identifier according to the Python language definition.
python string isidentifier()

Python String isidentifier()

A valid identifier string can be of any length. Prior to Python 3.0, a valid identifier can contain uppercase and lowercase letters A through Z, the underscore _ and, except for the first character, the digits 0 through 9.

However, Python 3.0 introduced additional characters from outside the ASCII range that can be used to create an identifier. This change was done against PEP-3131.

Let’s look at some of the examples of Python String isidentifier() function.


s = 'xyzABC'
print(f'{s} is a valid identifier = {s.isidentifier()}')

Output: xyzABC is a valid identifier = True


s = '0xyz'
print(f'{s} is a valid identifier = {s.isidentifier()}')

Output: 0xyz is a valid identifier = False because an identifier can’t start with digits 0-9.


s = ''
print(f'{s} is a valid identifier = {s.isidentifier()}')

Output: is a valid identifier = False because an identifier can’t be empty string.


s = '_xyz'
print(f'{s} is a valid identifier = {s.isidentifier()}')

Output: _xyz is a valid identifier = True because underscore is allowed to be first character in the identifier string.


s = 'ꝗꞨꫳ'
print(f'{s} is a valid identifier = {s.isidentifier()}')

Output: ꝗꞨꫳ is a valid identifier = True

It’s a valid identifier because of PEP-3131 that added these additional Non-ASCII characters to the valid identifier character list. However, if you are using Python 2.x then it will return False.

Print all valid identifier characters list

We can use unicodedata to check if a character is a part of valid identifiers list or not. Here is the program to print all the valid characters that can be used to create an identifier.


import unicodedata

count = 0
for codepoint in range(2 ** 16):
    ch = chr(codepoint)
    if ch.isidentifier():
        print(u'{:04x}: {} ({})'.format(codepoint, ch, unicodedata.name(ch, 'UNNAMED')))
        count = count + 1
print(f'Total Number of Identifier Unicode Characters = {count}')

Output:


...
ffd7: ᅲ (HALFWIDTH HANGUL LETTER YU)
ffda: ᅳ (HALFWIDTH HANGUL LETTER EU)
ffdb: ᅴ (HALFWIDTH HANGUL LETTER YI)
ffdc: ᅵ (HALFWIDTH HANGUL LETTER I)
Total Number of Identifier Unicode Characters = 48880

Note that I am providing only a few characters in the output because the valid identifier characters count is huge.

You can checkout more Python examples from our GitHub Repository.

Reference: Official Documentation, PEP-3131

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages