Python String isidentifier() function returns True
if the string is a valid identifier according to the Python language definition.
Python String isidentifier()
A valid identifier string can be of any length. Prior to Python 3.0, a valid identifier can contain uppercase and lowercase letters A through Z, the underscore _ and, except for the first character, the digits 0 through 9.
However, Python 3.0 introduced additional characters from outside the ASCII range that can be used to create an identifier. This change was done against PEP-3131.
Let’s look at some of the examples of Python String isidentifier() function.
s = 'xyzABC'
print(f'{s} is a valid identifier = {s.isidentifier()}')
Output: xyzABC is a valid identifier = True
s = '0xyz'
print(f'{s} is a valid identifier = {s.isidentifier()}')
Output: 0xyz is a valid identifier = False
because an identifier can’t start with digits 0-9.
s = ''
print(f'{s} is a valid identifier = {s.isidentifier()}')
Output: is a valid identifier = False
because an identifier can’t be empty string.
s = '_xyz'
print(f'{s} is a valid identifier = {s.isidentifier()}')
Output: _xyz is a valid identifier = True
because underscore is allowed to be first character in the identifier string.
s = 'ꝗꞨꫳ'
print(f'{s} is a valid identifier = {s.isidentifier()}')
Output: ꝗꞨꫳ is a valid identifier = True
It’s a valid identifier because of PEP-3131 that added these additional Non-ASCII characters to the valid identifier character list. However, if you are using Python 2.x then it will return False
.
Print all valid identifier characters list
We can use unicodedata
to check if a character is a part of valid identifiers list or not. Here is the program to print all the valid characters that can be used to create an identifier.
import unicodedata
count = 0
for codepoint in range(2 ** 16):
ch = chr(codepoint)
if ch.isidentifier():
print(u'{:04x}: {} ({})'.format(codepoint, ch, unicodedata.name(ch, 'UNNAMED')))
count = count + 1
print(f'Total Number of Identifier Unicode Characters = {count}')
Output:
...
ffd7: ᅲ (HALFWIDTH HANGUL LETTER YU)
ffda: ᅳ (HALFWIDTH HANGUL LETTER EU)
ffdb: ᅴ (HALFWIDTH HANGUL LETTER YI)
ffdc: ᅵ (HALFWIDTH HANGUL LETTER I)
Total Number of Identifier Unicode Characters = 48880
Note that I am providing only a few characters in the output because the valid identifier characters count is huge.
Reference: Official Documentation, PEP-3131