Python regex findall alternation behavior -


i'm using python 2.7.6. can't understand following result re.findall:

>>> re.findall('\d|\(\d,\d\)', '(6,7)') ['(6,7)'] 

i expected above return ['6', '7'], because according documentation:

'|'

a|b, , b can arbitrary res, creates regular expression match either or b. arbitrary number of res can separated '|' in way. can used inside groups (see below) well. target string scanned, res separated '|' tried left right. when 1 pattern matches, branch accepted. this means once matches, b not tested further, if produce longer overall match. in other words, '|' operator never greedy. match literal '|', use \|, or enclose inside character class, in [|].

thanks help

as mentioned in document :

this means once matches, b not tested further, if produce longer overall match.

so in case regex engine doesn't match \d because string stars ( , not \d match second case \(\d,\d\). if string stared \d match \d :

>>> re.findall('\d|\d,\d\)', '6,7)') ['6', '7'] 

Comments

Popular posts from this blog

php - Zend Framework / Skeleton-Application / Composer install issue -

c# - Better 64-bit byte array hash -

python - PyCharm Type error Message -