Python re to separate some data values
Joshua Judson Rosen
rozzin at hackerposse.com
Wed Apr 28 19:35:19 EDT 2021
On 4/28/21 7:01 PM, Bruce Labitt wrote:
> On 4/28/21 6:28 PM, Joshua Judson Rosen wrote:
>>> re.search('(\.)\d{3,3}', r1[1]) returns
>>> <re.Match object; span=(3, 7), match='.980'> so it found the first instance.
>>>
>>> But, re.sub('(\.)\d{3,3}', '(\.)\d{3,3}, ', r1[1]) yields a KeyError:
>>> '\\d' (Python3.8). Get bad escape \d at position 4.
>> The second argument [the replacement string] to re.sub(pattern, repl, string) is not supposed to
>> just be a variation of the pattern-matching string that you passed as the first argument.
>>
>> I think the best illustration that I can give here is to just fix this up for you:
>>
>> re.sub(r'(\.)(\d{3,3})', r'\1\2, ', r1[1])
>>
> Thanks for the embarrassingly concise answer. It is greatly
> appreciated. Can you explain the syntax of the 2nd argument? I haven't
> seen that before. Where can I find further examples?
>
> What astounds me is re.search allowed my 1st argument, but re.sub barfed
> all over the same 1st argument.
Actually re.search also accepted your first argument just fine.
It was your _second_ argument that it barfed all over,
because your match didn't produce a "matched character group #d",
it only produced a "matched character group #1"
(IIRC Python's RE documentation generally just calls them "groups").
Note that I added a second set of parentheses to your _pattern_
so that you now have also a group #2.
I was trying to make the smallest change possible to your pattern,
but this also would work fine:
re.sub(r'(\.\d{3,3})', r'\1, ', r1[1])
The "\1" (and "\2", in the previous example) are "references",
and are actually explained in an OK-ish way in the online Python library manual's
section for re:
https://docs.python.org/3/library/re.html
(there are also a few other backreference syntaxes that you can use in Python,
so that you can give non-numeric names to them or just avoid ambiguities like
whether "\20" means `group #2 and then a literal "0"' or `group #20'...).
--
Connect with me on the GNU social network! <https://status.hackerposse.com/rozzin>
Not on the network? Ask me for more info!
More information about the gnhlug-discuss
mailing list