注释\N或\U报错

今天看老师讲课回看讲到匹配分配有个\NUM
然后写在python文件当中用''''''注释
但是只要开头是\N或者\U就会报错
报错信息如下:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 12-13: malformed \N character escape
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 12-13: truncated \UXXXXXXXX escape

小龙兔

赞同来自: 64616dh

我试了一下,确实很妖怪啊。感觉解释器在看到\U的时候仍然尝试解析说\U后面应该跟一个类似\U00014321。逻辑上在注释里面写\U应该直接认为不需要遵守python语法检查的。   我尝试这样写,它就不会报问题了。不知道是不是解释器的一个bug。 ''' \\U '''    

小龙兔

赞同来自:

看到一篇文章遇到了你一模一样的问题。   https://blog.csdn.net/wlsyn/article/details/49613867   这是python有人遇到一个类似注释中会报错的问题,然后官方回答是,这是as design。细节可以看链接: https://bugs.python.org/issue13185   具体可以看这段话,从上面链接粘贴过来的
> Clearly, the stripping of comments and the source decoding should both be done in
> a single pass, and the source decoding should not be applied to the
> comments.

That's not clear at all. In general (i.e. for arbitrary encodings), it
is not possible to determine where the hash ("#") signs are in the input
without decoding. So you have to decode first.

In addition, it was a deliberate choice that the source encoding must be
consistent (i.e. all characters in the source must decode correctly),
even if that is not needed for parsing. This is like requiring colons
at the end of statements: they are not needed for parsing, but requiring
them improves the language.

要回复问题请先登录注册