python decode error - UnicodeDecodeError: ‘utf-8’ codec can’t decode

decode 시에 error 가 발생하면 error 파라미터를 같이 넘긴다.

>>> html = html.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 38395: invalid start byte

The errors argument specifies the response when the input string can’t be converted according to the encoding’s rules. Legal values for this argument are ‘strict’ (raise a UnicodeDecodeError exception), ‘replace’ (use U+FFFD, REPLACEMENT CHARACTER), ‘ignore’ (just leave the character out of the Unicode result), or ‘backslashreplace’ (inserts a \xNN escape sequence). The following examples show the differences:

https://docs.python.org/3/howto/unicode.html#the-unicode-type

>>> b'\x80abc'.decode("utf-8", "strict")  
Traceback (most recent call last):
    ...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
  invalid start byte
>>> b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc'
>>> b'\x80abc'.decode("utf-8", "backslashreplace")
'\\x80abc'
>>> b'\x80abc'.decode("utf-8", "ignore")
'abc'

저작자표시 비영리 변경금지

'Programming > Python' 카테고리의 다른 글

pycharm, pandas-datareader warning (0)	2017.01.07
jupyter 사용해보기 (0)	2016.12.19
[python] 알아보기 01 (0)	2016.04.11
[python] PyQt QString Unresolved reference (0)	2016.03.09
[python] 파이썬 정규표현식 - 파일 읽기, 저장 (0)	2016.03.04

Beyond Cool, Adventure Forever!

python decode error - UnicodeDecodeError: ‘utf-8’ codec can’t decode

python decode error - UnicodeDecodeError: ‘utf-8’ codec can’t decode

'Programming > Python' 카테고리의 다른 글

댓글

티스토리툴바

python decode error - UnicodeDecodeError: ‘utf-8’ codec can’t decode

python decode error - UnicodeDecodeError: ‘utf-8’ codec can’t decode

'Programming > Python' 카테고리의 다른 글

관련글

댓글

티스토리툴바