tools/asn2wrs.py: handle windows-1252 encoding

The RRC ASN.1 definitions resulted in a decode error in Python because
the file is encoded as windows-1252 instead of UTF-8. This patch makes
the tool more forgiving in handling windows-1252 encodings.

Tested with Python 2.6.9, 2.7.10, 3.4.3.

Change-Id: I9c9269e1065c98b8bcfb57ab4bfd21d5e183a656
Reviewed-on: https://code.wireshark.org/review/9133
Reviewed-by: Pascal Quantin <pascal.quantin@gmail.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
This commit is contained in:
Peter Wu 2015-06-24 20:03:51 +02:00
parent 1141033884
commit 149d0b7e91

View File

@ -7977,9 +7977,18 @@ def eth_main():
input_file = fn
lexer.lineno = 1
if (ectx.srcdir): fn = ectx.srcdir + '/' + fn
f = open (fn, "r")
ast.extend(yacc.parse(f.read(), lexer=lexer, debug=pd))
f.close ()
# Read ASN.1 definition, trying one of the common encodings.
data = open(fn, "rb").read()
for encoding in ('utf-8', 'windows-1252'):
try:
data = data.decode(encoding)
break
except:
warnings.warn_explicit("Decoding %s as %s failed, trying next." % (fn, encoding), UserWarning, '', 0)
# Py2 compat, name.translate in eth_output_hf_arr fails with unicode
if not isinstance(data, str):
data = data.encode('utf-8')
ast.extend(yacc.parse(data, lexer=lexer, debug=pd))
ectx.eth_clean()
if (ectx.merge_modules): # common output for all module
ectx.eth_clean()