practracker.py codec exception in some locales
practracker.py, implemented in legacy/trac#29221 (moved), seems to have a locale dependency when python3 is being used. If the locale isn't a UTF-8 locale, UTF-8 characters in sources can result in an exception:
$ LANG=en_US.US-ASCII make check-best-practices PYTHON=python python ../scripts/maint/practracker/practracker.py .. mirkwood:build-norust tlyu$ LANG=en_US.US-ASCII make check-best-practices python3 ../scripts/maint/practracker/practracker.py .. Traceback (most recent call last): File "../scripts/maint/practracker/practracker.py", line 151, in <module> main() File "../scripts/maint/practracker/practracker.py", line 134, in main found_new_issues = consider_all_metrics(files_list) File "../scripts/maint/practracker/practracker.py", line 89, in consider_all_metrics found_new_issues |= consider_metrics_for_file(fname, f) File "../scripts/maint/practracker/practracker.py", line 104, in consider_metrics_for_file found_new_issues |= consider_file_size(fname, f) File "../scripts/maint/practracker/practracker.py", line 51, in consider_file_size file_size = metrics.get_file_len(f) File "/Users/tlyu/src/tor/scripts/maint/practracker/metrics.py", line 11, in get_file_len for i, l in enumerate(f): File "/Users/tlyu/src/brew/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 14: ordinal not in range(128) make: *** [check-best-practices] Error 1
I'm also seeing this on gitlab.com CI, but I don't know offhand what its locale environment variables are.
We might want to use the
encoding= keyword parameter to
open(), but I think that would no longer be python2 compatible.