Compass' command-line script can't encode unicode characters
Today I found that
less are unhappy about the task #6329 (moved) script printing out unicode characters. When piping its output into
less, the script exits with a traceback. When writing to stdout directly, Python is happy.
Here's how to reproduce the problem:
Clone the metrics-tasks repository.
Navigate to the #6329 (moved) script and make it download required data:
cd task-6329/; ./tor-relays-stats.py -d
Find a unicode character in an AS name:
grep -B1 "as_name.*\\\\u" details.json
Display relays in that AS, e.g. AS28548:
./tor-relays-stats.py -i -a 28548 | tail
Python should print out the following traceback:
Traceback (most recent call last): File "./tor-relays-stats.py", line 197, in <module> short=70 if options.short else None) File "./tor-relays-stats.py", line 110, in print_groups print formatted_group[:short] UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 144: ordinal not in range(128)
I found that a possible solution is to replace all Unicode characters with '?'s, but that doesn't seem very elegant:
- exit, guard, country, as_number, as_name) + exit, guard, country, as_number, as_name.encode('ascii', 'replace'))
Are there better solutions?