Problem parsing .z-compressed descriptors fetched via DirPort
- Truncate descriptions
I'm having trouble parsing compressed descriptors fetched via tor's DirPort. I don't know enough about .z compression to track down the problem though. Anyway, here's what I found:
Download consensus and all server descriptors from turtles:
curl http://76.73.17.194:9030/tor/status-vote/current/consensus.z > turtles-consensus.z
curl http://76.73.17.194:9030/tor/server/all.z > turtles-server-all.z
Attempt to parse compressed consensus using Python's zlib:
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import zlib
>>> print len(zlib.decompress(open('turtles-consensus.z', 'rb').read()))
1085611
>>> print len(zlib.decompress(open('turtles-server-all.z', 'rb').read()))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
zlib.error: Error -5 while decompressing data: incomplete or truncated stream
>>> print len(zlib.decompressobj().decompress(open('turtles-server-all.z', 'rb').read()))
8932838
Decompressing the consensus works fine using zlib.decompress()
, but decompressing the server descriptors does not. The only workaround I found was to explicitly create a decompressor using zlib.decompressobj()
. AIUI, the difference between the two approaches is that the latter can handle partial content (cf. http://stackoverflow.com/questions/20620374/how-to-inflate-a-partial-zlib-file/20625078#20625078).
Does that mean tor sends partial content?
Cc'ing atagar and wfn, because we discussed this problem a year ago: https://lists.torproject.org/pipermail/tor-dev/2013-May/004924.html
- Show labels
- Show closed items