avoid httpurl connection and use more robust approach
- Truncate descriptions
While testing sync the following happened:
during the downloading of recent
from the main collector using the old, i.e. DescriptorCollectorImpl, method the download just hung at some point. I killed it after 30min, restarted and the collection worked.
When the hang-up happened again, I stopped the process immediately and re-started it using the new method, i.e. DescriptorIndexCollector. Now, FileNotFound warnings were logged and the download finished without problems. The files that were not available anymore had just expired, that is CollecTor just cleaned them out of its recent folder, which would explain the hang-up of the old method, too.
=== Why can one approach cope with disappearing files? The two ways differ in the trace below FileInputStream (marked below; needs a wide window to be visible)
==== thread dump old method
"CollecTor-Scheduled-Thread-1" #9 daemon prio=5 os_prio=0 tid=0x00007f49d43c0800 nid=0x25fd runnable [0x00007f49b0f06000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
at sun.security.ssl.InputRecord.read(InputRecord.java:532)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
- locked <0x000000071dc8b160> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x000000071dcbe658> (a sun.security.ssl.AppInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) <------------------------------------+
- locked <0x000000071dd26f98> (a java.io.BufferedInputStream) |
at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) |
at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) | Difference
at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) |
- locked <0x000000071dd2ae48> (a sun.net.www.http.ChunkedInputStream) |
at java.io.FilterInputStream.read(FilterInputStream.java:133) <-------------------------------------+
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3336)
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x000000071dd2d138> (a java.io.BufferedInputStream)
at org.torproject.descriptor.impl.DescriptorCollectorImpl.fetchRemoteFile(DescriptorCollectorImpl.java:225)
...
==== thread dump new method
"CollecTor-Scheduled-Thread-1" #9 daemon prio=5 os_prio=0 tid=0x00007f19fc3a9800 nid=0x4060 runnable [0x00007f19e4a3f000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
at sun.security.ssl.InputRecord.read(InputRecord.java:532)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
- locked <0x0000000734394300> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x00000007343963d0> (a sun.security.ssl.AppInputStream)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) <-------------------------------------+
- locked <0x000000071df8b3b8> (a java.io.BufferedInputStream) |
at sun.net.www.MeteredStream.read(MeteredStream.java:134) | Difference
- locked <0x000000071df8f618> (a sun.net.www.http.KeepAliveStream) <------- (*) |
at java.io.FilterInputStream.read(FilterInputStream.java:133) <--------------------------------------+
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3336)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3329)
at java.nio.file.Files.copy(Files.java:2908)
at java.nio.file.Files.copy(Files.java:3027)
at org.torproject.descriptor.index.DescriptorIndexCollector.fetchRemoteFiles(DescriptorIndexCollector.java:88)
at org.torproject.descriptor.index.DescriptorIndexCollector.collectDescriptors(DescriptorIndexCollector.java:61)
...
(*) see KeepAliveStream.java
example code from DescriptorIndexCollector (cf. here)
try (InputStream is = new URL(baseUrl + "/" + filepathname)
.openStream()) {
Files.copy(is, tempDestinationFile.toPath());
...
This is also a little shorter than the older approach
try {
URL url = new URL(urlString);
huc = (HttpURLConnection) url.openConnection();
huc.setRequestMethod("GET");
huc.connect();
int responseCode = huc.getResponseCode();
if (responseCode == 200) {
BufferedReader br = new BufferedReader(new InputStreamReader(
huc.getInputStream()));
String line;
while ((line = br.readLine()) != null) {
sb.append(line).append("\n");
}
br.close();
}
...
=== next steps
- Try to device a test that triggers the above problem.
- And/Or analyze the underlying sources for the reason.
- Replace the HttpURLConnection code with the shorter Files.copy of InputStream.
This would probably important for the following tickets, too: legacy/trac#8799 (moved), legacy/trac#16151 (moved)
- Show labels
- Show closed items