Store raw descriptors as byte[], offset, and length.
Prior to this commit we read raw descriptor bytes from disk, split them into serveral byte[] for each contained descriptor, and stored those copies together with descriptors. We further copied descriptor parts, like signatures or status entries, and stored those copies as well. Overall, we temporarily required up to 3 times the size of descriptor files just to store raw descriptor contents: 1) the entire descriptor file read to memory, 2) copies of all contained descriptors, and 3) copies of contained descriptor parts. After moving on to the next descriptor file, 1) was freed, but 2) and 3) remained in memory. This was rather wasteful. With this commit we store raw descriptors as reference to the byte[] containing the entire descriptor file plus offset and length of the part containing one descriptor. Similarly we store raw descriptor parts as a reference to the full descriptor plus offset and length of the descriptor part. This saves a lot of memory, and it avoids unnecessary array copying. This change is also a step towards not storing raw descriptor contents in memory at all, but instead leaving contents on disk and accessing parts as needed. However, this commit does not take that step yet. The original purpose of this commit was to prepare switching from the platform's default charset to UTF-8 for #21932. The idea was to reduce access to DescriptorImpl#rawDescriptorBytes and add all methods working on those bytes, including converting them to a String, to DescriptorImpl. This commit achieves this purpose by preparing that switch, yet it does not take that step, either. Switching to UTF-8 is midly backward-incompatible, so it'll have to wait until 2.0.0. However, switching will be much easier based on the changes in this commit. Many of these changes in this commit are interdependent which makes it difficult to split up this commit with reasonable effort. Still, in order to facilitate reviews, here is an explanation of changes made in this commit from top to bottom: Move all code for processing raw descriptor bytes from a) detecting the descriptor type, b) finding descriptor starts and ends, up to c) invoking the right DescriptorImpl subclass constructors from DescriptorImpl and its subclasses over to DescriptorParserImpl. Include offset and limit in the constructors of DescriptorImpl and most of its subclasses. Refer to directory and network status parts in RelayDirectoryImpl and NetworkStatusImpl and its subclasses by offset and length rather than passing copies of raw descriptors. Provide two overloaded methods DescriptorImpl#newScanner() that internally handle the byte[]-to-String conversion rather than leaving this task to all DescriptorImpl subclasses. In DescriptorImpl, rather than storing a copy of raw descriptor bytes per descriptor, store a reference to a potentially larger byte[], containing all descriptors read from a given file, together with offset and length. Provide various methods in DescriptorImpl that provide access to raw descriptor bytes and that internally handle issues like unified character encoding. Include an XXX21932 tag in all places where byte[] is currently converted to String using the platform's default charset. Update existing methods in DescriptorImpl to only access rawDescriptorBytes within offset and offset + length. In classes referenced from DescriptorImpl subclasses, like DirSourceEntryImpl and NetworkStatusEntryImpl, rather than storing a copy of raw descriptor bytes, store a reference to the parent DescriptorImpl instance together with offset and length. Change raw descriptor bytes in ExitListEntryImpl into a String, because the byte[] we stored there was never read from disk but generated by ourselves using String#getBytes() using the platform's default charset. We also never used raw bytes in ExitListEntryImpl anyway. Admittedly, we could use offset and length there, too, but the amount of saved memory is likely not worth the necessary code changes. Remove redundant zero-length checks from DescriptorImpl subclasses including ExitListImpl, NetworkStatusImpl, and RelayDirectoryImpl. These checks are redundant, because we already performed the same checks in DescriptorImpl#countKeys(). Move commonly used helper methods for finding the first index of a keyword or splitting descriptory by keyword from DescriptorImpl subclasses, like NetworkStatusImpl and RelayDirectoryImpl, to DescriptorImpl. In test classes, replace the numerous invocations of DescriptorImpl subclass constructors with local buildSomething() methods, so that future changes to constructor signatures won't produce a diff as long as this one.