1. 20 Jun, 2017 2 commits
    • Karsten Loesing's avatar
      Add new method to retrieve raw descriptor length. · a113feb2
      Karsten Loesing authored
      When we switched from storing copies of byte[] to storing byte[]
      references, offset, and length, getRawDescriptor() suddenly became
      more expensive than before.  (Before that, copying bytes in the first
      place was always expensive.)  If an application only calls
      Descriptor#getRawDescriptorBytes() to learn the array length, there's
      now a more efficient way to provide that information.
      a113feb2
    • Karsten Loesing's avatar
      Make DescriptorFile obsolete. · 1062da63
      Karsten Loesing authored
      This change simplifies the DescriptorReader interface and allows for
      shorter and more concise application code.  The result is that
      DescriptorReader returns Descriptor instances rather than
      DescriptorFile instances containing Descriptors and accepts a maximum
      queue size in Descriptors, DescriptorFile is deprecated, and
      Descriptor contains a File reference to the descriptor file.
      Implements #22141.
      
      Related to this change, this commit introduces a new
      UnparseableDescriptor to be returned by DescriptorParser and
      DescriptorReader if a descriptor cannot be parsed, as opposed to
      throwing a DescriptorParseException or skipping the entire descriptor
      file (fixes #22139), respectively.
      
      Also related to this change, DescriptorParser now returns an Iterable
      instead of a List, which prepares parsing large descriptor files
      descriptor by descriptor (will be tackled in #20395).
      1062da63
  2. 19 Jun, 2017 1 commit
  3. 17 Jun, 2017 2 commits
  4. 16 Jun, 2017 6 commits
  5. 12 Jun, 2017 3 commits
  6. 08 Jun, 2017 2 commits
  7. 07 Jun, 2017 3 commits
  8. 06 Jun, 2017 8 commits
    • Karsten Loesing's avatar
      Prepare for 1.8.0 release. · 086b0460
      Karsten Loesing authored
      086b0460
    • iwakeh's avatar
      Added simple tests for DescriptorImpl. · a3f41684
      iwakeh authored
      a3f41684
    • Karsten Loesing's avatar
      Don't compute bridge descriptor digests. · 6052e22b
      Karsten Loesing authored
      These calls will fail for computing bridge descriptor digests, because
      there are no `-----END SIGNATURE-----` lines in sanitized bridge
      descriptors.
      6052e22b
    • Karsten Loesing's avatar
      Store raw descriptors as byte[], offset, and length. · a4d184bf
      Karsten Loesing authored
      Prior to this commit we read raw descriptor bytes from disk, split
      them into serveral byte[] for each contained descriptor, and stored
      those copies together with descriptors.  We further copied descriptor
      parts, like signatures or status entries, and stored those copies as
      well.
      
      Overall, we temporarily required up to 3 times the size of descriptor
      files just to store raw descriptor contents: 1) the entire descriptor
      file read to memory, 2) copies of all contained descriptors, and 3)
      copies of contained descriptor parts.  After moving on to the next
      descriptor file, 1) was freed, but 2) and 3) remained in memory.  This
      was rather wasteful.
      
      With this commit we store raw descriptors as reference to the byte[]
      containing the entire descriptor file plus offset and length of the
      part containing one descriptor.  Similarly we store raw descriptor
      parts as a reference to the full descriptor plus offset and length of
      the descriptor part.  This saves a lot of memory, and it avoids
      unnecessary array copying.
      
      This change is also a step towards not storing raw descriptor contents
      in memory at all, but instead leaving contents on disk and accessing
      parts as needed.  However, this commit does not take that step yet.
      
      The original purpose of this commit was to prepare switching from the
      platform's default charset to UTF-8 for #21932.  The idea was to
      reduce access to DescriptorImpl#rawDescriptorBytes and add all methods
      working on those bytes, including converting them to a String, to
      DescriptorImpl.  This commit achieves this purpose by preparing that
      switch, yet it does not take that step, either.  Switching to UTF-8 is
      midly backward-incompatible, so it'll have to wait until 2.0.0.
      However, switching will be much easier based on the changes in this
      commit.
      
      Many of these changes in this commit are interdependent which makes it
      difficult to split up this commit with reasonable effort.  Still, in
      order to facilitate reviews, here is an explanation of changes made in
      this commit from top to bottom:
      
      Move all code for processing raw descriptor bytes from a) detecting
      the descriptor type, b) finding descriptor starts and ends, up to c)
      invoking the right DescriptorImpl subclass constructors from
      DescriptorImpl and its subclasses over to DescriptorParserImpl.
      
      Include offset and limit in the constructors of DescriptorImpl and
      most of its subclasses.
      
      Refer to directory and network status parts in RelayDirectoryImpl and
      NetworkStatusImpl and its subclasses by offset and length rather than
      passing copies of raw descriptors.
      
      Provide two overloaded methods DescriptorImpl#newScanner() that
      internally handle the byte[]-to-String conversion rather than leaving
      this task to all DescriptorImpl subclasses.
      
      In DescriptorImpl, rather than storing a copy of raw descriptor bytes
      per descriptor, store a reference to a potentially larger byte[],
      containing all descriptors read from a given file, together with
      offset and length.
      
      Provide various methods in DescriptorImpl that provide access to raw
      descriptor bytes and that internally handle issues like unified
      character encoding.
      
      Include an XXX21932 tag in all places where byte[] is currently
      converted to String using the platform's default charset.
      
      Update existing methods in DescriptorImpl to only access
      rawDescriptorBytes within offset and offset + length.
      
      In classes referenced from DescriptorImpl subclasses, like
      DirSourceEntryImpl and NetworkStatusEntryImpl, rather than storing a
      copy of raw descriptor bytes, store a reference to the parent
      DescriptorImpl instance together with offset and length.
      
      Change raw descriptor bytes in ExitListEntryImpl into a String,
      because the byte[] we stored there was never read from disk but
      generated by ourselves using String#getBytes() using the platform's
      default charset.  We also never used raw bytes in ExitListEntryImpl
      anyway.  Admittedly, we could use offset and length there, too, but
      the amount of saved memory is likely not worth the necessary code
      changes.
      
      Remove redundant zero-length checks from DescriptorImpl subclasses
      including ExitListImpl, NetworkStatusImpl, and RelayDirectoryImpl.
      These checks are redundant, because we already performed the same
      checks in DescriptorImpl#countKeys().
      
      Move commonly used helper methods for finding the first index of a
      keyword or splitting descriptory by keyword from DescriptorImpl
      subclasses, like NetworkStatusImpl and RelayDirectoryImpl, to
      DescriptorImpl.
      
      In test classes, replace the numerous invocations of DescriptorImpl
      subclass constructors with local buildSomething() methods, so that
      future changes to constructor signatures won't produce a diff as long
      as this one.
      a4d184bf
    • Karsten Loesing's avatar
    • Karsten Loesing's avatar
      Fix bug in digest computation. · 232ea426
      Karsten Loesing authored
      232ea426
    • Karsten Loesing's avatar
      Move descriptor digest computation to DescriptorImpl. · 74de0a5d
      Karsten Loesing authored
      The main intention behind this change is to reduce the number of
      places in the code where byte[] is converted to String.  But another
      reason is to reduce code duplication, which would have been sufficient
      to make this change.
      74de0a5d
    • Karsten Loesing's avatar
  9. 04 Jun, 2017 1 commit
  10. 01 Jun, 2017 1 commit
  11. 26 May, 2017 2 commits
  12. 17 May, 2017 1 commit
  13. 16 May, 2017 7 commits
  14. 12 May, 2017 1 commit