Implement lazy parsing for stem
Damian and I had a small discussion regarding lazy parsing (see below) and how it could speed up dealing with descriptor data. This might not be an awful lot of work for zoossh, so it might be worth implementing it.
18:28 <atagar> phw: Side note concerning zoossh, another option could be lazy parsing for descriptors. If I was to do stem's parsers again that's what I'd opt for to make them more performant. That would be a fair bit of work, but would both benefit all stem users and have performance just as fast as any Go solution (time would all be IO).
18:29 <atagar> That said though, Zoossh seems like a great way of learning the language so if that's the goal have fun. :)
18:36 <phw> atagar: that's actually a good idea, thanks
18:39 <atagar> phw: Oh! If you're interested then please open a ticket under the Stem component. This is something I've idly given some thought to for over a year but never bothered to actually jot down the idea. ;P
18:39 <atagar> Didn't expect you to actually think about opting for this route.
18:41 <atagar> Thought was that reading a descriptor dumps to a simple object that's a {keyword: [lines...]} dictionary. The getter methods then parse the actual content and cache the results. Upside: far, far faster since you only parse the fields you care about, downside: no upfront validation is done so malformed content would be acceptable.
18:42 <atagar> That said, validation is a far, far smaller concern for our users than performance in practice so this is a tradeoff I'd be fine with.
18:42 <atagar> We could then have a validate() method that simply calls all the getters to achieve the same thing we do now.
18:45 <atagar> Previously I thought that doing this would break backward compatibility which made me a little less keen on it (since we'd then need 'descriptor v2' objects) but on refelction it doesn't. We could slip this in transparently. The only difference users would see would be a tremendous speedup if the opt to not have validation.