Add general trace-event logging instrumentation to Tor

changed milestone to %Tor: 0.3.1.x-final in legacy/trac

Trac:
Parent: N/A to legacy/trac#13792 (moved)

I've talked to David a bit about this, and I do think it's a good way to run performance testing on private networks. It could also IMO make Shadow output easier to analyze, if the two facilities can be made to cooperate.

Preliminaries:

We should do a little research to make sure that there isn't some other important userspace instrumentation API that we should keep in mind for maintaining compatibility with. So far, it doesn't look like there is.
This should be, as noted above, controlled by a compile-time and a runtime option; both should be off by default.
We should make sure to document that this is for researchers and developers, not something that you should be running on a live relay.

Trac:
Parent: legacy/trac#13792 (moved) to N/A

Tenatively moving into 026

Trac:
Keywords: SponsorR deleted, SponsorR 026-deferrable added
Milestone: Tor: 0.2.??? to Tor: 0.2.6.x-final

Trac:
Parent: N/A to legacy/trac#13792 (moved)

Trac:
Cc: N/A to rob.g.jansen@nrl.navy.mil

As of now, I have 17 very useful tracepoints mostly in the hidden service subsystem. I don't think they are ready to be merged right now since we have yet to prove that all of them are useful to our performance measurement and that the data they record is actually what we really need (or any missing).

I suggest we move that to 0.2.7 because I would like to see that upstream soonish but with more eyes on the usefulness of the tracepoints. I think next tor dev. meeting, we should take a couple of minutes to discuss this in person with real use case I can demonstrate.

Trac:
Keywords: SponsorR 026-deferrable deleted, SponsorR added
Milestone: Tor: 0.2.6.x-final to Tor: 0.2.7.x-final

Trac:
Status: new to assigned

Marking tickets as deferrable from 0.2.7 triage round-1

Trac:
Keywords: N/A deleted, 027-triaged-1-deferrable added

Trac:
Summary: Add LTTng instrumentation to tor to Add instrumentation to tor
Description: Relevant to legacy/trac#13792 (moved), we (nickm and I) discussed what would be the way to collect and gather those statistics and measurements in a private network for performance analysis and debugging.

Using LTTng (https://lttng.org) to gather events of the different subsystems of tor (in userspace of course) seems like a good solution. This would be completely deactivated at compile time meaning tracepoints would be NOP. You would have to explicitly enable that feature during the configure process.

We think the best way to go with that, in terms of code, is to go in a header with something that could look like this:

#ifdef TOR_USE_LTTNG
tracepoint_aname(arg1, arg2, [...]) tracepoint(aname, arg1, arg2, ...)
#else
tracepoint_aname(arg1, arg2, [...])
#endif

The really good thing about this approach is that 1) everything is centralized in one place, 2) you NOP the call if not configured thus no performance issue, 3) We can support other tool like DTrace or SystemTap and possibly Windows tools. It provides a simple way to hook any other tool in the trace event facility of the code.

to

Relevant to legacy/trac#13792 (moved), we (nickm and I) discussed what would be the way to collect and gather those statistics and measurements in a private network for performance analysis and debugging.

This would be completely deactivated at compile time meaning tracepoints would be NOP. You would have to explicitly enable that feature during the configure process.

We think the best way to go with that, in terms of code, is to go in a header with something that could look like this:

#ifdef TOR_USE_LTTNG
tracepoint_aname(arg1, arg2, [...]) tracepoint(aname, arg1, arg2, ...)
#else
tracepoint_aname(arg1, arg2, [...])
#endif

The really good thing about this approach is that 1) everything is centralized in one place, 2) you NOP the call if not configured thus no performance issue, 3) We can support lots of different tracers as well as shadow. It provides a simple way to hook any other tool in the trace event facility of the code.
Cc: rob.g.jansen@nrl.navy.mil to robgjansen

Trac:
Milestone: Tor: 0.2.7.x-final to Tor: 0.2.8.x-final

Bulk-replace SponsorR keyword with SponsorR sponsor field in Tor component.

Trac:
Keywords: SponsorR deleted, N/A added
Sponsor: N/A to SponsorR

Trac:
Keywords: 027-triaged-1-deferrable deleted, N/A added
Points: N/A to medium

The crypto-instrumentation part of this is in for Nov if we can do it.

Trac:
Severity: N/A to Normal

Trac:
Keywords: N/A deleted, TorCoreTeam201511 added

Replying to nickm:

The crypto-instrumentation part of this is in for Nov if we can do it.

ACK. I have a branch that needs cleanup which adds the basics (build system integration and templates) to add any kind of low level instrumentation that we can then hook a tracer, shadow or any thing we would like that collect the data. I'll make sure to have it available soon for the crypto work to start.

For the crypto instrumentation, I'll need a Yawning or nickm for this. Basically, if you guys can give me callsite you want to instrument that is precise location in the code and payload to record, I can start adding them to the branch so you guys can see the mechanism and design involved.

Let me know if you would like to proceed differently. FYI, I plan in mid-November to work on this with Rob to instrument the connection buffers for his experiment and for that to be upstream also.

This would be a step that would be impossible to undo, and would be potentially very negative. I see LTTng has pretty documentation, though. Why don't we use their documentation formatter?

Trac:
Username: clePYew

Replying to clePYew:

This would be a step that would be impossible to undo, and would be potentially very negative.

Low-level instrumentation is indeed something that once added it has to be considered long live thus quite stable. I don't see that as negative because it's useful and as long as we think correctly about which event we add, it should be fine. The pros are beating the cons imo. It will become an external "ABI/API" like our control port.

I see LTTng has pretty documentation, though. Why don't we use their documentation formatter?

Not sure to understand here. LTTng has indeed pretty documentation on their website but it's first and foremost a tracer that can be hooked to those instrumentation event we'll add, not baked into tor.

Trac:
Owner: N/A to dgoulet

Bulk-move uncompleted items to december. :/

Trac:
Keywords: TorCoreTeam201511 deleted, TorCoreTeam201512, 201511-deferred added

Perhaps in January?

Trac:
Keywords: TorCoreTeam201512 deleted, TorCoreTeam201601, 201512-deferred added

Status update:

I'm working with Rob (through legacy/trac#17598 (moved)) on adding bunch of instrumentation points and a shadow component to hook on those so Rob can conduct KIST experiment (with and without shadow).

The goal is to submit the instrumentation and a tracing framework with some meat around the bone so we can directly start using it once merged and other developers have a concrete example on how to add more in the future.

To look at the work or follow it, see branch ticket13802_028_01 which is still in heavy development (needs cleanup) but at least can give an idea of the direction and structure of things.

Depending on Rob's time to test and review, optimistic ETA is by the end of the month.

Bulk-modify: It is February 2016, and no longer possible that anything else will get done in January 2016. Time's arrow and all that.

Trac:
Keywords: TorCoreTeam201601 deleted, TorCoreTeam201602 added

These seem like features, or like other stuff unlikely to be possible this month. Bumping them to 0.2.9

Trac:
Milestone: Tor: 0.2.8.x-final to Tor: 0.2.9.x-final

Trac:
Sponsor: SponsorR to SponsorR-must

Trac:
Keywords: TorCoreTeam201602 deleted, N/A added

Trac:
Points: medium to 3

Trac:
Reviewer: N/A to N/A
Milestone: Tor: 0.2.9.x-final to Tor: 0.2.???
Sponsor: SponsorR-must to SponsorR-can
Keywords: 201511-deferred, 201512-deferred deleted, N/A added

Poor ticket title. Instrumentation of what?

I've done some work with dgoulet's branch and have pushed it publicly here.

https://github.com/pastly/public-tor/tree/ticket13802_028_01

It is branched off of a slightly outdated release-0.2.8.

Differences from dgoulet's work

remove all/most of the included shadow traces
add shadow trace on inbuf
add shadow trace when cell is written to outbuf
add shadow trace when outbuf is flushed some

The shadow traces have been extensively tested as I'm using them. dgoulet's generalized code works wonderfully, and I don't think I really changed any of it.

IMO the code can be merged (after review) as my testing shows it's working. It's possible, though unlikely, that I'll need to change the shadow-specific traces.

Replying to arma:

Poor ticket title. Instrumentation of what?

Trac:
Summary: Add instrumentation to tor to Add general trace-event logging instrumentation to Tor

This doesn't belong in release version of Tor.

Milestone renamed

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.3.???

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

Trac:
Keywords: N/A deleted, tor-03-unspecified-201612 added
Milestone: Tor: 0.3.??? to Tor: unspecified

Hey, there's a patch for this!

Trac:
Milestone: Tor: unspecified to Tor: 0.3.0.x-final
Status: assigned to needs_review

Trac:
Status: needs_review to needs_revision

New public branch. Ready for review.

https://github.com/pastly/public-tor/tree/ticket13802

Trac:
Status: needs_revision to needs_review

Assigning this one to pastly so I can monitor what I need to review and what is mine waiting for review :).

Trac:
Status: needs_review to assigned
Owner: dgoulet to pastly

Trac:
Status: assigned to needs_review

Trac:
Reviewer: N/A to dgoulet

Trac:
Keywords: N/A deleted, review-group-15 added

Ok so I've cleaned up a bit the branch that is squashed all the commit into one and rebased it on master. I've also improved the comments and minor fixes.

I think this is ready for merge, it is working fine. Note that this is just the "tracing infrastructure". It is not used in the code base anywhere and you need to specifically enable it at compile time for the "tracepoint to log_debug()" framework (--enable-tracing-debug).

The point of having this skeleton is that once it's in, we can start adding tracepoint to subsystems that we want to trace using a specific framework (here we only have the log_debug() framework and even there it's very basic). For instance, KIST work at NRL are using this infrastructure to route their tracepoint to log_debug() but with some extra code around it. See doc/HACKING/Tracing.md for more information.

@nickm, if you prefer having it in use in the codebase before merging, I'm fine with it that means we differ this until we actually have tracepoint upstream. I think our friend at NRL can survive without it upstream yet I think because most of their stuff is a "new framework" plugged in this infrastructure.

Branch ticket13802_030_01.

Trac:
Status: needs_review to merge_ready

Let's take this early in the 031 cycle?

Trac:
Milestone: Tor: 0.3.0.x-final to Tor: 0.3.1.x-final

Trac:
Keywords: N/A deleted, review-group-16 added

Please rename 'tracing' to 'event tracing' everywhere. Otherwise, somebody will see the word "tracing" and freak out.

(I'm sure somebody will see the word "tracing" and freak out anyway, but let's at least give them a better chance to realize that it's silly to complain about debugging code.)

Trac:
Status: merge_ready to needs_review

(dgoulet said he'd take care of the renaming)

Trac:
Status: needs_review to needs_revision

Trac:
Keywords: review-group-16 deleted, N/A added

Taking over the ticket.

Trac:
Owner: pastly to dgoulet
Status: needs_revision to accepted

Renamed done, see fixup commit on this 031 branch: ticket13802_031_01

(Just in case you would prefer the non master rebased branch from above with the fixup commit: ticket13802_030_01)

Trac:
Reviewer: dgoulet to nickm
Status: accepted to needs_review

pfffew, exellent. Merged at last, with some slight documentation tweaks.

Trac:
Resolution: N/A to implemented
Status: needs_review to closed

While testing some of the compression code on OS X I found that this patch broke tor compilation on OS X.

OS X's ar(1) utility doesn't support creating an archive from an empty list of objects.

I've added a patch in https://gitlab.com/ahf/tor/merge_requests/4/commits that fixes this by creating a small stub function in an object.

Trac:
Status: closed to reopened
Resolution: implemented to N/A

Trac:
Status: reopened to needs_review

Merged!

Trac:
Status: needs_review to closed
Resolution: N/A to implemented

closed

changed time estimate to 24h

mentioned in issue legacy/trac#17598 (moved)

moved from legacy/trac#13802 (moved)

added Feature label and removed 1 deleted label

removed 1 deleted label

Add general trace-event logging instrumentation to Tor

Child items 0

Activity