I don't understand where/how it gets used or what it's purpose is (disclosure: i am not a browser engineer) but "enables 3rd party data-loss prevention software to serve as a gate-keeper" doesn't sound like something we want
DLP agents are background processes on managed computers that allow enterprises to monitor locally running applications for data exfiltration events. They can allow/block these activities based on customer defined DLP policies.
I looked into this a bit and there's not a minimal/clean way to not include this at compile time unfortunately. A patch could be developed with a bit of effort but more than I'm willing to expend at the moment.
Ok for now this is disabled by pref and it is opt-in via cmd-line to even enable the system. Long-term, we should add a configure option to completely opt-out of including this thing, and attempt to uplift it.
morganchanged title from Disable Content Analysis SDK for DLP to Add configure option to disable Content Analysis SDK
changed title from Disable Content Analysis SDK for DLP to Add configure option to disable Content Analysis SDK
About the flag: I think the code has been updated later, as the variable is called gAllowContentAnalysis in our source tree.
Also, the flag is necessary, in addition to the preference, unless the preference itself was set via enterprise policies (which we disable).
As far as I can tell from a first analysis, the main directories we'd have to nuke are third_party/content_analysis_sdk and toolkit/components/contentanalysis.
And if I'm not wrong, files of the former are built from the moz.build in the latter.
So, we could simply remove contentanalysis from toolkit/components/moz.build.
There are also other references, such as in toolkit/components/protobuf/regenerate_cpp_files.sh, but this one simply generates some Protobuf files, so it shouldn't be a big deal.
We could try to run a build and see if there's breakage at compile time.
I don't know for runtime.
Also, I don't know if upstream would take such a simple patch, but we can try.
There are several other occurrences of contentanalysis in the codebase, including clipboard, file dialog, printing, and IPC files.
They aren't a lot in each file, but, after excluding third_party/content_an*, toolkit/components/contentan*, I still get 358 result in 40 files.
Some are CSS, some are JS (hopefully lazy loading and gated on the pref), and some are localization files, but there are also some C++ files, which means we'd have to use some #ifdefs, or provide a stubbed API.
I don't think it's worth to do that many changes, as the pref will keep this feature off, and we disabled enterprise policies, which would have priority over the preference.
At most, we could change ContentAnalysis::GetIsActive to always return false, but the pref should be enough already (and if you are whistleblowing against your enterprise you should use Tails anyway and exfiltrate files in another way ).
Also, I guess somebody having a setup with the required agent would have something to block Tor Browser as well .
I originally tried prototyping conditionally including the ContentAnalsys code with a build-flag, and mocking out the content analysis WebIDL implementation with no-op stubs. Unfortunately, there are a number of non-trivial native functions which hang off these implementations as well which would need to be either mocked, or whose callers would need to be ifdef'd out. So the first half was relatively easy, but the second half was more effort than I was willing to expend at the time.
I think longer-term it would be nice for these various systems to be opt-in at compile time, rather than opt-out at runtime and uplift to Mozilla (and some things are which is great e.g. WebRTC, parental controls, etc). Especially given the ever looming threat of APK size restrictions on Android.