This may be easy for someone who has a bunch of different OSes and writes some test JS to print out values from some audio manipulation tests using these buffer extraction functions. Well, easy to test for differences anyway.
Trac: Keywords: ff31-esr deleted, ff38-esr added Summary: Determine if AudioBuffers are a fingerprinting vector to Determine if AudioBuffers/OfflineAudioContext are a fingerprinting vector
"This page tests browser-fingerprinting using the AudioContext and Canvas API. Using the AudioContext API to fingerprint does not collect sound played or recorded by your machine - an AudioContext fingerprint is a property of your machine's audio stack itself. "
I have three different machines, one Windows and two Linux ones and I can verify that for each different machine using Tor Browser 5.5.5 the fingerprints are exactly the same for each machine.
The fingerprinting persists on OS reboots, Tor Browser restarts and using Tor Browser's "New identity".
I have tested using the website above (https://audiofingerprint.openwpm.com) in Tor Browser 5.5.5 and have done the test three times for each machine.
This is very problematic, hope this can be fixed soon! Thanks all!
Edit: Sorry I thought fingerprintjs2 used the same "Audio fingerprinting" as Princeton's test.
So for those who want to do the test themselves, use https://audiofingerprint.openwpm.com and you might want to first enable Javascript, put yourself in Offline mode, and then do the test, so that no information about your machine/browser is sent to Princeton.
Trac: Severity: Normal to Critical Priority: Medium to Very High
I have three different machines, one Windows and two Linux ones and I can verify that for each different machine using Tor Browser 5.5.5 the fingerprints are exactly the same for each machine.
Hm... if they are exactly the same for each machine isn't that a good thing? It allows you hiding in the crowd which is our strategy to beat fingerprinters. That said, I tested it as well with two different Linux machines (and distributions) and on a Windows computer. I got the same fingerprint for the Linux machines but a different one with Windows (which is on one of the Linux boxes, too). Thus, this seems to support the theory that this is an OS-fingerprinting problem. Or did I miss anything?
Trac: Severity: Critical to Normal Priority: Very High to Medium Status: new to needs_information
I have three different machines, one Windows and two Linux ones and I can verify that for each different machine using Tor Browser 5.5.5 the fingerprints are exactly the same for each machine.
Hm... if they are exactly the same for each machine isn't that a good thing? It allows you hiding in the crowd which is our strategy to beat fingerprinters. That said, I tested it as well with two different Linux machines (and distributions) and on a Windows computer. I got the same fingerprint for the Linux machines but a different one with Windows (which is on one of the Linux boxes, too). Thus, this seems to support the theory that this is an OS-fingerprinting problem. Or did I miss anything?
The fingerprints are the same for Tor Browser 5.5.5 on each machine individually independent of browser, OS, or computer restarts.
So each Tor Browser can be uniquely identified. This is very problematic ...
Trac: Severity: Normal to Critical Priority: Medium to Very High
fingerprintjs2 does not (currently) contain tests for AudioContext fingerprinting: https://github.com/valve/fingerprintjs2/. The OpenWPM page has the tests in an embedded script tag (it has several implementations, apparently observed in the wild Internet).
In my case, I get the same fingerprint on 2 different computers with the same OS and same Tor Browser version. But changing Tor Browser version (I tried with 5.5.5 and 6.0a5) changed the fingerprint.
I've done some investigation of the fingerprinting via the Web Audio API. As far as I can tell, the source code for the Web Audio audio processing alogrithms, in mozilla-central's dom/media/webaudio/ directory, is doing computations that run on the cpu/fpu only. That is, I don't see any evidence for acceleration of these algorithms on audio hardware, gpus, or other special platform-specific tricks.
I also specifically examined the API calls used for fingerprinting in view-https://audiofingerprint.openwpm.com/, and tracked down their C/C++ implementations and the helper libraries they depend on (primarily Kiss FFT and libav/FFT) in the Firefox codebase . There's nothing I found that indicates OS- or hardware-specific algorithms.
So that suggests to me that we shouldn't expect radically more fingerprinting than is already observed via the JS Math API (as we discuss in legacy/trac#13018 (moved)). And, if we are able to find partial defenses for Math-based fingerprinting, such as bundling our own math libraries or setting certain compiler flags, then I would expect these would help to defend against Web Audio fingerprinting attacks as well.
It is possible, however, that the Web Audio API provides an efficient way to sample the space of floating point arithmetic operations to find differences between platforms that would be difficult to find manually. It's also possible that extensive use of somewhat complex numerical algorithms in the Web Audio source code and helper libraries provide more possibilities for floating point discrepancies than can be observed in the relatively simple JS Math interface. So in that sense this API might be a little extra dangerous.
The Web Audio API looks to me like something that would only have occasional legitimate uses. Most sites using audio do not need to do any sound processing on the fly. Many games need only to play sound samples, which can be done with elements and don't require Web Audio. Uses for Web Audio I can think of include 3D games or other immersive content, music sequencers or audio/video editing apps. So, because these are fairly unusual, I think one efficient defense would be to prompt the user before allowing content to instantiate an AudioContext object, very similar to how we prompt before HTML5 Canvas image extraction (legacy/trac#6253 (closed)).
The Web Audio API looks to me like something that would only have occasional legitimate uses. Most sites using audio do not need to do any sound processing on the fly. Many games need only to play sound samples, which can be done with elements and don't require Web Audio. Uses for Web Audio I can think of include 3D games or other immersive content, music sequencers or audio/video editing apps. So, because these are fairly unusual, I think one efficient defense would be to prompt the user before allowing content to instantiate an AudioContext object, very similar to how we prompt before HTML5 Canvas image extraction (legacy/trac#6253 (closed)).
I think the prompt is a good solution if indeed the Web Audio API reveals more about a browser/machine/OS than the JS Math interface. If not, fixing the JS Math interface should fix this problem? Not sure...
Can we find two Linux systems that have the same bit-width but different fingerprints here (Ie: debian/stable vs Fedora, or something with a large time difference between releases and differences in base system)
If so, a useful test would be testing if simply copying the same libm.so (found in either /lib/x86_64-linux-gnu/libm.so.6 or /usr/lib/x86_64-linux-gnu/libm.so on my system) onto both TBB's TorBrowser/Tor/ directory to see if the fingerprints here become identical again. That Tor directory should be in LD_LIBRARY_PATH, overriding the /lib search. You can check /proc/pid/maps to see if it got loaded.
This simple test would help us determine if we're just looking at math routine differences. In which case, we could ship the equivalent of a uniform set of math routines for all platforms for TBB, and use those instead.
I have been running the https://audiofingerprint.openwpm.com/ test on one computer with 3 different linux distributions using docker (so the same kernel was used): Fedora 22, Debian Jessie, Debian Wheezy.
The Fingerprint using DynamicsCompressor (sum of buffer values) line was the same in all cases: 35.74996018782258
The Fingerprint using DynamicsCompressor (hash of full buffer) was the same on Fedora 22 and Debian Jessie: 158e8189a3551fe4f2e564ac377b0f1e588a1ab3
But it was different on Debian Wheezy: 205ae8bb7897e9c9faa399d83bbcdc704a9962a1
After putting a copy of a libm.so.6 from Fedora in the Browser/TorBrowser/Tor/ directory and running it again on Wheezy, the hash of full buffer value became the same as on the 2 other distributions.
So it looks like the libm.so.6 used affects the hash of full buffer.
I have been running the https://audiofingerprint.openwpm.com/ test on one computer with 3 different linux distributions using docker (so the same kernel was used): Fedora 22, Debian Jessie, Debian Wheezy.
The Fingerprint using DynamicsCompressor (sum of buffer values) line was the same in all cases: 35.74996018782258
The Fingerprint using DynamicsCompressor (hash of full buffer) was the same on Fedora 22 and Debian Jessie: 158e8189a3551fe4f2e564ac377b0f1e588a1ab3
But it was different on Debian Wheezy: 205ae8bb7897e9c9faa399d83bbcdc704a9962a1
After putting a copy of a libm.so.6 from Fedora in the Browser/TorBrowser/Tor/ directory and running it again on Wheezy, the hash of full buffer value became the same as on the 2 other distributions.
So it looks like the libm.so.6 used affects the hash of full buffer.
What about Fingerprint using OscillatorNode and Fingerprint using hybrid of OscillatorNode/DynamicsCompressor method ?
When I tested it (see comment 22) those were all different from each other, haven't looked too closely if the values remained the same though.
I run browserprint.info which has these tests on it and can confirm that the tests appear to be stable among regular browsers.
We looked at sets of multiple fingerprints that were made by the same browser.
We found that "Fingerprint using DynamicsCompressor (sum of buffer values):", "Fingerprint using DynamicsCompressor (hash of full buffer):", and "AudioContext properties:" where very stable (between 1 and 10 instances out of ~500 of them changing between fingerprints).
The other two were less stable but the majority of the time they were consistent (56 and 87 instances of them changing between fingerprints out of ~500).
Hello! Developer of Fingerprint Central here!
The website is still in beta but thanks to several visitors, it seems that we can already have an early insight on some AudioContext attributes from the 40 TBB fingerprints that were collected. I added the tests found from the OpenWPM Study and you can see some results below that I found the most relevant.
N°
Count
Percentage
User-Agent
pxi buffer hash
ac-sampleRate
ac-maxChannelCount
1
21
60.00%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
158e8189...
44100
2
2
4
11.43%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
89cad797...
48000
2
3
3
8.57%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
4baefb24...
44100
2
4
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
89cad797...
96000
2
5
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
4baefb24...
48000
2
6
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:38.0) Gecko/20100101 Firefox/38.0"
158e8189...
48000
2
7
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
e8a01cca...
44100
2
8
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
4baefb24...
44100
10000
9
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
158e8189...
44100
32
10
1
2.86%
"Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0"
158e8189...
44100
0
I don't know if it can be generalized to the majority of the TBB population but it seems that most users should have the same combination of Sample rate/Channel count/Buffer hash. However, differences can still be observed between sample rate (44100Hz/48000Hz/96000Hz) and max channel count (0/2/32/10000) and users without the most common values may be more prone to fingerprinting than others. I added the hash to see if there was a link between these attributes and the rendered audio but this needs more investigation as noted by #comment:26.
I like this approach. Here's a patch that sets the pref to false. And we could keep this ticket open to work on a patch to homogenize the fingerprint instead in the future.