At this point the branch only handles the linux bundles. linux is easier because we can control font behavior with a fonts.conf file without patching the browser source code. I added two fonts to the bundle, Droid and Lohit. Droid covers all the languages of TBB, and Lohit additionally covers various Indic scripts.
I started looking at Firefox changes to disable system fonts and use only bundled fonts. comment:1 shows that it's easy to do on linux with fonts.conf. It remains to do it on other platforms.
I started by trying to disable system fonts for DirectWrite. DirectWrite is one of two font rendering APIs used on Windows (the other is GDI). It doesn't quite work yet. The system fonts are not loaded, and bundled fonts are loaded. But all text is rendered as squares, even in browser chrome—except, curiously, Georgian text. It seems that for whatever reason the only font displayed by default is Droid Sans Georgian, even though you can select others from the Content menu. More on that below.
This is the code I tried, my fonts branch of tor-browser-bundle.git, and a patch against tor-browser.git. The tor-browser.git patch is Mozilla #998844 (for --enable-bundled-fonts), plus a dummy loader for system fonts in the DirectWrite renderer.
I had to set
gfx.font_rendering.directwrite.enabled=true
and restart in order to enable DirectWrite. (Running in KVM, about:support says "Direct2D Enabled: Blocked for your graphics card because of unresolved driver issues." and "DirectWrite Enabled: false (6.2.9200.16581)".)
Here's what it looks like:
Almost everything is rendered as boxes, except for the Georgian text. I copy-pasted from the Fonts panel in the Inspector into a text editor, which shows that the "Droid Sans Georgian" font is being used. However, if I go into the Content menu, I can select Droid Sans (by looking for the right pattern of boxes, "▯▯▯▯▯ ▯▯▯▯"), and then the Latin text shows up properly (not on the Wikipedia page, but on other pages).
If I delete fonts/DroidSansGeorgian.ttf, then the font that gets loaded is instead Lohit Oriya, and is similarly broken. The shape of the boxes in Lohit Oriya have a noticeably different shape. Perhaps the fonts selection governed by ordering in an internal hash table or something.
I turn on fontlist logging with
set NSPR_LOG_MODULES=fontlist:5cd Browserfirefox.exe -console
and I see this on the console:
0[1197208]: (fontlist-postscript) name: Droid Sans Georgian Regular, psname: Droid SansGeorgian0[1197208]: (fontlist-fullname) name: Droid Sans Georgian Regular, fullname: Droid Sans Georgian0[1197208]: (fontlist) added (Droid Sans Georgian Regular) to family (Droid Sans Georgian) with style: normal weight: 400 stretch: 0 psname: DroidSansGeorgian fullname: Droid Sans Georgian0[1197208]: (fontlist) added (Droid Sans Georgian Bold) to family (Droid Sans Georgian) with style: normal weight: 700 stretch: 0 psname: DroidSansGeorgian fullname: Droid Sans Georgian0[1197208]: (fontlist-cmap) name: Droid Sans Georgian Bold, size: 304 hash: 29410df0 new0[1197208]: (fontlist-cmap) name: Droid Sans Georgian Regular, size: 1880 hash:54f67428 new
a few seconds later, while it's sitting at the about:tor screen, I see it load the rest of the fonts. The full log is in fontlist.log. (However, note that only Droid Sans Georgian has fontlist-cmap lines.)
textperf.log was empty.
The most interesting one appears to be cmapdata.log:
0[c07268]: (cmapdata) name: Droid Sans Georgian Bold u+000000 [80040000 80000000 00000000 00000000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Bold u+001000 [00000000 00000000 00000000 00000000 00000000 ffffffff fc00ffff fffffff8]0[c07268]: (cmapdata) name: Droid Sans Georgian Bold u+002d00 [ffffffff fc000000 00000000 00000000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000000 [00000000 ffffffff ffffffff fffffffe 00000000 ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000100 [ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000200 [ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000300 [ffffffff ffffffff ffffffff ffffff3e 0febffff dfffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000400 [ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000500 [ffffffff ff000000 00000000 00000000 00007fff ffffffff ff00ffff ffe0f800]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000600 [fbfffff3 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000700 [00000000 00000000 0000ffff ffffffff 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+000800 [00000000 00000000 00000000 00000000 00000000 bff80000 00000000 0ffffffe]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+001d00 [ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffe00000 00000003]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+001e00 [ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+001f00 [fffffcfc ffffffff fcfcff55 fffffffc ffffffff fffffbff fbfff3f7 ffff3bfe]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002000 [ffff3fff e23fb86e 08000002 003f8fc1 0000f800 ffffffe4 00000000 00008000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002100 [04001300 22020000 0006181e 00000000 0800fc00 00800000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002200 [22016463 00500000 00800000 cc000000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002300 [20008000 c0000000 00000000 00000000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002500 [a0088888 08080808 0000ffff fff80000 8888f000 c0382028 083900c0 02000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002600 [00000000 00000038 a0000000 96310000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002c00 [00000000 00000000 00000000 ffffffff 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+002e00 [00000100 00000000 00000000 00000000 00000000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00a700 [000001ff c0000000 00000000 00000000 00f80000 00000000 00000000 00000000]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00fb00 [60000007 fffffefa dbffffff ffffffff ffffffff ffffffff c0001fff ffffffff]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00fc00 [00000000 00000000 00000003 f0000000 00000000 00000000 00000000 00003800]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00fd00 [00000000 0000000f 00000000 00000000 00000000 00000000 00000000 0000283c]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00fe00 [00000000 f0000000 00000000 0000fbff ffffffff ffffffff ffffffff fffffff8]0[c07268]: (cmapdata) name: Droid Sans Georgian Regular u+00ff00 [00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000008]
If I guess at the meaning right, it looks like "Droid Sans Georgian Regular" is claiming support for lots of code points, including most of ASCII. Maybe it doesn't actually support those, but Firefox believes it does, so it doesn't try loading any other fonts.
If I set the prefs,
font.name.sans-serif.x-unicode="Droid Sans"
font.name.sans-serif.x-western="Droid Sans"
font.name.serif.x-unicode="Droid Serif"
font.name.serif.x-western="Droid Serif"
then it begins to work right, for Latin text. Presumably we could set the mappings for all our fonts in all.js. The font used for browser chrome is still messed up.
I got some advice from Firefox maintainers. One suggests that the best way to do the job is to put code in gfxPlatformFontList that filters the system font list (based on file name; we check that the names are within the whitelisted bundle directory). Another says that in order to disable system fallback, one should set the pref gfx.font_rendering.fallback.always_use_cmaps to true, which will cause the renderer to explicitly iterate through the list of known fonts.
I got some advice from Firefox maintainers. One suggests that the best way to do the job is to put code in gfxPlatformFontList that filters the system font list (based on file name; we check that the names are within the whitelisted bundle directory). Another says that in order to disable system fallback, one should set the pref gfx.font_rendering.fallback.always_use_cmaps to true, which will cause the renderer to explicitly iterate through the list of known fonts.
I got some advice from Firefox maintainers. One suggests that the best way to do the job is to put code in gfxPlatformFontList that filters the system font list (based on file name; we check that the names are within the whitelisted bundle directory). Another says that in order to disable system fallback, one should set the pref gfx.font_rendering.fallback.always_use_cmaps to true, which will cause the renderer to explicitly iterate through the list of known fonts.
Yes, it's that branch plus this tor-browser.git patch:
4.5-alpha-2-fonts-1.patch
The patch only covers DirectWrite for Windows, and resulted in the weird graphical effects in comment:3. In order to use it, you may have to set the pref
gfx.font_rendering.directwrite.enabled=true
The whitelisted fonts are given in the font.system.whitelist pref in about:config. They are:
Cousine, Noto Kufi Arabic, Noto Naskh Arabic, Noto Sans, Noto Sans Armenian, Noto Sans Bengali, Noto Sans Buginese, Noto Sans CJK SC Regular, Noto Sans Canadian Aboriginal, Noto Sans Cherokee, Noto Sans Devanagari, Noto Sans Ethiopic, Noto Sans Georgian, Noto Sans Gujarati, Noto Sans Gurmukhi, Noto Sans Hebrew, Noto Sans Kannada, Noto Sans Khmer, Noto Sans Lao, Noto Sans Malayalam, Noto Sans Mongolian, Noto Sans Myanmar, Noto Sans Oriya, Noto Sans Sinhala, Noto Sans Tamil, Noto Sans Telugu, Noto Sans Thaana, Noto Sans Thai, Noto Sans Tibetan, Noto Sans Yi, Noto Serif, Noto Serif Armenian, Noto Serif Khmer, Noto Serif Lao, Noto Serif Thai
Note that the extra MB added to Tor Browser are mostly from the file NotoSansCJKsc-Regular.otf, which covers Chinese (simplifed and traditional), Japanese, Korean. Also, Cousine is included as a monospace font similar in size and shape to the Noto fonts.
Trac: Status: new to needs_review Keywords: TorBrowserTeam201507 deleted, TorBrowserTeam201507R added
(It might be useful to mention that Cousine, Noto Sans and Noto Serif cover the large majority of languages in the list from Wikipedia -- those that use Latin-, Greek-, and Cyrillic-derived alphabets.)
It builds and I get proper Tor Browser bundles which is good :). Please move --enable-bundled-fonts into the respective .mozconfig files. It is a normal config option which landed on mozilla trunk a while ago. Apart from that the tor-browser-bundle changes are good.
I tested the bundles a bit and was kind of surprised. My naive understanding is that tests like they are done on http://ip-check.info should show the same amount and the same fonts regardless of the underlying OS/testing user or am I missing something here? Anyway, with 5.0a3 I get 54 fonts on a linux machine and 250 fonts on a windows machine. With the patches I get 21 fonts on the same linux machine and 250 fonts on the same windows machine. While I could understand the former the latter sounds like a bug (provided there are no issues with the test itself) to me.
Anyway, with 5.0a3 I get 54 fonts on a linux machine and 250 fonts on a windows machine. With the patches I get 21 fonts on the same linux machine and 250 fonts on the same windows machine. While I could understand the former the latter sounds like a bug (provided there are no issues with the test itself) to me.
Okay, this was a bug on my side: I forgot the commit that added the font whitelist. Sorry, for the noise. Interestingly, the test on ip-check.info is falling back to the CSS test. I forgot how the code worked but one guess would be that it is doing this because it could not find any known font at all with JS...
It builds and I get proper Tor Browser bundles which is good :). Please move --enable-bundled-fonts into the respective .mozconfig files. It is a normal config option which landed on mozilla trunk a while ago. Apart from that the tor-browser-bundle changes are good.
As I mentioned on IRC, the reason I put --enable-bundled-fonts in tor-browser-bundle.git instead of mozconfigs is that the fonts aren't available to tor-browser.git. If someone is just building tor-browser.git, I don't want to make them have to download the Noto fonts and put them in the right directory just to be able to read text.
Alternative approaches could be:
Find a preprocessor flag that is only active when we build tor-browser-bundle.git, and use that to disable the whitelisting pref for tor-browser.git alone.
Add the Noto fonts directly to the tor-browser.git repo, and add something in the Mozilla build scripts to install them in the directory where fonts are bundled. That would avoid modifying tor-browser-bundle.git altogether.
I just tested that on two 32bit Linux systems (one Ubuntu 12.04 and one Debian testing) and even there are differeces visible with bundled fonts (the diff is attached). I guess this means shipping the alpha with it is fine (it can't get worse wrt to the status quo :) ) but we might want to have an estimation about what the current solution really helps us for the stable series before we ship it there.
Add the Noto fonts directly to the tor-browser.git repo, and add something in the Mozilla build scripts to install them in the directory where fonts are bundled. That would avoid modifying tor-browser-bundle.git altogether.
I think this makes sense. Another thing that bothers me with the currently proposed solution is that it makes bisecting quite error-prone. Although this is not documented yet the fastest approach is to just take an existing Tor Browser bundle and just bisect the tor-browser parts copying the result over the respective bundle parts with each iteration. This is not working anymore with having so many parts in tor-browser-bundle.git. Having everything in tor-browser could help us debug issues due to font updates easier as well.
I just tested that on two 32bit Linux systems (one Ubuntu 12.04 and one Debian testing) and even there are differeces visible with bundled fonts (the diff is attached). I guess this means shipping the alpha with it is fine (it can't get worse wrt to the status quo :) ) but we might want to have an estimation about what the current solution really helps us for the stable series before we ship it there.
Whoa, interesting result. I think, though, that it's a form of OS fingerprinting, similar to legacy/trac#13018 (moved), or am I missing something? Whereas this ticket attempts to solve an orthogonal problem, which is that it is possible to enumerate the system fonts installed on a user's machine.
Also, I think measuring glyph sizes is only possible with JS activated, whereas enumerating fonts is possible using CSS alone.
(I've opened legacy/trac#16672 (moved) regarding differences in text rendering between operating systems.)
Add the Noto fonts directly to the tor-browser.git repo, and add something in the Mozilla build scripts to install them in the directory where fonts are bundled. That would avoid modifying tor-browser-bundle.git altogether.
I think this makes sense. Another thing that bothers me with the currently proposed solution is that it makes bisecting quite error-prone. Although this is not documented yet the fastest approach is to just take an existing Tor Browser bundle and just bisect the tor-browser parts copying the result over the respective bundle parts with each iteration. This is not working anymore with having so many parts in tor-browser-bundle.git. Having everything in tor-browser could help us debug issues due to font updates easier as well.
Whoa, interesting result. I think, though, that it's a form of OS fingerprinting, similar to legacy/trac#13018 (moved), or am I missing something? Whereas this ticket attempts to solve an orthogonal problem, which is that it is possible to enumerate the system fonts installed on a user's machine.
Whitelisting font files is meant to solve both: enumeration of font names, and differences in glyph rendering. Differences in glyph rendering provide much more precision than just the OS--it can vary based on what fonts are installed, what antialiasing settings you use, and what graphics card you have, for example. Glyph rendering is in scope for this ticket--that's the idea behind enforcing a single list of exact font files, not just a single list of font names. By standardizing the list of font file and rendering settings you should be able to bring down the variability a lot. See figures 4 and 5 on page 13 of https://bamsoftware.com/papers/fontfp.pdf.
Whoa, interesting result. I think, though, that it's a form of OS fingerprinting, similar to legacy/trac#13018 (moved), or am I missing something? Whereas this ticket attempts to solve an orthogonal problem, which is that it is possible to enumerate the system fonts installed on a user's machine.
Whitelisting font files is meant to solve both: enumeration of font names, and differences in glyph rendering. Differences in glyph rendering provide much more precision than just the OS--it can vary based on what fonts are installed, what antialiasing settings you use, and what graphics card you have, for example. Glyph rendering is in scope for this ticket--that's the idea behind enforcing a single list of exact font files, not just a single list of font names. By standardizing the list of font file and rendering settings you should be able to bring down the variability a lot. See figures 4 and 5 on page 13 of https://bamsoftware.com/papers/fontfp.pdf.
What I understand from those figures is that most of the entropy saved is in standardizing the exact font files (please correct me if I'm mistaken). In comment:19 we have patches that enforce a single list of fonts, and bundle exactly the same font files on all platforms. I think that moves us from the red line to the blue line. To get closer to the green line, we need to adjust rendering settings -- I'd suggest punting that work to legacy/trac#16672 (moved), because I think it's going to take substantial experimentation to optimize those settings across platforms. In the meantime I think it would be nice to get user feedback for the bundled fonts in the alpha if possible.
Thanks, this is going to ship in 5.0a4. I am closing this as we enable bundled fonts in this ticket. It seems to me the idea with the fonts.conf file belongs to legacy/trac#16672 (moved), too.
Trac: Status: needs_review to closed Resolution: N/Ato fixed
I made a dedicated page that's more tailored to the kind of tests we're doing now. It displays a single checksum, lets you download a text file of all the dimensions, and includes a code point viewer.