Commits · 7e4fb31d581ec40804d5b8ff1fb2ebde29a66c67 · The Tor Project / Applications / Tor Browser

Jan 17, 2022

Bug 1749522 - When plain text encoding speculation fails, restart the... · 1366c0f2

Henri Sivonen authored 3 years ago

Bug 1749522 - When plain text encoding speculation fails, restart the plaintext mode of the tokenizer. r=smaug a=dmeehan

Differential Revision: https://phabricator.services.mozilla.com/D135830

1366c0f2

Jan 05, 2022
- Bug 1748482 - Check XML parser for brokenness in various methods. r=smaug · 09d38d11
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D135096
  09d38d11
Jan 04, 2022
- Bug 1748234 - Sync HTML parser Java source comments with validator repo. NPOTB r=edgar DONTBUILD · 4c4a0740
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134951
  4c4a0740
Jan 01, 2022

Bug 1746996 - Ensure that storeRawNames is always called. r=peterv · 65ffce97

Bobby Holley authored 3 years ago

To avoid copying in the common case, expat points directly into the
(internal) input buffer for strings referenced from the tag stack. When
expat unwinds back to the caller, it attempts to call storeRawNames() to
walk the tag stack and copy any such strings into persistent memory
before the caller potentially invokes XML_Parse() again and shuffles the
input buffer, thereby invalidating these references.

Unfortunately, it doesn't do it in all the right places. Because
different parsing states set |processor| to different callbacks (so that
parsing can resume in the right context), there are a number of
non-obvious entry points. In this case, the input stream was chunked so
that parsing paused in middle of processing an internal entity [1], so
|processor| was set to internalEntityProcessor(), which invokes
doContent() but does not call storeRawNames(). The doContent() call then
parsed some nested tags from the entity. Tags within an entity are
generally required to be balanced, but a host callback returned an error
code to interrupt parsing midway through. This caused Expat to return to
the caller with a tag stack still referencing the input buffer, which
got clobbered in the next call to XML_Parse, causing a tag mismatch on
the next close tag.

Conceptually, this optimization should be managed by doContent(), and I
believe the only reason that isn't the case is that doContent() has so
many return paths (and we don't have RAII in C). We can fix this by
wrapping doContent() in a helper.

[1] &certerror.expiredCert.whatCanYouDoAboutIt2;

Differential Revision: https://phabricator.services.mozilla.com/D134878

65ffce97

Dec 27, 2021
- Bug 1747264 - Avoid OOMing if the URL of an XML document is huge. r=smaug, a=dsmith · acf48e09
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134506
  acf48e09
- Bug 1747264 - Avoid OOMing if the URL of an XML document is huge. r=smaug · 2f2b2fb6
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134506
  2f2b2fb6
Dec 25, 2021
- Bug 1747514 - Ensure the expat sandbox is large enough to hold the base URI. r=shravanrn,deian · 42c90d14
  Bobby Holley authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134653
  42c90d14
- Bug 1747514 - Make RLBoxTransferBufferToSandbox properly fallible. r=shravanrn · 84ba1b89
  Bobby Holley authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134670
  84ba1b89
- Bug 1747514 - Ensure the expat sandbox is large enough to hold the base URI.... · e338e34c
  Bobby Holley authored 3 years ago
  
  Bug 1747514 - Ensure the expat sandbox is large enough to hold the base URI. r=shravanrn,deian, a=dsmith Differential Revision: https://phabricator.services.mozilla.com/D134653
  e338e34c
- Bug 1747514 - Make RLBoxTransferBufferToSandbox properly fallible. r=shravanrn, a=dsmith · e480eec6
  Bobby Holley authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134670
  e480eec6
Dec 23, 2021
- Bug 1539884 - Part 32: Mark nsHtml5SVGLoadDispatcher::Run as CAN_RUN_SCRIPT_BOUNDARY r=masayuki · dc3411ed
  Kagami Sascha Rosylight authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134415
  dc3411ed
Dec 22, 2021
- Bug 1745142 - Communicate encoding commitment via speculative load queue. r=smaug · 97da4f66
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D133996
  97da4f66
Dec 20, 2021
- Bug 1745239 - Chunk XML parsing to 64k characters at a time. r=bholley · 56d7e1b1
  Peter Van der Beken authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134320
  56d7e1b1
Dec 19, 2021
- Bug 1746412: Parser cleanup r=hsivonen · 18dd8638
  Randell Jesup authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134021
  18dd8638
Dec 18, 2021
- Bug 1746603 - Make mSpeculationFailureCount atomic. r=jesup · 2e97d5cb
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134142
  2e97d5cb
- Bug 1746593 - Turn mTerminated and mInterrupted into atomics. r=jesup · fe10b6b5
  Henri Sivonen authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134135
  fe10b6b5
Dec 14, 2021

Bug 1736248 - Update the charset source if non-ASCII is seen after the first... · 527882ba

Henri Sivonen authored 3 years ago

Bug 1736248 - Update the charset source if non-ASCII is seen after the first detector guess but the encoding does not change. r=smaug

Differential Revision: https://phabricator.services.mozilla.com/D133731

527882ba

Dec 13, 2021

Bug 1741665 - Align nsCString's public size_type better with other C++ APIs,... · 7b2e6d49

Nika Layzell authored 3 years ago

Bug 1741665 - Align nsCString's public size_type better with other C++ APIs, r=mccr8,geckoview-reviewers,agi

Differential Revision: https://phabricator.services.mozilla.com/D131422

7b2e6d49

Dec 20, 2021
- Bug 1745239 - Chunk XML parsing to 64k characters at a time. r=bholley, a=dsmith · 50d5b0a9
  Peter Van der Beken authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D134320
  50d5b0a9
Dec 09, 2021
- Bug 1745139 - Check for termination even after the first flush loop in... · cdcc0b21
  Henri Sivonen authored 3 years ago
  
  Bug 1745139 - Check for termination even after the first flush loop in CommitToInternalEncoding. r=smaug Differential Revision: https://phabricator.services.mozilla.com/D133342
  cdcc0b21
- Bug 1744460 part 2 - Update woff2 RLBoxSandboxPool to track minimum sandbox size r=bholley · 70793586
  shravanrn@gmail.com authored 3 years ago
  
  Depends on D133009 Differential Revision: https://phabricator.services.mozilla.com/D133158
  70793586
Dec 08, 2021

Backed out 2 changesets (bug 1744460) for causing build bustages at... · 6ffe112d

Butkovits Atila authored 3 years ago

Backed out 2 changesets (bug 1744460) for causing build bustages at RLBoxSandboxPool.cpp. CLOSED TREE

Backed out changeset 582101d582a0 (bug 1744460)
Backed out changeset dba7b7c19b2f (bug 1744460)

6ffe112d

Bug 1744460 part 2 - Update woff2 RLBoxSandboxPool to track minimum sandbox size r=bholley · 8001ccc0
shravanrn@gmail.com authored 3 years ago
```
Differential Revision: https://phabricator.services.mozilla.com/D133158
```
8001ccc0

Dec 09, 2021
- Bug 1744460 part 2 - Update woff2 RLBoxSandboxPool to track minimum sandbox size r=bholley,a=dsmith · eeebb938
  shravanrn@gmail.com authored 3 years ago
  
  Depends on D133009 Differential Revision: https://phabricator.services.mozilla.com/D133158
  eeebb938
Dec 08, 2021

Bug 1701828 - meta charset rewrite. r=smaug · 649a5b63

Henri Sivonen authored 3 years ago

Implements https://github.com/whatwg/html/issues/6962 . Improves performance
when <meta charset> occurs in head but after the first kilobyte and aligns
behavior better with WebKit and Blink.

The main change is to avoid reloads when meta appears within head but
after the first kilobyte. Prior to this change, Gecko reloaded in that
case (in compliance with the spec!) even though WebKit and Blink did not.

Differences from WebKit and Blink:

* WebKit and Blink honor <meta charset> in <noscript>. This implementation
  does not.
* WebKit and Blink look for meta as if the tree builder was unaware of
  foreign content. This implementation is foreign content-aware. This
  makes a difference for CDATA sections that contain a > before the meta
  as well as style and script elements within foreign content. This could
  happen if the CDATA section that has mysteriously been introduced around
  a what looks like a meta tag also contains another prior tag-looking
  run of text.
* This implementation processes rel=preload and speculative loads that are
  seen before <meta charset> has been seen. WebKit and Blink instead first
  look for the meta and rewind before starting speculative parsing.
* Unlike WebKit, if there is neither an honored meta nor syntax resembling
  an XML declaration, detection from content takes place (as in Blink).
* Unlike Blink, if there is neither an honored meta nor syntax resembling
  an XML declaration, the detection from content is not dependent of network
  buffer boundaries.
* Unlike Blink, detection from content can trigger a reload at the end of
  the stream if the guess made at that point differs from the first guess.
  (See below for the definition of the input to the first guess.)

Differences from the old spec and Gecko previously:

* Meta inside script and RCDATA elements is no longer honored.
* Late meta is now ignored and no longer triggers a reload.
* Later meta counts as early enough meta: In addition to the previous
  meta within the first 1024 bytes, now a meta that started within the first
  1024 bytes counts as early enough. Additionally, if by then there hasn't
  been a template start tag and head hasn't ended, meta occurring before the
  earlier of the end of the head or a template start tag counts as early
  enough.
* Meta now counts as not-late even if the encoding label has numeric
  character reference escapes.
* Syntax resembling an XML declaration longer than a kilobyte is honored if
  there is no honored meta.
* If there is neither an honored meta nor syntax resembling an XML declaration,
  the initial chardetng scan is potentially longer than before: the first 1024
  bytes, the token spanning the 1024-byte boundary if there is such a token,
  and, if by then head hasn't ended and there hasn't been a template start tag
  until the end of the template start tag or the end of the token that causes
  head to end, ever comes first. However, if the token implying the end of the
  head is a text token, bytes only to the end of the previous non-text token is
  considered. (This definition avoids depending on network buffer boundaries.)
* XML View Source now uses the code for syntax resembling an XML declaration
  instead of expat for extracting the internal encoding label.

Reftest are added as both WPT and Gecko reftests in order to test both http:
and file: URL scenarios. The Gecko tests retain the WPT <link> tags in order
to use the exact same bytes.

An encoding declaration has been added to a number of old tests that didn't
intend to test the new speculation behavior especially in the context of
https://bugzilla.mozilla.org/show_bug.cgi?id=1727750 .

Differential Revision: https://phabricator.services.mozilla.com/D125808

649a5b63

Backed out changeset 1778ca2ab291 (bug 1744425) for bc failures on... · fdf40d5a
Cosmin Sabou authored 3 years ago
```
Backed out changeset 1778ca2ab291 (bug 1744425) for bc failures on browser_xpcom_graph_wait.js. CLOSED TREE
```
fdf40d5a

Bug 1744425 - Replace nsContentUtils::GenerateUUID() to nsID::GenerateUUID(). r=nika · aae95e46

Chris Peterson authored 3 years ago

Bug 1723674 added a new nsID::GenerateUUID() static factory function to generate UUIDs without the overhead of querying and instantiating an nsIUUIDGenerator object. nsContentUtils::GenerateUUID() is a utility function that amortizes that overhead by holding an nsIUUIDGenerator singleton. That's no longer necessary because code that calls nsContentUtils::GenerateUUID() can now just call nsID::GenerateUUID(). No nsIUUDGenerator is needed.

Differential Revision: https://phabricator.services.mozilla.com/D132866

aae95e46

Dec 07, 2021

Backed out changeset 3dfd3c94a105 (bug 1701828) for causing mochitest failures... · 1d6984bc
Norisz Fay authored 3 years ago
```
Backed out changeset 3dfd3c94a105 (bug 1701828) for causing mochitest failures on browser_hsts_host.js CLOSED TREE
```
1d6984bc

Bug 1701828 - meta charset rewrite. r=smaug · 58476d7f

Henri Sivonen authored 3 years ago

Implements https://github.com/whatwg/html/issues/6962 . Improves performance
when <meta charset> occurs in head but after the first kilobyte and aligns
behavior better with WebKit and Blink.

The main change is to avoid reloads when meta appears within head but
after the first kilobyte. Prior to this change, Gecko reloaded in that
case (in compliance with the spec!) even though WebKit and Blink did not.

Differences from WebKit and Blink:

* WebKit and Blink honor <meta charset> in <noscript>. This implementation
  does not.
* WebKit and Blink look for meta as if the tree builder was unaware of
  foreign content. This implementation is foreign content-aware. This
  makes a difference for CDATA sections that contain a > before the meta
  as well as style and script elements within foreign content. This could
  happen if the CDATA section that has mysteriously been introduced around
  a what looks like a meta tag also contains another prior tag-looking
  run of text.
* This implementation processes rel=preload and speculative loads that are
  seen before <meta charset> has been seen. WebKit and Blink instead first
  look for the meta and rewind before starting speculative parsing.
* Unlike WebKit, if there is neither an honored meta nor syntax resembling
  an XML declaration, detection from content takes place (as in Blink).
* Unlike Blink, if there is neither an honored meta nor syntax resembling
  an XML declaration, the detection from content is not dependent of network
  buffer boundaries.
* Unlike Blink, detection from content can trigger a reload at the end of
  the stream if the guess made at that point differs from the first guess.
  (See below for the definition of the input to the first guess.)

Differences from the old spec and Gecko previously:

* Meta inside script and RCDATA elements is no longer honored.
* Late meta is now ignored and no longer triggers a reload.
* Later meta counts as early enough meta: In addition to the previous
  meta within the first 1024 bytes, now a meta that started within the first
  1024 bytes counts as early enough. Additionally, if by then there hasn't
  been a template start tag and head hasn't ended, meta occurring before the
  earlier of the end of the head or a template start tag counts as early
  enough.
* Meta now counts as not-late even if the encoding label has numeric
  character reference escapes.
* Syntax resembling an XML declaration longer than a kilobyte is honored if
  there is no honored meta.
* If there is neither an honored meta nor syntax resembling an XML declaration,
  the initial chardetng scan is potentially longer than before: the first 1024
  bytes, the token spanning the 1024-byte boundary if there is such a token,
  and, if by then head hasn't ended and there hasn't been a template start tag
  until the end of the template start tag or the end of the token that causes
  head to end, ever comes first. However, if the token implying the end of the
  head is a text token, bytes only to the end of the previous non-text token is
  considered. (This definition avoids depending on network buffer boundaries.)
* XML View Source now uses the code for syntax resembling an XML declaration
  instead of expat for extracting the internal encoding label.

Reftest are added as both WPT and Gecko reftests in order to test both http:
and file: URL scenarios. The Gecko tests retain the WPT <link> tags in order
to use the exact same bytes.

An encoding declaration has been added to a number of old tests that didn't
intend to test the new speculation behavior especially in the context of
https://bugzilla.mozilla.org/show_bug.cgi?id=1727750 .

Differential Revision: https://phabricator.services.mozilla.com/D125808

58476d7f

Dec 02, 2021

Bug 1238861 - Display a warning message when doctype is not standard. r=hsivonen. · b31e82ce

nchevobbe authored 3 years ago

This patchs adds new error messages which are extending existing ones,
providing extra information to the user.
A webconsole mochitest is added in the following patch of this stack.

Differential Revision: https://phabricator.services.mozilla.com/D131889

b31e82ce

Dec 01, 2021

Bug 1738401 - Remove -Wno-shadow warning suppressions. r=firefox-build-system-reviewers,glandium · f6fdbf02

Chris Peterson authored 3 years ago

-Wshadow warnings are not enabled globally, so these -Wno-shadow suppressions have no effect. I had intended to enable -Wshadow globally along with these suppressions in some directories (in bug 1272513), but that was blocked by other issues.

There are too many -Wshadow warnings (now over 2000) to realistically fix them all. We should remove all these unnecessary -Wno-shadow flags cluttering many moz.build files.

Differential Revision: https://phabricator.services.mozilla.com/D132289

f6fdbf02

Nov 27, 2021
- Bug 1732201 - Sandbox woff2 in OTS using RLBox r=bholley · 1ee9a841
  Deian Stefan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D126435
  1ee9a841
- Backed out changeset d486edc7499b (bug 1732201) for causing web-platform-tests... · f0ef0360
  Cristian Tuns authored 3 years ago
  
  Backed out changeset d486edc7499b (bug 1732201) for causing web-platform-tests failures on header-totalsfntsize-001.xht CLOSED TREE
  f0ef0360
- Bug 1732201 - Sandbox woff2 in OTS using RLBox r=bholley · 0a5e1f20
  Deian Stefan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D126435
  0a5e1f20
Nov 25, 2021
- Bug 1743007 - Convert expat XML_StopParser API to take an int param instead of u8 r=bholley · 1cc3caab
  Shravan Narayan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D132171
  1cc3caab
- Bug 1742914 - Add explicit casts for u8 and u16 parameters to RLBox sandbox_invoke r=bholley · 361d7f84
  Shravan Narayan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D132113
  361d7f84
- Backed out changeset 524df7136a1f (bug 1742914) for causing assertion failures... · a543cd6c
  Cosmin Sabou authored 3 years ago
  
  Backed out changeset 524df7136a1f (bug 1742914) for causing assertion failures on htmlparser/nsExpatDriver.cpp. CLOSED TREE
  a543cd6c
- Bug 1742914 - Add explicit casts for u8 and u16 parameters to RLBox sandbox_invoke r=bholley · 7fe9421a
  Shravan Narayan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D132113
  7fe9421a
Nov 22, 2021
- Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr,peterv,bholley,glandium · 86e82e10
  Deian Stefan authored 3 years ago
  
  Differential Revision: https://phabricator.services.mozilla.com/D104658
  86e82e10
Nov 20, 2021
- Backed out changeset 4294063f1606 (bug 1688452) for causing mochitest and wpt failures. CLOSED TREE · 51a05715
  Sandor Molnar authored 3 years ago
  
  51a05715