{"id":933,"date":"2017-08-18T14:07:43","date_gmt":"2017-08-18T18:07:43","guid":{"rendered":"https:\/\/www.visophyte.org\/blog\/?p=933"},"modified":"2017-08-18T14:07:43","modified_gmt":"2017-08-18T18:07:43","slug":"gecko-performing-service-worker-interceptions-in-the-parent-process","status":"publish","type":"post","link":"https:\/\/www.visophyte.org\/blog\/2017\/08\/18\/gecko-performing-service-worker-interceptions-in-the-parent-process\/","title":{"rendered":"Gecko: Performing Service Worker Interceptions in the Parent Process"},"content":{"rendered":"<h2>THIS IS AN OUTDATED DRAFT, IF YOU CAN READ THIS, I PUSHED SOME BUTTONS WRONG.\u00a0 OOPS.<\/h2>\n<p>Meta: <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1231222\">Bug 1231222<\/a> overhauls how Service Worker (<a href=\"https:\/\/w3c.github.io\/ServiceWorker\/\">spec<\/a>, MDN: <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/Service_Worker_API\/Using_Service_Workers\">overview<\/a> <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/Service_Worker_API\">details<\/a>) network interception happens in Gecko.\u00a0 I&#8217;m writing this post in support of the review process and to help provide some point-in-time higher level documentation.<\/p>\n<p>Disclaimers: Examples involving Service Workers are intentionally simplified unless relevant to the parent intercept changes.\u00a0 All code links are made to <a href=\"https:\/\/dxr.mozilla.org\/\">DXR<\/a> at a file granularity because I expect those links to be long-term stable and less of a hassle than linking to a query page.\u00a0 You may need to control-f find the term I refer to, however.\u00a0 For code exploration and understanding, I recommend <a href=\"https:\/\/searchfox.org\/\">searchfox<\/a>.<\/p>\n<h1>Background (which you may already know)<\/h1>\n<h2>Service Workers<\/h2>\n<h3>Service Workers and Fetch<\/h3>\n<p>Service Workers that have registered a fetch (<a href=\"https:\/\/fetch.spec.whatwg.org\/\">spec<\/a>) event handler during initial script evaluation will receive fetch events.\u00a0 These fetch events fall into 2 groups:<\/p>\n<ol>\n<li>Navigation: A new document\/worker &#8220;<a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#dfn-service-worker-client\">client<\/a>&#8221; is being loaded and the URL <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#match-service-worker-registration\">matches<\/a> the <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#dfn-service-worker-registration\">registration<\/a>&#8216;s scope URL.\u00a0 If the SW <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#fetch-event-respondwith\">responds<\/a>, the registration is said to control the client and will receive fetch events for all future requests.\u00a0 If the SW does not respond, then the request will go to the network\/HTTP cache as if the SW did not exist and the SW will not receive fetch events for sub-resources loaded by the resulting page.<\/li>\n<li>Sub-resources:\u00a0 A controlled document\/worker is making a network request which can be to the page&#8217;s own origin or any other origin.\u00a0 The SW has an opportunity to respond to the fetch request.\u00a0 If the SW does not respond, then the request will go to the network\/HTTP cache as if the SW did not exist.\u00a0 The SW will continue to receive fetch events for controlled pages even if it never responds to any of them.<\/li>\n<\/ol>\n<p>&#8220;fetch&#8221; event handlers must do one of the following things when they receive their event:<\/p>\n<ul>\n<li>Call respondWith() before returning, providing a promise that must resolve to a Response object.\u00a0 If the promise is rejected or is resolved with anything other than a Response object, the fetch&#8217;s result will be a network error.\u00a0 In other words, once respondWith() has been invoked, the Service Worker has committed to a response and will no longer automatically fall back to contacting the network.\u00a0 However, the SW will can still call fetch(request) itself to approximate the same result.<\/li>\n<li>Call preventDefault() without calling respondWith(), resulting in a network error for the fetch.<\/li>\n<li>Don&#8217;t call respondWith() or preventDefault(), resulting in the fetch going to the network like the Service Worker did not exist.<\/li>\n<\/ul>\n<h3>Service Workers and Multiple Processes: Now<\/h3>\n<p><a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/dom\/workers\/ServiceWorkerManager.h\">ServiceWorkerManager<\/a> is the brains of Gecko&#8217;s Service Worker implementation.\u00a0 It knows if a given URI is covered by a Service Worker scope.\u00a0 It knows if a given document is controlled.\u00a0 It lets you dispatch fetch events, notification events, everything.\u00a0 Currently there&#8217;s one ServiceWorkerManager per process.\u00a0 The ServiceWorkerManagers share unified race-proof <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#cache-objects\">Cache API<\/a> storage in the parent process, and a race-prone-broadcast understanding of what registrations exist and which chrome-namespaced cache name where the SW script and its dependencies are stored.<\/p>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-939\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/service-worker-multi.svg\" alt=\"Diagram showing one ServiceWorkerManager per process.\" \/><\/p>\n<p>This will change in the near future, but the net result is that for the patch under discussion:<\/p>\n<ul>\n<li>Each content process has its own ServiceWorkerManager.<\/li>\n<li>When a ServiceWorker instance needs to be spun up, it will be spun up in the current process.\u00a0 It does not matter if there is already an equivalent instance alive in another content process.<\/li>\n<li>When fetch events get dispatched, they will be dispatched in the current process and the results produced in the current process.<\/li>\n<\/ul>\n<p>Terminology note: When I say &#8220;instance&#8221; I refer to a distinct version of the ServiceWorker as identified by its <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#dfn-service-worker-id\">id<\/a> <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#ref-for-dfn-service-worker-id-1\">issued during the update process<\/a> with its own specific Cache API storage. In other words, each of a <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#serviceworkerregistration-interface\">ServiceWorkerRegistration<\/a>&#8216;s installing, waiting, and active is a different instance.<\/p>\n<h3>Service Workers and Multiple Processes: The Future<\/h3>\n<p>Part of why we need the parent intercept patch is for that bright future where there is only one ServiceWorkerManager and it lives in the parent process.\u00a0 And there is only ever one ServiceWorker instance across all content processes in the browser.\u00a0 To start with, all of these instances will live in a single Service Worker process for main thread contention reasons.\u00a0 As various Project Quantum advancements are made that reduce main thread contention, we should be able to spawn ServiceWorkers in normal content processes.<\/p>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-940\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/service-worker-remoted.svg\" alt=\"Simplified diagram of the beautiful future where there's only one ServiceWorkerManager.\" \/><\/p>\n<p>The diagram above attempts to capture a very simplified version of the expected first steps.<\/p>\n<h2>Necko HTTP Channels<\/h2>\n<h3>Basics: URIs, Channels, Stream Listeners<\/h3>\n<p>In Gecko and its network layer, Necko, channels are the primary abstraction for network requests.\u00a0 If you have a string URL, you ask <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIIOService.idl\">nsIIOService<\/a> to create an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIURI.idl\">nsIURI<\/a> and using that, create an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIChannel.idl\">nsIChannel<\/a> (which is also an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIRequest.idl\">nsIRequest<\/a>).\u00a0 nsIChannel&#8217;s <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/search?q=function-decl%3AasyncOpen2\">asyncOpen2<\/a> method begins the asynchronous process of opening the channel, with notifications provided to the passed <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIStreamListener.idl\">nsIStreamListener<\/a>\/<a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIRequestObserver.idl\">nsIRequestObserver<\/a> &#8220;listener&#8221; which exposes the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/xpcom\/io\/nsIInputStream.idl\">nsIInputStream<\/a> which contains the actual data stream.<\/p>\n<p>Each channel corresponds to exactly one URI, exposed as &#8220;URI&#8221; on the nsIChannel.\u00a0 In the event of a redirection, a new channel is created with the new URI.\u00a0 The original URI is propagated to the new channel as well, and is exposed as &#8220;originalURI&#8221;.\u00a0 HTTP channels additionally may store the &#8220;documentURI&#8221; of the document originating the request which allegedly is for 3rd-party cookie blocking, but the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/mozIThirdPartyUtil.idl\">mozIThirdPartyUtil<\/a> <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/dom\/base\/ThirdPartyUtil.cpp\">implementation<\/a> and most other code now gets their information from the channel&#8217;s associated <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsILoadInfo.idl\">nsILoadInfo<\/a>.\u00a0 nsILoadInfo provides important context about why a network load was started and what it will be used for.<\/p>\n<p>Each network protocol has its own implementation directory beneath <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/\">mozilla-central<\/a>\/<a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\">netwerk<\/a>\/<a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\">protocol<\/a>.\u00a0 Service Workers only support the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\">HTTP protocol<\/a> (in <a href=\"https:\/\/w3c.github.io\/webappsec-secure-contexts\/#secure-context\">secure contexts<\/a>).<\/p>\n<h3>HTTP Channels under Multi-process Firefox (AKA Electrolysis AKA e10s)<\/h3>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-937\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/class-httpchannel-e10s.svg\" alt=\"Simplified HTTP Channel class diagram.\" \/><\/p>\n<p>In single process Firefox (or for HTTP connections initiated from the parent process), an instance of the &#8220;real&#8221; HTTP channel class <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/nsHttpChannel.h\">nsHttpChannel<\/a> is created.\u00a0 In multi-process Firefox when code in a child (AKA content) process wants to open an HTTP(S) URI, an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/HttpChannelChild.h\">HttpChannelChild<\/a> instance is created instead.\u00a0 Each HttpChannelChild instance *may* cause a corresponding <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/HttpChannelParent.h\">HttpChannelParent<\/a> instance to be created in the parent process.\u00a0 The child and parent communicate with each other via the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/PHttpChannel.ipdl\">PHttpChannel<\/a> <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Mozilla\/IPDL\">IPDL protocol<\/a>.\u00a0 I place an emphasis on *may* because that is something this patch changes.<\/p>\n<p>The HttpChannelParent instance creates and owns an nsHttpChannel instance to do the &#8220;real&#8221; work.\u00a0 nsHttpChannel deals with the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/cache2\">HTTP cache<\/a> and perform the actual network communication.<\/p>\n<p>HttpChannelParent also creates an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/HttpChannelParentListener.h\">HttpChannelParentListener<\/a> which is the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIStreamListener.idl\">nsIStreamListener<\/a> it passes to the nsHttpChannel.\u00a0 The HttpChannelParentListener provides a layer of indirection so that the stream can be &#8220;diverted&#8221; from the child back to the parent and the HttpChannelParent can be removed from the picture.\u00a0 In the simplest case, the HttpChannelParentListener simply redirects OnStartRequest, OnDataAvailable, and OnStopRequest callbacks to its mNextListener, the HttpChannelParent.<\/p>\n<p>HttpChannelChild and HttpChannelParent work together to proxy requests from the child to the parent and data from the parent to the child.\u00a0 Data travels as nsCStrings in <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/PHttpChannel.ipdl\">OnTransportAndData<\/a> IPC messages, not via file descriptors passed to the child process.<\/p>\n<p>Logic in the child process doesn&#8217;t notice the difference between HttpChannelChild and nsHttpChannel because it interacts via XPCOM interfaces like <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/nsIHttpChannel.idl\">nsIHttpChannel<\/a> rather than the concrete implementation types.\u00a0 This is helped by nsHttpChannel and HttpChannelChild sharing <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/HttpBaseChannel.h\">HttpBaseChannel<\/a> as a base class that contains common logic related to implementing the XPCOM interfaces.<\/p>\n<h3>Complication: Redirects<\/h3>\n<p>As noted above, each channel is associated with a single URI.\u00a0 Accordingly, a new URI means a new channel.\u00a0 For HTTP, redirects usually happen because an HTTP request returned a <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Redirections\">3xx response<\/a>.\u00a0 However, Service Workers introduce an additional complication because the fetch <a href=\"https:\/\/fetch.spec.whatwg.org\/#response-class\">Response<\/a> instance eventually provided to <a href=\"https:\/\/w3c.github.io\/ServiceWorker\/#fetch-event-respondwith\">respondWith<\/a> need not have the same URL as the request.\u00a0 Because of the 1 channel = 1 URI invariant, an internal redirect must be performed in that case.\u00a0 More on that later.<\/p>\n<p>At a high level, redirects are simple:<\/p>\n<ul>\n<li>Create a new channel, propagating state from the original channel to the new channel.<\/li>\n<li>Check that the redirect is okay.\u00a0 This is an opportunity run all the checks that were run when the channel was originally created.\u00a0 Just because something is a redirect doesn&#8217;t mean it&#8217;s okay to violate the <a href=\"https:\/\/w3c.github.io\/webappsec-csp\/\">Content Security Policy<\/a> or <a href=\"https:\/\/w3c.github.io\/webappsec-mixed-content\/\">Mixed Content<\/a> policies.<\/li>\n<li>If all the checks passed, notify the original channel so it can open the new channel, passing the original channel&#8217;s listener so it now becomes the new channel&#8217;s listener.\u00a0 The listener will not have had its <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIRequestObserver.idl\">nsIRequestObserver<\/a>::OnStartRequest method called yet because when a redirect response is observed, the redirect is performed rather than advancing to nsHttpChannel::ProcessNormal().<\/li>\n<\/ul>\n<p>In practice, they&#8217;re a bit more complex.\u00a0 Here&#8217;s a high-level sequence diagram of the (in the parent) nsHttpChannel redirection life-cycle using the StartRedirectChannelToURI programmatic redirect API used by nsIHttpChannel::redirectTo and the InterceptedChannel mechanism (more on that later).\u00a0 For 3xx redirect responses, nsHttpChannel::AsyncProcessRedirection uses ContinueProcessRedirectionAfterFallback whose behavior is analogous to StartRedirectChannelToURI and whose ContinueProcessRedirection is analogous to ContinueAsyncRedirectChannelToURI\/OpenRedirectChannel.<\/p>\n<p><a href=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/redirect-nshttpchannel-api.svg\"><img decoding=\"async\" class=\"alignnone size-full wp-image-943\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/redirect-nshttpchannel-api.svg\" alt=\"Sequence diagram of redirection without e10s details.\" \/><\/a><\/p>\n<p>Elaborations on the diagram:<\/p>\n<ul>\n<li>When the original nsHttpChannel &#8220;Pumps [are] suspended&#8221;, that&#8217;s its <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsInputStreamPump.h\">nsInputStreamPump<\/a> that provides the channel with OnStartRequest\/OnDataAvailable\/OnStopRequest events.\u00a0 By suspending the pump, the channel doesn&#8217;t need to worry about those calls happening.<\/li>\n<li>When checking if redirects are okay, <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIChannelEventSink.idl\">nsIChannelEventSink<\/a>::AsyncOnChannelRedirect methods are invoked.\u00a0 This is an asynchronous check; the method needs to call the passed in callback to complete the check.\u00a0 The <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsAsyncRedirectVerifyHelper.h\">nsAsyncRedirectVerifyHelper<\/a> provides a DelegateOnChannelRedirect helper that abstracts the book-keeping for callbacks in addition to being the home to logic like invoking the nsIOService checks.<\/li>\n<li>The &#8220;nsIOService checks&#8221; AsyncOnChannelRedirect handler:\n<ul>\n<li>Checks if this looks like a captive portal redirect to a local\/private IP sub-net and hints to the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsICaptivePortalService.idl\">nsICaptivePortalService<\/a> that it should run a check again.<\/li>\n<li>Directly asks <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/dom\/security\/nsContentSecurityManager.cpp\">nsContentSecurityManager<\/a> to vote on the redirect via its <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIChannelEventSink.idl\">nsIChannelEventSink<\/a>::AsyncOnChannelRedirect implementation.<\/li>\n<li>Asks every <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIChannelEventSink.idl\">nsIChannelEventSink<\/a> listed in the category manager under &#8220;<a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/search?q=net-channel-event-sinks&amp;redirect=true\">net-channel-event-sinks<\/a>&#8221; to vote on the redirect.\u00a0 Currently that&#8217;s the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/dom\/security\/nsCSPService.cpp\">CSPService<\/a> and the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/dom\/security\/nsMixedContentBlocker.h\">nsMixedContentBlocker<\/a>.<\/li>\n<\/ul>\n<\/li>\n<li>The listener, if it implements\u00a0<a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIChannelEventSink.idl\">nsIChannelEventSink<\/a>, is made aware of the redirect via a call to its AsyncOnChannelRedirect method with the result of all the prior checks.\u00a0 It can veto the redirect or allow it to proceed.\u00a0 If it proceeds, it can expect to receive an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIRedirectResultListener.idl\">nsIRedirectResultListener<\/a>::OnRedirectResult when the redirect has succeeded by AsyncOpen2-ing the redirected channel, or an error otherwise.<\/li>\n<\/ul>\n<h3>Sources of Redirects<\/h3>\n<p>When trying to understand code, I always find it useful to know the motivating use-cases.\u00a0 Especially because these use-cases may interact in ways that increase complexity.\u00a0 So, what causes redirects?<\/p>\n<ul>\n<li>Redirects initiated by nsHttpChannel as a result of content directives:\n<ul>\n<li>Explicit <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Redirections\">3xx responses<\/a> from the network or cache.<\/li>\n<li>HTTP-to-HTTPS upgrades triggered by &#8220;<a href=\"https:\/\/w3c.github.io\/webappsec-upgrade-insecure-requests\/\">upgrade-insecure-requests<\/a>&#8221; <a href=\"https:\/\/w3c.github.io\/webappsec-csp\/\">Content Security Policy<\/a> (CSP) directives or <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Headers\/Strict-Transport-Security\">HTTP Strict-Transport-Security<\/a> (HSTS).\u00a0 A &#8220;permanent&#8221; &#8220;STS&#8221; redirect is performed.<\/li>\n<\/ul>\n<\/li>\n<li>Redirects initiated by nsHttpChannel as a result of things going wrong:\n<ul>\n<li>Failure to load a URL from the network that&#8217;s covered by an <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTML\/Using_the_application_cache\">Application Cache<\/a> cache manifest <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTML\/Using_the_application_cache#Fallback_entries\">FALLBACK entry<\/a> that specific a file to load instead.\u00a0 Note Application Cache (AKA &#8220;AppCache&#8221;) has been deprecated in favor of Service Workers and is hopefully going away.<\/li>\n<li>Cache problems.\u00a0 If we think we have a valid cache entry but it turns out we couldn&#8217;t read the cache file from disk, then the cache entry is removed (&#8220;doomed&#8221;) and we generate an internal redirect to hit the network.\u00a0 This may also happen if certain edge-cases trigger when processing HTTP 304 &#8220;Not Modified&#8221; responses to conditional GETs.<\/li>\n<li>Proxy-server problems.\u00a0 If the proxy server is unable to handle the request, an internal redirect will be generated to fail-over to the alternate proxy returned by <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIProtocolProxyService.idl\">nsIProtocolProxyService<\/a>::getFailoverForProxy().<\/li>\n<\/ul>\n<\/li>\n<li>Non-nsHttpChannel code calling <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/nsIHttpChannel.idl\">nsIHttpChannel<\/a>::redirectTo():\n<ul>\n<li><a href=\"https:\/\/developer.mozilla.org\/en-US\/Add-ons\/WebExtensions\">WebExtensions<\/a> using the <a href=\"https:\/\/developer.mozilla.org\/en-US\/Add-ons\/WebExtensions\/API\/webRequest\">webRequest API<\/a> inspect\/intercept HTTP requests that choose to perform redirects in <a href=\"https:\/\/developer.mozilla.org\/en-US\/Add-ons\/WebExtensions\/API\/webRequest\/onBeforeRequest\">onBeforeRequest<\/a> or <a href=\"https:\/\/developer.mozilla.org\/en-US\/Add-ons\/WebExtensions\/API\/webRequest\/onHeadersReceived\">onHeadersReceived<\/a>.\u00a0 (If you are wondering if there&#8217;s a way you can create an extension that does the same thing as a ServiceWorker for a site you do not control, this is the API for you.)<\/li>\n<li>Legacy Firefox Extensions.<\/li>\n<\/ul>\n<\/li>\n<li>ADDRESS: UNDERSTAND https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1222008. ServiceWorker fetch events resolved with a Response whose URI does not match that of the request.\u00a0 This is a special case that can&#8217;t happen for normal web content because HTTP does not provide this capability.\u00a0 The Location header only has meaning for redirect status codes and will be ignored otherwise.\n<ul>\n<li>Pre-patch this happens in both the parent and child.\u00a0 Post-patch this happens only in the parent.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>ChannelEventQueue, a Nested Event Loop e10s IPC bandage<\/h3>\n<p>The normal event loop assumption is that events will run in the order they were enqueued, each event running to completion before the next event starts running.\u00a0 Nested event loops like those used by synchronous XMLHttpRequest and <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Mozilla\/IPDL\/Tutorial#Synchronous_and_RPC_Messaging\">synchronous\/rpc Inter-Process Communication (IPC) calls<\/a> violate this assumption.<\/p>\n<p>The canonical Necko concern is that a call to an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIRequestObserver.idl\">nsIRequestObserver<\/a>::OnStartRequest implementation will be on the stack spinning a nested event loop, and that a call to the same listener&#8217;s (nsIRequestObserver subclass) <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIStreamListener.idl\">nsIStreamListener<\/a>::OnDataAvailable implementation will be made.\u00a0 This is not a problem for non-e10s channels because Necko event delivery has built-in back-pressure.\u00a0 Core implementations like <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsInputStreamPump.h\">nsInputStreamPump<\/a> and NS_AsyncCopy&#8217;s <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/xpcom\/io\/nsStreamUtils.cpp\">nsAStreamCopier<\/a> go out of their way to maintain thread-safe state machines that only allow events to be delivered after the previous event has run to completion.\u00a0 Other low-level interfaces encourage idioms like this, for example <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/xpcom\/io\/nsIAsyncInputStream.idl\">nsIAsyncInputStream<\/a>&#8216;s asyncWait requests a single callback notification rather than a subscription to all future notifications.<\/p>\n<p>Messages\/events relayed via IPC can&#8217;t have this free back-pressure.\u00a0 At least as long as they&#8217;re not &#8220;sync&#8221; IPC calls.\u00a0 But, for reasons of performance and sanity, it is desirable for all IPC messages to be &#8220;async&#8221;.\u00a0 Additionally, there are distributed computing issues to deal with where both the parent and child may have different opinions of the channel&#8217;s state and in-flight multi-step asynchronous operations.\u00a0 Thus we have the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/ipc\/ChannelEventQueue.h\">ChannelEventQueue<\/a>, a suspendable queue of ChannelEvent instances (basically non-XPCOM <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/xpcom\/threads\/nsIRunnable.idl\">nsIRunnables<\/a>).\u00a0 When receiving IPC events, Necko *Parent and *Child classes will construct ChannelEvent subclasses and use ChannelEventQueue::RunOrEnqueue to ensure events are run sequentially.<\/p>\n<h3>E10s Complication: Diversions<\/h3>\n<p>Logic in the child content process may realize that the data it&#8217;s receiving should be consumed by the parent process instead.\u00a0 In that case the channel is diverted back to the parent process using <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsIDivertableChannel.idl\">nsIDivertableChannel<\/a>.\u00a0 It&#8217;s currently used by:<\/p>\n<ul>\n<li><a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/uriloader\/exthandler\/nsIExternalHelperAppService.idl\">nsIExternalHelperAppService<\/a> which handles downloads, asking you whether you want to &#8220;Open with&#8221; an external application or &#8220;Save File&#8221; as a download.<\/li>\n<li><a href=\"http:\/\/searchfox.org\/mozilla-central\/source\/security\/manager\/ssl\/PSMContentListener.h\">PSMContentListener<\/a> which handles importing certificates identified by the <a href=\"http:\/\/searchfox.org\/mozilla-central\/source\/security\/manager\/ssl\/nsNSSModule.cpp#265\">MIME types application\/x-x509-*-cert<\/a>.<\/li>\n<\/ul>\n<p>These are both <a href=\"http:\/\/searchfox.org\/mozilla-central\/source\/uriloader\/base\/nsIURIContentListener.idl\">nsIURIContentListener<\/a> implementations that are invoked by the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/uriloader\/base\/nsIURILoader.idl\">nsIURILoader<\/a>.\u00a0 When a URL is opened (in a navigation context), the URI loader opens the channel to discover the content type.\u00a0 This is initiated by a docshell in the (child) content process.\u00a0 And since the content type comes from the &#8220;Content-Type&#8221; HTTP header, it means data has already started flowing to the child process.\u00a0 Normally this is what we&#8217;d want because for HTML documents we want to parse and render them in the child.\u00a0 But content processes are sandboxed and cannot directly write to disk, open applications, or manipulate the certificate store.<\/p>\n<p><a href=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/diversion-external-after.svg\"><img decoding=\"async\" class=\"alignnone size-full wp-image-946\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/diversion-external-after.svg\" alt=\"Sequence diagram of channel diversion, an inherently e10s activity.\" \/><\/a><\/p>\n<p>The above sequence diagram expresses the high level parent-child diversion flow using the external helper app case as an example.\u00a0 In prose:<\/p>\n<ul>\n<li>Not in the diagram: The child channel generates an OnStartRequest event, processed by nsURILoader&#8217;s nsDocumentOpenInfo::OnStartRequest.\u00a0 It invokes nsExternalHelperAppService::DoContent which defers to the content-process specific DoContentContentProcessHelper which creates an ExternalHelperAppChild and returns it.\u00a0 nsDocumentInfo::OnStartRequest invokes ExternalHelperAppChild::OnStartRequest.<\/li>\n<li>The child channel consumer, ExternalHelperAppChild in this example, decides that it wants to hand processing off to the parent.\u00a0 It invokes DivertToParent on the HttpChannelChild.<\/li>\n<li>HttpChannelChild suspends its ChannelEventQueue, pausing processing of data messages coming from the parent.\u00a0 Because all of this is happening during the &#8220;OnStartRequest&#8221; handler, none of the &#8220;OnDataAvailable&#8221; events with the HTTP request&#8217;s body have been processed.\u00a0 HttpChannelChild then tells the parent about the diversion by creating a PChannelDiverter actor instance that can be used to claim the diverted channel in the parent process.<\/li>\n<li>When the parent receives the constructor message, it stops the flow of data by suspending the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsInputStreamPump.h\">nsInputStreamPump<\/a> that is delivering the OnDataAvailable notifications to the nsHttpChannel.\u00a0 Prior to the diversion, these calls would be propagated to the HttpChannelParentListener and then onto the HttpChannelParent which would send the data to the child via <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/protocol\/http\/PHttpChannel.ipdl\">OnTransportAndData<\/a> messages.<\/li>\n<li>ExternalHelperAppChild conveys the PChannelDiverter actor to the parent in an IPC call to DivertToParentUsing(the diverter).<\/li>\n<li>ExternalHelperAppParent receives the IPC messages and invokes DivertTo(itself as stream listener).\u00a0 This results in the HttpChannelParentListener switching its mNextListener from the HttpChannelParent instance to the ExternalHelperAppParent so that when data starts flowing, it will be the ExternalHelperAppParent receiving the OnDataAvailable notifications.\u00a0 A synthetic OnStartRequest is also generated since the real OnStartRequest was already consumed by the child.<\/li>\n<li>HttpChannelParent&#8217;s StartDiversion method sends two IPC messages to HttpChannelChild.\u00a0 The first, FlushedForDiversion, gets wrapped into a ChannelEvent by the child and placed in the suspended ChannelEventQueue.\u00a0 It will serve as a notification to the parent that the child has finished processing its event queue.\u00a0 The second message, DivertMessages, resumes the processing of the ChannelEventQueue.\u00a0 This will result in the child processing all of the buffered data messages.\u00a0 Because diversion is enabled, they will be re-transmitted to the parent as DivertOnDataAvailable messages.\u00a0 Finally, the FlushedForDiversion event will be processed, sending a DivertComplete message to the parent.<\/li>\n<li>HttpChannelParent processes the messages, and resumes the underlying nsHttpChannel, which will then resume its nsInputStreamPump, causing data and events to flow again.\u00a0 The HttpChannelParentListener is still the nsHttpChannel&#8217;s listener, but it will now be calling the ExternalHelperAppParent&#8217;s OnDataAvailable.<\/li>\n<\/ul>\n<h3>E10s Complications: Redirects<\/h3>\n<p>Redirects get more complicated under e10s.\u00a0 The good news is that &#8220;normal&#8221; redirects all happen in the parent, based around the same nsHttpChannel control flow discussed above.\u00a0 The e10s logic builds on that.<\/p>\n<p>foo<span style=\"text-decoration: underline;\"><strong> DISCUSS<\/strong><\/span> foo.\u00a0 See http channel redirect crash notes which happily overlap with this.<strong><br \/>\n<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2>Service Worker and HTTP Channel Interactions<\/h2>\n<h3>nsINetworkInterceptController<\/h3>\n<p><a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsINetworkInterceptController.idl#126\">nsINetworkInterceptController<\/a> is the high level interaction point between Service Workers and Necko.\u00a0 Each &#8220;document&#8221; (really, <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/docshell\/base\/nsDocShell.h\">nsDocShell<\/a>) implements the interface.\u00a0 This means it has the context to know whether the document is controlled and by whom in addition to any arguments passed in method calls.\u00a0 The interface is exposed to the channel by being set as the <a href=\"http:\/\/searchfox.org\/mozilla-central\/search?q=notificationCallbacks\">notificationCallbacks<\/a> attribute on the <a href=\"http:\/\/searchfox.org\/mozilla-central\/source\/netwerk\/base\/nsIChannel.idl\">channel<\/a> or the <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsILoadGroup.idl\">load group<\/a>.<\/p>\n<p>The interface defines 2 methods:<\/p>\n<ul>\n<li><span class=\"k\">bool<\/span> <a data-menu=\"[{&quot;href&quot;: &quot;\/mozilla-central\/search?q=function%3AshouldPrepareForIntercept&quot;, &quot;html&quot;: &quot;Find overrides&quot;, &quot;icon&quot;: &quot;method&quot;, &quot;title&quot;: &quot;Search for overrides of this method.&quot;}, {&quot;href&quot;: &quot;\/mozilla-central\/search?q=function-decl%3AshouldPrepareForIntercept&quot;, &quot;html&quot;: &quot;Find declaration&quot;, &quot;icon&quot;: &quot;method&quot;, &quot;title&quot;: &quot;Search for declarations.&quot;}]\">shouldPrepareForIntercept<\/a>(in nsIURI aURI, in <span class=\"k\">bool<\/span> aIsNonSubresourceRequest): A synchronous method to check whether the given URI should be intercepted and therefore should avoid touching the network.\u00a0 If true is returned and the channel doesn&#8217;t get canceled in the interim, a call to the next method should be expected at some point in the future.<\/li>\n<li>void channelIntercepted(in nsIInterceptedChannel aChannel): Here&#8217;s an <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsINetworkInterceptController.idl\">intercepted channel<\/a> that you need to do 1 of 2 things to:\n<ul>\n<li>Call resetInterception(), causing an internal redirect so that we act like the SW never existed, checking the HTTP cache and going to the network as appropriate.<\/li>\n<li>Generate a synthesized response.\u00a0 Set the status via synthesizeStatus, set headers via synthesizeHeader, write to the responseBody stream,\u00a0 and finish by invoking finishSynthesizedResponse(finalURLSpec).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Under e10s, the nsDocShell instances live in the content process.\u00a0 So what happens in the parent process?\u00a0 The answer is that HttpChannelParentListener also implements the interface and handles the calls.\u00a0 nsHttpChannel always attempts to retrieve an nsINetworkInterceptController from its notificationCallbacks attribute and invoke its shouldPrepareForIntercept method to determine whether interception is appropriate.\u00a0 This happens regardless of whether it&#8217;s an e10s scenario where the nsHttpChannel was created by an HttpChannelParent or it&#8217;s non-e10s and the nsHttpChannel was directly created in the parent.\u00a0 nsHttpChannel doesn&#8217;t distinguish between the two, it just consumes the interface.<\/p>\n<h3>Intercepted Channels<\/h3>\n<p>As mentioned above, nsIInterceptedChannel is the interface that the Service Worker fetch event handler uses to provide its Response object.\u00a0 Currently this entails a number of separate method calls, but in the future it&#8217;s likely the Response will be directly provided instead.<\/p>\n<p><a href=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/intercepted-channels.svg\"><img decoding=\"async\" class=\"alignnone size-full wp-image-948\" src=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2017\/03\/intercepted-channels.svg\" alt=\"Confusing diagram of relationship of intercepted channels.\" \/><\/a><\/p>\n<p>There are two implementations of <a href=\"https:\/\/dxr.mozilla.org\/mozilla-central\/source\/netwerk\/base\/nsINetworkInterceptController.idl#126\">nsIInterceptedChannel<\/a>: InterceptedChannelChrome and InterceptedChannelContent, both of which subclass InterceptedChannelBase.\u00a0 Although the Chrome variant will only ever be created in the parent process and the Content variant in a child content process, they do not have a parent\/child relationship like HttpChannelParent and HttpChannelChild.\u00a0 We have two implementation types because they each hold mChannel references to the concrete channel types, nsHttpChannel in the parent, and HttpChannelChild in the child.\u00a0 They do this because of their differing means of injecting the synthesized content and resetting interception.<\/p>\n<p>In nsHttpChannel, Service Worker interception is performed during the OpenCacheEntry stage of processing.\u00a0 If we intercept and provide a synthesized result, we never hit the HTTP cache and a synthesized cache entry is generated.\u00a0 InterceptedChannelChome has the specific logic to populate the synthetic cache entry.\u00a0 And it handles calls to resetInterception by manually triggering a programmatic redirect; nsHttpChannel itself has no resetIndirection method.<\/p>\n<p>InterceptedChannelContent&#8217;s behavior changes with the patch, which we&#8217;ll get to in a later section.\u00a0 The key difference to be aware of is that HttpChannelChild is very aware of ServiceWorkers and interception and so InterceptedChannelContent is able to be a thinner layer.\u00a0 HttpChannelChild exposes a ResetInterception method it is able to invoke directly, and post-patch, content synthesis is simpler too.<\/p>\n<p>Differences in behavior between e10s and non-e10s stem from whether the nsINetworkInterceptController is an nsDocShell (non-e10s in parent, e10s in child) or an HttpChannelParentListener (e10s in parent only).<\/p>\n<h3>Complications: Redirects and Secure Upgrades<\/h3>\n<p>&nbsp;<\/p>\n<h1>The Old Way: Child Intercept<\/h1>\n<h2>Strategy: Don&#8217;t Get The Parents Involved<\/h2>\n<p>The current pre-patch implementation optimizes for the ability to process Service Worker interceptions in the child with as little parent involvement as possible.\u00a0 In a world where the decision to intercept and the processing of the intercept both happen entirely in the child process, this makes a lot of sense.\u00a0 But it also comes with a massive amount of complexity because it means the child channel needs to duplicate logic that would normally be handled in the parent, plus the additional permutations when the parent does need to get involved.<\/p>\n<p>The parent needs to be involved when any of the following things happen:<\/p>\n<ul>\n<li>The fetch needs to go to the network because neither respondWith() nor preventDefault() was invoked on the fetch event.<\/li>\n<li>respondWith() is resolved with a Response with a redirect status.<\/li>\n<li>Diversion: As previously covered, diversion is a mechanism by which the channel can be consumed in the parent process and by definition this involves the parent process.<\/li>\n<\/ul>\n<p>The child can handle things without involving the parent process when:<\/p>\n<ul>\n<li>respondWith is resolved with a Response whose URI matches that of the request.\u00a0 This is the simplest and most straightforward case.<\/li>\n<li>respondWith is resolved with a Response that does not have a redirect status but whose URI does not match that of the request.\u00a0 This results in HttpChannelChild&#8217;s specialized BeginNonIPCRedirect method being invoked to run the redirect entirely in the client.<\/li>\n<\/ul>\n<h2>Complexity: Redirects<\/h2>\n<p>Redirects happen in the parent.\u00a0 Why?\u00a0 Because that&#8217;s where they happened prior to the introduction of Service Workers.\u00a0 Why?\u00a0 Because nsHttpChannel already knew how to perform redirects and there&#8217;s only a performance hit to remote the decision-making process from the parent where the network I\/O is happening down to the child and then back up again.\u00a0 (At least that&#8217;s my guess on the simplest evolutionary explanation.)Redirect3Complete<\/p>\n<p>As a result, if a ServiceWorker responds to a fetch event with a redirection, an HttpChannelParent will need to be spun up.\u00a0 Things get more complicated because this transfers control to the parent and that redirected channel may also need to be intercepted.<\/p>\n<p>As mentioned previously, HttpChannelParentListener implements nsINetworkInterceptController.\u00a0 This is how the child regains control of the channel.\u00a0 When responding with a synthesized redirect, the child tells the parent that it should intercept the redirected channel.\u00a0 Its shouldPrepareForIntercept will then return true so that its channelIntercepted method will be called.\u00a0 When it&#8217;s channelIntercepted method is called it simply saves off the intercepted channel and does nothing.\u00a0 Instead, the next action will be taken by the HttpChannelChild whose CompleteRedirectSetup method will be sending a SendFinishInterceptedRedirect ping-pong.\u00a0 The parent will stop sending IPC messages on receipt and issue its own send.\u00a0 The child will then delete the parent and continue the redirect handling in the child by invoking AsyncOpen2.<br \/>\n**need to <strong>document<\/strong> relationship of the self-deading channel to the redirected channel**.\u00a0 It looks like the same child.\u00a0 How do replacement channels work for this?<\/p>\n<h3>InterceptStreamListener<\/h3>\n<p>The synthesized response data and its nsIStreamListener events are actually being generated by HttpChannelChild&#8217;s mSynthesizedResponsePump nsInputStreamPump.\u00a0 This means that in the OnDataAvailable and other listener notifications, the nsIRequest* aRequest is that of the nsInputStreamPump rather than the HttpChannelChild.\u00a0 This is potentially confusing to the listener.\u00a0 So the InterceptStreamListener is registered as the pump&#8217;s listener and it redirects each event to the child&#8217;s listener, passing the child as aRequest instead.<\/p>\n<h1>The New Way: Parent Intercept<\/h1>\n","protected":false},"excerpt":{"rendered":"<p>THIS IS AN OUTDATED DRAFT, IF YOU CAN READ THIS, I PUSHED SOME BUTTONS WRONG.\u00a0 OOPS. Meta: Bug 1231222 overhauls how Service Worker (spec, MDN: overview details) network interception happens in Gecko.\u00a0 I&#8217;m writing this post in support of the &hellip; <a href=\"https:\/\/www.visophyte.org\/blog\/2017\/08\/18\/gecko-performing-service-worker-interceptions-in-the-parent-process\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[134],"tags":[],"class_list":["post-933","post","type-post","status-publish","format-standard","hentry","category-drafty"],"_links":{"self":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/933","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/comments?post=933"}],"version-history":[{"count":9,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/933\/revisions"}],"predecessor-version":[{"id":954,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/933\/revisions\/954"}],"wp:attachment":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/media?parent=933"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/categories?post=933"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/tags?post=933"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}