diff --git a/CHANGELOG.md b/CHANGELOG.md index eaf9ed638b..276851796a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,23 @@ *not released yet* + * Implement lazy loading and compact internal representation [#87](https://github.com/cliqz-oss/adblocker/pull/87) + * [BREAKING] serialization module has been removed, instead, each class now + provides a `serialize` method as well as a static method `deserialize`. + * [BREAKING] FiltersEngine now exposes different methods for update: + `update` which expects a diff of filters, `updateList` and + `updateResources`. This API should be a cleared and allows using the + adblocker without managing filters lists. + * [BREAKING] ReverseIndex' API dropped the use of a callback to specify + filters and instead expects a list of filters. + * [BREAKING] parsing and matching filters can now be done using methods of + the filters classes directly instead of free functions. For example + NetworkFilter has a `parse` and `match` method (with the same expected + arguments). + * ReverseIndex is now implemented using a very compact + representation (stored in a typed array). + * `toString` method of filters should now be more accurate. + * Addition of numerous unit tests (coverage is now >90%) * Implement support for :style cosmetic filters [#86](https://github.com/cliqz-oss/adblocker/pull/86) * [BREAKING] `getCosmeticsFilters` will now return CSS as a single string (stylesheet) instead of a list of selectors. This simplifies the usage and diff --git a/README.md b/README.md index 3535447843..c029a7dd5b 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,152 @@ # Adblocker -A fast, pure-JavaScript content-blocking library made by Cliqz. +A fast and memory efficient, pure-JavaScript content-blocking library made by Cliqz. -This library is the building block technology used to power Cliqz and Ghostery's Adblocking. Being a pure JavaScript library, it can be used for various purposes such as: +This library is the building block technology used to power Ghostery and +Cliqz' Adblocking. Being a pure JavaScript library it is trivial to include in +any new project and can also be used as a building block for tooling. For +example this library can be used for: * Building a content-blocking extension (see [this example](./example) for a minimal content-blocking webextension) * Building tooling around popular block-lists such as [EasyList](https://github.com/easylist/easylist) -* Converting between various formats of filters (EasyList, Safari Block Lists, etc.) -* Detecting duplicates in lists + - validating filters + - normalizing filters + - detecting redundant filters * Detecting dead domains * etc. -The library provides the low-level implementation to fetch, parse and match filters; which makes it possible to manipulate the lists at a high level. +The library provides abstractions to manipulate filters at a low-level. -## Development +## Getting Started + +This package can be installed directly from `npm`: -Install dependencies: ```sh -$ npm install +$ npm install @cliqz/adblocker ``` -Build: +Or you can install it from sources directly: ```sh +$ npm ci $ npm pack +$ npm run test ``` -Test: -```sh -$ npm run test +Multiple bundles are provided in the `dist` folder. + +## Usage + + +### Network Filters + +Here is how one can parse and match individual *network* filters: + +```javascript +const { NetworkFilter, Request } = require('@cliqz/adblocker'); + +// 1. Parsing +const filter = NetworkFilter.parse('||domain.com/ads.js$script'); + +// 2. Matching +filter.match(new Request({ + type: 'script', + + domain: 'domain.com', + hostname: 'domain.com', + url: 'https://domain.com/ads.js?param=42', + + sourceUrl: 'https://domain.com/', + sourceHostname: 'domain.com', + sourceDomain: 'domain.com', +})); ``` -You can use the following bundle: `adblocker.umd.js`. +Matching requires you to provide an instance of `Request` which knows +about the type of the request (e.g.: `main_frame`, `script,` etc.) as +well as the URL, hostname and domain of the request and *source URL*. +To make things a bit easier, the library exposes a `makeRequest` helper +which can be used alongside a library like `tldts` (or another library +providing parsing of hostnames and domains) to provide the parsing: + +```javascript +const tldts = require('tldts'); +const { NetworkFilter, makeRequest } = require('@cliqz/adblocker'); + +// 1. Parsing +const filter = NetworkFilter.parse('||domain.com/ads.js$script'); + +// 2. Matching +filter.match(makeRequest({ + type: 'script', + url: 'https://domain.com/ads.js', +}, tldts)); // true +``` + +### Cosmetic Filters + +Here is how one can parse and match individual *cosmetic* filters: + +```javascript +const { CosmeticFilter } = require('@cliqz/adblocker'); + +// 1. Parsing +const filter = CosmeticFilter.parse('domain.*,domain2.com###selector'); + +// 2. The `match` method expects a hostname and domain as arguments +filter.match('sub.domain.com', 'domain.com'); // true +``` + +### Filters Engine + +Manipulating filters at a low level is useful to build tooling or debugging, but +to perform efficient matching we need to use `FiltersEngine` which can be seen +as a "container" for both network and cosmetic filters. The filters are +organized in a very compact way and allow fast matching against requests. + +```javascript +const { FiltersEngine } = require('@cliqz/adblocker'); + +const engine = FiltersEngine.parse(` +! This is a custom list +||domain.com/ads.js$script + +###selector +domain.com,entity.*##+js(script,args1) +`); + +// It is possible to serialize the full engine to a typed array for caching +const serialized = engine.serialize(); +const deserialized = FiltersEngine.deserialize(serialized); + +// Matching network filters +const { + match, // `true` if there is a match + redirect, // data url to redirect to if any + exception, // instance of NetworkFilter exception if any + filter, // instance of NetworkFilter which matched +} = engine.match(new Request(...)); + +// Matching CSP (content security policy) filters +const directives = engine.getCSPDirectives(new Request(...)); + +// Matching cosmetic filters +const { + styles, // stylesheet to inject in the page + scripts, // Array of scriptlets to inject in the page +} = engine.getCosmeticFilters('sub.domain.com', 'domain.com'); +``` -## Releasing Checklist +# Release Checklist To publish a new version: -1. Update `version` in [package.json](./package.json) -2. Update [CHANGELOG.md](./CHANGELOG.md) -3. New commit on local `master` branch (e.g.: `Release vx.y.z`) -5. Make release PR with your commit -6. Merge and create new Release on GitHub -6. Travis takes care of the rest! +1. Create a new branch (e.g.: `release-x.y.z`) +2. Update `version` in [package.json](./package.json) +3. Update [CHANGELOG.md](./CHANGELOG.md) +4. Create a release commit (e.g.: "Release vx.y.z") +5. Create a PR for the release +6. Merge and create a new Release on GitHub +7. Travis takes care of the rest! ## License diff --git a/bench/micro.js b/bench/micro.js index 3dd17b7088..0e2fea5baa 100644 --- a/bench/micro.js +++ b/bench/micro.js @@ -10,16 +10,12 @@ function benchEngineCreation({ lists, resources }) { }); } -function benchEngineOptimization({ engine }) { - return engine.optimize(); -} - function benchEngineSerialization({ engine }) { return engine.serialize(); } function benchEngineDeserialization({ serialized }) { - return adblocker.deserializeEngine(serialized, 1); + return adblocker.FiltersEngine.deserialize(serialized); } function benchStringHashing({ filters }) { @@ -42,11 +38,10 @@ function benchParsingImpl(lists, { loadNetworkFilters, loadCosmeticFilters }) { let dummy = 0; for (let i = 0; i < lists.length; i += 1) { - dummy = (dummy + adblocker.parseList(lists[i], { + dummy = (dummy + adblocker.parseFilters(lists[i], { loadNetworkFilters, loadCosmeticFilters, - debug: false, - }).length) % 100000; + }).networkFilters.length) % 100000; } return dummy; @@ -70,7 +65,6 @@ function benchNetworkFiltersParsing({ lists }) { module.exports = { benchCosmeticsFiltersParsing, benchEngineCreation, - benchEngineOptimization, benchEngineSerialization, benchEngineDeserialization, benchNetworkFiltersParsing, diff --git a/bench/run_benchmark.js b/bench/run_benchmark.js index a6d3f85319..c535a0020a 100644 --- a/bench/run_benchmark.js +++ b/bench/run_benchmark.js @@ -29,7 +29,6 @@ const { const { benchEngineCreation, - benchEngineOptimization, benchEngineSerialization, benchEngineDeserialization, benchNetworkFiltersParsing, @@ -68,7 +67,7 @@ function triggerGC() { function getMemoryConsumption() { triggerGC(); - return process.memoryUsage().heapUsed / 1024 / 1024; + return process.memoryUsage().heapUsed; } @@ -136,7 +135,6 @@ function runMicroBenchmarks(lists, resources) { }; [ - benchEngineOptimization, benchStringHashing, benchCosmeticsFiltersParsing, benchStringTokenize, @@ -176,7 +174,6 @@ function runMemoryBench(lists, resources) { const { engine, serialized } = createEngine(lists, resources, { loadCosmeticFilters: true, loadNetworkFilters: true, - optimizeAOT: true, }, true /* Also serialize engine */); const engineMemory = getMemoryConsumption() - baseMemory; diff --git a/bench/utils.js b/bench/utils.js index d7cd33dbe8..65b5eadb9b 100644 --- a/bench/utils.js +++ b/bench/utils.js @@ -2,17 +2,12 @@ const fs = require('fs'); const adblocker = require('../dist/adblocker.cjs.js'); function createEngine(lists, resources, options = {}, serialize = false) { - const engine = new adblocker.FiltersEngine({ - ...options, - version: 1, - }); - - engine.onUpdateResource([{ filters: resources, checksum: '' }]); - engine.onUpdateFilters(lists.map((list, i) => ({ - asset: `${i}`, - checksum: '', - filters: lists[i], - })), new Set()); + const engine = adblocker.FiltersEngine.parse( + lists.join('\n'), + options, + ); + + engine.updateResources(resources, ''); return { engine, diff --git a/example/background.ts b/example/background.ts index a3f0021fb2..d592c7b91a 100644 --- a/example/background.ts +++ b/example/background.ts @@ -7,13 +7,6 @@ import * as adblocker from '../index'; * should be blocked or altered. */ function loadAdblocker() { - const engine = new adblocker.FiltersEngine({ - enableOptimizations: true, - loadCosmeticFilters: true, - loadNetworkFilters: true, - optimizeAOT: true, - }); - console.log('Fetching resources...'); return Promise.all([adblocker.fetchLists(), adblocker.fetchResources()]).then( ([responses, resources]) => { @@ -26,16 +19,28 @@ function loadAdblocker() { } } - engine.onUpdateResource([{ filters: resources, checksum: '' }]); - engine.onUpdateFilters([ - { - asset: 'filters', - checksum: '', - filters: [...deduplicatedLines].join('\n'), - }, - ]); + let t0 = Date.now(); + const engine = adblocker.FiltersEngine.parse([...deduplicatedLines].join('\n')); + let total = Date.now() - t0; + console.log('parsing filters', total); + + t0 = Date.now(); + engine.updateResources(resources, '' + adblocker.fastHash(resources)); + total = Date.now() - t0; + console.log('parsing resources', total); + + t0 = Date.now(); + const serialized = engine.serialize(); + total = Date.now() - t0; + console.log('serialization', total); + console.log('size', serialized.byteLength); + + t0 = Date.now(); + const deserialized = adblocker.FiltersEngine.deserialize(serialized); + total = Date.now() - t0; + console.log('deserialization', total); - return adblocker.deserializeEngine(engine.serialize()); + return deserialized; }, ); } diff --git a/index.ts b/index.ts index 7925cc2267..f7a5bd1740 100644 --- a/index.ts +++ b/index.ts @@ -1,20 +1,16 @@ // Cosmetic injection export { default as injectCosmetics, IMessageFromBackground } from './src/cosmetics-injection'; -// Blocking export { default as FiltersEngine } from './src/engine/engine'; export { default as ReverseIndex } from './src/engine/reverse-index'; export { default as Request, makeRequest } from './src/request'; -export { deserializeEngine } from './src/serialization'; +export { default as CosmeticFilter } from './src/filters/cosmetic'; +export { default as NetworkFilter } from './src/filters/network'; -export { default as matchCosmeticFilter } from './src/matching/cosmetics'; -export { default as matchNetworkFilter } from './src/matching/network'; - -export { parseCosmeticFilter } from './src/parsing/cosmetic-filter'; -export { parseNetworkFilter } from './src/parsing/network-filter'; -export { f, parseList } from './src/parsing/list'; +export { f, List, default as Lists, parseFilters } from './src/lists'; export { compactTokens, hasEmptyIntersection, mergeCompactSets } from './src/compact-set'; export { fetchLists, fetchResources } from './src/fetch'; export { tokenize, fastHash, updateResponseHeadersWithCSP } from './src/utils'; +export { default as StaticDataView } from './src/data-view'; diff --git a/package-lock.json b/package-lock.json index 905d58f856..8a37772633 100644 --- a/package-lock.json +++ b/package-lock.json @@ -25,9 +25,9 @@ } }, "@types/chrome": { - "version": "0.0.77", - "resolved": "https://registry.npmjs.org/@types/chrome/-/chrome-0.0.77.tgz", - "integrity": "sha512-VPjm9KeAbwNM0gY8wFGCqO45N382xxiUhgGaiqQPai/NmLNOHBl6w6m71seq566Na0v/+skcWozzyalmX/FMIg==", + "version": "0.0.78", + "resolved": "https://registry.npmjs.org/@types/chrome/-/chrome-0.0.78.tgz", + "integrity": "sha512-0ibUi2LxeH96KIGRr/QZ6TlUm6DCBXztUn2pWK8vWMgasS2vkADoO5OROG0Fqz7YkKLa/DAVzQF5h/4acxaCgg==", "dev": true, "requires": { "@types/filesystem": "*" @@ -55,9 +55,9 @@ "dev": true }, "@types/jest": { - "version": "23.3.11", - "resolved": "https://registry.npmjs.org/@types/jest/-/jest-23.3.11.tgz", - "integrity": "sha512-eroF85PoG87XjCwzxey7yBsQNkIY/TV5myKKSG/022A0FW25afdu/uub6JDMS5eT68zBBt82S+w/MFOTjeLM3Q==", + "version": "23.3.13", + "resolved": "https://registry.npmjs.org/@types/jest/-/jest-23.3.13.tgz", + "integrity": "sha512-ePl4l+7dLLmCucIwgQHAgjiepY++qcI6nb8eAwGNkB6OxmTe3Z9rQU3rSpomqu42PCCnlThZbOoxsf+qylJsLA==", "dev": true }, "@types/jsdom": { @@ -78,9 +78,9 @@ "dev": true }, "@types/tough-cookie": { - "version": "2.3.4", - "resolved": "https://registry.npmjs.org/@types/tough-cookie/-/tough-cookie-2.3.4.tgz", - "integrity": "sha512-Set5ZdrAaKI/qHdFlVMgm/GsAv/wkXhSTuZFkJ+JI7HK+wIkIlOaUXSXieIvJ0+OvGIqtREFoE+NHJtEq0gtEw==", + "version": "2.3.5", + "resolved": "https://registry.npmjs.org/@types/tough-cookie/-/tough-cookie-2.3.5.tgz", + "integrity": "sha512-SCcK7mvGi3+ZNz833RRjFIxrn4gI1PPR3NtuIS+6vMkvmsGjosqTJwRt5bAEFLRz+wtJMWv8+uOnZf2hi2QXTg==", "dev": true }, "abab": { @@ -90,9 +90,9 @@ "dev": true }, "acorn": { - "version": "6.0.4", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-6.0.4.tgz", - "integrity": "sha512-VY4i5EKSKkofY2I+6QLTbTTN/UvEQPCo6eiwzzSaSWfpaDhOmStMCMod6wmuPciNq+XS0faCglFu2lHZpdHUtg==", + "version": "6.0.5", + "resolved": "https://registry.npmjs.org/acorn/-/acorn-6.0.5.tgz", + "integrity": "sha512-i33Zgp3XWtmZBMNvCr4azvOFeWVw1Rk6p3hfi3LUDvIFraOMywb1kAtrbi+med14m4Xfpqm3zRZMT+c0FNE7kg==", "dev": true }, "acorn-globals": { @@ -118,9 +118,9 @@ "dev": true }, "ajv": { - "version": "6.6.2", - "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.6.2.tgz", - "integrity": "sha512-FBHEW6Jf5TB9MGBgUUA9XHkTbjXYfAUjY43ACMfmdMRHniyoMHjHjzD50OK8LGDWQwp4rWEsIq5kEqq7rvIM1g==", + "version": "6.7.0", + "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.7.0.tgz", + "integrity": "sha512-RZXPviBTtfmtka9n9sy1N5M5b82CbxWIR6HIis4s3WQTXDJamc/0gpCWNGz6EWdWp4DOfjzJfhz/AS9zVPjjWg==", "dev": true, "requires": { "fast-deep-equal": "^2.0.1", @@ -777,6 +777,40 @@ "requires": { "locate-path": "^2.0.0" } + }, + "locate-path": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", + "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "dev": true, + "requires": { + "p-locate": "^2.0.0", + "path-exists": "^3.0.0" + } + }, + "p-limit": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", + "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "dev": true, + "requires": { + "p-try": "^1.0.0" + } + }, + "p-locate": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", + "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "dev": true, + "requires": { + "p-limit": "^1.1.0" + } + }, + "p-try": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", + "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "dev": true } } }, @@ -1089,19 +1123,10 @@ } } }, - "caller-path": { - "version": "0.1.0", - "resolved": "https://registry.npmjs.org/caller-path/-/caller-path-0.1.0.tgz", - "integrity": "sha1-lAhe9jWB7NPaqSREqP6U6CV3dR8=", - "dev": true, - "requires": { - "callsites": "^0.2.0" - } - }, "callsites": { - "version": "0.2.0", - "resolved": "https://registry.npmjs.org/callsites/-/callsites-0.2.0.tgz", - "integrity": "sha1-r6uWJikQp/M8GaV3WCXGnzTjUMo=", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/callsites/-/callsites-3.0.0.tgz", + "integrity": "sha512-tWnkwu9YEq2uzlBDI4RcLn8jrFvF9AOi8PxDNU3hZZjJcjkcRAq3vCI+vZcg1SuxISDYe86k9VZFwAxDiJGoAw==", "dev": true }, "camelcase": { @@ -1126,9 +1151,9 @@ "dev": true }, "chalk": { - "version": "2.4.1", - "resolved": "https://registry.npmjs.org/chalk/-/chalk-2.4.1.tgz", - "integrity": "sha512-ObN6h1v2fTJSmUXoS3nMQ92LbDK9be4TV+6G+omQlGJFdcUX5heKi1LZ1YnRMIgwTLEj3E24bT6tYni50rlCfQ==", + "version": "2.4.2", + "resolved": "https://registry.npmjs.org/chalk/-/chalk-2.4.2.tgz", + "integrity": "sha512-Mti+f9lpJNcwF4tWV8/OrTTtF1gZi+f8FqlyAdouralcFWFQWF2+NgCHShjkCb+IFBLq9buZwE1xckQU4peSuQ==", "dev": true, "requires": { "ansi-styles": "^3.2.1", @@ -1325,33 +1350,6 @@ "integrity": "sha1-6CB68cx7MNRGzHC3NLXovhj4jVE=", "dev": true }, - "parse-json": { - "version": "4.0.0", - "resolved": "https://registry.npmjs.org/parse-json/-/parse-json-4.0.0.tgz", - "integrity": "sha1-vjX1Qlvh9/bHRxhPmKeIy5lHfuA=", - "dev": true, - "requires": { - "error-ex": "^1.3.1", - "json-parse-better-errors": "^1.0.1" - } - }, - "pify": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/pify/-/pify-3.0.0.tgz", - "integrity": "sha1-5aSs0sEB/fPZpNB/DbxNtJ3SgXY=", - "dev": true - }, - "read-pkg": { - "version": "4.0.1", - "resolved": "https://registry.npmjs.org/read-pkg/-/read-pkg-4.0.1.tgz", - "integrity": "sha1-ljYlN48+HE1IyFhytabsfV0JMjc=", - "dev": true, - "requires": { - "normalize-package-data": "^2.3.2", - "parse-json": "^4.0.0", - "pify": "^3.0.0" - } - }, "supports-color": { "version": "4.5.0", "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-4.5.0.tgz", @@ -1385,9 +1383,9 @@ "dev": true }, "core-js": { - "version": "2.6.1", - "resolved": "https://registry.npmjs.org/core-js/-/core-js-2.6.1.tgz", - "integrity": "sha512-L72mmmEayPJBejKIWe2pYtGis5r0tQ5NaJekdhyXgeMQTpJoBsH0NL4ElY2LfSoV15xeQWKQ+XTTOZdyero5Xg==", + "version": "2.6.3", + "resolved": "https://registry.npmjs.org/core-js/-/core-js-2.6.3.tgz", + "integrity": "sha512-l00tmFFZOBHtYhN4Cz7k32VM7vTn3rE2ANjQDxdEN6zmXZ/xq1jQuutnmHvMG1ZJ7xd72+TA5YpUK8wz3rWsfQ==", "dev": true }, "core-util-is": { @@ -1658,16 +1656,17 @@ } }, "es-abstract": { - "version": "1.12.0", - "resolved": "https://registry.npmjs.org/es-abstract/-/es-abstract-1.12.0.tgz", - "integrity": "sha512-C8Fx/0jFmV5IPoMOFPA9P9G5NtqW+4cOPit3MIuvR2t7Ag2K15EJTpxnHAYTzL+aYQJIESYeXZmDBfOBE1HcpA==", + "version": "1.13.0", + "resolved": "https://registry.npmjs.org/es-abstract/-/es-abstract-1.13.0.tgz", + "integrity": "sha512-vDZfg/ykNxQVwup/8E1BZhVzFfBxs9NqMzGcvIJrqg5k2/5Za2bWo40dK2J1pgLngZ7c+Shh8lwYtLGyrwPutg==", "dev": true, "requires": { - "es-to-primitive": "^1.1.1", + "es-to-primitive": "^1.2.0", "function-bind": "^1.1.1", - "has": "^1.0.1", - "is-callable": "^1.1.3", - "is-regex": "^1.0.4" + "has": "^1.0.3", + "is-callable": "^1.1.4", + "is-regex": "^1.0.4", + "object-keys": "^1.0.12" } }, "es-to-primitive": { @@ -1716,9 +1715,9 @@ } }, "eslint": { - "version": "5.11.1", - "resolved": "https://registry.npmjs.org/eslint/-/eslint-5.11.1.tgz", - "integrity": "sha512-gOKhM8JwlFOc2acbOrkYR05NW8M6DCMSvfcJiBB5NDxRE1gv8kbvxKaC9u69e6ZGEMWXcswA/7eKR229cEIpvg==", + "version": "5.12.1", + "resolved": "https://registry.npmjs.org/eslint/-/eslint-5.12.1.tgz", + "integrity": "sha512-54NV+JkTpTu0d8+UYSA8mMKAG4XAsaOrozA9rCW7tgneg1mevcL7wIotPC+fZ0SkWwdhNqoXoxnQCTBp7UvTsg==", "dev": true, "requires": { "@babel/code-frame": "^7.0.0", @@ -1738,6 +1737,7 @@ "glob": "^7.1.2", "globals": "^11.7.0", "ignore": "^4.0.6", + "import-fresh": "^3.0.0", "imurmurhash": "^0.1.4", "inquirer": "^6.1.0", "js-yaml": "^3.12.0", @@ -1752,7 +1752,6 @@ "pluralize": "^7.0.0", "progress": "^2.0.0", "regexpp": "^2.0.1", - "require-uncached": "^1.0.3", "semver": "^5.5.1", "strip-ansi": "^4.0.0", "strip-json-comments": "^2.0.1", @@ -1810,13 +1809,13 @@ } }, "eslint-module-utils": { - "version": "2.2.0", - "resolved": "https://registry.npmjs.org/eslint-module-utils/-/eslint-module-utils-2.2.0.tgz", - "integrity": "sha1-snA2LNiLGkitMIl2zn+lTphBF0Y=", + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/eslint-module-utils/-/eslint-module-utils-2.3.0.tgz", + "integrity": "sha512-lmDJgeOOjk8hObTysjqH7wyMi+nsHwwvfBykwfhjR1LNdd7C2uFJBvx4OpWYpXOw4df1yE1cDEVd1yLHitk34w==", "dev": true, "requires": { "debug": "^2.6.8", - "pkg-dir": "^1.0.0" + "pkg-dir": "^2.0.0" }, "dependencies": { "debug": { @@ -1837,21 +1836,21 @@ } }, "eslint-plugin-import": { - "version": "2.14.0", - "resolved": "https://registry.npmjs.org/eslint-plugin-import/-/eslint-plugin-import-2.14.0.tgz", - "integrity": "sha512-FpuRtniD/AY6sXByma2Wr0TXvXJ4nA/2/04VPlfpmUDPOpOY264x+ILiwnrk/k4RINgDAyFZByxqPUbSQ5YE7g==", + "version": "2.15.0", + "resolved": "https://registry.npmjs.org/eslint-plugin-import/-/eslint-plugin-import-2.15.0.tgz", + "integrity": "sha512-LEHqgR+RcnpGqYW7h9WMkPb/tP+ekKxWdQDztfTtZeV43IHF+X8lXU+1HOCcR4oXD24qRgEwNSxIweD5uNKGVg==", "dev": true, "requires": { "contains-path": "^0.1.0", - "debug": "^2.6.8", + "debug": "^2.6.9", "doctrine": "1.5.0", - "eslint-import-resolver-node": "^0.3.1", - "eslint-module-utils": "^2.2.0", - "has": "^1.0.1", - "lodash": "^4.17.4", - "minimatch": "^3.0.3", + "eslint-import-resolver-node": "^0.3.2", + "eslint-module-utils": "^2.3.0", + "has": "^1.0.3", + "lodash": "^4.17.11", + "minimatch": "^3.0.4", "read-pkg-up": "^2.0.0", - "resolve": "^1.6.0" + "resolve": "^1.9.0" }, "dependencies": { "debug": { @@ -1898,9 +1897,9 @@ } }, "eslint-plugin-react": { - "version": "7.12.1", - "resolved": "https://registry.npmjs.org/eslint-plugin-react/-/eslint-plugin-react-7.12.1.tgz", - "integrity": "sha512-1YyXVhp6KSB+xRC1BWzmlA4BH9Wp9jMMBE6AJizxuk+bg/KUJpQGRwsU1/q1pV8rM6oEdLCxunXn7Nfh2BOWBg==", + "version": "7.12.4", + "resolved": "https://registry.npmjs.org/eslint-plugin-react/-/eslint-plugin-react-7.12.4.tgz", + "integrity": "sha512-1puHJkXJY+oS1t467MjbqjvX53uQ05HXwjqDgdbGBqf5j9eeydI54G3KwiJmWciQ0HTBacIKw2jgwSBSH3yfgQ==", "dev": true, "requires": { "array-includes": "^3.0.3", @@ -2184,13 +2183,12 @@ } }, "find-up": { - "version": "1.1.2", - "resolved": "https://registry.npmjs.org/find-up/-/find-up-1.1.2.tgz", - "integrity": "sha1-ay6YIrGizgpgq2TWEOzK1TyyTQ8=", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/find-up/-/find-up-3.0.0.tgz", + "integrity": "sha512-1yD6RmLI1XBfxugvORwlck6f75tYL+iR0jqwsOrOxMZyGYqUuDhJ0l4AXdO1iX/FTs9cBAMEk1gWSEx1kSbylg==", "dev": true, "requires": { - "path-exists": "^2.0.0", - "pinkie-promise": "^2.0.0" + "locate-path": "^3.0.0" } }, "flat-cache": { @@ -2253,9 +2251,9 @@ "dev": true }, "fsevents": { - "version": "1.2.4", - "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-1.2.4.tgz", - "integrity": "sha512-z8H8/diyk76B7q5wg+Ud0+CqzcAF3mBBI/bA5ne5zrRUUIvNkJY//D3BqyH571KuAC4Nr7Rw7CjWX4r0y9DvNg==", + "version": "1.2.7", + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-1.2.7.tgz", + "integrity": "sha512-Pxm6sI2MeBD7RdD12RYsqaP0nMiwx8eZBXCa6z2L+mRHm2DYrOYwihmhjpkdjUHwQhslWQjRpEgNq4XvBmaAuw==", "dev": true, "optional": true, "requires": { @@ -2272,7 +2270,8 @@ "ansi-regex": { "version": "2.1.1", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "aproba": { "version": "1.2.0", @@ -2281,7 +2280,7 @@ "optional": true }, "are-we-there-yet": { - "version": "1.1.4", + "version": "1.1.5", "bundled": true, "dev": true, "optional": true, @@ -2293,19 +2292,21 @@ "balanced-match": { "version": "1.0.0", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "brace-expansion": { "version": "1.1.11", "bundled": true, "dev": true, + "optional": true, "requires": { "balanced-match": "^1.0.0", "concat-map": "0.0.1" } }, "chownr": { - "version": "1.0.1", + "version": "1.1.1", "bundled": true, "dev": true, "optional": true @@ -2313,17 +2314,20 @@ "code-point-at": { "version": "1.1.0", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "concat-map": { "version": "0.0.1", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "console-control-strings": { "version": "1.1.0", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "core-util-is": { "version": "1.0.2", @@ -2341,7 +2345,7 @@ } }, "deep-extend": { - "version": "0.5.1", + "version": "0.6.0", "bundled": true, "dev": true, "optional": true @@ -2390,7 +2394,7 @@ } }, "glob": { - "version": "7.1.2", + "version": "7.1.3", "bundled": true, "dev": true, "optional": true, @@ -2410,12 +2414,12 @@ "optional": true }, "iconv-lite": { - "version": "0.4.21", + "version": "0.4.24", "bundled": true, "dev": true, "optional": true, "requires": { - "safer-buffer": "^2.1.0" + "safer-buffer": ">= 2.1.2 < 3" } }, "ignore-walk": { @@ -2440,7 +2444,8 @@ "inherits": { "version": "2.0.3", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "ini": { "version": "1.3.5", @@ -2452,6 +2457,7 @@ "version": "1.0.0", "bundled": true, "dev": true, + "optional": true, "requires": { "number-is-nan": "^1.0.0" } @@ -2466,6 +2472,7 @@ "version": "3.0.4", "bundled": true, "dev": true, + "optional": true, "requires": { "brace-expansion": "^1.1.7" } @@ -2473,19 +2480,21 @@ "minimist": { "version": "0.0.8", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "minipass": { - "version": "2.2.4", + "version": "2.3.5", "bundled": true, "dev": true, + "optional": true, "requires": { - "safe-buffer": "^5.1.1", + "safe-buffer": "^5.1.2", "yallist": "^3.0.0" } }, "minizlib": { - "version": "1.1.0", + "version": "1.2.1", "bundled": true, "dev": true, "optional": true, @@ -2497,6 +2506,7 @@ "version": "0.5.1", "bundled": true, "dev": true, + "optional": true, "requires": { "minimist": "0.0.8" } @@ -2508,7 +2518,7 @@ "optional": true }, "needle": { - "version": "2.2.0", + "version": "2.2.4", "bundled": true, "dev": true, "optional": true, @@ -2519,18 +2529,18 @@ } }, "node-pre-gyp": { - "version": "0.10.0", + "version": "0.10.3", "bundled": true, "dev": true, "optional": true, "requires": { "detect-libc": "^1.0.2", "mkdirp": "^0.5.1", - "needle": "^2.2.0", + "needle": "^2.2.1", "nopt": "^4.0.1", "npm-packlist": "^1.1.6", "npmlog": "^4.0.2", - "rc": "^1.1.7", + "rc": "^1.2.7", "rimraf": "^2.6.1", "semver": "^5.3.0", "tar": "^4" @@ -2547,13 +2557,13 @@ } }, "npm-bundled": { - "version": "1.0.3", + "version": "1.0.5", "bundled": true, "dev": true, "optional": true }, "npm-packlist": { - "version": "1.1.10", + "version": "1.2.0", "bundled": true, "dev": true, "optional": true, @@ -2577,7 +2587,8 @@ "number-is-nan": { "version": "1.0.1", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "object-assign": { "version": "4.1.1", @@ -2589,6 +2600,7 @@ "version": "1.4.0", "bundled": true, "dev": true, + "optional": true, "requires": { "wrappy": "1" } @@ -2628,12 +2640,12 @@ "optional": true }, "rc": { - "version": "1.2.7", + "version": "1.2.8", "bundled": true, "dev": true, "optional": true, "requires": { - "deep-extend": "^0.5.1", + "deep-extend": "^0.6.0", "ini": "~1.3.0", "minimist": "^1.2.0", "strip-json-comments": "~2.0.1" @@ -2663,18 +2675,19 @@ } }, "rimraf": { - "version": "2.6.2", + "version": "2.6.3", "bundled": true, "dev": true, "optional": true, "requires": { - "glob": "^7.0.5" + "glob": "^7.1.3" } }, "safe-buffer": { - "version": "5.1.1", + "version": "5.1.2", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "safer-buffer": { "version": "2.1.2", @@ -2689,7 +2702,7 @@ "optional": true }, "semver": { - "version": "5.5.0", + "version": "5.6.0", "bundled": true, "dev": true, "optional": true @@ -2710,6 +2723,7 @@ "version": "1.0.2", "bundled": true, "dev": true, + "optional": true, "requires": { "code-point-at": "^1.0.0", "is-fullwidth-code-point": "^1.0.0", @@ -2729,6 +2743,7 @@ "version": "3.0.1", "bundled": true, "dev": true, + "optional": true, "requires": { "ansi-regex": "^2.0.0" } @@ -2740,17 +2755,17 @@ "optional": true }, "tar": { - "version": "4.4.1", + "version": "4.4.8", "bundled": true, "dev": true, "optional": true, "requires": { - "chownr": "^1.0.1", + "chownr": "^1.1.1", "fs-minipass": "^1.2.5", - "minipass": "^2.2.4", - "minizlib": "^1.1.0", + "minipass": "^2.3.4", + "minizlib": "^1.1.1", "mkdirp": "^0.5.0", - "safe-buffer": "^5.1.1", + "safe-buffer": "^5.1.2", "yallist": "^3.0.2" } }, @@ -2761,23 +2776,25 @@ "optional": true }, "wide-align": { - "version": "1.1.2", + "version": "1.1.3", "bundled": true, "dev": true, "optional": true, "requires": { - "string-width": "^1.0.2" + "string-width": "^1.0.2 || 2" } }, "wrappy": { "version": "1.0.2", "bundled": true, - "dev": true + "dev": true, + "optional": true }, "yallist": { - "version": "3.0.2", + "version": "3.0.3", "bundled": true, - "dev": true + "dev": true, + "optional": true } } }, @@ -2857,22 +2874,22 @@ } }, "globals": { - "version": "11.9.0", - "resolved": "https://registry.npmjs.org/globals/-/globals-11.9.0.tgz", - "integrity": "sha512-5cJVtyXWH8PiJPVLZzzoIizXx944O4OmRro5MWKx5fT4MgcN7OfaMutPeaTdJCCURwbWdhhcCWcKIffPnmTzBg==", + "version": "11.10.0", + "resolved": "https://registry.npmjs.org/globals/-/globals-11.10.0.tgz", + "integrity": "sha512-0GZF1RiPKU97IHUO5TORo9w1PwrH/NBPl+fS7oMLdaTRiYmYbwK4NWoZWrAdd0/abG9R2BU+OiwyQpTpE6pdfQ==", "dev": true }, "google-closure-compiler": { - "version": "20181210.0.0", - "resolved": "https://registry.npmjs.org/google-closure-compiler/-/google-closure-compiler-20181210.0.0.tgz", - "integrity": "sha512-GCMLakdibnc+jpdNTvF3M/ET5i6I4zzxGKw67A4bQahxc0TPLXQdkVfhF3kwBSoPfK8xwgU5kA+KO0qvDZHKHw==", + "version": "20190121.0.0", + "resolved": "https://registry.npmjs.org/google-closure-compiler/-/google-closure-compiler-20190121.0.0.tgz", + "integrity": "sha512-FIp3+KxjtDwykDTr1WsFo0QexEopAC4bDXXZfnEdgHECF7hCeFAAsLUPxMmj9Wx+O39eFCXGAzY7w0k5aU9qjg==", "dev": true, "requires": { "chalk": "^1.0.0", - "google-closure-compiler-java": "^20181210.0.0", - "google-closure-compiler-js": "^20181210.0.0", - "google-closure-compiler-linux": "^20181210.0.0", - "google-closure-compiler-osx": "^20181210.0.0", + "google-closure-compiler-java": "^20190121.0.0", + "google-closure-compiler-js": "^20190121.0.0", + "google-closure-compiler-linux": "^20190121.0.0", + "google-closure-compiler-osx": "^20190121.0.0", "minimist": "^1.2.0", "vinyl": "^2.0.1", "vinyl-sourcemaps-apply": "^0.2.0" @@ -2927,28 +2944,28 @@ } }, "google-closure-compiler-java": { - "version": "20181210.0.0", - "resolved": "https://registry.npmjs.org/google-closure-compiler-java/-/google-closure-compiler-java-20181210.0.0.tgz", - "integrity": "sha512-FMGzY+vp25DePolYNyVcXz8UI2PV/I3AYU3nuFexmHcKn5XiBVy4CqK7em6NpVbZdDXJYUF3GUv5A0x0gLvbfw==", + "version": "20190121.0.0", + "resolved": "https://registry.npmjs.org/google-closure-compiler-java/-/google-closure-compiler-java-20190121.0.0.tgz", + "integrity": "sha512-UCQ7ZXOlk/g101DS4TqyW+SaoR+4GVq7NKrwebH4gnESY76Xuz7FRrKWwfAXwltmiYAUVZCVI4qpoEz48V+VjA==", "dev": true }, "google-closure-compiler-js": { - "version": "20181210.0.0", - "resolved": "https://registry.npmjs.org/google-closure-compiler-js/-/google-closure-compiler-js-20181210.0.0.tgz", - "integrity": "sha512-gn+2hT4uQtYKD/jXJqGIXzPMln3/JD7R4caAKDPJm7adqqDvrCAw7qxAiK4Vz1rNec7hJXPXh9TeKQjzz03ZaQ==", + "version": "20190121.0.0", + "resolved": "https://registry.npmjs.org/google-closure-compiler-js/-/google-closure-compiler-js-20190121.0.0.tgz", + "integrity": "sha512-PgY0Fy+fXZnjir6aPz/FVJPXuwZf5pKJ9n7Hf1HL4x1lhqVIf3i+u3Ed6ZWCXa+YiEhvwH5RTQr/iPP/D3gDRg==", "dev": true }, "google-closure-compiler-linux": { - "version": "20181210.0.0", - "resolved": "https://registry.npmjs.org/google-closure-compiler-linux/-/google-closure-compiler-linux-20181210.0.0.tgz", - "integrity": "sha512-Gp+yp+Vb6QWEhtYkePKxkspRlzX5dx6L46zUoHGWW7Henuk3ACYoUXuaHLQQ+tF0lmi2QAmFXEkvdnKVDIxR+Q==", + "version": "20190121.0.0", + "resolved": "https://registry.npmjs.org/google-closure-compiler-linux/-/google-closure-compiler-linux-20190121.0.0.tgz", + "integrity": "sha512-cw4qr9TuB2gB53l/oYadZLuw+zOi2yggYFtnNA5jvTLTqY8m2VZAL5DGL6gmCtZovbQ0bv9ANqjT8NxEtcSzfw==", "dev": true, "optional": true }, "google-closure-compiler-osx": { - "version": "20181210.0.0", - "resolved": "https://registry.npmjs.org/google-closure-compiler-osx/-/google-closure-compiler-osx-20181210.0.0.tgz", - "integrity": "sha512-SYUakmEpq8BorJU/O5CfrC+ABYjXR0rTvBd3Khwd1sml9B2aKEiHArdHC5SCmBRZd3ccUhp/XyrVO6PoxHKeZA==", + "version": "20190121.0.0", + "resolved": "https://registry.npmjs.org/google-closure-compiler-osx/-/google-closure-compiler-osx-20190121.0.0.tgz", + "integrity": "sha512-6OqyUcgojPCqCuzdyKLwmIkBhfoWF3cVzaX8vaJvQ3SYwlITBT3aepMEZiWFRVvvml+ojs1AJcZvQIqFke8X1w==", "dev": true, "optional": true }, @@ -3149,6 +3166,16 @@ "integrity": "sha512-cyFDKrqc/YdcWFniJhzI42+AzS+gNwmUzOSFcRCQYwySuBBBy/KjuxWLZ/FHEH6Moq1NizMOBWyTcv8O4OZIMg==", "dev": true }, + "import-fresh": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/import-fresh/-/import-fresh-3.0.0.tgz", + "integrity": "sha512-pOnA9tfM3Uwics+SaBLCNyZZZbK+4PTu0OPZtLlMIrv17EdBoC15S9Kn8ckJ9TZTyKb3ywNE5y1yeDxxGA7nTQ==", + "dev": true, + "requires": { + "parent-module": "^1.0.0", + "resolve-from": "^4.0.0" + } + }, "import-local": { "version": "1.0.0", "resolved": "https://registry.npmjs.org/import-local/-/import-local-1.0.0.tgz", @@ -3157,26 +3184,6 @@ "requires": { "pkg-dir": "^2.0.0", "resolve-cwd": "^2.0.0" - }, - "dependencies": { - "find-up": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/find-up/-/find-up-2.1.0.tgz", - "integrity": "sha1-RdG35QbHF93UgndaK3eSCjwMV6c=", - "dev": true, - "requires": { - "locate-path": "^2.0.0" - } - }, - "pkg-dir": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/pkg-dir/-/pkg-dir-2.0.0.tgz", - "integrity": "sha1-9tXREJ4Z1j7fQo4L1X4Sd3YVM0s=", - "dev": true, - "requires": { - "find-up": "^2.1.0" - } - } } }, "imurmurhash": { @@ -3673,7 +3680,7 @@ }, "get-stream": { "version": "3.0.0", - "resolved": "http://registry.npmjs.org/get-stream/-/get-stream-3.0.0.tgz", + "resolved": "https://registry.npmjs.org/get-stream/-/get-stream-3.0.0.tgz", "integrity": "sha1-jpQ9E1jcN1VQVOy+LtsFqhdO3hQ=", "dev": true }, @@ -3736,6 +3743,16 @@ "invert-kv": "^1.0.0" } }, + "locate-path": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", + "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "dev": true, + "requires": { + "p-locate": "^2.0.0", + "path-exists": "^3.0.0" + } + }, "mem": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/mem/-/mem-1.1.0.tgz", @@ -3756,6 +3773,30 @@ "mem": "^1.1.0" } }, + "p-limit": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", + "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "dev": true, + "requires": { + "p-try": "^1.0.0" + } + }, + "p-locate": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", + "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "dev": true, + "requires": { + "p-limit": "^1.1.0" + } + }, + "p-try": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", + "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "dev": true + }, "y18n": { "version": "3.2.1", "resolved": "https://registry.npmjs.org/y18n/-/y18n-3.2.1.tgz", @@ -3764,7 +3805,7 @@ }, "yargs": { "version": "11.1.0", - "resolved": "http://registry.npmjs.org/yargs/-/yargs-11.1.0.tgz", + "resolved": "https://registry.npmjs.org/yargs/-/yargs-11.1.0.tgz", "integrity": "sha512-NwW69J42EsCSanF8kyn5upxvjp5ds+t3+udGBeTbFnERA+lF541DDpMawzo4z6W/QrzNM18D+BPMiOBibnFV5A==", "dev": true, "requires": { @@ -3920,7 +3961,7 @@ }, "jest-get-type": { "version": "22.4.3", - "resolved": "http://registry.npmjs.org/jest-get-type/-/jest-get-type-22.4.3.tgz", + "resolved": "https://registry.npmjs.org/jest-get-type/-/jest-get-type-22.4.3.tgz", "integrity": "sha512-/jsz0Y+V29w1chdXVygEKSz2nBoHoYqNShPe+QgxSNjAuP1i8+k4LbQNrfoliKej0P45sivkSCh7yiD6ubHS3w==", "dev": true }, @@ -4054,9 +4095,9 @@ "dev": true }, "source-map-support": { - "version": "0.5.9", - "resolved": "https://registry.npmjs.org/source-map-support/-/source-map-support-0.5.9.tgz", - "integrity": "sha512-gR6Rw4MvUlYy83vP0vxoVNzM6t8MUXqNuRsuBmBHQDu1Fh6X015FrLdgoDKcNdkwGubozq0P4N0Q37UyFVr1EA==", + "version": "0.5.10", + "resolved": "https://registry.npmjs.org/source-map-support/-/source-map-support-0.5.10.tgz", + "integrity": "sha512-YfQ3tQFTK/yzlGJuX8pTwa4tifQj4QS2Mj7UegOu8jAz59MqIiMGPXxQhVQiIMNzayuUSF/jEuVnfFF5JqybmQ==", "dev": true, "requires": { "buffer-from": "^1.0.0", @@ -4137,7 +4178,7 @@ }, "get-stream": { "version": "3.0.0", - "resolved": "http://registry.npmjs.org/get-stream/-/get-stream-3.0.0.tgz", + "resolved": "https://registry.npmjs.org/get-stream/-/get-stream-3.0.0.tgz", "integrity": "sha1-jpQ9E1jcN1VQVOy+LtsFqhdO3hQ=", "dev": true }, @@ -4156,6 +4197,16 @@ "invert-kv": "^1.0.0" } }, + "locate-path": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", + "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "dev": true, + "requires": { + "p-locate": "^2.0.0", + "path-exists": "^3.0.0" + } + }, "mem": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/mem/-/mem-1.1.0.tgz", @@ -4176,6 +4227,30 @@ "mem": "^1.1.0" } }, + "p-limit": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", + "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "dev": true, + "requires": { + "p-try": "^1.0.0" + } + }, + "p-locate": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", + "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "dev": true, + "requires": { + "p-limit": "^1.1.0" + } + }, + "p-try": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", + "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "dev": true + }, "y18n": { "version": "3.2.1", "resolved": "https://registry.npmjs.org/y18n/-/y18n-3.2.1.tgz", @@ -4184,7 +4259,7 @@ }, "yargs": { "version": "11.1.0", - "resolved": "http://registry.npmjs.org/yargs/-/yargs-11.1.0.tgz", + "resolved": "https://registry.npmjs.org/yargs/-/yargs-11.1.0.tgz", "integrity": "sha512-NwW69J42EsCSanF8kyn5upxvjp5ds+t3+udGBeTbFnERA+lF541DDpMawzo4z6W/QrzNM18D+BPMiOBibnFV5A==", "dev": true, "requires": { @@ -4306,9 +4381,9 @@ "dev": true }, "js-yaml": { - "version": "3.12.0", - "resolved": "https://registry.npmjs.org/js-yaml/-/js-yaml-3.12.0.tgz", - "integrity": "sha512-PIt2cnwmPfL4hKNwqeiuz4bKfnzHTBv6HyVgjahA6mPLwPDzjDWrplJBMjHUFxku/N3FlmrbyPclad+I+4mJ3A==", + "version": "3.12.1", + "resolved": "https://registry.npmjs.org/js-yaml/-/js-yaml-3.12.1.tgz", + "integrity": "sha512-um46hB9wNOKlwkHgiuyEVAybXBjwFUV0Z/RaHJblRd9DXltue9FTYvzCr9ErQrK9Adz5MU4gHWVaNUfdmrC8qA==", "dev": true, "requires": { "argparse": "^1.0.7", @@ -4322,9 +4397,9 @@ "dev": true }, "jsdom": { - "version": "13.1.0", - "resolved": "https://registry.npmjs.org/jsdom/-/jsdom-13.1.0.tgz", - "integrity": "sha512-C2Kp0qNuopw0smXFaHeayvharqF3kkcNqlcIlSX71+3XrsOFwkEPLt/9f5JksMmaul2JZYIQuY+WTpqHpQQcLg==", + "version": "13.2.0", + "resolved": "https://registry.npmjs.org/jsdom/-/jsdom-13.2.0.tgz", + "integrity": "sha512-cG1NtMWO9hWpqRNRR3dSvEQa8bFI6iLlqU2x4kwX51FQjp0qus8T9aBaAO6iGp3DeBrhdwuKxckknohkmfvsFw==", "dev": true, "requires": { "abab": "^2.0.0", @@ -4342,7 +4417,7 @@ "pn": "^1.1.0", "request": "^2.88.0", "request-promise-native": "^1.0.5", - "saxes": "^3.1.4", + "saxes": "^3.1.5", "symbol-tree": "^3.2.2", "tough-cookie": "^2.5.0", "w3c-hr-time": "^1.0.1", @@ -4373,9 +4448,9 @@ } }, "ws": { - "version": "6.1.2", - "resolved": "https://registry.npmjs.org/ws/-/ws-6.1.2.tgz", - "integrity": "sha512-rfUqzvz0WxmSXtJpPMX2EeASXabOrSMk1ruMOV3JBTBjo4ac2lDjGGsbQSyxj8Odhw5fBib8ZKEjDNvgouNKYw==", + "version": "6.1.3", + "resolved": "https://registry.npmjs.org/ws/-/ws-6.1.3.tgz", + "integrity": "sha512-tbSxiT+qJI223AP4iLfQbkbxkwdFcneYinM2+x46Gx2wgvbaOMO36czfdfVUBRTHvzAMRhDd98sA5d/BuWbQdg==", "dev": true, "requires": { "async-limiter": "~1.0.0" @@ -4385,7 +4460,7 @@ }, "jsesc": { "version": "1.3.0", - "resolved": "http://registry.npmjs.org/jsesc/-/jsesc-1.3.0.tgz", + "resolved": "https://registry.npmjs.org/jsesc/-/jsesc-1.3.0.tgz", "integrity": "sha1-RsP+yMGJKxKwgz25vHYiF226s0s=", "dev": true }, @@ -4421,7 +4496,7 @@ }, "json5": { "version": "0.5.1", - "resolved": "http://registry.npmjs.org/json5/-/json5-0.5.1.tgz", + "resolved": "https://registry.npmjs.org/json5/-/json5-0.5.1.tgz", "integrity": "sha1-Hq3nrMASA0rYTiOWdn6tn6VJWCE=", "dev": true }, @@ -4502,24 +4577,33 @@ "parse-json": "^2.2.0", "pify": "^2.0.0", "strip-bom": "^3.0.0" + }, + "dependencies": { + "parse-json": { + "version": "2.2.0", + "resolved": "https://registry.npmjs.org/parse-json/-/parse-json-2.2.0.tgz", + "integrity": "sha1-9ID0BDTvgHQfhGkJn43qGPVaTck=", + "dev": true, + "requires": { + "error-ex": "^1.2.0" + } + }, + "pify": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/pify/-/pify-2.3.0.tgz", + "integrity": "sha1-7RQaasBDqEnqWISY59yosVMw6Qw=", + "dev": true + } } }, "locate-path": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", - "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-3.0.0.tgz", + "integrity": "sha512-7AO748wWnIhNqAuaty2ZWHkQHRSNfPVIsPIfwEOWO22AmaoVrWavlOcMR5nzTLNYvp36X220/maaRsrec1G65A==", "dev": true, "requires": { - "p-locate": "^2.0.0", + "p-locate": "^3.0.0", "path-exists": "^3.0.0" - }, - "dependencies": { - "path-exists": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-3.0.0.tgz", - "integrity": "sha1-zg6+ql94yxiSXqfYENe1mwEP1RU=", - "dev": true - } } }, "lodash": { @@ -4602,9 +4686,9 @@ } }, "math-random": { - "version": "1.0.1", - "resolved": "https://registry.npmjs.org/math-random/-/math-random-1.0.1.tgz", - "integrity": "sha1-izqsWIuKZuSXXjzepn97sylgH6w=", + "version": "1.0.4", + "resolved": "https://registry.npmjs.org/math-random/-/math-random-1.0.4.tgz", + "integrity": "sha512-rUxjysqif/BZQH2yhd5Aaq7vXMSx9NdEsQcyA07uEzIvxgI7zIr33gGsh+RU0/XjmQpCW7RsVof1vlkvQVCK5A==", "dev": true }, "mem": { @@ -5032,7 +5116,7 @@ }, "os-homedir": { "version": "1.0.2", - "resolved": "http://registry.npmjs.org/os-homedir/-/os-homedir-1.0.2.tgz", + "resolved": "https://registry.npmjs.org/os-homedir/-/os-homedir-1.0.2.tgz", "integrity": "sha1-/7xJiDNuDoM94MFox+8VISGqf7M=", "dev": true }, @@ -5072,29 +5156,38 @@ "dev": true }, "p-limit": { - "version": "1.3.0", - "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", - "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "version": "2.1.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-2.1.0.tgz", + "integrity": "sha512-NhURkNcrVB+8hNfLuysU8enY5xn2KXphsHBaC2YmRNTZRc7RWusw6apSpdEj3jo4CMb6W9nrF6tTnsJsJeyu6g==", "dev": true, "requires": { - "p-try": "^1.0.0" + "p-try": "^2.0.0" } }, "p-locate": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", - "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-3.0.0.tgz", + "integrity": "sha512-x+12w/To+4GFfgJhBEpiDcLozRJGegY+Ei7/z0tSLkMmxGZNybVMSfWj9aJn8Z5Fc7dBUNJOOVgPv2H7IwulSQ==", "dev": true, "requires": { - "p-limit": "^1.1.0" + "p-limit": "^2.0.0" } }, "p-try": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", - "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-2.0.0.tgz", + "integrity": "sha512-hMp0onDKIajHfIkdRk3P4CdCmErkYAxxDtP3Wx/4nZ3aGlau2VKh3mZpcuFkH27WQkL/3WBCPOktzA9ZOAnMQQ==", "dev": true }, + "parent-module": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/parent-module/-/parent-module-1.0.0.tgz", + "integrity": "sha512-8Mf5juOMmiE4FcmzYc4IaiS9L3+9paz2KOiXzkRviCP6aDmN49Hz6EMWz0lGNp9pX80GvvAuLADtyGfW/Em3TA==", + "dev": true, + "requires": { + "callsites": "^3.0.0" + } + }, "parse-glob": { "version": "3.0.4", "resolved": "https://registry.npmjs.org/parse-glob/-/parse-glob-3.0.4.tgz", @@ -5108,12 +5201,13 @@ } }, "parse-json": { - "version": "2.2.0", - "resolved": "https://registry.npmjs.org/parse-json/-/parse-json-2.2.0.tgz", - "integrity": "sha1-9ID0BDTvgHQfhGkJn43qGPVaTck=", + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/parse-json/-/parse-json-4.0.0.tgz", + "integrity": "sha1-vjX1Qlvh9/bHRxhPmKeIy5lHfuA=", "dev": true, "requires": { - "error-ex": "^1.2.0" + "error-ex": "^1.3.1", + "json-parse-better-errors": "^1.0.1" } }, "parse5": { @@ -5129,13 +5223,10 @@ "dev": true }, "path-exists": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-2.1.0.tgz", - "integrity": "sha1-D+tsZPD8UY2adU3V77YscCJ2H0s=", - "dev": true, - "requires": { - "pinkie-promise": "^2.0.0" - } + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-3.0.0.tgz", + "integrity": "sha1-zg6+ql94yxiSXqfYENe1mwEP1RU=", + "dev": true }, "path-is-absolute": { "version": "1.0.1", @@ -5168,6 +5259,14 @@ "dev": true, "requires": { "pify": "^2.0.0" + }, + "dependencies": { + "pify": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/pify/-/pify-2.3.0.tgz", + "integrity": "sha1-7RQaasBDqEnqWISY59yosVMw6Qw=", + "dev": true + } } }, "performance-now": { @@ -5177,9 +5276,9 @@ "dev": true }, "pify": { - "version": "2.3.0", - "resolved": "http://registry.npmjs.org/pify/-/pify-2.3.0.tgz", - "integrity": "sha1-7RQaasBDqEnqWISY59yosVMw6Qw=", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/pify/-/pify-3.0.0.tgz", + "integrity": "sha1-5aSs0sEB/fPZpNB/DbxNtJ3SgXY=", "dev": true }, "pinkie": { @@ -5198,12 +5297,57 @@ } }, "pkg-dir": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/pkg-dir/-/pkg-dir-1.0.0.tgz", - "integrity": "sha1-ektQio1bstYp1EcFb/TpyTFM89Q=", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/pkg-dir/-/pkg-dir-2.0.0.tgz", + "integrity": "sha1-9tXREJ4Z1j7fQo4L1X4Sd3YVM0s=", "dev": true, "requires": { - "find-up": "^1.0.0" + "find-up": "^2.1.0" + }, + "dependencies": { + "find-up": { + "version": "2.1.0", + "resolved": "https://registry.npmjs.org/find-up/-/find-up-2.1.0.tgz", + "integrity": "sha1-RdG35QbHF93UgndaK3eSCjwMV6c=", + "dev": true, + "requires": { + "locate-path": "^2.0.0" + } + }, + "locate-path": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", + "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "dev": true, + "requires": { + "p-locate": "^2.0.0", + "path-exists": "^3.0.0" + } + }, + "p-limit": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", + "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "dev": true, + "requires": { + "p-try": "^1.0.0" + } + }, + "p-locate": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", + "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "dev": true, + "requires": { + "p-limit": "^1.1.0" + } + }, + "p-try": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", + "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "dev": true + } } }, "platform": { @@ -5349,14 +5493,14 @@ } }, "read-pkg": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/read-pkg/-/read-pkg-2.0.0.tgz", - "integrity": "sha1-jvHAYjxqbbDcZxPEv6xGMysjaPg=", + "version": "4.0.1", + "resolved": "https://registry.npmjs.org/read-pkg/-/read-pkg-4.0.1.tgz", + "integrity": "sha1-ljYlN48+HE1IyFhytabsfV0JMjc=", "dev": true, "requires": { - "load-json-file": "^2.0.0", "normalize-package-data": "^2.3.2", - "path-type": "^2.0.0" + "parse-json": "^4.0.0", + "pify": "^3.0.0" } }, "read-pkg-up": { @@ -5377,6 +5521,51 @@ "requires": { "locate-path": "^2.0.0" } + }, + "locate-path": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-2.0.0.tgz", + "integrity": "sha1-K1aLJl7slExtnA3pw9u7ygNUzY4=", + "dev": true, + "requires": { + "p-locate": "^2.0.0", + "path-exists": "^3.0.0" + } + }, + "p-limit": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-1.3.0.tgz", + "integrity": "sha512-vvcXsLAJ9Dr5rQOPk7toZQZJApBl2K4J6dANSsEuh6QI41JYcsS/qhTGa9ErIUUgK3WNQoJYvylxvjqmiqEA9Q==", + "dev": true, + "requires": { + "p-try": "^1.0.0" + } + }, + "p-locate": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-2.0.0.tgz", + "integrity": "sha1-IKAQOyIqcMj9OcwuWAaA893l7EM=", + "dev": true, + "requires": { + "p-limit": "^1.1.0" + } + }, + "p-try": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/p-try/-/p-try-1.0.0.tgz", + "integrity": "sha1-y8ec26+P1CKOE/Yh8rGiN8GyB7M=", + "dev": true + }, + "read-pkg": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/read-pkg/-/read-pkg-2.0.0.tgz", + "integrity": "sha1-jvHAYjxqbbDcZxPEv6xGMysjaPg=", + "dev": true, + "requires": { + "load-json-file": "^2.0.0", + "normalize-package-data": "^2.3.2", + "path-type": "^2.0.0" + } } } }, @@ -5546,20 +5735,10 @@ "integrity": "sha1-l/cXtp1IeE9fUmpsWqj/3aBVpNE=", "dev": true }, - "require-uncached": { - "version": "1.0.3", - "resolved": "http://registry.npmjs.org/require-uncached/-/require-uncached-1.0.3.tgz", - "integrity": "sha1-Tg1W1slmL9MeQwEcS5WqSZVUIdM=", - "dev": true, - "requires": { - "caller-path": "^0.1.0", - "resolve-from": "^1.0.0" - } - }, "resolve": { - "version": "1.9.0", - "resolved": "https://registry.npmjs.org/resolve/-/resolve-1.9.0.tgz", - "integrity": "sha512-TZNye00tI67lwYvzxCxHGjwTNlUV70io54/Ed4j6PscB8xVfuBJpRenI/o6dVk0cY0PYTY27AgCoGGxRnYuItQ==", + "version": "1.10.0", + "resolved": "https://registry.npmjs.org/resolve/-/resolve-1.10.0.tgz", + "integrity": "sha512-3sUr9aq5OfSg2S9pNtPA9hL1FVEAjvfOC4leW0SNf/mpnaakz2a9femSd6LqAww2RaFctwyf1lCqnTHuF1rxDg==", "dev": true, "requires": { "path-parse": "^1.0.6" @@ -5583,9 +5762,9 @@ } }, "resolve-from": { - "version": "1.0.1", - "resolved": "https://registry.npmjs.org/resolve-from/-/resolve-from-1.0.1.tgz", - "integrity": "sha1-Jsv+k10a7uq7Kbw/5a6wHpPUQiY=", + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/resolve-from/-/resolve-from-4.0.0.tgz", + "integrity": "sha512-pb/MYmXstAkysRFx8piNI1tGFNQIFA3vkE3Gq4EuA1dF6gHp/+vgZqsCGJapvy8N3Q+4o7FwvquPJcnZ7RYy4g==", "dev": true }, "resolve-url": { @@ -5611,23 +5790,23 @@ "dev": true }, "rimraf": { - "version": "2.6.2", - "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-2.6.2.tgz", - "integrity": "sha512-lreewLK/BlghmxtfH36YYVg1i8IAce4TI7oao75I1g245+6BctqTVQiBP3YUJ9C6DQOXJmkYR9X9fCLtCOJc5w==", + "version": "2.6.3", + "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-2.6.3.tgz", + "integrity": "sha512-mwqeW5XsA2qAejG46gYdENaxXjx9onRNCfn7L0duuP4hCuTIi/QO7PDK07KJfp1d+izWPrzEJDcSqBa0OZQriA==", "dev": true, "requires": { - "glob": "^7.0.5" + "glob": "^7.1.3" } }, "rollup": { - "version": "1.0.1", - "resolved": "https://registry.npmjs.org/rollup/-/rollup-1.0.1.tgz", - "integrity": "sha512-jf1EA9xJMx4hgEVdJQd8lVo2a0gbzY7fKM9kHZwQzcafYDapwLijd9G56Kxm2/RdEnQUEw9mSv8PyRWhsV0x2A==", + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/rollup/-/rollup-1.1.2.tgz", + "integrity": "sha512-OkdMxqMl8pWoQc5D8y1cIinYQPPLV8ZkfLgCzL6SytXeNA2P7UHynEQXI9tYxuAjAMsSyvRaWnyJDLHMxq0XAg==", "dev": true, "requires": { "@types/estree": "0.0.39", "@types/node": "*", - "acorn": "^6.0.4" + "acorn": "^6.0.5" } }, "rollup-plugin-commonjs": { @@ -6037,9 +6216,9 @@ "dev": true }, "saxes": { - "version": "3.1.4", - "resolved": "https://registry.npmjs.org/saxes/-/saxes-3.1.4.tgz", - "integrity": "sha512-GVZmLJnkS4Vl8Pe9o4nc5ALZ615VOVxCmea8Cs0l+8GZw3RQ5XGOSUomIUfuZuk4Todo44v4y+HY1EATkDDiZg==", + "version": "3.1.6", + "resolved": "https://registry.npmjs.org/saxes/-/saxes-3.1.6.tgz", + "integrity": "sha512-LAYs+lChg1v5uKNzPtsgTxSS5hLo8aIhSMCJt1WMpefAxm3D1RTpMwSpb6ebdL31cubiLTnhokVktBW+cv9Y9w==", "dev": true, "requires": { "xmlchars": "^1.3.1" @@ -6347,9 +6526,9 @@ "dev": true }, "sshpk": { - "version": "1.16.0", - "resolved": "https://registry.npmjs.org/sshpk/-/sshpk-1.16.0.tgz", - "integrity": "sha512-Zhev35/y7hRMcID/upReIvRse+I9SVhyVre/KTJSJQWMz3C3+G+HpO7m1wK/yckEtujKZ7dS4hkVxAnmHaIGVQ==", + "version": "1.16.1", + "resolved": "https://registry.npmjs.org/sshpk/-/sshpk-1.16.1.tgz", + "integrity": "sha512-HXXqVUq7+pcKeLqqZj6mHFUMvXtOJt1uoUx09pFW6011inTMxqI8BA8PM95myrIyyKwdnzjdFjLiE6KBPVtJIg==", "dev": true, "requires": { "asn1": "~0.2.3", @@ -6468,9 +6647,9 @@ "dev": true }, "table": { - "version": "5.1.1", - "resolved": "https://registry.npmjs.org/table/-/table-5.1.1.tgz", - "integrity": "sha512-NUjapYb/qd4PeFW03HnAuOJ7OMcBkJlqeClWxeNlQ0lXGSb52oZXGzkO0/I0ARegQ2eUT1g2VDJH0eUxDRcHmw==", + "version": "5.2.1", + "resolved": "https://registry.npmjs.org/table/-/table-5.2.1.tgz", + "integrity": "sha512-qmhNs2GEHNqY5fd2Mo+8N1r2sw/rvTAAvBZTaTx+Y7PHLypqyrxr1MdIu0pLw6Xvl/Gi4ONu/sdceP8vvUjkyA==", "dev": true, "requires": { "ajv": "^6.6.1", @@ -6492,9 +6671,19 @@ "require-main-filename": "^1.0.1" }, "dependencies": { + "find-up": { + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/find-up/-/find-up-1.1.2.tgz", + "integrity": "sha1-ay6YIrGizgpgq2TWEOzK1TyyTQ8=", + "dev": true, + "requires": { + "path-exists": "^2.0.0", + "pinkie-promise": "^2.0.0" + } + }, "load-json-file": { "version": "1.1.0", - "resolved": "http://registry.npmjs.org/load-json-file/-/load-json-file-1.1.0.tgz", + "resolved": "https://registry.npmjs.org/load-json-file/-/load-json-file-1.1.0.tgz", "integrity": "sha1-lWkFcI1YtLq0wiYbBPWfMcmTdMA=", "dev": true, "requires": { @@ -6505,6 +6694,24 @@ "strip-bom": "^2.0.0" } }, + "parse-json": { + "version": "2.2.0", + "resolved": "https://registry.npmjs.org/parse-json/-/parse-json-2.2.0.tgz", + "integrity": "sha1-9ID0BDTvgHQfhGkJn43qGPVaTck=", + "dev": true, + "requires": { + "error-ex": "^1.2.0" + } + }, + "path-exists": { + "version": "2.1.0", + "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-2.1.0.tgz", + "integrity": "sha1-D+tsZPD8UY2adU3V77YscCJ2H0s=", + "dev": true, + "requires": { + "pinkie-promise": "^2.0.0" + } + }, "path-type": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/path-type/-/path-type-1.1.0.tgz", @@ -6516,6 +6723,12 @@ "pinkie-promise": "^2.0.0" } }, + "pify": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/pify/-/pify-2.3.0.tgz", + "integrity": "sha1-7RQaasBDqEnqWISY59yosVMw6Qw=", + "dev": true + }, "read-pkg": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/read-pkg/-/read-pkg-1.1.0.tgz", @@ -6721,9 +6934,9 @@ "integrity": "sha512-4krF8scpejhaOgqzBEcGM7yDIEfi0/8+8zDRZhNZZ2kjmHJ4hv3zCbQWxoJGz1iw5U0Jl0nma13xzHXcncMavQ==" }, "tslint": { - "version": "5.12.0", - "resolved": "https://registry.npmjs.org/tslint/-/tslint-5.12.0.tgz", - "integrity": "sha512-CKEcH1MHUBhoV43SA/Jmy1l24HJJgI0eyLbBNSRyFlsQvb9v6Zdq+Nz2vEOH00nC5SUx4SneJ59PZUS/ARcokQ==", + "version": "5.12.1", + "resolved": "https://registry.npmjs.org/tslint/-/tslint-5.12.1.tgz", + "integrity": "sha512-sfodBHOucFg6egff8d1BvuofoOQ/nOeYNfbp7LDlKBcLNrL3lmS5zoiDGyOMdT7YsEXAwWpTdAHwOGOc8eRZAw==", "dev": true, "requires": { "babel-code-frame": "^6.22.0", @@ -6774,9 +6987,9 @@ } }, "typescript": { - "version": "3.2.2", - "resolved": "https://registry.npmjs.org/typescript/-/typescript-3.2.2.tgz", - "integrity": "sha512-VCj5UiSyHBjwfYacmDuc/NOk4QQixbE+Wn7MFJuS0nRuPQbof132Pw4u53dm264O8LPc2MVsc7RJNml5szurkg==", + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-3.2.4.tgz", + "integrity": "sha512-0RNDbSdEokBeEAkgNbxJ+BLwSManFy9TeXz8uW+48j/xhEXv1ePME60olyzw2XzUqUBNAYFeJadIqAgNqIACwg==", "dev": true }, "uglify-js": { @@ -7137,9 +7350,9 @@ } }, "write-file-atomic": { - "version": "2.3.0", - "resolved": "https://registry.npmjs.org/write-file-atomic/-/write-file-atomic-2.3.0.tgz", - "integrity": "sha512-xuPeK4OdjWqtfi59ylvVL0Yn35SF3zgcAcv7rBPFHVaEapaDr4GdGgm3j7ckTwH9wHL7fGmgfAnb0+THrHb8tA==", + "version": "2.4.2", + "resolved": "https://registry.npmjs.org/write-file-atomic/-/write-file-atomic-2.4.2.tgz", + "integrity": "sha512-s0b6vB3xIVRLWywa6X9TOMA7k9zio0TMOsl9ZnDkliA/cfJlpHXAscj0gbHVJiTdIuAYpIyqS5GW91fqm6gG5g==", "dev": true, "requires": { "graceful-fs": "^4.1.11", @@ -7198,57 +7411,6 @@ "which-module": "^2.0.0", "y18n": "^3.2.1 || ^4.0.0", "yargs-parser": "^11.1.1" - }, - "dependencies": { - "find-up": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/find-up/-/find-up-3.0.0.tgz", - "integrity": "sha512-1yD6RmLI1XBfxugvORwlck6f75tYL+iR0jqwsOrOxMZyGYqUuDhJ0l4AXdO1iX/FTs9cBAMEk1gWSEx1kSbylg==", - "dev": true, - "requires": { - "locate-path": "^3.0.0" - } - }, - "locate-path": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-3.0.0.tgz", - "integrity": "sha512-7AO748wWnIhNqAuaty2ZWHkQHRSNfPVIsPIfwEOWO22AmaoVrWavlOcMR5nzTLNYvp36X220/maaRsrec1G65A==", - "dev": true, - "requires": { - "p-locate": "^3.0.0", - "path-exists": "^3.0.0" - } - }, - "p-limit": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-2.1.0.tgz", - "integrity": "sha512-NhURkNcrVB+8hNfLuysU8enY5xn2KXphsHBaC2YmRNTZRc7RWusw6apSpdEj3jo4CMb6W9nrF6tTnsJsJeyu6g==", - "dev": true, - "requires": { - "p-try": "^2.0.0" - } - }, - "p-locate": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/p-locate/-/p-locate-3.0.0.tgz", - "integrity": "sha512-x+12w/To+4GFfgJhBEpiDcLozRJGegY+Ei7/z0tSLkMmxGZNybVMSfWj9aJn8Z5Fc7dBUNJOOVgPv2H7IwulSQ==", - "dev": true, - "requires": { - "p-limit": "^2.0.0" - } - }, - "p-try": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/p-try/-/p-try-2.0.0.tgz", - "integrity": "sha512-hMp0onDKIajHfIkdRk3P4CdCmErkYAxxDtP3Wx/4nZ3aGlau2VKh3mZpcuFkH27WQkL/3WBCPOktzA9ZOAnMQQ==", - "dev": true - }, - "path-exists": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/path-exists/-/path-exists-3.0.0.tgz", - "integrity": "sha1-zg6+ql94yxiSXqfYENe1mwEP1RU=", - "dev": true - } } }, "yargs-parser": { diff --git a/package.json b/package.json index 0971740923..348458dbaf 100644 --- a/package.json +++ b/package.json @@ -29,31 +29,31 @@ "prepack": "npm run minify", "pretest": "npm run lint", "test": "jest --coverage --no-cache ./test/", - "dev": "jest --watch ./test/" + "dev": "jest --watch --coverage --bail --no-cache ./test/" }, "devDependencies": { - "@types/chrome": "0.0.77", - "@types/jest": "^23.3.11", + "@types/chrome": "0.0.78", + "@types/jest": "^23.3.13", "@types/jsdom": "^12.2.1", "@types/node": "^10.12.18", "benchmark": "^2.1.4", - "chalk": "^2.4.1", + "chalk": "^2.4.2", "concurrently": "^4.1.0", - "eslint": "^5.11.1", + "eslint": "^5.12.1", "eslint-config-airbnb": "^17.1.0", - "eslint-plugin-import": "^2.14.0", + "eslint-plugin-import": "^2.15.0", "eslint-plugin-jsx-a11y": "^6.1.2", - "eslint-plugin-react": "^7.12.1", - "google-closure-compiler": "^20181210.0.0", + "eslint-plugin-react": "^7.12.4", + "google-closure-compiler": "^20190121.0.0", "jest": "^23.6.0", - "jsdom": "^13.1.0", - "rollup": "^1.0.1", + "jsdom": "^13.2.0", + "rollup": "^1.1.2", "rollup-plugin-commonjs": "^9.2.0", "rollup-plugin-node-resolve": "^4.0.0", "tldts": "^4.0.0", "ts-jest": "^23.10.5", - "tslint": "^5.12.0", - "typescript": "^3.2.2" + "tslint": "^5.12.1", + "typescript": "^3.2.4" }, "dependencies": { "punycode": "^2.1.1", diff --git a/src/data-view.ts b/src/data-view.ts index e3f4b3ce91..841b14cbbb 100644 --- a/src/data-view.ts +++ b/src/data-view.ts @@ -2,8 +2,6 @@ import * as punycode from 'punycode'; import { hasUnicode } from './utils'; /** - * @class StaticDataView - * * This abstraction allows to serialize efficiently low-level values of types: * String, uint8, uint16, uint32 while hiding the complexity of managing the * current offset and growing. It should always be instantiated with a @@ -17,14 +15,30 @@ import { hasUnicode } from './utils'; * deserializer you use `getX` functions to get back the values. */ export default class StaticDataView { - protected buffer: Uint8Array; - protected pos: number; + public static fromUint32Array(array: Uint32Array): StaticDataView { + return new StaticDataView(0, new Uint8Array(array.buffer)); + } + + public static fromUint8Array(array: Uint8Array): StaticDataView { + return new StaticDataView(0, array); + } + + public pos: number; + public buffer: Uint8Array; constructor(length: number, buffer?: Uint8Array) { this.buffer = buffer !== undefined ? buffer : new Uint8Array(length); this.pos = 0; } + public dataAvailable(): boolean { + return this.pos < this.buffer.byteLength; + } + + public setPos(pos: number): void { + this.pos = pos; + } + public getPos(): number { return this.pos; } @@ -33,12 +47,12 @@ export default class StaticDataView { this.pos = 0; } + public slice(): Uint8Array { + this.checkSize(); + return this.buffer.slice(0, this.pos); + } public crop(): Uint8Array { - if (this.pos >= this.buffer.byteLength) { - throw new Error( - `StaticDataView too small: ${this.buffer.byteLength}, but required ${this.pos - 1} bytes`, - ); - } + this.checkSize(); return this.buffer.subarray(0, this.pos); } @@ -47,6 +61,14 @@ export default class StaticDataView { this.seekZero(); } + public pushBool(bool: boolean): void { + this.pushByte(Number(bool)); + } + + public getBool(): boolean { + return Boolean(this.getByte()); + } + public setByte(pos: number, byte: number): void { this.buffer[pos] = byte; } @@ -59,121 +81,109 @@ export default class StaticDataView { return this.getUint8(); } + public pushBytes(bytes: Uint8Array): void { + this.pushUint32(bytes.byteLength); + this.buffer.set(bytes, this.pos); + this.pos += bytes.byteLength; + } + + public getBytes(): Uint8Array { + const numberOfBytes = this.getUint32(); + const bytes = this.buffer.subarray(this.pos, this.pos + numberOfBytes); + this.pos += numberOfBytes; + return bytes; + } + public pushUint8(uint8: number): void { - this.buffer[this.pos] = uint8; - this.pos += 1; + this.buffer[this.pos++] = uint8; } public getUint8(): number { - const uint8 = this.buffer[this.pos]; - this.pos += 1; - return uint8; + return this.buffer[this.pos++]; } public pushUint16(uint16: number): void { - this.buffer[this.pos] = uint16 >>> 8; - this.buffer[this.pos + 1] = uint16; - this.pos += 2; + this.buffer[this.pos++] = uint16 >>> 8; + this.buffer[this.pos++] = uint16; } public getUint16(): number { - const uint16 = ((this.buffer[this.pos] << 8) | this.buffer[this.pos + 1]) >>> 0; - this.pos += 2; - return uint16; + return ((this.buffer[this.pos++] << 8) | this.buffer[this.pos++]) >>> 0; } public pushUint32(uint32: number): void { - this.buffer[this.pos] = uint32 >>> 24; - this.buffer[this.pos + 1] = uint32 >>> 16; - this.buffer[this.pos + 2] = uint32 >>> 8; - this.buffer[this.pos + 3] = uint32; - this.pos += 4; + this.buffer[this.pos++] = uint32 >>> 24; + this.buffer[this.pos++] = uint32 >>> 16; + this.buffer[this.pos++] = uint32 >>> 8; + this.buffer[this.pos++] = uint32; } - public pushUint32Array(arr: Uint32Array | undefined): void { - if (arr === undefined) { - this.pushUint16(0); - } else { - this.pushUint16(arr.length); - for (let i = 0; i < arr.length; i += 1) { - this.pushUint32(arr[i]); - } + public getUint32(): number { + return ( + (((this.buffer[this.pos++] << 24) >>> 0) + + ((this.buffer[this.pos++] << 16) | + (this.buffer[this.pos++] << 8) | + this.buffer[this.pos++])) >>> + 0 + ); + } + + public pushUint32Array(arr: Uint32Array): void { + this.pushUint32(arr.length); + // TODO - use `set` to push the full buffer at once? + for (let i = 0; i < arr.length; i += 1) { + this.pushUint32(arr[i]); } } - public getUint32Array(): Uint32Array | undefined { - const length = this.getUint16(); - if (length === 0) { - return undefined; - } + public getUint32Array(): Uint32Array { + const length = this.getUint32(); const arr = new Uint32Array(length); + // TODO - use `subarray`? for (let i = 0; i < length; i += 1) { arr[i] = this.getUint32(); } return arr; } - public getUint32(): number { - const uint32 = - (((this.buffer[this.pos] << 24) >>> 0) + - ((this.buffer[this.pos + 1] << 16) | - (this.buffer[this.pos + 2] << 8) | - this.buffer[this.pos + 3])) >>> - 0; - this.pos += 4; - return uint32; - } - - public pushUTF8(str: string | undefined): void { - if (str === undefined) { - this.pushUint16(0); + public pushUTF8(str: string): void { + this.pushUint16(str.length); + if (hasUnicode(str)) { + this.pushASCII(punycode.encode(str)); } else { - this.pushUint16(str.length); - if (hasUnicode(str)) { - this.pushASCII(punycode.encode(str)); - } else { - this.pushASCII(str); - } + this.pushASCII(str); } } - public getUTF8(): string | undefined { + public getUTF8(): string { const length = this.getUint16(); - if (length === 0) { - return undefined; - } - const str = this.getASCII(); - if (str === undefined || str.length === length) { + if (str.length === length) { return str; } return punycode.decode(str); } - public pushASCII(str: string | undefined): void { - if (str === undefined) { - this.pushUint16(0); - } else { - this.pushUint16(str.length); - const len = str.length; - const offset = this.pos; - for (let i = 0; i < len; i += 1) { - this.buffer[offset + i] = str.charCodeAt(i); - } - this.pos += len; + public pushASCII(str: string): void { + this.pushUint16(str.length); + for (let i = 0; i < str.length; i += 1) { + this.buffer[this.pos++] = str.charCodeAt(i); } } - public getASCII(): string | undefined { + public getASCII(): string { const byteLength = this.getUint16(); - - if (byteLength === 0) { - return undefined; - } - this.pos += byteLength; // @ts-ignore return String.fromCharCode.apply(null, this.buffer.subarray(this.pos - byteLength, this.pos)); } + + private checkSize() { + if (this.pos >= this.buffer.byteLength) { + throw new Error( + `StaticDataView too small: ${this.buffer.byteLength}, but required ${this.pos - 1} bytes`, + ); + } + } } diff --git a/src/engine/bucket/cosmetic.ts b/src/engine/bucket/cosmetic.ts new file mode 100644 index 0000000000..2c48493db8 --- /dev/null +++ b/src/engine/bucket/cosmetic.ts @@ -0,0 +1,144 @@ +import StaticDataView from '../../data-view'; +import CosmeticFilter, { + getEntityHashesFromLabelsBackward, + getHostnameHashesFromLabelsBackward, +} from '../../filters/cosmetic'; + +import ReverseIndex from '../reverse-index'; + +export default class CosmeticFilterBucket { + public static deserialize(buffer: StaticDataView): CosmeticFilterBucket { + const bucket = new CosmeticFilterBucket(); + + bucket.genericRules = buffer.getBytes(); + bucket.hostnameIndex = ReverseIndex.deserialize(buffer, CosmeticFilter.deserialize); + + return bucket; + } + + public hostnameIndex: ReverseIndex; + public genericRules: Uint8Array; + + private cache: CosmeticFilter[]; + + constructor({ filters = [] }: { filters?: CosmeticFilter[] } = {}) { + this.cache = []; + this.genericRules = new Uint8Array(0); + this.hostnameIndex = new ReverseIndex({ + deserialize: CosmeticFilter.deserialize, + }); + + if (filters.length !== 0) { + this.update(filters); + } + } + + public update(newFilters: CosmeticFilter[], removedFilters?: Set) { + // This will be used to keep in cache the generic CosmeticFilter instances. + // It will be populated the first time filters are required. TODO - maybe we + // do not need the full instance there? But instead the selectors only (we + // only need to know enough to inject and apply exceptions). + const genericRules: CosmeticFilter[] = []; + const hostnameSpecificRules: CosmeticFilter[] = []; + + // Add existing rules (removing the ones with ids in `removedFilters`) + const currentGenericRules: CosmeticFilter[] = this.getGenericRules(); + for (let i = 0; i < currentGenericRules.length; i += 1) { + const filter = currentGenericRules[i]; + if (removedFilters === undefined || !removedFilters.has(filter.getId())) { + genericRules.push(filter); + } + } + + // Add new rules + for (let i = 0; i < newFilters.length; i += 1) { + const filter = newFilters[i]; + if (filter.hasHostnameConstraint()) { + hostnameSpecificRules.push(filter); + } else { + genericRules.push(filter); + } + } + + // This accelerating data structure is used to retrieve cosmetic filters for + // a given hostname. We only store filters having at least one hostname + // specified and we index each filter several time (one time per hostname). + this.hostnameIndex.update(hostnameSpecificRules, removedFilters); + + // Store generic cosmetic filters in an array. It will be used whenever we + // need to inject cosmetics in a page and filtered according to + // domain-specific exceptions/unhide. + const buffer = new StaticDataView(1500000); + buffer.pushUint32(genericRules.length); + for (let i = 0; i < genericRules.length; i += 1) { + genericRules[i].serialize(buffer); + } + + this.cache = []; + this.genericRules = buffer.slice(); + } + + public serialize(buffer: StaticDataView): void { + buffer.pushBytes(this.genericRules); + this.hostnameIndex.serialize(buffer); + } + + public getCosmeticsFilters(hostname: string, domain: string): CosmeticFilter[] { + const disabledRules = new Set(); + const rules: CosmeticFilter[] = []; + + // Collect rules specifying a domain + this.hostnameIndex.iterMatchingFilters( + new Uint32Array([ + ...getHostnameHashesFromLabelsBackward(hostname, domain), + ...getEntityHashesFromLabelsBackward(hostname, domain), + ]), + (rule: CosmeticFilter) => { + if (rule.match(hostname, domain)) { + if (rule.isUnhide()) { + disabledRules.add(rule.getSelector()); + } else { + rules.push(rule); + } + } + + return true; + }, + ); + + if (disabledRules.size === 0) { + // No exception/unhide found, so we return all the rules + return [...rules, ...this.getGenericRules()]; + } + + const rulesWithoutExceptions: CosmeticFilter[] = []; + for (let i = 0; i < rules.length; i += 1) { + const rule = rules[i]; + if (!disabledRules.has(rule.getSelector())) { + rulesWithoutExceptions.push(rule); + } + } + + const genericRules = this.getGenericRules(); + for (let i = 0; i < genericRules.length; i += 1) { + const rule = genericRules[i]; + if (!disabledRules.has(rule.getSelector())) { + rulesWithoutExceptions.push(rule); + } + } + + return rulesWithoutExceptions; + } + + private getGenericRules(): CosmeticFilter[] { + if (this.cache.length === 0) { + const buffer = StaticDataView.fromUint8Array(this.genericRules); + const numberOfFilters = buffer.getUint32(); + for (let i = 0; i < numberOfFilters; i += 1) { + this.cache.push(CosmeticFilter.deserialize(buffer)); + } + } + + return this.cache; + } +} diff --git a/src/engine/bucket/cosmetics.ts b/src/engine/bucket/cosmetics.ts deleted file mode 100644 index 326648a494..0000000000 --- a/src/engine/bucket/cosmetics.ts +++ /dev/null @@ -1,75 +0,0 @@ -import matchCosmeticFilter from '../../matching/cosmetics'; -import { CosmeticFilter } from '../../parsing/cosmetic-filter'; -import { tokenizeHostnames } from '../../utils'; - -import ReverseIndex from '../reverse-index'; - -export default class CosmeticFilterBucket { - public hostnameIndex: ReverseIndex; - public genericRules: CosmeticFilter[]; - public size: number; - - constructor(filters?: (cb: (f: CosmeticFilter) => void) => void) { - // Store generic cosmetic filters in an array. It will be used whenever we - // need to inject cosmetics in a paged and filtered according to - // domain-specific exceptions/unhide. - this.genericRules = []; - - // This accelerating data structure is used to retrieve cosmetic filters for - // a given hostname. We only store filters having at least one hostname - // specified and we index each filter several time (one time per hostname). - this.hostnameIndex = new ReverseIndex((cb: (f: CosmeticFilter) => void) => { - if (filters !== undefined) { - filters((f: CosmeticFilter) => { - if (f.hasHostnames()) { - cb(f); - } else { - this.genericRules.push(f); - } - }); - } - }); - - this.size = this.hostnameIndex.size + this.genericRules.length; - } - - public getCosmeticsFilters(hostname: string, domain: string): CosmeticFilter[] { - const disabledRules = new Set(); - const rules: CosmeticFilter[] = []; - - // Collect rules specifying a domain - this.hostnameIndex.iterMatchingFilters(tokenizeHostnames(hostname), (rule: CosmeticFilter) => { - if (matchCosmeticFilter(rule, hostname, domain)) { - if (rule.isUnhide()) { - disabledRules.add(rule.getSelector()); - } else { - rules.push(rule); - } - } - - return true; - }); - - if (disabledRules.size === 0) { - // No exception/unhide found, so we return all the rules - return [...rules, ...this.genericRules]; - } - - const rulesWithoutExceptions: CosmeticFilter[] = []; - for (let i = 0; i < rules.length; i += 1) { - const rule = rules[i]; - if (!disabledRules.has(rule.getSelector())) { - rulesWithoutExceptions.push(rule); - } - } - - for (let i = 0; i < this.genericRules.length; i += 1) { - const rule = this.genericRules[i]; - if (!disabledRules.has(rule.getSelector())) { - rulesWithoutExceptions.push(rule); - } - } - - return rulesWithoutExceptions; - } -} diff --git a/src/engine/bucket/network.ts b/src/engine/bucket/network.ts index 2c42b60585..83fd1a83a0 100644 --- a/src/engine/bucket/network.ts +++ b/src/engine/bucket/network.ts @@ -1,43 +1,56 @@ -import matchNetworkFilter from '../../matching/network'; -import { NetworkFilter } from '../../parsing/network-filter'; +import StaticDataView from '../../data-view'; +import NetworkFilter from '../../filters/network'; import Request from '../../request'; - import networkFiltersOptimizer from '../optimizer'; import ReverseIndex from '../reverse-index'; /** - * Accelerating data structure for network filters matching. Makes use of the - * reverse index structure defined above. + * Accelerating data structure for network filters matching. */ export default class NetworkFilterBucket { - public readonly name: string; + public static deserialize(buffer: StaticDataView): NetworkFilterBucket { + const enableOptimizations = buffer.getBool(); + const bucket = new NetworkFilterBucket({ enableOptimizations }); + bucket.index = ReverseIndex.deserialize( + buffer, + NetworkFilter.deserialize, + enableOptimizations ? networkFiltersOptimizer : undefined, + ); + return bucket; + } + public index: ReverseIndex; - public size: number; public enableOptimizations: boolean; - constructor( - name: string, - filters?: (cb: (f: NetworkFilter) => void) => void, + constructor({ + filters = [], enableOptimizations = true, - ) { - this.name = name; + }: { + filters?: NetworkFilter[]; + enableOptimizations?: boolean; + } = {}) { this.enableOptimizations = enableOptimizations; - this.index = new ReverseIndex( + this.index = new ReverseIndex({ + deserialize: NetworkFilter.deserialize, filters, - enableOptimizations ? networkFiltersOptimizer : undefined, - ); - this.size = this.index.size; + optimize: enableOptimizations ? networkFiltersOptimizer : undefined, + }); + } + + public update(newFilters: NetworkFilter[], removedFilters?: Set): void { + this.index.update(newFilters, removedFilters); } - public optimizeAheadOfTime() { - this.index.optimizeAheadOfTime(); + public serialize(buffer: StaticDataView): void { + buffer.pushBool(this.enableOptimizations); + this.index.serialize(buffer); } public matchAll(request: Request): NetworkFilter[] { const filters: NetworkFilter[] = []; this.index.iterMatchingFilters(request.getTokens(), (filter: NetworkFilter) => { - if (matchNetworkFilter(filter, request)) { + if (filter.match(request)) { filters.push(filter); } return true; @@ -50,7 +63,7 @@ export default class NetworkFilterBucket { let match: NetworkFilter | undefined; this.index.iterMatchingFilters(request.getTokens(), (filter: NetworkFilter) => { - if (matchNetworkFilter(filter, request)) { + if (filter.match(request)) { match = filter; return false; } diff --git a/src/engine/engine.ts b/src/engine/engine.ts index d00dcf97fa..4f98c9463f 100644 --- a/src/engine/engine.ts +++ b/src/engine/engine.ts @@ -1,16 +1,17 @@ -import { CosmeticFilter } from '../parsing/cosmetic-filter'; -import IFilter from '../parsing/interface'; -import { parseJSResource, parseList } from '../parsing/list'; -import { NetworkFilter } from '../parsing/network-filter'; +import StaticDataView from '../data-view'; +import CosmeticFilter from '../filters/cosmetic'; +import NetworkFilter from '../filters/network'; import Request, { RequestType } from '../request'; -import { serializeEngine } from '../serialization'; +import Resources from '../resources'; -import CosmeticFilterBucket from './bucket/cosmetics'; +import Lists, { IListDiff, parseFilters } from '../lists'; +import CosmeticFilterBucket from './bucket/cosmetic'; import NetworkFilterBucket from './bucket/network'; -import IList from './list'; import { createStylesheet } from '../content/injection'; +export const ENGINE_VERSION = 17; + // Polyfill for `btoa` function btoaPolyfill(buffer: string): string { if (typeof btoa !== 'undefined') { @@ -21,28 +22,60 @@ function btoaPolyfill(buffer: string): string { return buffer; } -function iterFilters( - lists: Map, - select: (l: IList) => F[], - cb: (f: F) => void, -): void { - lists.forEach((list: IList) => { - const filters: F[] = select(list); - for (let i = 0; i < filters.length; i += 1) { - cb(filters[i]); - } - }); -} - interface IOptions { + debug: boolean; + enableOptimizations: boolean; loadCosmeticFilters: boolean; loadNetworkFilters: boolean; - optimizeAOT: boolean; - enableOptimizations: boolean; + enableUpdates: boolean; } export default class FilterEngine { - public lists: Map; + public static parse(filters: string, options: Partial = {}): FilterEngine { + return new FilterEngine({ + ...parseFilters(filters, options), + ...options, + }); + } + + public static deserialize(serialized: Uint8Array): FilterEngine { + const buffer = StaticDataView.fromUint8Array(serialized); + + // Before starting deserialization, we make sure that the version of the + // serialized engine is the same as the current source code. If not, we start + // fresh and create a new engine from the lists. + const serializedEngineVersion = buffer.getUint8(); + if (ENGINE_VERSION !== serializedEngineVersion) { + throw new Error('serialized engine version mismatch'); + } + + // Create a new engine with same options + const engine = new FilterEngine({ + debug: false, + enableOptimizations: buffer.getBool(), + enableUpdates: buffer.getBool(), + loadCosmeticFilters: buffer.getBool(), + loadNetworkFilters: buffer.getBool(), + }); + + // Deserialize resources + engine.resources = Resources.deserialize(buffer); + + // Deserialize lists + engine.lists = Lists.deserialize(buffer); + + // Deserialize buckets + engine.filters = NetworkFilterBucket.deserialize(buffer); + engine.exceptions = NetworkFilterBucket.deserialize(buffer); + engine.importants = NetworkFilterBucket.deserialize(buffer); + engine.redirects = NetworkFilterBucket.deserialize(buffer); + engine.csp = NetworkFilterBucket.deserialize(buffer); + engine.cosmetics = CosmeticFilterBucket.deserialize(buffer); + + return engine; + } + + public lists: Lists; public csp: NetworkFilterBucket; public exceptions: NetworkFilterBucket; @@ -51,123 +84,199 @@ export default class FilterEngine { public filters: NetworkFilterBucket; public cosmetics: CosmeticFilterBucket; - public size: number; - - public resourceChecksum: string; - public js: Map; - public resources: Map; + public resources: Resources; - public loadCosmeticFilters: boolean; - public loadNetworkFilters: boolean; - public optimizeAOT: boolean; - public enableOptimizations: boolean; + public readonly debug: boolean; + public readonly enableOptimizations: boolean; + public readonly enableUpdates: boolean; + public readonly loadCosmeticFilters: boolean; + public readonly loadNetworkFilters: boolean; constructor({ + // Optionally initialize the engine with filters + cosmeticFilters = [], + networkFilters = [], + + // Options + debug = false, enableOptimizations = true, + enableUpdates = true, loadCosmeticFilters = true, loadNetworkFilters = true, - optimizeAOT = true, - }: IOptions) { + }: { + cosmeticFilters?: CosmeticFilter[]; + networkFilters?: NetworkFilter[]; + } & Partial = {}) { // Options + this.debug = debug; + this.enableOptimizations = enableOptimizations; + this.enableUpdates = enableUpdates; this.loadCosmeticFilters = loadCosmeticFilters; this.loadNetworkFilters = loadNetworkFilters; - this.optimizeAOT = optimizeAOT; - this.enableOptimizations = enableOptimizations; - this.lists = new Map(); - this.size = 0; + // Subscription management: disabled by default + this.lists = new Lists({ + debug: this.debug, + loadCosmeticFilters: this.loadCosmeticFilters, + loadNetworkFilters: this.loadNetworkFilters, + }); // $csp= - this.csp = new NetworkFilterBucket('csp', undefined, false); + this.csp = new NetworkFilterBucket({ enableOptimizations: false }); // @@filter - this.exceptions = new NetworkFilterBucket('exceptions'); + this.exceptions = new NetworkFilterBucket(); // $important - this.importants = new NetworkFilterBucket('importants'); + this.importants = new NetworkFilterBucket(); // $redirect - this.redirects = new NetworkFilterBucket('redirects'); + this.redirects = new NetworkFilterBucket(); // All other filters - this.filters = new NetworkFilterBucket('filters'); + this.filters = new NetworkFilterBucket(); // Cosmetic filters this.cosmetics = new CosmeticFilterBucket(); // Injections - this.resourceChecksum = ''; - this.js = new Map(); - this.resources = new Map(); + this.resources = new Resources(); + + if (networkFilters.length !== 0 || cosmeticFilters.length !== 0) { + this.update({ + newCosmeticFilters: cosmeticFilters, + newNetworkFilters: networkFilters, + }); + } } - public serialize(): Uint8Array { - return serializeEngine(this); + /** + * Creates a binary representation of the full engine. It can be stored + * on-disk for faster loading of the adblocker. The `deserialize` static + * method of Engine can be used to restore the engine. + */ + public serialize(array?: Uint8Array): Uint8Array { + // Create a big buffer! It should always be bigger than the serialized + // engine since `StaticDataView` will neither resize it nor detect overflows + // (for efficiency purposes). + const buffer = StaticDataView.fromUint8Array(array || new Uint8Array(9000000)); + + buffer.pushUint8(ENGINE_VERSION); + + buffer.pushBool(this.enableOptimizations); + buffer.pushBool(this.enableUpdates); + buffer.pushBool(this.loadCosmeticFilters); + buffer.pushBool(this.loadNetworkFilters); + + // Resources (js, resources) + this.resources.serialize(buffer); + + // Subscription management + this.lists.serialize(buffer); + + // Filters buckets + this.filters.serialize(buffer); + this.exceptions.serialize(buffer); + this.importants.serialize(buffer); + this.redirects.serialize(buffer); + this.csp.serialize(buffer); + this.cosmetics.serialize(buffer); + + return buffer.crop(); } - public hasList(asset: string, checksum: string): boolean { - const list = this.lists.get(asset); - if (list !== undefined) { - return list.checksum === checksum; - } - return false; + /** + * Update engine with new filters or resources. + */ + + public loadedLists(): string[] { + return this.lists.getLoaded(); } - public onUpdateResource(updates: Array<{ filters: string; checksum: string }>): void { - for (let i = 0; i < updates.length; i += 1) { - const { filters, checksum } = updates[i]; + public hasList(name: string, checksum: string): boolean { + return this.lists.has(name, checksum); + } - // NOTE: Here we can only handle one resource file at a time. - this.resourceChecksum = checksum; - const typeToResource = parseJSResource(filters); + public deleteLists(names: string[]): boolean { + return this.update(this.lists.delete(names)); + } - // the resource containing javascirpts to be injected - const js = typeToResource.get('application/javascript'); - if (js !== undefined) { - this.js = js; - } + public deleteList(name: string): boolean { + return this.update(this.lists.delete([name])); + } - // Create a mapping from resource name to { contentType, data } - // used for request redirection. - typeToResource.forEach((resources, contentType) => { - resources.forEach((data, name) => { - this.resources.set(name, { - contentType, - data, - }); - }); - }); + public updateLists(lists: Array<{ name: string; checksum: string; list: string }>): boolean { + if (this.enableUpdates === false) { + return false; } + + return this.update(this.lists.update(lists)); } - public onUpdateFilters( - lists: Array<{ filters: string; checksum: string; asset: string }>, - loadedAssets: Set = new Set(), - debug: boolean = false, - ): void { - // Remove assets if needed - this.lists.forEach((_, asset) => { - if (!loadedAssets.has(asset)) { - this.lists.delete(asset); - } - }); + public updateList({ + name, + checksum, + list, + }: { + name: string; + checksum: string; + list: string; + }): boolean { + return this.updateLists([{ name, checksum, list }]); + } - // Parse all filters and update `this.lists` - for (let i = 0; i < lists.length; i += 1) { - const { asset, filters, checksum } = lists[i]; + /** + * Update engine with `resources.txt` content. + */ + public updateResources(data: string, checksum: string): boolean { + if (this.enableUpdates === false) { + return false; + } - // Parse and dispatch filters depending on type - const { cosmeticFilters, networkFilters } = parseList(filters, { - debug, - loadCosmeticFilters: this.loadCosmeticFilters, - loadNetworkFilters: this.loadNetworkFilters, - }); + if (this.resources.checksum === checksum) { + return false; + } - // Network filters - const miscFilters: NetworkFilter[] = []; - const exceptions: NetworkFilter[] = []; + this.resources = Resources.parse(data, { checksum }); + return true; + } + + /** + * Update engine with new filters as well as optionally removed filters. + */ + public update({ + newNetworkFilters = [], + newCosmeticFilters = [], + removedCosmeticFilters = [], + removedNetworkFilters = [], + }: Partial): boolean { + if (this.enableUpdates === false) { + return false; + } + + let updated: boolean = false; + + // Update cosmetic filters + if ( + this.loadCosmeticFilters && + (newCosmeticFilters.length !== 0 || removedCosmeticFilters.length !== 0) + ) { + updated = true; + this.cosmetics.update( + newCosmeticFilters, + removedCosmeticFilters.length === 0 ? undefined : new Set(removedCosmeticFilters), + ); + } + + // Update network filters + if ( + this.loadNetworkFilters && + (newNetworkFilters.length !== 0 || removedNetworkFilters.length !== 0) + ) { + updated = true; + const filters: NetworkFilter[] = []; const csp: NetworkFilter[] = []; + const exceptions: NetworkFilter[] = []; const importants: NetworkFilter[] = []; const redirects: NetworkFilter[] = []; - // Dispatch filters into their bucket - for (let j = 0; j < networkFilters.length; j += 1) { - const filter = networkFilters[j]; + for (let i = 0; i < newNetworkFilters.length; i += 1) { + const filter = newNetworkFilters[i]; if (filter.isCSP()) { csp.push(filter); } else if (filter.isException()) { @@ -177,77 +286,33 @@ export default class FilterEngine { } else if (filter.isRedirect()) { redirects.push(filter); } else { - miscFilters.push(filter); + filters.push(filter); } } - this.lists.set(asset, { - checksum, - cosmetics: cosmeticFilters, - csp, - exceptions, - filters: miscFilters, - importants, - redirects, - }); - } + const removedNetworkFiltersSet: Set | undefined = + removedNetworkFilters.length === 0 ? undefined : new Set(removedNetworkFilters); - // Re-create all buckets - this.filters = new NetworkFilterBucket( - 'filters', - (cb: (f: NetworkFilter) => void) => iterFilters(this.lists, (l) => l.filters, cb), - this.enableOptimizations, - ); - this.csp = new NetworkFilterBucket( - 'csp', - (cb: (f: NetworkFilter) => void) => iterFilters(this.lists, (l) => l.csp, cb), - false, // Disable optimizations - ); - this.exceptions = new NetworkFilterBucket( - 'exceptions', - (cb: (f: NetworkFilter) => void) => iterFilters(this.lists, (l) => l.exceptions, cb), - this.enableOptimizations, - ); - this.importants = new NetworkFilterBucket( - 'importants', - (cb: (f: NetworkFilter) => void) => iterFilters(this.lists, (l) => l.importants, cb), - this.enableOptimizations, - ); - this.redirects = new NetworkFilterBucket( - 'redirects', - (cb: (f: NetworkFilter) => void) => iterFilters(this.lists, (l) => l.redirects, cb), - this.enableOptimizations, - ); - - // Eagerly collect filters in this case only - this.cosmetics = new CosmeticFilterBucket((cb: (f: CosmeticFilter) => void) => - iterFilters(this.lists, (l) => l.cosmetics, cb), - ); - - // Update size - this.size = - this.cosmetics.size + - this.csp.size + - this.exceptions.size + - this.filters.size + - this.importants.size + - this.redirects.size; - - // Optimize ahead of time if asked for - if (this.optimizeAOT) { - this.optimize(); + // Update buckets in-place + this.filters.update(filters, removedNetworkFiltersSet); + this.csp.update(csp, removedNetworkFiltersSet); + this.exceptions.update(exceptions, removedNetworkFiltersSet); + this.importants.update(importants, removedNetworkFiltersSet); + this.redirects.update(redirects, removedNetworkFiltersSet); } - } - public optimize() { - this.filters.optimizeAheadOfTime(); - this.exceptions.optimizeAheadOfTime(); - this.importants.optimizeAheadOfTime(); - this.redirects.optimizeAheadOfTime(); - // Cosmetic bucket does not expose any optimization yet. - // this.cosmetics.optimizeAheadOfTime(); + return updated; } + /** + * Matching APIs. The following methods are used to retrieve matching filters + * either to apply cosmetics on a page or alter network requests. + */ + + /** + * Given `hostname` and `domain` of a page (or frame), return the list of + * styles and scripts to inject in the page. + */ public getCosmeticsFilters(hostname: string, domain: string | null | undefined) { const selectorsPerStyle: Map = new Map(); const scripts: string[] = []; @@ -261,7 +326,7 @@ export default class FilterEngine { if (rule.isScriptBlock()) { blockedScripts.push(rule.getSelector()); } else if (rule.isScriptInject()) { - const script = rule.getScript(this.js); + const script = rule.getScript(this.resources.js); if (script !== undefined) { scripts.push(script); } @@ -290,6 +355,9 @@ export default class FilterEngine { }; } + /** + * Given a `request`, return all matching network filters found in the engine. + */ public matchAll(request: Request): Set { const filters: NetworkFilter[] = []; if (request.isSupported) { @@ -303,6 +371,10 @@ export default class FilterEngine { return new Set(filters); } + /** + * Given a "main_frame" request, check if some content security policies + * should be injected in the page. + */ public getCSPDirectives(request: Request): string | undefined { if (!this.loadNetworkFilters) { return undefined; @@ -334,6 +406,10 @@ export default class FilterEngine { return [...enabledCsp].filter((csp) => !disabledCsp.has(csp)).join('; ') || undefined; } + /** + * Decide if a network request (usually from WebRequest API) should be + * blocked, redirected or allowed. + */ public match( request: Request, ): { @@ -379,7 +455,7 @@ export default class FilterEngine { // If there is a match if (filter !== undefined) { if (filter.isRedirect()) { - const redirectResource = this.resources.get(filter.getRedirect()); + const redirectResource = this.resources.getResource(filter.getRedirect()); if (redirectResource !== undefined) { const { data, contentType } = redirectResource; let dataUrl; diff --git a/src/engine/list.ts b/src/engine/list.ts deleted file mode 100644 index 8890be531d..0000000000 --- a/src/engine/list.ts +++ /dev/null @@ -1,12 +0,0 @@ -import { CosmeticFilter } from '../parsing/cosmetic-filter'; -import { NetworkFilter } from '../parsing/network-filter'; - -export default interface IList { - checksum: string; - cosmetics: CosmeticFilter[]; - exceptions: NetworkFilter[]; - csp: NetworkFilter[]; - filters: NetworkFilter[]; - importants: NetworkFilter[]; - redirects: NetworkFilter[]; -} diff --git a/src/engine/optimizer.ts b/src/engine/optimizer.ts index a908120c5b..85ee5725a2 100644 --- a/src/engine/optimizer.ts +++ b/src/engine/optimizer.ts @@ -1,4 +1,4 @@ -import { NETWORK_FILTER_MASK, NetworkFilter } from '../parsing/network-filter'; +import NetworkFilter, { NETWORK_FILTER_MASK } from '../filters/network'; import { setBit } from '../utils'; function processRegex(r: RegExp): string { diff --git a/src/engine/reverse-index.ts b/src/engine/reverse-index.ts index 5cef4bb1da..5529cf730b 100644 --- a/src/engine/reverse-index.ts +++ b/src/engine/reverse-index.ts @@ -1,41 +1,54 @@ -import IFilter from '../parsing/interface'; +import StaticDataView from '../data-view'; +import IFilter from '../filters/interface'; import { fastHash } from '../utils'; -class DefaultMap { - private map: Map; - private ctr: () => V; +// https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2 +function nextPow2(v: number): number { + v--; + v |= v >> 1; + v |= v >> 2; + v |= v >> 4; + v |= v >> 8; + v |= v >> 16; + v++; + return v; +} + +/** + * Counter implemented on top of Map. + */ +class Counter { + private counter: Map; - constructor(ctr: () => V) { - this.map = new Map(); - this.ctr = ctr; + constructor() { + this.counter = new Map(); } - public getMap() { - return this.map; + public incr(key: K): void { + this.counter.set(key, (this.counter.get(key) || 0) + 1); } - public set(key: K, value: V) { - this.map.set(key, value); + public get(key: K): number { + return this.counter.get(key) || 0; } - public get(key: K): V { - let value = this.map.get(key); - if (value === undefined) { - value = this.ctr(); - this.map.set(key, value); - } - return value; + public set(key: K, value: number): void { + this.counter.set(key, value); } } -function noop(filters: T[]): T[] { +/** + * Optimizer which returns the list of original filters. + */ +function noopOptimize(filters: T[]): T[] { return filters; } -function noFilter(_: (f: T) => void): void { - /* do nothing */ -} - +/** + * Generate unique IDs for requests, which is used to avoid matching the same + * buckets multiple times on the same request (which can happen if a token + * appears more than once in a URL). + */ let UID = 1; function getNextId() { const id = UID; @@ -43,79 +56,231 @@ function getNextId() { return id; } -export interface IBucket { - filters: T[]; - magic: number; - optimized: boolean; - originals: T[] | undefined; -} +/** + * List of filters being indexed using the same token in the index. + */ +class Bucket { + public readonly filters: T[]; + public lastRequestSeen: number; -export function newBucket(filters: T[] = []): IBucket { - return { - filters, - magic: 0, - optimized: false, - originals: undefined, - }; + constructor(filters: T[] = []) { + this.filters = filters; + this.lastRequestSeen = 0; + } } /** - * Accelerating data structure based on a reverse token index. The creation of - * the index follows the following algorithm: - * 1. Tokenize each filter - * 2. Compute a histogram of frequency of each token (globally) - * 3. Select the best token for each filter (lowest frequency) + * The ReverseIndex is an accelerating data structure which allows finding a + * subset of the filters given a list of token seen in a URL. It is the core of + * the adblocker's matching capabilities. + * + * It has mainly two caracteristics: + * 1. It should be very compact and be able to load fast. + * 2. It should be very fast. + * + * Conceptually, the reverse index dispatches filters in "buckets" (an array of + * one or more filters). Filters living in the same bucket are guaranteed to + * share at least one of their token (appearing in the pattern). For example: * - * By default, each filter is only indexed once, using its token having the - * lowest global frequency. This is to minimize the size of buckets. + * - Bucket 1 (ads): + * - /ads.js + * - /script/ads/tracking.js + * - /ads/ + * - Bucket 2 (tracking) + * - /tracking.js + * - ||tracking.com/cdn * - * The ReverseIndex can be extended in two ways to provide more advanced - * features: - * 1. It is possible to provide an `optimizer` function, which takes as input - * a list of filters (typically the content of a bucket) and returns another - * list of filters (new content of the bucket), more compact/efficient. This - * allows to dynamically optimize the filters and make matching time and memory - * consumption lower. This optimization can be done ahead of time on all - * buckets, or dynamically when a bucket is 'hot' (hit several times). + * We see that filters in "Bucket 1" are indexed using the token "ads" and + * "Bucket 2" using token "tracking". * - * Currently this is only available for network filters. + * This property allows to quickly discard most of the filters when we match a + * URL. To achieve this, the URL is tokenized in the same way filters are + * tokenized and for each token, we check if there are some filters available. + * For example: * - * 2. Insert a filter multiple times (with multiple keys). It is sometimes - * needed to insert the same filter at different keys. For this purpose - * `getTokens` should return a list of list of tokens, so that it can be - * inserted several times. If you want it to be inserted only once, then - * returning a list of only one list of tokens will do the trick. + * URL "https://tracking.com/" has the following tokens: "https", "tracking" + * and "com". We immediatly see that we only check the two filters in the + * "tracking" bucket since they are the only ones having a common token with + * the URL. * - * For each set of tokens returned by the `getTokens` function, the filter - * will be inserted once. This is currently used only for hostname dispatch of - * cosmetic filters. + * How do we pick the token for each filter? + * ========================================= + * + * Each filter is only indexed *once*, which means that we need to pick one of + * the tokens appearing in the pattern. We choose the token such has each filter + * is indexed using the token which was the *least seen* globally. In other + * words, we pick the most discriminative token for each filter. This is done + * using the following algorithm: + * 1. Tokenize all the filters which will be stored in the index + * 2. Compute a histogram of frequency of each token (globally) + * 3. Select the best token for each filter (lowest frequency) */ export default class ReverseIndex { - public size: number; - public index: Map>; - private optimizer: (filters: T[]) => T[]; - - constructor( - filters: (cb: (f: T) => void) => void = noFilter, - optimizer: (filters: T[]) => T[] = noop, - ) { - // Mapping from tokens to filters - this.index = new Map(); - this.size = 0; - - this.optimizer = optimizer; - this.addFilters(filters); + public static deserialize( + buffer: StaticDataView, + deserialize: (view: StaticDataView) => T, + optimize: (filters: T[]) => T[] = noopOptimize, + ): ReverseIndex { + const reverseIndex = new ReverseIndex({ + deserialize, + optimize, + }); + + reverseIndex.tokensLookupIndexSize = buffer.getUint32(); + reverseIndex.tokensLookupIndexStart = buffer.getUint32(); + + reverseIndex.view = StaticDataView.fromUint8Array(buffer.getBytes()); + + return reverseIndex; + } + + // Compact representation + private tokensLookupIndexStart: number; + private tokensLookupIndexSize: number; + private view: StaticDataView; + + private deserializeFilter: (view: StaticDataView) => T; + private readonly optimize: (filters: T[]) => T[]; + + // In-memory cache used to keep track of buckets which have been loaded from + // the compact representation (i.e.: this.view). It is not strictly necessary + // but will speed-up retrival of popular filters. + private cache: Map>; + + constructor({ + deserialize, + filters = [], + optimize = noopOptimize, + }: { + deserialize: (view: StaticDataView) => T; + filters?: T[]; + optimize?: (filters: T[]) => T[]; + }) { + // Function used to load a filter (e.g.: CosmeticFilter or NetworkFilter) + // from its compact representation. Each filter exposes a `serialize` method + // which is used to store it in `this.view`. While matching we need to + // retrieve the instance of the filter to perform matching and use + // `this.deserializeFilter` to do so. + this.deserializeFilter = deserialize; + + // Optional function which will be used to optimize a list of filters + // in-memory. Typically this is used while matching when a list of filters + // are loaded in memory and stored in `this.cache`. Before using the bucket, + // we can `this.optimize` on the list of filters to allow some optimizations + // to be performed (e.g.: fusion of similar filters, etc.). Have a look into + // `./src/engine/optimizer.ts` for examples of such optimizations. + this.optimize = optimize; + + // Cache deserialized buckets in memory for faster retrieval. It is a + // mapping from token to `Bucket`. + this.cache = new Map(); + + // Compact representation of the reverse index (described at the top level + // comment of this class). It contains three distinct parts: + // + // 1. The list of all filters contained in this index, serialized + // contiguously (one after the other) in the typed array starting at index + // 0. This would look like: |f1|f2|f3|...|fn| Note that not all filters use + // the same amount of memory (or number of bytes) so the only way to + // navigate the compact representation at this point is to iterate through + // all of them from first to last. Which is why we need a small "index" to + // help us navigate this compact representation and leads us to the second + // section of `this.view`. + // + // 2. buckets index which conceptually can be understood as a way to group + // several buckets in the same neighborhood of the typed array. It could + // look something like: |bucket1|bucket2|bucket3|...| each bucket could + // contain multiple filters. In reality, for each section of this bucket + // index, we know how many filters there are and the filters for multiple + // buckets are interleaved. For example if the index starts with a section + // containing |bucket1|bucket2|bucket3| and these bucket have tokens `tok1`, + // `tok2` and `tok3` respectively, then the final representation in memory + // could be: |tok1|f1|tok2|f2|tok1|f3|tok3|f4|tok2|f5| (where `f1`, `f2`, + // etc. are indices to the serialized representation of each filter, in the + // same array, described in 1. above). + // + // 3. The last part is called "tokens lookup index" and allows to locate the + // bucket given a suffix of the indexing token. If the binary representation + // of the token for bucket1 is 101010 and prefix has size 3, then we would + // lookup the "tokens lookup index" using the last 3 bits "010" which would + // give us the offset in our typed array where we can start reading the + // filters of buckets having a token ending with the same 3 bits. + this.view = new StaticDataView(0); + this.tokensLookupIndexSize = 0; + this.tokensLookupIndexStart = 0; + + // Optionaly initialize the index with given filters. + if (filters.length !== 0) { + this.update(filters); + } + } + + /** + * Load all filters from this index in memory (i.e.: deserialize them from the + * byte array into NetworkFilter or CosmeticFilter instances). + */ + public getFilters(): T[] { + const view = this.view; + view.seekZero(); + const numberOfFilters = view.getUint32(); + const filters: T[] = []; + + for (let i = 0; i < numberOfFilters; i += 1) { + filters.push(this.deserializeFilter(view)); + } + + return filters; + } + + /** + * Return an array of all the tokens currently used as keys of the index. + */ + public getTokens(): Uint32Array { + const tokens: Set = new Set(); + const view = this.view; + + for (let i = 0; i < this.tokensLookupIndexSize; i += 1) { + view.setPos(this.tokensLookupIndexStart + 4 * i); + const startOfBucket = view.getUint32(); + + // We do not have any filters for this token + if (startOfBucket !== Number.MAX_SAFE_INTEGER) { + view.setPos(startOfBucket); + + const numberOfFilters = view.getByte(); + for (let j = 0; j < numberOfFilters; j += 1) { + tokens.add(view.getUint32()); + view.pos += 4; // skip index of corresponding filter + } + } + } + + return new Uint32Array(tokens); + } + + /** + * Dump this index to `buffer`. + */ + public serialize(buffer: StaticDataView): void { + buffer.pushUint32(this.tokensLookupIndexSize); + buffer.pushUint32(this.tokensLookupIndexStart); + buffer.pushBytes(this.view.buffer); } /** * Iterate on all filters found in buckets associated with the given list of * tokens. The callback is called on each of them. Early termination can be * achieved if the callback returns `false`. + * + * This will not check if each filter returned would match a given request but + * is instead used as a list of potential candidates (much smaller than the + * total set of filters; typically between 5 and 10 filters will be checked). */ public iterMatchingFilters(tokens: Uint32Array, cb: (f: T) => boolean): void { // Each request is assigned an ID so that we can keep track of the last // request seen by each bucket in the reverse index. This provides a cheap - // way to prevent filters from being inspected more than once per request. + // way to prevent filters from being inspected more than once per request + // (which could happen if the same token appears more than once in the URL). const requestId = getNextId(); for (let i = 0; i < tokens.length; i += 1) { @@ -124,117 +289,153 @@ export default class ReverseIndex { } } - // Fallback to 0 bucket if nothing was found before. + // Fallback to 0 (i.e.: wildcard bucket) bucket if nothing was found before. this.iterBucket(0, requestId, cb); } /** - * Force optimization of all buckets. + * Re-create the internal data-structure of the reverse index *in-place*. It + * needs to be called with a list of new filters and optionally a list of ids + * (as returned by either NetworkFilter.getId() or CosmeticFilter.getId()) + * which need to be removed from the index. */ - public optimizeAheadOfTime(): void { - if (this.optimizer !== undefined) { - this.index.forEach((bucket) => { - if (bucket.optimized === false) { - this.optimize(bucket); - } - }); - } - } - - private addFilters(iterFilters: (cb: (f: T) => void) => void): void { + public update(newFilters: T[], removedFilters?: Set): void { let totalNumberOfTokens = 0; + let totalNumberOfIndexedFilters = 0; + const filtersTokens: Array<{ filter: T; multiTokens: Uint32Array[] }> = []; + const histogram = new Counter(); - // Keep track of all filters with their tokens - const filters: Array<{ filter: T; multiTokens: Uint32Array[] }> = []; - - // Index will be used both as a histogram while constructing buckets and - // then as the final reverse index. We re-use the same Map to avoid having - // to construct two big ones. - const index = new DefaultMap>(newBucket); - - // The wildcard bucket will contains some filters for which we could not - // find any valid token. - const wildcardBucket = index.get(0); - - // Count number of occurrences of each token, globally - iterFilters((filter: T) => { - const multiTokens = filter.getTokens(); - filters.push({ - filter, - multiTokens, - }); - - for (let i = 0; i < multiTokens.length; i += 1) { - const tokens = multiTokens[i]; - for (let j = 0; j < tokens.length; j += 1) { - totalNumberOfTokens += 1; - index.get(tokens[j]).magic += 1; + // Compute tokens for all filters (the ones already contained in the index + // *plus* the new ones *minus* the ones removed ). + [this.getFilters(), newFilters].forEach((filters) => { + for (let i = 0; i < filters.length; i += 1) { + const filter = filters[i]; + if (removedFilters === undefined || !removedFilters.has(filter.getId())) { + const multiTokens = filter.getTokens(); + filtersTokens.push({ + filter, + multiTokens, + }); + + for (let j = 0; j < multiTokens.length; j += 1) { + const tokens = multiTokens[j]; + totalNumberOfIndexedFilters += 1; + for (let k = 0; k < tokens.length; k += 1) { + totalNumberOfTokens += 1; + histogram.incr(tokens[k]); + } + } } } }); + // No filters given; reset to empty bucket + if (filtersTokens.length === 0) { + this.view = new StaticDataView(0); + this.tokensLookupIndexSize = 0; + this.tokensLookupIndexStart = 0; + this.cache = new Map(); + return; + } + // Add an heavy weight on these common patterns because they appear in // almost all URLs. If there is a choice, then filters should use other // tokens than those. ['http', 'https', 'www', 'com'].forEach((badToken) => { - index.get(fastHash(badToken)).magic = totalNumberOfTokens; + histogram.set(fastHash(badToken), totalNumberOfTokens); }); - // For each filter, take the best token (least seen) - for (let i = 0; i < filters.length; i += 1) { - const { filter, multiTokens } = filters[i]; - let wildCardInserted = false; + // Prepare tokensLookupIndex. This is an array where keys are suffixes of N + // bits from tokens (the ones used to index filters in the index) and values + // are indices to compact representation of buckets. Each bucket contains a + // list of filters associated with a token with identical N bits suffix. + // This allows to quickly identify the potential filters given a query + // token. + const tokensLookupIndexSize = Math.max(2, nextPow2(totalNumberOfIndexedFilters)); + const mask = tokensLookupIndexSize - 1; + const prefixes: Array> = []; + for (let i = 0; i <= mask; i += 1) { + prefixes.push([]); + } + // This byte array contains all the filters serialized consecutively. Having + // them separately from the reverse index structure allows filters to be + // indexed more than once while not paying extra storage cost. + // `buffer` is a contiguous chunk of memory which will be used to store 3 + // kinds of data: + // 1. The first section contains all the filters stored in the index + // 2. The second section contains the compact buckets where filter having + // their indexing token sharing the last N bits are grouped together. + const buffer = new StaticDataView(6000000); + buffer.pushUint32(filtersTokens.length); + + // For each filter, find the best token (least seen) + for (let i = 0; i < filtersTokens.length; i += 1) { + const { filter, multiTokens } = filtersTokens[i]; + + // Serialize this filter and keep track of its index in the byte array + const filterIndex = buffer.getPos(); + filter.serialize(buffer); + + // Index the filter once per "tokens" for (let j = 0; j < multiTokens.length; j += 1) { - const tokens = multiTokens[j]; + const tokens: Uint32Array = multiTokens[j]; - let bestBucket; - let count = totalNumberOfTokens + 1; + // Find best token (least seen) from `tokens` using `histogram`. + let bestToken: number = 0; + let minCount: number = totalNumberOfTokens + 1; for (let k = 0; k < tokens.length; k += 1) { - const bucket = index.get(tokens[k]); - if (bucket.magic <= count) { - count = bucket.magic; - bestBucket = bucket; - - if (count === 1) { + const tokenCount = histogram.get(tokens[k]); + if (tokenCount <= minCount) { + minCount = tokenCount; + bestToken = tokens[k]; + + // Fast path, if the current token has only been seen once, we can + // stop iterating since we will not find better! + if (minCount === 1) { break; } } } - // Only allow each filter to be present one time in the wildcard - if (bestBucket === undefined) { - if (wildCardInserted === false) { - wildCardInserted = true; - wildcardBucket.filters.push(filter); - } - } else { - bestBucket.filters.push(filter); - } + // `bestToken & mask` represents the N last bits of `bestToken`. We + // group all filters indexed with a token sharing the same N bits. + prefixes[bestToken & mask].push({ + index: filterIndex, + token: bestToken, + }); } } - this.size = filters.length; - this.index = index.getMap(); - - // Clean-up empty buckets - this.index.forEach((bucket, key, map) => { - bucket.magic = 0; - if (bucket.filters.length === 0) { - map.delete(key); - } - }); - } - - private optimize(bucket: IBucket): void { - if (this.optimizer !== undefined && bucket.optimized === false) { - if (bucket.filters.length > 1) { - bucket.originals = bucket.filters; - bucket.filters = this.optimizer(bucket.filters); + // We finished dumping all the filters so now starts the buckets index section + const tokensLookupIndex = new Uint32Array(tokensLookupIndexSize); + const emptyBucket = Number.MAX_SAFE_INTEGER; + for (let i = 0; i < tokensLookupIndexSize; i += 1) { + const filtersForMask = prefixes[i]; + if (filtersForMask.length === 0) { + tokensLookupIndex[mask] = emptyBucket; + } else { + tokensLookupIndex[i] = buffer.getPos(); + buffer.pushByte(filtersForMask.length); + for (let j = 0; j < filtersForMask.length; j += 1) { + const { token, index } = filtersForMask[j]; + buffer.pushUint32(token); + buffer.pushUint32(index); + } } + } - bucket.optimized = true; + // Write lookupIndex in buffer. It will be used to locate the corresponding + // bucket, in the same buffer. + const tokensLookupIndexStart = buffer.getPos(); + for (let i = 0; i < tokensLookupIndexSize; i += 1) { + buffer.pushUint32(tokensLookupIndex[i]); } + + this.cache = new Map(); + this.tokensLookupIndexStart = tokensLookupIndexStart; + this.tokensLookupIndexSize = tokensLookupIndexSize; + this.view = StaticDataView.fromUint8Array(buffer.slice()); } /** @@ -243,14 +444,57 @@ export default class ReverseIndex { * as soon as `false` is returned from the callback. */ private iterBucket(token: number, requestId: number, cb: (f: T) => boolean): boolean { - const bucket = this.index.get(token); - if (bucket !== undefined && bucket.magic !== requestId) { - bucket.magic = requestId; + let bucket: Bucket | undefined = this.cache.get(token); + + // Lazily create bucket if it does not yet exist in memory. Lookup the + // compact bucket representation and find all filters being associated with + // `token`. Create a `Bucket` out of them and store them in cache. + if (bucket === undefined) { + const offset = token & (this.tokensLookupIndexSize - 1); + + const view = this.view; + view.setPos(this.tokensLookupIndexStart + 4 * offset); + const startOfBucket = view.getUint32(); + + // We do not have any filters for this token + if (startOfBucket === Number.MAX_SAFE_INTEGER) { + return true; + } + + // Get indices of filters indexed with `token`, if any. + view.setPos(startOfBucket); + + const numberOfFilters = view.getByte(); + const filtersIndices: number[] = []; + for (let i = 0; i < numberOfFilters; i += 1) { + const currentToken = view.getUint32(); + if (currentToken === token) { + filtersIndices.push(view.getUint32()); + } else { + view.pos += 4; // skip one 32bits number + } + } + + // No filter indexed with `token`. + if (filtersIndices.length === 0) { + return true; // continue looking for a match + } - if (bucket.optimized === false) { - this.optimize(bucket); + // If we have filters for `token` then deserialize filters in memory and + // create a `Bucket` instance to hold them for future access. + const filters: T[] = []; + for (let i = 0; i < filtersIndices.length; i += 1) { + view.setPos(filtersIndices[i]); + filters.push(this.deserializeFilter(view)); } + bucket = new Bucket(filters.length > 1 ? this.optimize(filters) : filters); + this.cache.set(token, bucket); + } + + // Look for matching filter in this bucket + if (bucket !== undefined && bucket.lastRequestSeen !== requestId) { + bucket.lastRequestSeen = requestId; const filters = bucket.filters; for (let i = 0; i < filters.length; i += 1) { // Break the loop if the callback returns `false` diff --git a/src/filters/cosmetic.ts b/src/filters/cosmetic.ts new file mode 100644 index 0000000000..645f56caca --- /dev/null +++ b/src/filters/cosmetic.ts @@ -0,0 +1,703 @@ +import * as punycode from 'punycode'; +import StaticDataView from '../data-view'; +import { binLookup, fastStartsWithFrom, getBit, hasUnicode, setBit } from '../utils'; +import IFilter from './interface'; + +export const DEFAULT_HIDDING_STYLE: string = 'display: none !important;'; + +export function hashHostnameBackward(hostname: string): number { + let hash = 5381; + for (let j = hostname.length - 1; j >= 0; j -= 1) { + hash = (hash * 33) ^ hostname.charCodeAt(j); + } + return hash >>> 0; +} + +export function getHashesFromLabelsBackward( + hostname: string, + end: number, + startOfDomain: number, +): number[] { + const hashes: number[] = []; + let hash = 5381; + + // Compute hash backward, label per label + for (let i = end - 1; i >= 0; i -= 1) { + // Process label + if (hostname[i] === '.' && i < startOfDomain) { + hashes.push(hash >>> 0); + } + + // Update hash + hash = (hash * 33) ^ hostname.charCodeAt(i); + } + + hashes.push(hash >>> 0); + return hashes; +} + +export function getEntityHashesFromLabelsBackward(hostname: string, domain: string): number[] { + const hostnameWithoutPublicSuffix = getHostnameWithoutPublicSuffix(hostname, domain); + if (hostnameWithoutPublicSuffix !== null) { + return getHashesFromLabelsBackward( + hostnameWithoutPublicSuffix, + hostnameWithoutPublicSuffix.length, + hostnameWithoutPublicSuffix.length, + ); + } + return []; +} + +export function getHostnameHashesFromLabelsBackward(hostname: string, domain: string): number[] { + return getHashesFromLabelsBackward(hostname, hostname.length, hostname.length - domain.length); +} + +/** + * Given a hostname and its domain, return the hostname without the public + * suffix. We know that the domain, with one less label on the left, will be a + * the public suffix; and from there we know which trailing portion of + * `hostname` we should remove. + */ +export function getHostnameWithoutPublicSuffix(hostname: string, domain: string): string | null { + let hostnameWithoutPublicSuffix: string | null = null; + + const indexOfDot = domain.indexOf('.'); + if (indexOfDot !== -1) { + const publicSuffix = domain.slice(indexOfDot + 1); + hostnameWithoutPublicSuffix = hostname.slice(0, -publicSuffix.length - 1); + } + + return hostnameWithoutPublicSuffix; +} + +/** + * Validate CSS selector. There is a fast path for simple selectors (e.g.: #foo + * or .bar) which are the most common case. For complex ones, we rely on + * `Element.matches` (if available). + */ +const isValidCss = (() => { + const div = + typeof document !== 'undefined' + ? document.createElement('div') + : { + matches: () => { + /* noop */ + }, + }; + const matches = (selector: string): void | boolean => div.matches(selector); + const validSelectorRe = /^[#.]?[\w-.]+$/; + + return function isValidCssImpl(selector: string): boolean { + if (validSelectorRe.test(selector)) { + return true; + } + + try { + matches(selector); + } catch (ex) { + return false; + } + + return true; + }; +})(); + +/** + * Masks used to store options of cosmetic filters in a bitmask. + */ +const enum COSMETICS_MASK { + unhide = 1 << 0, + scriptInject = 1 << 1, + scriptBlock = 1 << 2, +} + +function computeFilterId( + mask: number, + selector: string | undefined, + hostnames: Uint32Array | undefined, + entities: Uint32Array | undefined, + notHostnames: Uint32Array | undefined, + notEntities: Uint32Array | undefined, +): number { + let hash = (5408 * 33) ^ mask; + + if (selector !== undefined) { + for (let i = 0; i < selector.length; i += 1) { + hash = (hash * 33) ^ selector.charCodeAt(i); + } + } + + if (hostnames !== undefined) { + for (let i = 0; i < hostnames.length; i += 1) { + hash = (hash * 33) ^ hostnames[i]; + } + } + + if (entities !== undefined) { + for (let i = 0; i < entities.length; i += 1) { + hash = (hash * 33) ^ entities[i]; + } + } + + if (notHostnames !== undefined) { + for (let i = 0; i < notHostnames.length; i += 1) { + hash = (hash * 33) ^ notHostnames[i]; + } + } + + if (notEntities !== undefined) { + for (let i = 0; i < notEntities.length; i += 1) { + hash = (hash * 33) ^ notEntities[i]; + } + } + + return hash >>> 0; +} + +/*************************************************************************** + * Cosmetic filters parsing + * ************************************************************************ */ + +/** + * TODO: Make sure these are implemented properly and write tests. + * - -abp-contains + * - -abp-has + * - contains + * - has + * - has-text + * - if + * - if-not + * - matches-css + * - matches-css-after + * - matches-css-before + * - xpath + */ +export default class CosmeticFilter implements IFilter { + /** + * Given a line that we know contains a cosmetic filter, create a CosmeticFiler + * instance out of it. This function should be *very* efficient, as it will be + * used to parse tens of thousands of lines. + */ + public static parse(line: string, debug: boolean = false): CosmeticFilter | null { + // Mask to store attributes + // Each flag (unhide, scriptInject, etc.) takes only 1 bit + // at a specific offset defined in COSMETICS_MASK. + // cf: COSMETICS_MASK for the offset of each property + let mask = 0; + let selector: string | undefined; + let hostnames: Uint32Array | undefined; + let notHostnames: Uint32Array | undefined; + let entities: Uint32Array | undefined; + let notEntities: Uint32Array | undefined; + let style: string | undefined; + const sharpIndex = line.indexOf('#'); + + // Start parsing the line + const afterSharpIndex = sharpIndex + 1; + let suffixStartIndex = afterSharpIndex + 1; + + // hostname1,hostname2#@#.selector + // ^^ ^ + // || | + // || suffixStartIndex + // |afterSharpIndex + // sharpIndex + + // Check if unhide + if (line.length > afterSharpIndex && line[afterSharpIndex] === '@') { + mask = setBit(mask, COSMETICS_MASK.unhide); + suffixStartIndex += 1; + } + + // Parse hostnames and entitites as well as their negations. + // + // - ~hostname##.selector + // - hostname##.selector + // - entity.*##.selector + // - ~entity.*##.selector + // + // Each kind will have its own Uint32Array containing hashes, sorted by + // number of labels considered. This allows a compact representation of + // hostnames and fast matching without any string copy. + if (sharpIndex > 0) { + const entitiesArray: number[] = []; + const notEntitiesArray: number[] = []; + const hostnamesArray: number[] = []; + const notHostnamesArray: number[] = []; + + // TODO - this could be done without any string copy + line + .slice(0, sharpIndex) + .split(',') + .forEach((hostname) => { + if (hasUnicode(hostname)) { + hostname = punycode.encode(hostname); + } + + const negation: boolean = hostname[0] === '~'; + const entity: boolean = hostname.endsWith('.*'); + + const start: number = negation ? 1 : 0; + const end: number = entity ? hostname.length - 2 : hostname.length; + + const hash = hashHostnameBackward(hostname.slice(start, end)); + + if (negation) { + if (entity) { + notEntitiesArray.push(hash); + } else { + notHostnamesArray.push(hash); + } + } else { + if (entity) { + entitiesArray.push(hash); + } else { + hostnamesArray.push(hash); + } + } + }); + + if (entitiesArray.length !== 0) { + entities = new Uint32Array(entitiesArray).sort(); + } + + if (hostnamesArray.length !== 0) { + hostnames = new Uint32Array(hostnamesArray).sort(); + } + + if (notEntitiesArray.length !== 0) { + notEntities = new Uint32Array(notEntitiesArray).sort(); + } + + if (notHostnamesArray.length !== 0) { + notHostnames = new Uint32Array(notHostnamesArray).sort(); + } + } + + // We should not have unhide without any hostname + if (getBit(mask, COSMETICS_MASK.unhide) && hostnames === undefined && entities === undefined) { + return null; + } + + // Deal with script:inject and script:contains + if (fastStartsWithFrom(line, 'script:', suffixStartIndex)) { + // script:inject(.......) + // ^ ^ + // script:contains(/......./) + // ^ ^ + // script:contains(selector[, args]) + // ^ ^ ^^ + // | | | || + // | | | |selector.length + // | | | scriptSelectorIndexEnd + // | | |scriptArguments + // | scriptSelectorIndexStart + // scriptMethodIndex + const scriptMethodIndex = suffixStartIndex + 7; + let scriptSelectorIndexStart = scriptMethodIndex; + let scriptSelectorIndexEnd = line.length - 1; + + if (fastStartsWithFrom(line, 'inject(', scriptMethodIndex)) { + mask = setBit(mask, COSMETICS_MASK.scriptInject); + scriptSelectorIndexStart += 7; + } else if (fastStartsWithFrom(line, 'contains(', scriptMethodIndex)) { + mask = setBit(mask, COSMETICS_MASK.scriptBlock); + scriptSelectorIndexStart += 9; + + // If it's a regex + if (line[scriptSelectorIndexStart] === '/' && line[scriptSelectorIndexEnd - 1] === '/') { + scriptSelectorIndexStart += 1; + scriptSelectorIndexEnd -= 1; + } + } + + selector = line.slice(scriptSelectorIndexStart, scriptSelectorIndexEnd); + } else if (fastStartsWithFrom(line, '+js(', suffixStartIndex)) { + mask = setBit(mask, COSMETICS_MASK.scriptInject); + selector = line.slice(suffixStartIndex + 4, line.length - 1); + } else { + // Detect special syntax + let indexOfColon = line.indexOf(':', suffixStartIndex); + while (indexOfColon !== -1) { + const indexAfterColon = indexOfColon + 1; + if (fastStartsWithFrom(line, 'style', indexAfterColon)) { + // ##selector :style(...) + if (line[indexAfterColon + 5] === '(' && line[line.length - 1] === ')') { + selector = line.slice(suffixStartIndex, indexOfColon); + style = line.slice(indexAfterColon + 6, -1); + } else { + return null; + } + } else if ( + fastStartsWithFrom(line, '-abp-', indexAfterColon) || + fastStartsWithFrom(line, 'contains', indexAfterColon) || + fastStartsWithFrom(line, 'has', indexAfterColon) || + fastStartsWithFrom(line, 'if', indexAfterColon) || + fastStartsWithFrom(line, 'if-not', indexAfterColon) || + fastStartsWithFrom(line, 'matches-css', indexAfterColon) || + fastStartsWithFrom(line, 'matches-css-after', indexAfterColon) || + fastStartsWithFrom(line, 'matches-css-before', indexAfterColon) || + fastStartsWithFrom(line, 'not', indexAfterColon) || + fastStartsWithFrom(line, 'properties', indexAfterColon) || + fastStartsWithFrom(line, 'subject', indexAfterColon) || + fastStartsWithFrom(line, 'xpath', indexAfterColon) + ) { + return null; + } + indexOfColon = line.indexOf(':', indexAfterColon); + } + + // If we reach this point, filter is not extended syntax + if (selector === undefined && suffixStartIndex < line.length) { + selector = line.slice(suffixStartIndex); + } + + if (selector === undefined || !isValidCss(selector)) { + // Not a valid selector + return null; + } + } + + return new CosmeticFilter({ + entities, + hostnames, + mask, + notEntities, + notHostnames, + rawLine: debug === true ? line : undefined, + selector, + style, + }); + } + + /** + * Deserialize cosmetic filters. The code accessing the buffer should be + * symetrical to the one in `serializeCosmeticFilter`. + */ + public static deserialize(buffer: StaticDataView): CosmeticFilter { + const mask = buffer.getUint8(); + const selector = buffer.getUTF8(); + const optionalParts = buffer.getUint8(); + + // The order of these fields should be the same as when we serialize them. + return new CosmeticFilter({ + // Mandatory fields + mask, + selector, + + // Optional fields + entities: (optionalParts & 1) === 1 ? buffer.getUint32Array() : undefined, + hostnames: (optionalParts & 2) === 2 ? buffer.getUint32Array() : undefined, + notEntities: (optionalParts & 4) === 4 ? buffer.getUint32Array() : undefined, + notHostnames: (optionalParts & 8) === 8 ? buffer.getUint32Array() : undefined, + rawLine: (optionalParts & 16) === 16 ? buffer.getUTF8() : undefined, + style: (optionalParts & 32) === 32 ? buffer.getASCII() : undefined, + }); + } + + public readonly mask: number; + public readonly selector: string; + + // hostnames + public readonly hostnames?: Uint32Array; + public readonly entities?: Uint32Array; + + // Exceptions + public readonly notHostnames?: Uint32Array; + public readonly notEntities?: Uint32Array; + + public readonly style?: string; + + public id?: number; + public rawLine?: string; + + constructor({ + entities, + hostnames, + mask, + notEntities, + notHostnames, + rawLine, + selector, + style, + }: Partial & { mask: number; selector: string }) { + this.entities = entities; + this.hostnames = hostnames; + this.mask = mask; + this.notEntities = notEntities; + this.notHostnames = notHostnames; + this.rawLine = rawLine; + this.selector = selector; + this.style = style; + } + + public isCosmeticFilter(): boolean { + return true; + } + + public isNetworkFilter(): boolean { + return false; + } + + /** + * The format of a cosmetic filter is: + * + * | mask | selector length | selector... | hostnames length | hostnames... + * 32 16 16 + * + * The header (mask) is 32 bits, then we have a total of 32 bits to store the + * length of `selector` and `hostnames` (16 bits each). + * + * Improvements similar to the onces mentioned in `serializeNetworkFilters` + * could be applied here, to get a more compact representation. + */ + public serialize(buffer: StaticDataView): void { + // Mandatory fields + buffer.pushUint8(this.mask); + buffer.pushUTF8(this.selector); + + const index = buffer.getPos(); + buffer.pushUint8(0); + + // This bit-mask indicates which optional parts of the filter were serialized. + let optionalParts = 0; + + if (this.entities !== undefined) { + optionalParts |= 1; + buffer.pushUint32Array(this.entities); + } + + if (this.hostnames !== undefined) { + optionalParts |= 2; + buffer.pushUint32Array(this.hostnames); + } + + if (this.notEntities !== undefined) { + optionalParts |= 4; + buffer.pushUint32Array(this.notEntities); + } + + if (this.notHostnames !== undefined) { + optionalParts |= 8; + buffer.pushUint32Array(this.notHostnames); + } + + if (this.rawLine !== undefined) { + optionalParts |= 16; + buffer.pushUTF8(this.rawLine); + } + + if (this.style !== undefined) { + optionalParts |= 32; + buffer.pushASCII(this.style); + } + + buffer.setByte(index, optionalParts); + } + + /** + * Create a more human-readable version of this filter. It is mainly used for + * debugging purpose, as it will expand the values stored in the bit mask. + */ + public toString(): string { + if (this.rawLine !== undefined) { + return this.rawLine; + } + + let filter = ''; + + if ( + this.hostnames !== undefined || + this.entities !== undefined || + this.notHostnames !== undefined || + this.notEntities !== undefined + ) { + filter += ''; + } + + if (this.isUnhide()) { + filter += '#@#'; + } else { + filter += '##'; + } + + if (this.isScriptInject()) { + filter += '+js('; + filter += this.selector; + filter += ')'; + } else if (this.isScriptBlock()) { + filter += 'script:contains('; + filter += this.selector; + filter += ')'; + } else { + filter += this.selector; + } + + return filter; + } + + public hasHostnameConstraint(): boolean { + return ( + this.hostnames !== undefined || + this.entities !== undefined || + this.notEntities !== undefined || + this.notHostnames !== undefined + ); + } + + public match(hostname: string, domain: string): boolean { + // No `hostname` available but this filter has some constraints on hostname. + if ( + !hostname && + (this.hostnames !== undefined || + this.entities !== undefined || + this.notHostnames !== undefined || + this.notEntities !== undefined) + ) { + return false; + } + + const entitiesHashes: number[] = + this.entities !== undefined || this.notEntities !== undefined + ? getEntityHashesFromLabelsBackward(hostname, domain) + : []; + const hostnameHashes: number[] = + this.hostnames !== undefined || this.notHostnames !== undefined + ? getHostnameHashesFromLabelsBackward(hostname, domain) + : []; + + // Check if `hostname` is blacklisted + if (this.notHostnames !== undefined) { + for (let i = 0; i < hostnameHashes.length; i += 1) { + if (binLookup(this.notHostnames, hostnameHashes[i])) { + return false; + } + } + } + + // Check if `hostname` is blacklisted by *entity* + if (this.notEntities !== undefined) { + for (let i = 0; i < entitiesHashes.length; i += 1) { + if (binLookup(this.notEntities, entitiesHashes[i])) { + return false; + } + } + } + + // Check if `hostname` is allowed + if (this.hostnames !== undefined || this.entities !== undefined) { + if (this.hostnames !== undefined) { + for (let i = 0; i < hostnameHashes.length; i += 1) { + if (binLookup(this.hostnames, hostnameHashes[i])) { + return true; + } + } + } + + if (this.entities !== undefined) { + for (let i = 0; i < entitiesHashes.length; i += 1) { + if (binLookup(this.entities, entitiesHashes[i])) { + return true; + } + } + } + + return false; + } + + return true; + } + + /** + * Get tokens for this filter. It can be indexed multiple times if multiple + * hostnames are specified (e.g.: host1,host2##.selector). + */ + public getTokens(): Uint32Array[] { + const tokens: Uint32Array[] = []; + + if (this.hostnames !== undefined) { + for (let i = 0; i < this.hostnames.length; i += 1) { + tokens.push(new Uint32Array([this.hostnames[i]])); + } + } + + if (this.entities !== undefined) { + for (let i = 0; i < this.entities.length; i += 1) { + tokens.push(new Uint32Array([this.entities[i]])); + } + } + + if (this.notEntities !== undefined) { + for (let i = 0; i < this.notEntities.length; i += 1) { + tokens.push(new Uint32Array([this.notEntities[i]])); + } + } + + if (this.notHostnames !== undefined) { + for (let i = 0; i < this.notHostnames.length; i += 1) { + tokens.push(new Uint32Array([this.notHostnames[i]])); + } + } + + return tokens; + } + + public getScript(js: Map): string | undefined { + let scriptName = this.getSelector(); + let scriptArguments: string[] = []; + if (scriptName.indexOf(',') !== -1) { + const parts = scriptName.split(','); + scriptName = parts[0]; + scriptArguments = parts.slice(1).map((s) => s.trim()); + } + + let script = js.get(scriptName); + if (script !== undefined) { + for (let i = 0; i < scriptArguments.length; i += 1) { + script = script.replace(`{{${i + 1}}}`, scriptArguments[i]); + } + + return script; + } // TODO - else throw an exception? + + return undefined; + } + + public getId(): number { + if (this.id === undefined) { + this.id = computeFilterId( + this.mask, + this.selector, + this.hostnames, + this.entities, + this.notHostnames, + this.notEntities, + ); + } + return this.id; + } + + public getStyle(): string { + return this.style || DEFAULT_HIDDING_STYLE; + } + + public getSelector(): string { + return this.selector; + } + + public hasHostnames(): boolean { + return this.hostnames !== undefined; + } + + public isUnhide(): boolean { + return getBit(this.mask, COSMETICS_MASK.unhide); + } + + public isScriptInject(): boolean { + return getBit(this.mask, COSMETICS_MASK.scriptInject); + } + + public isScriptBlock(): boolean { + return getBit(this.mask, COSMETICS_MASK.scriptBlock); + } +} diff --git a/src/parsing/interface.ts b/src/filters/interface.ts similarity index 54% rename from src/parsing/interface.ts rename to src/filters/interface.ts index 04464646c6..e3ed634e16 100644 --- a/src/parsing/interface.ts +++ b/src/filters/interface.ts @@ -1,5 +1,8 @@ +import StaticDataView from '../data-view'; + export default interface IFilter { mask: number; getId: () => number; getTokens: () => Uint32Array[]; + serialize: (buffer: StaticDataView) => void; } diff --git a/src/filters/network.ts b/src/filters/network.ts new file mode 100644 index 0000000000..79b4862671 --- /dev/null +++ b/src/filters/network.ts @@ -0,0 +1,1464 @@ +import * as punycode from 'punycode'; +import StaticDataView from '../data-view'; +import { RequestType } from '../request'; +import Request from '../request'; +import { + binLookup, + bitCount, + clearBit, + createFuzzySignature, + fastHash, + fastStartsWith, + fastStartsWithFrom, + getBit, + hasUnicode, + setBit, + tokenize, + tokenizeFilter, +} from '../utils'; +import IFilter from './interface'; + +const TOKENS_BUFFER = new Uint32Array(200); + +/** + * Masks used to store options of network filters in a bitmask. + */ +export const enum NETWORK_FILTER_MASK { + // Content Policy Type + fromImage = 1 << 0, + fromMedia = 1 << 1, + fromObject = 1 << 2, + fromOther = 1 << 3, + fromPing = 1 << 4, + fromScript = 1 << 5, + fromStylesheet = 1 << 6, + fromSubdocument = 1 << 7, + fromWebsocket = 1 << 8, // e.g.: ws, wss + fromXmlHttpRequest = 1 << 9, + fromFont = 1 << 10, + fromHttp = 1 << 11, + fromHttps = 1 << 12, + isImportant = 1 << 13, + matchCase = 1 << 14, + fuzzyMatch = 1 << 15, + + // Kind of patterns + thirdParty = 1 << 16, + firstParty = 1 << 17, + isRegex = 1 << 18, + isLeftAnchor = 1 << 19, + isRightAnchor = 1 << 20, + isHostnameAnchor = 1 << 21, + isException = 1 << 22, + isCSP = 1 << 23, +} + +/** + * Mask used when a network filter can be applied on any content type. + */ +const FROM_ANY: number = + NETWORK_FILTER_MASK.fromFont | + NETWORK_FILTER_MASK.fromImage | + NETWORK_FILTER_MASK.fromMedia | + NETWORK_FILTER_MASK.fromObject | + NETWORK_FILTER_MASK.fromOther | + NETWORK_FILTER_MASK.fromPing | + NETWORK_FILTER_MASK.fromScript | + NETWORK_FILTER_MASK.fromStylesheet | + NETWORK_FILTER_MASK.fromSubdocument | + NETWORK_FILTER_MASK.fromWebsocket | + NETWORK_FILTER_MASK.fromXmlHttpRequest; + +/** + * Map content type value to mask the corresponding mask. + * ref: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIContentPolicy + */ +const CPT_TO_MASK: { + [s: number]: number; +} = { + [RequestType.other]: NETWORK_FILTER_MASK.fromOther, + [RequestType.script]: NETWORK_FILTER_MASK.fromScript, + [RequestType.image]: NETWORK_FILTER_MASK.fromImage, + [RequestType.stylesheet]: NETWORK_FILTER_MASK.fromStylesheet, + [RequestType.object]: NETWORK_FILTER_MASK.fromObject, + [RequestType.subdocument]: NETWORK_FILTER_MASK.fromSubdocument, + [RequestType.ping]: NETWORK_FILTER_MASK.fromPing, + [RequestType.beacon]: NETWORK_FILTER_MASK.fromPing, + [RequestType.xmlhttprequest]: NETWORK_FILTER_MASK.fromXmlHttpRequest, + [RequestType.font]: NETWORK_FILTER_MASK.fromFont, + [RequestType.media]: NETWORK_FILTER_MASK.fromMedia, + [RequestType.websocket]: NETWORK_FILTER_MASK.fromWebsocket, + [RequestType.dtd]: NETWORK_FILTER_MASK.fromOther, + [RequestType.fetch]: NETWORK_FILTER_MASK.fromOther, + [RequestType.xlst]: NETWORK_FILTER_MASK.fromOther, +}; + +function computeFilterId( + csp: string | undefined, + mask: number, + filter: string | undefined, + hostname: string | undefined, + optDomains: Uint32Array | undefined, + optNotDomains: Uint32Array | undefined, +): number { + let hash = (5408 * 33) ^ mask; + + if (csp !== undefined) { + for (let i = 0; i < csp.length; i += 1) { + hash = (hash * 33) ^ csp.charCodeAt(i); + } + } + + if (optDomains !== undefined) { + for (let i = 0; i < optDomains.length; i += 1) { + hash = (hash * 33) ^ optDomains[i]; + } + } + + if (optNotDomains !== undefined) { + for (let i = 0; i < optNotDomains.length; i += 1) { + hash = (hash * 33) ^ optNotDomains[i]; + } + } + + if (filter !== undefined) { + for (let i = 0; i < filter.length; i += 1) { + hash = (hash * 33) ^ filter.charCodeAt(i); + } + } + + if (hostname !== undefined) { + for (let i = 0; i < hostname.length; i += 1) { + hash = (hash * 33) ^ hostname.charCodeAt(i); + } + } + + return hash >>> 0; +} + +const SEPARATOR = /[/^*]/; + +/** + * Compiles a filter pattern to a regex. This is only performed *lazily* for + * filters containing at least a * or ^ symbol. Because Regexes are expansive, + * we try to convert some patterns to plain filters. + */ +function compileRegex(filterStr: string, isRightAnchor: boolean, isLeftAnchor: boolean): RegExp { + let filter = filterStr; + + // Escape special regex characters: |.$+?{}()[]\ + filter = filter.replace(/([|.$+?{}()[\]\\])/g, '\\$1'); + + // * can match anything + filter = filter.replace(/\*/g, '.*'); + // ^ can match any separator or the end of the pattern + filter = filter.replace(/\^/g, '(?:[^\\w\\d_.%-]|$)'); + + // Should match end of url + if (isRightAnchor) { + filter = `${filter}$`; + } + + if (isLeftAnchor) { + filter = `^${filter}`; + } + + return new RegExp(filter); +} + +const EMPTY_ARRAY = new Uint32Array([]); +const MATCH_ALL = new RegExp(''); + +// TODO: +// 1. Options not supported yet: +// - badfilter +// - inline-script +// - popup +// - popunder +// - generichide +// - genericblock +// 2. Replace `split` with `substr` +export default class NetworkFilter implements IFilter { + public static parse(line: string, debug: boolean = false): NetworkFilter | null { + // Represent options as a bitmask + let mask: number = + NETWORK_FILTER_MASK.thirdParty | + NETWORK_FILTER_MASK.firstParty | + NETWORK_FILTER_MASK.fromHttps | + NETWORK_FILTER_MASK.fromHttp; + + // Temporary masks for positive (e.g.: $script) and negative (e.g.: $~script) + // content type options. + let cptMaskPositive: number = 0; + let cptMaskNegative: number = FROM_ANY; + + let hostname: string | undefined; + + let optDomains: Uint32Array | undefined; + let optNotDomains: Uint32Array | undefined; + let redirect: string | undefined; + let csp: string | undefined; + let bug: number | undefined; + + // Start parsing + let filterIndexStart: number = 0; + let filterIndexEnd: number = line.length; + + // @@filter == Exception + if (fastStartsWith(line, '@@')) { + filterIndexStart += 2; + mask = setBit(mask, NETWORK_FILTER_MASK.isException); + } + + // filter$options == Options + // ^ ^ + // | | + // | optionsIndex + // filterIndexStart + const optionsIndex: number = line.lastIndexOf('$'); + if (optionsIndex !== -1) { + // Parse options and set flags + filterIndexEnd = optionsIndex; + + // --------------------------------------------------------------------- // + // parseOptions + // TODO: This could be implemented without string copy, + // using indices, like in main parsing functions. + const rawOptions = line.slice(optionsIndex + 1); + const options = rawOptions.split(','); + for (let i = 0; i < options.length; i += 1) { + const rawOption = options[i]; + let negation = false; + let option = rawOption; + + // Check for negation: ~option + if (fastStartsWith(option, '~')) { + negation = true; + option = option.slice(1); + } else { + negation = false; + } + + // Check for options: option=value1|value2 + let optionValue: string = ''; + if (option.indexOf('=') !== -1) { + const optionAndValues = option.split('=', 2); + option = optionAndValues[0]; + optionValue = optionAndValues[1]; + } + + switch (option) { + case 'domain': { + const optionValues: string[] = optionValue.split('|'); + const optDomainsArray: number[] = []; + const optNotDomainsArray: number[] = []; + + for (let j = 0; j < optionValues.length; j += 1) { + const value: string = optionValues[j]; + if (value) { + if (fastStartsWith(value, '~')) { + optNotDomainsArray.push(fastHash(value.slice(1))); + } else { + optDomainsArray.push(fastHash(value)); + } + } + } + + if (optDomainsArray.length > 0) { + optDomains = new Uint32Array(optDomainsArray); + } + + if (optNotDomainsArray.length > 0) { + optNotDomains = new Uint32Array(optNotDomainsArray); + } + + break; + } + case 'badfilter': + // TODO - how to handle those, if we start in mask, then the id will + // differ from the other filter. We could keep original line. How do + // to eliminate thos efficiently? They will probably endup in the same + // bucket, so maybe we could do that on a per-bucket basis? + return null; + case 'important': + // Note: `negation` should always be `false` here. + if (negation) { + return null; + } + + mask = setBit(mask, NETWORK_FILTER_MASK.isImportant); + break; + case 'match-case': + // Note: `negation` should always be `false` here. + if (negation) { + return null; + } + + mask = setBit(mask, NETWORK_FILTER_MASK.matchCase); + break; + case 'third-party': + if (negation) { + // ~third-party means we should clear the flag + mask = clearBit(mask, NETWORK_FILTER_MASK.thirdParty); + } else { + // third-party means ~first-party + mask = clearBit(mask, NETWORK_FILTER_MASK.firstParty); + } + break; + case 'first-party': + if (negation) { + // ~first-party means we should clear the flag + mask = clearBit(mask, NETWORK_FILTER_MASK.firstParty); + } else { + // first-party means ~third-party + mask = clearBit(mask, NETWORK_FILTER_MASK.thirdParty); + } + break; + case 'fuzzy': + mask = setBit(mask, NETWORK_FILTER_MASK.fuzzyMatch); + break; + case 'collapse': + break; + case 'bug': + bug = parseInt(optionValue, 10); + break; + case 'redirect': + // Negation of redirection doesn't make sense + if (negation) { + return null; + } + + // Ignore this filter if no redirection resource is specified + if (optionValue.length === 0) { + return null; + } + + redirect = optionValue; + break; + case 'csp': + mask = setBit(mask, NETWORK_FILTER_MASK.isCSP); + if (optionValue.length > 0) { + csp = optionValue; + } + break; + default: { + // Handle content type options separatly + let optionMask: number = 0; + switch (option) { + case 'image': + optionMask = NETWORK_FILTER_MASK.fromImage; + break; + case 'media': + optionMask = NETWORK_FILTER_MASK.fromMedia; + break; + case 'object': + optionMask = NETWORK_FILTER_MASK.fromObject; + break; + case 'object-subrequest': + optionMask = NETWORK_FILTER_MASK.fromObject; + break; + case 'other': + optionMask = NETWORK_FILTER_MASK.fromOther; + break; + case 'ping': + case 'beacon': + optionMask = NETWORK_FILTER_MASK.fromPing; + break; + case 'script': + optionMask = NETWORK_FILTER_MASK.fromScript; + break; + case 'stylesheet': + optionMask = NETWORK_FILTER_MASK.fromStylesheet; + break; + case 'subdocument': + optionMask = NETWORK_FILTER_MASK.fromSubdocument; + break; + case 'xmlhttprequest': + case 'xhr': + optionMask = NETWORK_FILTER_MASK.fromXmlHttpRequest; + break; + case 'websocket': + optionMask = NETWORK_FILTER_MASK.fromWebsocket; + break; + case 'font': + optionMask = NETWORK_FILTER_MASK.fromFont; + break; + default: + // Disable this filter if we don't support all the options + return null; + } + + // We got a valid cpt option, update mask + if (negation) { + cptMaskNegative = clearBit(cptMaskNegative, optionMask); + } else { + cptMaskPositive = setBit(cptMaskPositive, optionMask); + } + break; + } + } + } + // End of option parsing + // --------------------------------------------------------------------- // + } + + if (cptMaskPositive === 0) { + mask |= cptMaskNegative; + } else if (cptMaskNegative === FROM_ANY) { + mask |= cptMaskPositive; + } else { + mask |= cptMaskPositive & cptMaskNegative; + } + + // Identify kind of pattern + + // Deal with hostname pattern + if (line[filterIndexEnd - 1] === '|') { + mask = setBit(mask, NETWORK_FILTER_MASK.isRightAnchor); + filterIndexEnd -= 1; + } + + if (fastStartsWithFrom(line, '||', filterIndexStart)) { + mask = setBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor); + filterIndexStart += 2; + } else if (line[filterIndexStart] === '|') { + mask = setBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart += 1; + } + + const isRegex = checkIsRegex(line, filterIndexStart, filterIndexEnd); + mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isRegex, isRegex); + + if (getBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor)) { + if (isRegex) { + // Split at the first '/', '*' or '^' character to get the hostname + // and then the pattern. + // TODO - this could be made more efficient if we could match between two + // indices. Once again, we have to do more work than is really needed. + const firstSeparator = line.search(SEPARATOR); + // NOTE: `firstSeparator` shall never be -1 here since `isRegex` is true. + // This means there must be at least an occurrence of `*` or `^` + // somewhere. + + hostname = line.slice(filterIndexStart, firstSeparator); + filterIndexStart = firstSeparator; + + // If the only symbol remaining for the selector is '^' then ignore it + // but set the filter as right anchored since there should not be any + // other label on the right + if (filterIndexEnd - filterIndexStart === 1 && line[filterIndexStart] === '^') { + mask = clearBit(mask, NETWORK_FILTER_MASK.isRegex); + filterIndexStart = filterIndexEnd; + mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isRightAnchor, true); + } else { + mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isLeftAnchor, true); + mask = setNetworkMask( + mask, + NETWORK_FILTER_MASK.isRegex, + checkIsRegex(line, filterIndexStart, filterIndexEnd), + ); + } + } else { + // Look for next / + const slashIndex = line.indexOf('/', filterIndexStart); + if (slashIndex !== -1) { + hostname = line.slice(filterIndexStart, slashIndex); + filterIndexStart = slashIndex; + mask = setBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + } else { + hostname = line.slice(filterIndexStart, filterIndexEnd); + filterIndexStart = filterIndexEnd; + } + } + } + + // Remove trailing '*' + if (filterIndexEnd - filterIndexStart > 0 && line[filterIndexEnd - 1] === '*') { + filterIndexEnd -= 1; + } + + // Remove leading '*' if the filter is not hostname anchored. + if (filterIndexEnd - filterIndexStart > 0 && line[filterIndexStart] === '*') { + mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart += 1; + } + + // Transform filters on protocol (http, https, ws) + if (getBit(mask, NETWORK_FILTER_MASK.isLeftAnchor)) { + if ( + filterIndexEnd - filterIndexStart === 5 && + fastStartsWithFrom(line, 'ws://', filterIndexStart) + ) { + mask = setBit(mask, NETWORK_FILTER_MASK.fromWebsocket); + mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart = filterIndexEnd; + } else if ( + filterIndexEnd - filterIndexStart === 7 && + fastStartsWithFrom(line, 'http://', filterIndexStart) + ) { + mask = setBit(mask, NETWORK_FILTER_MASK.fromHttp); + mask = clearBit(mask, NETWORK_FILTER_MASK.fromHttps); + mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart = filterIndexEnd; + } else if ( + filterIndexEnd - filterIndexStart === 8 && + fastStartsWithFrom(line, 'https://', filterIndexStart) + ) { + mask = setBit(mask, NETWORK_FILTER_MASK.fromHttps); + mask = clearBit(mask, NETWORK_FILTER_MASK.fromHttp); + mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart = filterIndexEnd; + } else if ( + filterIndexEnd - filterIndexStart === 8 && + fastStartsWithFrom(line, 'http*://', filterIndexStart) + ) { + mask = setBit(mask, NETWORK_FILTER_MASK.fromHttps); + mask = setBit(mask, NETWORK_FILTER_MASK.fromHttp); + mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); + filterIndexStart = filterIndexEnd; + } + } + + let filter: string | undefined; + if (filterIndexEnd - filterIndexStart > 0) { + filter = line.slice(filterIndexStart, filterIndexEnd).toLowerCase(); + mask = setNetworkMask( + mask, + NETWORK_FILTER_MASK.isRegex, + checkIsRegex(filter, 0, filter.length), + ); + } + + // TODO + // - ignore hostname anchor is not hostname provided + + if (hostname !== undefined) { + if (getBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor) && fastStartsWith(hostname, 'www.')) { + hostname = hostname.slice(4); + } + hostname = hostname.toLowerCase(); + if (hasUnicode(hostname)) { + hostname = punycode.toASCII(hostname); + } + } + + return new NetworkFilter({ + bug, + csp, + filter, + hostname, + mask, + optDomains, + optNotDomains, + rawLine: debug === true ? line : undefined, + redirect, + }); + } + + /** + * Deserialize network filters. The code accessing the buffer should be + * symetrical to the one in `serializeNetworkFilter`. + */ + public static deserialize(buffer: StaticDataView): NetworkFilter { + const mask = buffer.getUint32(); + const optionalParts = buffer.getUint8(); + + // The order of these statements is important. Since `buffer.getX()` will + // internally increment the position of next byte to read, they need to be + // retrieved in the exact same order they were serialized (check + // `serializeNetworkFilter`). + return new NetworkFilter({ + // Mandatory field + mask, + + // Optional parts + bug: (optionalParts & 1) === 1 ? buffer.getUint16() : undefined, + csp: (optionalParts & 2) === 2 ? buffer.getASCII() : undefined, + filter: (optionalParts & 4) === 4 ? buffer.getASCII() : undefined, + hostname: (optionalParts & 8) === 8 ? buffer.getASCII() : undefined, + optDomains: (optionalParts & 16) === 16 ? buffer.getUint32Array() : undefined, + optNotDomains: (optionalParts & 32) === 32 ? buffer.getUint32Array() : undefined, + rawLine: (optionalParts & 64) === 64 ? buffer.getUTF8() : undefined, + redirect: (optionalParts & 128) === 128 ? buffer.getASCII() : undefined, + }); + } + + public readonly mask: number; + public readonly filter?: string; + public readonly optDomains?: Uint32Array; + public readonly optNotDomains?: Uint32Array; + public readonly redirect?: string; + public readonly hostname?: string; + public readonly csp?: string; + public readonly bug?: number; + + // Set only in debug mode + public rawLine?: string; + + public id?: number; + private fuzzySignature?: Uint32Array; + private regex?: RegExp; + private optimized: boolean = false; + + constructor({ + bug, + csp, + filter, + hostname, + mask, + optDomains, + optNotDomains, + rawLine, + redirect, + regex, + }: { mask: number; regex?: RegExp } & Partial) { + this.bug = bug; + this.csp = csp; + this.filter = filter; + this.hostname = hostname; + this.mask = mask; + this.optDomains = optDomains; + this.optNotDomains = optNotDomains; + this.rawLine = rawLine; + this.redirect = redirect; + this.regex = regex; + } + + public isCosmeticFilter() { + return false; + } + public isNetworkFilter() { + return true; + } + + public match(request: Request): boolean { + return checkOptions(this, request) && checkPattern(this, request); + } + + /** + * To allow for a more compact representation of network filters, the + * representation is composed of a mandatory header, and some optional + * + * Header: + * ======= + * + * | opt | mask + * 8 32 + * + * For an empty filter having no pattern, hostname, the minimum size is: 42 bits. + * + * Then for each optional part (filter, hostname optDomains, optNotDomains, + * redirect), it takes 16 bits for the length of the string + the length of the + * string in bytes. + * + * The optional parts are written in order of there number of occurrence in the + * filter list used by the adblocker. The most common being `hostname`, then + * `filter`, `optDomains`, `optNotDomains`, `redirect`. + * + * Example: + * ======== + * + * @@||cliqz.com would result in a serialized version: + * + * | 1 | mask | 9 | c | l | i | q | z | . | c | o | m (16 bytes) + * + * In this case, the serialized version is actually bigger than the original + * filter, but faster to deserialize. In the future, we could optimize the + * representation to compact small filters better. + * + * Ideas: + * * variable length encoding for the mask (if not option, take max 1 byte). + * * first byte could contain the mask as well if small enough. + * * when packing ascii string, store several of them in each byte. + */ + public serialize(buffer: StaticDataView): void { + buffer.pushUint32(this.mask); + + const index = buffer.getPos(); + buffer.pushUint8(0); + + // This bit-mask indicates which optional parts of the filter were serialized. + let optionalParts = 0; + + if (this.bug !== undefined) { + optionalParts |= 1; + buffer.pushUint16(this.bug); + } + + if (this.csp !== undefined) { + optionalParts |= 2; + buffer.pushASCII(this.csp); + } + + if (this.filter !== undefined) { + optionalParts |= 4; + buffer.pushASCII(this.filter); + } + + if (this.hostname !== undefined) { + optionalParts |= 8; + buffer.pushASCII(this.hostname); + } + + if (this.optDomains) { + optionalParts |= 16; + buffer.pushUint32Array(this.optDomains); + } + + if (this.optNotDomains !== undefined) { + optionalParts |= 32; + buffer.pushUint32Array(this.optNotDomains); + } + + if (this.rawLine !== undefined) { + optionalParts |= 64; + buffer.pushUTF8(this.rawLine); + } + + if (this.redirect !== undefined) { + optionalParts |= 128; + buffer.pushASCII(this.redirect); + } + + buffer.setByte(index, optionalParts); + } + + /** + * Tries to recreate the original representation of the filter (adblock + * syntax) from the internal representation. If `rawLine` is set (when filters + * are parsed in `debug` mode for example), then it is returned directly. + * Otherwise, we try to stick as closely as possible to the original form; + * there are things which cannot be recovered though, like domains options + * of which only hashes are stored. + */ + public toString() { + if (this.rawLine !== undefined) { + return this.rawLine; + } + + let filter = ''; + + if (this.isException()) { + filter += '@@'; + } + + if (this.isHostnameAnchor()) { + filter += '||'; + } + + if (this.isLeftAnchor()) { + filter += '|'; + } + + if (this.hasHostname()) { + filter += this.getHostname(); + filter += '^'; + } + + if (!this.isRegex()) { + filter += this.getFilter(); + } else { + // Visualize the compiled regex + filter += this.getRegex().source; + } + + // Options + const options: string[] = []; + + if (!this.fromAny()) { + const numberOfCptOptions = bitCount(this.getCptMask()); + const numberOfNegatedOptions = bitCount(FROM_ANY) - numberOfCptOptions; + + if (numberOfNegatedOptions < numberOfCptOptions) { + if (!this.fromImage()) { + options.push('~image'); + } + if (!this.fromMedia()) { + options.push('~media'); + } + if (!this.fromObject()) { + options.push('~object'); + } + if (!this.fromOther()) { + options.push('~other'); + } + if (!this.fromPing()) { + options.push('~ping'); + } + if (!this.fromScript()) { + options.push('~script'); + } + if (!this.fromStylesheet()) { + options.push('~stylesheet'); + } + if (!this.fromSubdocument()) { + options.push('~subdocument'); + } + if (!this.fromWebsocket()) { + options.push('~websocket'); + } + if (!this.fromXmlHttpRequest()) { + options.push('~xmlhttprequest'); + } + if (!this.fromFont()) { + options.push('~font'); + } + } else { + if (this.fromImage()) { + options.push('image'); + } + if (this.fromMedia()) { + options.push('media'); + } + if (this.fromObject()) { + options.push('object'); + } + if (this.fromOther()) { + options.push('other'); + } + if (this.fromPing()) { + options.push('ping'); + } + if (this.fromScript()) { + options.push('script'); + } + if (this.fromStylesheet()) { + options.push('stylesheet'); + } + if (this.fromSubdocument()) { + options.push('subdocument'); + } + if (this.fromWebsocket()) { + options.push('websocket'); + } + if (this.fromXmlHttpRequest()) { + options.push('xmlhttprequest'); + } + if (this.fromFont()) { + options.push('font'); + } + } + } + + if (this.isFuzzy()) { + options.push('fuzzy'); + } + + if (this.isImportant()) { + options.push('important'); + } + + if (this.isRedirect()) { + options.push(`redirect=${this.getRedirect()}`); + } + + if (this.isCSP()) { + options.push(`csp=${this.csp}`); + } + + if (this.hasBug()) { + options.push(`bug=${this.bug}`); + } + + if (this.firstParty() !== this.thirdParty()) { + if (this.firstParty()) { + options.push('first-party'); + } + if (this.thirdParty()) { + options.push('third-party'); + } + } + + if (this.hasOptDomains() || this.hasOptNotDomains()) { + options.push('domain='); + } + + if (options.length > 0) { + filter += `$${options.join(',')}`; + } + + if (this.isRightAnchor()) { + filter += '|'; + } + + return filter; + } + + // Public API (Read-Only) + public getId(): number { + if (this.id === undefined) { + this.id = computeFilterId( + this.csp, + this.mask, + this.filter, + this.hostname, + this.optDomains, + this.optNotDomains, + ); + } + return this.id; + } + + public hasFilter(): boolean { + return this.filter !== undefined; + } + + public hasOptNotDomains(): boolean { + return this.optNotDomains !== undefined; + } + + public getOptNotDomains(): Uint32Array { + this.optimize(); + return this.optNotDomains || EMPTY_ARRAY; + } + + public hasOptDomains(): boolean { + return this.optDomains !== undefined; + } + + public getOptDomains(): Uint32Array { + this.optimize(); + return this.optDomains || EMPTY_ARRAY; + } + + public getMask(): number { + return this.mask; + } + + public getCptMask(): number { + return this.getMask() & FROM_ANY; + } + + public isRedirect(): boolean { + return this.redirect !== undefined; + } + + public getRedirect(): string { + return this.redirect || ''; + } + + public hasHostname(): boolean { + return this.hostname !== undefined; + } + + public getHostname(): string { + return this.hostname || ''; + } + + public getFilter(): string { + return this.filter || ''; + } + + public getRegex(): RegExp { + this.optimize(); + return this.regex || MATCH_ALL; + } + + public getFuzzySignature(): Uint32Array { + this.optimize(); + return this.fuzzySignature || EMPTY_ARRAY; + } + + public getTokens(): Uint32Array[] { + let tokensBufferIndex = 0; + + // If there is only one domain and no domain negation, we also use this + // domain as a token. + if ( + this.optDomains !== undefined && + this.optNotDomains === undefined && + this.optDomains.length === 1 + ) { + TOKENS_BUFFER[tokensBufferIndex] = this.optDomains[0]; + tokensBufferIndex += 1; + } + + // Get tokens from filter + if (this.filter !== undefined) { + const skipLastToken = this.isPlain() && !this.isRightAnchor() && !this.isFuzzy(); + const skipFirstToken = this.isRightAnchor(); + const filterTokens = tokenizeFilter(this.filter, skipFirstToken, skipLastToken); + TOKENS_BUFFER.set(filterTokens, tokensBufferIndex); + tokensBufferIndex += filterTokens.length; + } + + // Append tokens from hostname, if any + if (this.hostname !== undefined) { + const hostnameTokens = tokenize(this.hostname); + TOKENS_BUFFER.set(hostnameTokens, tokensBufferIndex); + tokensBufferIndex += hostnameTokens.length; + } + + // If we got no tokens for the filter/hostname part, then we will dispatch + // this filter in multiple buckets based on the domains option. + if ( + tokensBufferIndex === 0 && + this.optDomains !== undefined && + this.optNotDomains === undefined + ) { + return [...this.optDomains].map((d) => new Uint32Array([d])); + } + + // Add optional token for protocol + if (this.fromHttp() && !this.fromHttps()) { + TOKENS_BUFFER[tokensBufferIndex] = fastHash('http'); + tokensBufferIndex += 1; + } else if (this.fromHttps() && !this.fromHttp()) { + TOKENS_BUFFER[tokensBufferIndex] = fastHash('https'); + tokensBufferIndex += 1; + } + + return [TOKENS_BUFFER.slice(0, tokensBufferIndex)]; + } + + /** + * Check if this filter should apply to a request with this content type. + */ + public isCptAllowed(cpt: RequestType): boolean { + const mask = CPT_TO_MASK[cpt]; + if (mask !== undefined) { + return getBit(this.mask, mask); + } + + // If content type is not supported (or not specified), we return `true` + // only if the filter does not specify any resource type. + return this.fromAny(); + } + + public isFuzzy() { + return getBit(this.mask, NETWORK_FILTER_MASK.fuzzyMatch); + } + + public isException() { + return getBit(this.mask, NETWORK_FILTER_MASK.isException); + } + + public isHostnameAnchor() { + return getBit(this.mask, NETWORK_FILTER_MASK.isHostnameAnchor); + } + + public isRightAnchor() { + return getBit(this.mask, NETWORK_FILTER_MASK.isRightAnchor); + } + + public isLeftAnchor() { + return getBit(this.mask, NETWORK_FILTER_MASK.isLeftAnchor); + } + + public matchCase() { + return getBit(this.mask, NETWORK_FILTER_MASK.matchCase); + } + + public isImportant() { + return getBit(this.mask, NETWORK_FILTER_MASK.isImportant); + } + + public isRegex() { + return getBit(this.mask, NETWORK_FILTER_MASK.isRegex); + } + + public isPlain() { + return !getBit(this.mask, NETWORK_FILTER_MASK.isRegex); + } + + public isCSP() { + return getBit(this.mask, NETWORK_FILTER_MASK.isCSP); + } + + public hasBug() { + return this.bug !== undefined; + } + + public fromAny() { + return this.getCptMask() === FROM_ANY; + } + + public thirdParty() { + return getBit(this.mask, NETWORK_FILTER_MASK.thirdParty); + } + + public firstParty() { + return getBit(this.mask, NETWORK_FILTER_MASK.firstParty); + } + + public fromImage() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromImage); + } + + public fromMedia() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromMedia); + } + + public fromObject() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromObject); + } + + public fromOther() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromOther); + } + + public fromPing() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromPing); + } + + public fromScript() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromScript); + } + + public fromStylesheet() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromStylesheet); + } + + public fromSubdocument() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromSubdocument); + } + + public fromWebsocket() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromWebsocket); + } + + public fromHttp() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromHttp); + } + + public fromHttps() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromHttps); + } + + public fromXmlHttpRequest() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromXmlHttpRequest); + } + + public fromFont() { + return getBit(this.mask, NETWORK_FILTER_MASK.fromFont); + } + + private optimize() { + if (this.optimized === false) { + this.optimized = true; + if (this.optNotDomains !== undefined) { + this.optNotDomains.sort(); + } + if (this.optDomains !== undefined) { + this.optDomains.sort(); + } + if (this.filter !== undefined && this.regex === undefined && this.isRegex()) { + this.regex = compileRegex(this.filter, this.isRightAnchor(), this.isLeftAnchor()); + } + if (this.filter !== undefined && this.isFuzzy()) { + this.fuzzySignature = createFuzzySignature(this.filter); + if (this.fuzzySignature.length === 0) { + this.fuzzySignature = undefined; + } + } + } + } +} + +// --------------------------------------------------------------------------- +// Filter parsing +// --------------------------------------------------------------------------- + +function setNetworkMask(mask: number, m: number, value: boolean): number { + if (value === true) { + return setBit(mask, m); + } + + return clearBit(mask, m); +} + +/** + * Check if the sub-string contained between the indices start and end is a + * regex filter (it contains a '*' or '^' char). Here we are limited by the + * capability of javascript to check the presence of a pattern between two + * indices (same for Regex...). + * // TODO - we could use sticky regex here + */ +function checkIsRegex(filter: string, start: number, end: number): boolean { + const starIndex = filter.indexOf('*', start); + const separatorIndex = filter.indexOf('^', start); + return (starIndex !== -1 && starIndex < end) || (separatorIndex !== -1 && separatorIndex < end); +} + +export function isAnchoredByHostname(filterHostname: string, hostname: string): boolean { + // Corner-case, if `filterHostname` is empty, then it's a match + if (filterHostname.length === 0) { + return true; + } + + // `filterHostname` cannot be longer than actual hostname + if (filterHostname.length > hostname.length) { + return false; + } + + // Check if `filterHostname` appears anywhere in `hostname` + const matchIndex = hostname.indexOf(filterHostname); + + // No match + if (matchIndex === -1) { + return false; + } + + // Either start at beginning of hostname or be preceded by a '.' + return ( + // Prefix match + (matchIndex === 0 && + // This means `filterHostname` is equal to `hostname` + (hostname.length === filterHostname.length || + // This means that `filterHostname` is a prefix of `hostname` (ends with a '.') + filterHostname[filterHostname.length - 1] === '.' || + hostname[filterHostname.length] === '.')) || + // Suffix or infix match + ((hostname[matchIndex - 1] === '.' || filterHostname[0] === '.') && + // `filterHostname` is a full suffix of `hostname` + (hostname.length - matchIndex === filterHostname.length || + // This means that `filterHostname` is infix of `hostname` (ends with a '.') + filterHostname[filterHostname.length - 1] === '.' || + hostname[matchIndex + filterHostname.length] === '.')) + ); +} + +function getUrlAfterHostname(url: string, hostname: string): string { + return url.slice(url.indexOf(hostname) + hostname.length); +} + +// pattern$fuzzy +function checkPatternFuzzyFilter(filter: NetworkFilter, request: Request) { + const signature = filter.getFuzzySignature(); + const requestSignature = request.getFuzzySignature(); + + if (signature.length > requestSignature.length) { + return false; + } + + let lastIndex = 0; + for (let i = 0; i < signature.length; i += 1) { + const c = signature[i]; + // Find the occurrence of `c` in `requestSignature` + const j = requestSignature.indexOf(c, lastIndex); + if (j === -1) { + return false; + } + lastIndex = j + 1; + } + + return true; +} + +// pattern +function checkPatternPlainFilter(filter: NetworkFilter, request: Request): boolean { + if (filter.hasFilter() === false) { + return true; + } + + return request.url.indexOf(filter.getFilter()) !== -1; +} + +// pattern| +function checkPatternRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { + return request.url.endsWith(filter.getFilter()); +} + +// |pattern +function checkPatternLeftAnchorFilter(filter: NetworkFilter, request: Request): boolean { + return fastStartsWith(request.url, filter.getFilter()); +} + +// |pattern| +function checkPatternLeftRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { + return request.url === filter.getFilter(); +} + +// pattern*^ +function checkPatternRegexFilter( + filter: NetworkFilter, + request: Request, + startFrom: number = 0, +): boolean { + let url = request.url; + if (startFrom > 0) { + url = url.slice(startFrom); + } + return filter.getRegex().test(url); +} + +// ||pattern*^ +function checkPatternHostnameAnchorRegexFilter(filter: NetworkFilter, request: Request): boolean { + const url = request.url; + const hostname = request.hostname; + const filterHostname = filter.getHostname(); + if (isAnchoredByHostname(filterHostname, hostname)) { + return checkPatternRegexFilter( + filter, + request, + url.indexOf(filterHostname) + filterHostname.length, + ); + } + + return false; +} + +// ||pattern| +function checkPatternHostnameRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { + const filterHostname = filter.getHostname(); + const requestHostname = request.hostname; + if (isAnchoredByHostname(filterHostname, requestHostname)) { + if (filter.hasFilter() === false) { + // In this specific case it means that the specified hostname should match + // at the end of the hostname of the request. This allows to prevent false + // positive like ||foo.bar which would match https://foo.bar.baz where + // ||foo.bar^ would not. + return ( + filterHostname.length === requestHostname.length || + requestHostname.endsWith(filterHostname) + ); + } else { + return checkPatternRightAnchorFilter(filter, request); + } + } + + return false; +} + +// |||pattern| +function checkPatternHostnameLeftRightAnchorFilter( + filter: NetworkFilter, + request: Request, +): boolean { + if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { + // Since this is not a regex, the filter pattern must follow the hostname + // with nothing in between. So we extract the part of the URL following + // after hostname and will perform the matching on it. + const urlAfterHostname = getUrlAfterHostname(request.url, filter.getHostname()); + + // Since it must follow immediatly after the hostname and be a suffix of + // the URL, we conclude that filter must be equal to the part of the + // url following the hostname. + return filter.getFilter() === urlAfterHostname; + } + + return false; +} + +// ||pattern + left-anchor => This means that a plain pattern needs to appear +// exactly after the hostname, with nothing in between. +function checkPatternHostnameLeftAnchorFilter(filter: NetworkFilter, request: Request): boolean { + if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { + // Since this is not a regex, the filter pattern must follow the hostname + // with nothing in between. So we extract the part of the URL following + // after hostname and will perform the matching on it. + return fastStartsWithFrom( + request.url, + filter.getFilter(), + request.url.indexOf(filter.getHostname()) + filter.getHostname().length, + ); + } + + return false; +} + +// ||pattern +function checkPatternHostnameAnchorFilter(filter: NetworkFilter, request: Request): boolean { + const filterHostname = filter.getHostname(); + if (isAnchoredByHostname(filterHostname, request.hostname)) { + if (filter.hasFilter() === false) { + return true; + } + + // We consider this a match if the plain patter (i.e.: filter) appears anywhere. + return ( + request.url.indexOf( + filter.getFilter(), + request.url.indexOf(filterHostname) + filterHostname.length, + ) !== -1 + ); + } + + return false; +} + +// ||pattern$fuzzy +function checkPatternHostnameAnchorFuzzyFilter(filter: NetworkFilter, request: Request) { + if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { + return checkPatternFuzzyFilter(filter, request); + } + + return false; +} + +/** + * Specialize a network filter depending on its type. It allows for more + * efficient matching function. + */ +function checkPattern(filter: NetworkFilter, request: Request): boolean { + if (filter.isHostnameAnchor()) { + if (filter.isRegex()) { + return checkPatternHostnameAnchorRegexFilter(filter, request); + } else if (filter.isRightAnchor() && filter.isLeftAnchor()) { + return checkPatternHostnameLeftRightAnchorFilter(filter, request); + } else if (filter.isRightAnchor()) { + return checkPatternHostnameRightAnchorFilter(filter, request); + } else if (filter.isFuzzy()) { + return checkPatternHostnameAnchorFuzzyFilter(filter, request); + } else if (filter.isLeftAnchor()) { + return checkPatternHostnameLeftAnchorFilter(filter, request); + } + return checkPatternHostnameAnchorFilter(filter, request); + } else if (filter.isRegex()) { + return checkPatternRegexFilter(filter, request); + } else if (filter.isLeftAnchor() && filter.isRightAnchor()) { + return checkPatternLeftRightAnchorFilter(filter, request); + } else if (filter.isLeftAnchor()) { + return checkPatternLeftAnchorFilter(filter, request); + } else if (filter.isRightAnchor()) { + return checkPatternRightAnchorFilter(filter, request); + } else if (filter.isFuzzy()) { + return checkPatternFuzzyFilter(filter, request); + } + + return checkPatternPlainFilter(filter, request); +} + +function checkOptions(filter: NetworkFilter, request: Request): boolean { + // We first discard requests based on type, protocol and party. This is really + // cheap and should be done first. + if ( + filter.isCptAllowed(request.type) === false || + (request.isHttps === true && filter.fromHttps() === false) || + (request.isHttp === true && filter.fromHttp() === false) || + (!filter.firstParty() && request.isFirstParty === true) || + (!filter.thirdParty() && request.isThirdParty === true) + ) { + return false; + } + + // Make sure that an exception with a bug ID can only apply to a request being + // matched for a specific bug ID. + if (filter.bug !== undefined && filter.isException() && filter.bug !== request.bug) { + return false; + } + + // Source URL must be among these domains to match + if (filter.hasOptDomains()) { + const optDomains = filter.getOptDomains(); + if ( + binLookup(optDomains, request.sourceHostnameHash) === false && + binLookup(optDomains, request.sourceDomainHash) === false + ) { + return false; + } + } + + // Source URL must not be among these domains to match + if (filter.hasOptNotDomains()) { + const optNotDomains = filter.getOptNotDomains(); + if ( + binLookup(optNotDomains, request.sourceHostnameHash) === true || + binLookup(optNotDomains, request.sourceDomainHash) === true + ) { + return false; + } + } + + return true; +} diff --git a/src/lists.ts b/src/lists.ts new file mode 100644 index 0000000000..5ae0f36ea5 --- /dev/null +++ b/src/lists.ts @@ -0,0 +1,389 @@ +import StaticDataView from './data-view'; +import CosmeticFilter from './filters/cosmetic'; +import NetworkFilter from './filters/network'; +import { fastStartsWith, fastStartsWithFrom } from './utils'; + +const SPACE = /\s/; + +const enum FilterType { + NOT_SUPPORTED, + NETWORK, + COSMETIC, +} + +/** + * Given a single line (string), checks if this would likely be a cosmetic + * filter, a network filter or something that is not supported. This check is + * performed before calling a more specific parser to create an instance of + * `NetworkFilter` or `CosmeticFilter`. + */ +function detectFilterType(line: string): FilterType { + // Ignore comments + if ( + line.length === 1 || + line.charAt(0) === '!' || + (line.charAt(0) === '#' && SPACE.test(line.charAt(1))) || + fastStartsWith(line, '[Adblock') + ) { + return FilterType.NOT_SUPPORTED; + } + + if (fastStartsWith(line, '|') || fastStartsWith(line, '@@|')) { + return FilterType.NETWORK; + } + + // Ignore Adguard cosmetics + // `$$` + if (line.indexOf('$$') !== -1) { + return FilterType.NOT_SUPPORTED; + } + + // Check if filter is cosmetics + const sharpIndex = line.indexOf('#'); + if (sharpIndex !== -1) { + const afterSharpIndex = sharpIndex + 1; + + // Ignore Adguard cosmetics + // `#$#` `#@$#` + // `#%#` `#@%#` + // `#?#` + if ( + fastStartsWithFrom(line, /* #@$# */ '@$#', afterSharpIndex) || + fastStartsWithFrom(line, /* #@%# */ '@%#', afterSharpIndex) || + fastStartsWithFrom(line, /* #%# */ '%#', afterSharpIndex) || + fastStartsWithFrom(line, /* #$# */ '$#', afterSharpIndex) || + fastStartsWithFrom(line, /* #?# */ '?#', afterSharpIndex) + ) { + return FilterType.NOT_SUPPORTED; + } else if ( + fastStartsWithFrom(line, /* ## */ '#', afterSharpIndex) || + fastStartsWithFrom(line, /* #@# */ '@#', afterSharpIndex) + ) { + // Parse supported cosmetic filter + // `##` `#@#` + return FilterType.COSMETIC; + } + } + + // Everything else is a network filter + return FilterType.NETWORK; +} + +export function f(strings: TemplateStringsArray): NetworkFilter | CosmeticFilter | null { + const rawFilter = strings.raw[0]; + const filterType = detectFilterType(rawFilter); + + let filter: NetworkFilter | CosmeticFilter | null = null; + if (filterType === FilterType.NETWORK) { + filter = NetworkFilter.parse(rawFilter); + } else if (filterType === FilterType.COSMETIC) { + filter = CosmeticFilter.parse(rawFilter); + } + + if (filter !== null) { + filter.rawLine = rawFilter; + } + + return filter; +} + +export function parseFilters( + list: string, + { + loadNetworkFilters = true, + loadCosmeticFilters = true, + debug = false, + }: { + loadNetworkFilters?: boolean; + loadCosmeticFilters?: boolean; + debug?: boolean; + } = {}, +): { networkFilters: NetworkFilter[]; cosmeticFilters: CosmeticFilter[] } { + const networkFilters: NetworkFilter[] = []; + const cosmeticFilters: CosmeticFilter[] = []; + + const lines = list.split('\n'); + + for (let i = 0; i < lines.length; i += 1) { + const line = lines[i].trim(); + + if (line.length > 0) { + const filterType = detectFilterType(line); + + if (filterType === FilterType.NETWORK && loadNetworkFilters) { + const filter = NetworkFilter.parse(line, debug); + if (filter !== null) { + networkFilters.push(filter); + } + } else if (filterType === FilterType.COSMETIC && loadCosmeticFilters) { + const filter = CosmeticFilter.parse(line, debug); + if (filter !== null) { + cosmeticFilters.push(filter); + } + } + } + } + + return { networkFilters, cosmeticFilters }; +} + +export interface IListDiff { + newNetworkFilters: NetworkFilter[]; + newCosmeticFilters: CosmeticFilter[]; + removedCosmeticFilters: number[]; + removedNetworkFilters: number[]; +} + +export class List { + public static deserialize(buffer: StaticDataView): List { + const checksum: string = buffer.getASCII(); + + const debug = buffer.getBool(); + const loadCosmeticFilters = buffer.getBool(); + const loadNetworkFilters = buffer.getBool(); + + const list = new List({ + debug, + loadCosmeticFilters, + loadNetworkFilters, + }); + + list.checksum = checksum; + list.cosmeticFilterIds = new Set(buffer.getUint32Array()); + list.networkFilterIds = new Set(buffer.getUint32Array()); + + return list; + } + + public checksum: string; + public readonly loadCosmeticFilters: boolean; + public readonly loadNetworkFilters: boolean; + public readonly debug: boolean; + + public networkFilterIds: Set; + public cosmeticFilterIds: Set; + + constructor({ + debug = false, + loadCosmeticFilters = true, + loadNetworkFilters = true, + }: { + debug?: boolean; + loadCosmeticFilters?: boolean; + loadNetworkFilters?: boolean; + } = {}) { + this.debug = debug; + this.checksum = ''; + this.loadCosmeticFilters = loadCosmeticFilters; + this.loadNetworkFilters = loadNetworkFilters; + + // Keep track of currently loaded filters TODO - could make use of a typed + // array to allow fast serialization/deserialization as well as less memory + // usage (use compact set abstraction) + this.cosmeticFilterIds = new Set(); + this.networkFilterIds = new Set(); + } + + public getNetworkFiltersIds(): number[] { + return [...this.networkFilterIds]; + } + + public getCosmeticFiltersIds(): number[] { + return [...this.cosmeticFilterIds]; + } + + public update(list: string, checksum: string): IListDiff { + if (checksum === this.checksum) { + return { + newCosmeticFilters: [], + newNetworkFilters: [], + removedCosmeticFilters: [], + removedNetworkFilters: [], + }; + } + + this.checksum = checksum; + + const newCosmeticFilters: CosmeticFilter[] = []; + const newCosmeticFilterIds: Set = new Set(); + + const newNetworkFilters: NetworkFilter[] = []; + const newNetworkFilterIds: Set = new Set(); + + // Parse new filters + const { cosmeticFilters, networkFilters } = parseFilters(list, { + debug: this.debug, + loadCosmeticFilters: this.loadCosmeticFilters, + loadNetworkFilters: this.loadNetworkFilters, + }); + + for (let i = 0; i < cosmeticFilters.length; i += 1) { + const filter = cosmeticFilters[i]; + newCosmeticFilterIds.add(filter.getId()); + if (!this.cosmeticFilterIds.has(filter.getId())) { + newCosmeticFilters.push(filter); + } + } + + for (let i = 0; i < networkFilters.length; i += 1) { + const filter = networkFilters[i]; + newNetworkFilterIds.add(filter.getId()); + if (!this.networkFilterIds.has(filter.getId())) { + newNetworkFilters.push(filter); + } + } + + // Detect list of IDs which have been removed + const removedNetworkFilters: number[] = [...this.networkFilterIds].filter( + (id) => !newNetworkFilterIds.has(id), + ); + const removedCosmeticFilters: number[] = [...this.cosmeticFilterIds].filter( + (id) => !newCosmeticFilterIds.has(id), + ); + + // Update list of filter IDs + this.cosmeticFilterIds = newCosmeticFilterIds; + this.networkFilterIds = newNetworkFilterIds; + + return { + newCosmeticFilters, + newNetworkFilters, + removedCosmeticFilters, + removedNetworkFilters, + }; + } + + public serialize(buffer: StaticDataView): void { + buffer.pushASCII(this.checksum); + + buffer.pushBool(this.debug); + buffer.pushBool(this.loadCosmeticFilters); + buffer.pushBool(this.loadNetworkFilters); + + buffer.pushUint32Array(new Uint32Array([...this.cosmeticFilterIds])); + buffer.pushUint32Array(new Uint32Array([...this.networkFilterIds])); + } +} + +export interface IListsOptions { + loadCosmeticFilters?: boolean; + loadNetworkFilters?: boolean; + debug?: boolean; +} + +export default class Lists { + public static deserialize(buffer: StaticDataView): Lists { + const debug = buffer.getBool(); + const loadCosmeticFilters = buffer.getBool(); + const loadNetworkFilters = buffer.getBool(); + + const lists = new Lists({ + debug, + loadCosmeticFilters, + loadNetworkFilters, + }); + + const numberOfLists = buffer.getUint16(); + for (let i = 0; i < numberOfLists; i += 1) { + const name = buffer.getASCII(); + const list = List.deserialize(buffer); + lists.lists.set(name, list); + } + + return lists; + } + + public readonly lists: Map; + public readonly loadNetworkFilters: boolean; + public readonly loadCosmeticFilters: boolean; + public readonly debug: boolean; + + constructor({ + debug = false, + loadCosmeticFilters = true, + loadNetworkFilters = true, + }: IListsOptions = {}) { + this.lists = new Map(); + this.loadNetworkFilters = loadNetworkFilters; + this.loadCosmeticFilters = loadCosmeticFilters; + this.debug = debug; + } + + public serialize(buffer: StaticDataView): void { + buffer.pushBool(this.debug); + buffer.pushBool(this.loadCosmeticFilters); + buffer.pushBool(this.loadNetworkFilters); + + buffer.pushUint16(this.lists.size); + this.lists.forEach((list, name) => { + buffer.pushASCII(name); + list.serialize(buffer); + }); + } + + public getLoaded(): string[] { + return [...this.lists.keys()]; + } + + public has(name: string, checksum: string): boolean { + const list: List | undefined = this.lists.get(name); + if (list !== undefined && list.checksum === checksum) { + return true; + } + return false; + } + + public delete(names: string[]): IListDiff { + const removedNetworkFilters: number[] = []; + const removedCosmeticFilters: number[] = []; + + for (let i = 0; i < names.length; i += 1) { + const name = names[i]; + const list: List | undefined = this.lists.get(name); + if (list !== undefined) { + removedNetworkFilters.push(...list.getNetworkFiltersIds()); + removedCosmeticFilters.push(...list.getCosmeticFiltersIds()); + this.lists.delete(name); + } + } + + return { + newCosmeticFilters: [], + newNetworkFilters: [], + removedCosmeticFilters, + removedNetworkFilters, + }; + } + + public update(lists: Array<{ name: string; checksum: string; list: string }>): IListDiff { + const newNetworkFilters: NetworkFilter[] = []; + const removedNetworkFilters: number[] = []; + const newCosmeticFilters: CosmeticFilter[] = []; + const removedCosmeticFilters: number[] = []; + + for (let i = 0; i < lists.length; i += 1) { + const { name, list, checksum } = lists[i]; + const currentList = + this.lists.get(name) || + new List({ + debug: this.debug, + loadCosmeticFilters: this.loadCosmeticFilters, + loadNetworkFilters: this.loadNetworkFilters, + }); + this.lists.set(name, currentList); + + const diff = currentList.update(list, checksum); + newNetworkFilters.push(...diff.newNetworkFilters); + removedNetworkFilters.push(...diff.removedNetworkFilters); + newCosmeticFilters.push(...diff.newCosmeticFilters); + removedCosmeticFilters.push(...diff.removedCosmeticFilters); + } + + return { + newCosmeticFilters, + newNetworkFilters, + removedCosmeticFilters, + removedNetworkFilters, + }; + } +} diff --git a/src/matching/cosmetics.ts b/src/matching/cosmetics.ts deleted file mode 100644 index 4303215528..0000000000 --- a/src/matching/cosmetics.ts +++ /dev/null @@ -1,98 +0,0 @@ -import { CosmeticFilter } from '../parsing/cosmetic-filter'; - -/* Checks that hostnamePattern matches at the end of the hostname. - * Partial matches are allowed, but hostname should be a valid - * subdomain of hostnamePattern. - */ -function checkHostnamesPartialMatch(hostname: string, hostnamePattern: string): boolean { - if (hostname.endsWith(hostnamePattern)) { - const patternIndex = hostname.length - hostnamePattern.length; - if (patternIndex === 0 || hostname[patternIndex - 1] === '.') { - return true; - } - } - - return false; -} - -/* Checks if `hostname` matches `hostnamePattern`, which can appear as - * a domain selector in a cosmetic filter: hostnamePattern##selector - * - * It takes care of the concept of entities introduced by uBlock: google.* - * https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#entity-based-cosmetic-filters - */ -function matchHostname( - hostname: string, - hostnameWithoutPublicSuffix: string | null, - hostnamePattern: string, -): boolean { - if (hostnamePattern.endsWith('.*')) { - // Check if we have an entity match - if (hostnameWithoutPublicSuffix !== null) { - return checkHostnamesPartialMatch(hostnameWithoutPublicSuffix, hostnamePattern.slice(0, -2)); - } - - return false; - } - - return checkHostnamesPartialMatch(hostname, hostnamePattern); -} - -/** - * Given a hostname and its domain, return the hostname without the public - * suffix. We know that the domain, with one less label on the left, will be a - * the public suffix; and from there we know which trailing portion of - * `hostname` we should remove. - */ -export function getHostnameWithoutPublicSuffix(hostname: string, domain: string): string | null { - let hostnameWithoutPublicSuffix: string | null = null; - - const indexOfDot = domain.indexOf('.'); - if (indexOfDot !== -1) { - const publicSuffix = domain.slice(indexOfDot + 1); - hostnameWithoutPublicSuffix = hostname.slice(0, -publicSuffix.length - 1); - } - - return hostnameWithoutPublicSuffix; -} - -export default function matchCosmeticFilter( - filter: CosmeticFilter, - hostname: string, - domain: string, -): boolean { - const hostnameWithoutPublicSuffix = getHostnameWithoutPublicSuffix(hostname, domain); - - // Check hostnames - if (filter.hasHostnames()) { - if (hostname) { - const hostnames = filter.getHostnames(); - - // Check for exceptions - for (let i = 0; i < hostnames.length; i += 1) { - const filterHostname = hostnames[i]; - if ( - filterHostname[0] === '~' && - matchHostname(hostname, hostnameWithoutPublicSuffix, filterHostname.slice(1)) - ) { - return false; - } - } - - // Check for positive matches - for (let i = 0; i < hostnames.length; i += 1) { - const filterHostname = hostnames[i]; - if ( - filterHostname[0] !== '~' && - matchHostname(hostname, hostnameWithoutPublicSuffix, filterHostname) - ) { - return true; - } - } - } - - return false; - } - - return true; -} diff --git a/src/matching/network.ts b/src/matching/network.ts deleted file mode 100644 index 3ef85e2df0..0000000000 --- a/src/matching/network.ts +++ /dev/null @@ -1,309 +0,0 @@ -import { NetworkFilter } from '../parsing/network-filter'; -import Request from '../request'; -import { binSearch, fastStartsWith, fastStartsWithFrom } from '../utils'; - -export function isAnchoredByHostname(filterHostname: string, hostname: string): boolean { - // Corner-case, if `filterHostname` is empty, then it's a match - if (filterHostname.length === 0) { - return true; - } - - // `filterHostname` cannot be longer than actual hostname - if (filterHostname.length > hostname.length) { - return false; - } - - // Check if `filterHostname` appears anywhere in `hostname` - const matchIndex = hostname.indexOf(filterHostname); - - // No match - if (matchIndex === -1) { - return false; - } - - // Either start at beginning of hostname or be preceded by a '.' - return ( - // Prefix match - (matchIndex === 0 && - // This means `filterHostname` is equal to `hostname` - (hostname.length === filterHostname.length || - // This means that `filterHostname` is a prefix of `hostname` (ends with a '.') - filterHostname[filterHostname.length - 1] === '.' || - hostname[filterHostname.length] === '.')) || - // Suffix or infix match - ((hostname[matchIndex - 1] === '.' || filterHostname[0] === '.') && - // `filterHostname` is a full suffix of `hostname` - (hostname.length - matchIndex === filterHostname.length || - // This means that `filterHostname` is infix of `hostname` (ends with a '.') - filterHostname[filterHostname.length - 1] === '.' || - hostname[matchIndex + filterHostname.length] === '.')) - ); -} - -function getUrlAfterHostname(url: string, hostname: string): string { - return url.slice(url.indexOf(hostname) + hostname.length); -} - -// pattern$fuzzy -function checkPatternFuzzyFilter(filter: NetworkFilter, request: Request) { - const signature = filter.getFuzzySignature(); - const requestSignature = request.getFuzzySignature(); - - if (signature.length > requestSignature.length) { - return false; - } - - let lastIndex = 0; - for (let i = 0; i < signature.length; i += 1) { - const c = signature[i]; - // Find the occurrence of `c` in `requestSignature` - const j = requestSignature.indexOf(c, lastIndex); - if (j === -1) { - return false; - } - lastIndex = j + 1; - } - - return true; -} - -// pattern -function checkPatternPlainFilter(filter: NetworkFilter, request: Request): boolean { - if (filter.hasFilter() === false) { - return true; - } - - return request.url.indexOf(filter.getFilter()) !== -1; -} - -// pattern| -function checkPatternRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { - if (filter.hasFilter() === false) { - return true; - } - - return request.url.endsWith(filter.getFilter()); -} - -// |pattern -function checkPatternLeftAnchorFilter(filter: NetworkFilter, request: Request): boolean { - if (filter.hasFilter() === false) { - return true; - } - - return fastStartsWith(request.url, filter.getFilter()); -} - -// |pattern| -function checkPatternLeftRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { - if (filter.hasFilter() === false) { - return true; - } - return request.url === filter.getFilter(); -} - -// pattern*^ -function checkPatternRegexFilter( - filter: NetworkFilter, - request: Request, - startFrom: number = 0, -): boolean { - let url = request.url; - if (startFrom > 0) { - url = url.slice(startFrom); - } - return filter.getRegex().test(url); -} - -// ||pattern*^ -function checkPatternHostnameAnchorRegexFilter(filter: NetworkFilter, request: Request): boolean { - const url = request.url; - const hostname = request.hostname; - const filterHostname = filter.getHostname(); - if (isAnchoredByHostname(filterHostname, hostname)) { - return checkPatternRegexFilter( - filter, - request, - url.indexOf(filterHostname) + filterHostname.length, - ); - } - - return false; -} - -// ||pattern| -function checkPatternHostnameRightAnchorFilter(filter: NetworkFilter, request: Request): boolean { - const filterHostname = filter.getHostname(); - const requestHostname = request.hostname; - if (isAnchoredByHostname(filterHostname, requestHostname)) { - if (filter.hasFilter() === false) { - // In this specific case it means that the specified hostname should match - // at the end of the hostname of the request. This allows to prevent false - // positive like ||foo.bar which would match https://foo.bar.baz where - // ||foo.bar^ would not. - return ( - filterHostname.length === requestHostname.length || - requestHostname.endsWith(filterHostname) - ); - } else { - return checkPatternRightAnchorFilter(filter, request); - } - } - - return false; -} - -// |||pattern| -function checkPatternHostnameLeftRightAnchorFilter( - filter: NetworkFilter, - request: Request, -): boolean { - if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { - if (filter.hasFilter() === false) { - return true; - } - - // Since this is not a regex, the filter pattern must follow the hostname - // with nothing in between. So we extract the part of the URL following - // after hostname and will perform the matching on it. - const urlAfterHostname = getUrlAfterHostname(request.url, filter.getHostname()); - - // Since it must follow immediatly after the hostname and be a suffix of - // the URL, we conclude that filter must be equal to the part of the - // url following the hostname. - return filter.getFilter() === urlAfterHostname; - } - - return false; -} - -// ||pattern + left-anchor => This means that a plain pattern needs to appear -// exactly after the hostname, with nothing in between. -function checkPatternHostnameLeftAnchorFilter(filter: NetworkFilter, request: Request): boolean { - if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { - if (filter.hasFilter() === false) { - return true; - } - - // Since this is not a regex, the filter pattern must follow the hostname - // with nothing in between. So we extract the part of the URL following - // after hostname and will perform the matching on it. - return fastStartsWithFrom( - request.url, - filter.getFilter(), - request.url.indexOf(filter.getHostname()) + filter.getHostname().length, - ); - } - - return false; -} - -// ||pattern -function checkPatternHostnameAnchorFilter(filter: NetworkFilter, request: Request): boolean { - const filterHostname = filter.getHostname(); - if (isAnchoredByHostname(filterHostname, request.hostname)) { - if (filter.hasFilter() === false) { - return true; - } - - // We consider this a match if the plain patter (i.e.: filter) appears anywhere. - return ( - request.url.indexOf( - filter.getFilter(), - request.url.indexOf(filterHostname) + filterHostname.length, - ) !== -1 - ); - } - - return false; -} - -// ||pattern$fuzzy -function checkPatternHostnameAnchorFuzzyFilter(filter: NetworkFilter, request: Request) { - if (isAnchoredByHostname(filter.getHostname(), request.hostname)) { - return checkPatternFuzzyFilter(filter, request); - } - - return false; -} - -/** - * Specialize a network filter depending on its type. It allows for more - * efficient matching function. - */ -function checkPattern(filter: NetworkFilter, request: Request): boolean { - if (filter.isHostnameAnchor()) { - if (filter.isRegex()) { - return checkPatternHostnameAnchorRegexFilter(filter, request); - } else if (filter.isRightAnchor() && filter.isLeftAnchor()) { - return checkPatternHostnameLeftRightAnchorFilter(filter, request); - } else if (filter.isRightAnchor()) { - return checkPatternHostnameRightAnchorFilter(filter, request); - } else if (filter.isFuzzy()) { - return checkPatternHostnameAnchorFuzzyFilter(filter, request); - } else if (filter.isLeftAnchor()) { - return checkPatternHostnameLeftAnchorFilter(filter, request); - } - return checkPatternHostnameAnchorFilter(filter, request); - } else if (filter.isRegex()) { - return checkPatternRegexFilter(filter, request); - } else if (filter.isLeftAnchor() && filter.isRightAnchor()) { - return checkPatternLeftRightAnchorFilter(filter, request); - } else if (filter.isLeftAnchor()) { - return checkPatternLeftAnchorFilter(filter, request); - } else if (filter.isRightAnchor()) { - return checkPatternRightAnchorFilter(filter, request); - } else if (filter.isFuzzy()) { - return checkPatternFuzzyFilter(filter, request); - } - - return checkPatternPlainFilter(filter, request); -} - -function checkOptions(filter: NetworkFilter, request: Request): boolean { - // We first discard requests based on type, protocol and party. This is really - // cheap and should be done first. - if ( - filter.isCptAllowed(request.type) === false || - (request.isHttps === true && filter.fromHttps() === false) || - (request.isHttp === true && filter.fromHttp() === false) || - (!filter.firstParty() && request.isFirstParty === true) || - (!filter.thirdParty() && request.isThirdParty === true) - ) { - return false; - } - - // Make sure that an exception with a bug ID can only apply to a request being - // matched for a specific bug ID. - if (filter.bug !== undefined && filter.isException() && filter.bug !== request.bug) { - return false; - } - - // Source URL must be among these domains to match - if (filter.hasOptDomains()) { - const optDomains = filter.getOptDomains(); - if ( - !binSearch(optDomains, request.sourceHostnameHash) && - !binSearch(optDomains, request.sourceDomainHash) - ) { - return false; - } - } - - // Source URL must not be among these domains to match - if (filter.hasOptNotDomains()) { - const optNotDomains = filter.getOptNotDomains(); - if ( - binSearch(optNotDomains, request.sourceHostnameHash) || - binSearch(optNotDomains, request.sourceDomainHash) - ) { - return false; - } - } - - return true; -} - -export default function matchNetworkFilter(filter: NetworkFilter, request: Request): boolean { - return checkOptions(filter, request) && checkPattern(filter, request); -} diff --git a/src/parsing/cosmetic-filter.ts b/src/parsing/cosmetic-filter.ts deleted file mode 100644 index 7f6aa7bc84..0000000000 --- a/src/parsing/cosmetic-filter.ts +++ /dev/null @@ -1,363 +0,0 @@ -import * as punycode from 'punycode'; -import { fastStartsWithFrom, getBit, hasUnicode, setBit, tokenizeHostnames } from '../utils'; -import IFilter from './interface'; - -/** - * Validate CSS selector. There is a fast path for simple selectors (e.g.: #foo - * or .bar) which are the most common case. For complex ones, we rely on - * `Element.matches` (if available). - */ -const isValidCss = (() => { - const div = - typeof document !== 'undefined' - ? document.createElement('div') - : { - matches: () => { - /* noop */ - }, - }; - const matches = (selector: string): void | boolean => div.matches(selector); - const validSelectorRe = /^[#.]?[\w-.]+$/; - - return function isValidCssImpl(selector: string): boolean { - if (validSelectorRe.test(selector)) { - return true; - } - - try { - matches(selector); - } catch (ex) { - return false; - } - - return true; - }; -})(); - -/** - * Masks used to store options of cosmetic filters in a bitmask. - */ -const enum COSMETICS_MASK { - unhide = 1 << 0, - scriptInject = 1 << 1, - scriptBlock = 1 << 2, -} - -function computeFilterId( - mask: number, - selector: string | undefined, - hostnames: string | undefined, -): number { - let hash = (5408 * 33) ^ mask; - - if (selector !== undefined) { - for (let j = 0; j < selector.length; j += 1) { - hash = (hash * 33) ^ selector.charCodeAt(j); - } - } - - if (hostnames !== undefined) { - for (let j = 0; j < hostnames.length; j += 1) { - hash = (hash * 33) ^ hostnames.charCodeAt(j); - } - } - - return hash >>> 0; -} - -/*************************************************************************** - * Cosmetic filters parsing - * ************************************************************************ */ - -/** - * TODO: Make sure these are implemented properly and write tests. - * - -abp-contains - * - -abp-has - * - contains - * - has - * - has-text - * - if - * - if-not - * - matches-css - * - matches-css-after - * - matches-css-before - * - xpath - */ -export class CosmeticFilter implements IFilter { - public readonly mask: number; - public readonly selector?: string; - public readonly hostnames?: string; - public readonly style?: string; - - public id?: number; - public rawLine?: string; - private hostnamesArray?: string[]; - - constructor({ - hostnames, - id, - mask, - selector, - style, - }: Partial & { mask: number }) { - this.id = id; - this.mask = mask; - this.selector = selector; - this.hostnames = hostnames; - this.style = style; - } - - public isCosmeticFilter(): boolean { - return true; - } - public isNetworkFilter(): boolean { - return false; - } - - /** - * Create a more human-readable version of this filter. It is mainly used for - * debugging purpose, as it will expand the values stored in the bit mask. - */ - public toString(): string { - let filter = ''; - - if (this.hasHostnames()) { - filter += this.hostnames; - } - - if (this.isUnhide()) { - filter += '#@#'; - } else { - filter += '##'; - } - - if (this.isScriptInject()) { - filter += 'script:inject('; - filter += this.selector; - filter += ')'; - } else if (this.isScriptBlock()) { - filter += 'script:contains('; - filter += this.selector; - filter += ')'; - } else { - filter += this.selector; - } - - return filter; - } - - public getTokens(): Uint32Array[] { - if (this.hostnames !== undefined) { - return this.hostnames.split(',').map(tokenizeHostnames); - } - return []; - } - - public getScript(js: Map): string | undefined { - let scriptName = this.getSelector(); - let scriptArguments: string[] = []; - if (scriptName.indexOf(',') !== -1) { - const parts = scriptName.split(','); - scriptName = parts[0]; - scriptArguments = parts.slice(1).map((s) => s.trim()); - } - - let script = js.get(scriptName); - if (script !== undefined) { - for (let i = 0; i < scriptArguments.length; i += 1) { - script = script.replace(`{{${i + 1}}}`, scriptArguments[i]); - } - - return script; - } // TODO - else throw an exception? - - return undefined; - } - - public getId(): number { - if (this.id === undefined) { - this.id = computeFilterId(this.mask, this.selector, this.hostnames); - } - return this.id; - } - - public getStyle(): string { - return this.style || 'display: none !important;'; - } - - public getSelector(): string { - return this.selector || ''; - } - - public hasHostnames(): boolean { - return this.hostnames !== undefined; - } - - public getHostnames(): string[] { - if (this.hostnamesArray === undefined) { - // Sort them from longer hostname to shorter. - // This is to make sure that we will always start by the most specific - // when matching. - this.hostnamesArray = - this.hostnames === undefined - ? [] - : this.hostnames.split(',').sort((h1, h2) => { - if (h1.length > h2.length) { - return -1; - } else if (h1.length < h2.length) { - return 1; - } - - return 0; - }); - } - - return this.hostnamesArray; - } - - public isUnhide(): boolean { - return getBit(this.mask, COSMETICS_MASK.unhide); - } - - public isScriptInject(): boolean { - return getBit(this.mask, COSMETICS_MASK.scriptInject); - } - - public isScriptBlock(): boolean { - return getBit(this.mask, COSMETICS_MASK.scriptBlock); - } -} - -/** - * Given a line that we know contains a cosmetic filter, create a CosmeticFiler - * instance out of it. This function should be *very* efficient, as it will be - * used to parse tens of thousands of lines. - */ -export function parseCosmeticFilter(line: string): CosmeticFilter | null { - // Mask to store attributes - // Each flag (unhide, scriptInject, etc.) takes only 1 bit - // at a specific offset defined in COSMETICS_MASK. - // cf: COSMETICS_MASK for the offset of each property - let mask = 0; - let selector: string | undefined; - let hostnames: string | undefined; - let style: string | undefined; - const sharpIndex = line.indexOf('#'); - - // Start parsing the line - const afterSharpIndex = sharpIndex + 1; - let suffixStartIndex = afterSharpIndex + 1; - - // hostname1,hostname2#@#.selector - // ^^ ^ - // || | - // || suffixStartIndex - // |afterSharpIndex - // sharpIndex - - // Check if unhide - if (line.length > afterSharpIndex && line[afterSharpIndex] === '@') { - mask = setBit(mask, COSMETICS_MASK.unhide); - suffixStartIndex += 1; - } - - // Parse hostnames - if (sharpIndex > 0) { - hostnames = line.slice(0, sharpIndex); - if (hasUnicode(hostnames)) { - hostnames = punycode.encode(hostnames); - } - } - - // We should not have unhide without any hostname - if (getBit(mask, COSMETICS_MASK.unhide) && hostnames === undefined) { - return null; - } - - // Deal with script:inject and script:contains - if (fastStartsWithFrom(line, 'script:', suffixStartIndex)) { - // script:inject(.......) - // ^ ^ - // script:contains(/......./) - // ^ ^ - // script:contains(selector[, args]) - // ^ ^ ^^ - // | | | || - // | | | |selector.length - // | | | scriptSelectorIndexEnd - // | | |scriptArguments - // | scriptSelectorIndexStart - // scriptMethodIndex - const scriptMethodIndex = suffixStartIndex + 7; - let scriptSelectorIndexStart = scriptMethodIndex; - let scriptSelectorIndexEnd = line.length - 1; - - if (fastStartsWithFrom(line, 'inject(', scriptMethodIndex)) { - mask = setBit(mask, COSMETICS_MASK.scriptInject); - scriptSelectorIndexStart += 7; - } else if (fastStartsWithFrom(line, 'contains(', scriptMethodIndex)) { - mask = setBit(mask, COSMETICS_MASK.scriptBlock); - scriptSelectorIndexStart += 9; - - // If it's a regex - if (line[scriptSelectorIndexStart] === '/' && line[scriptSelectorIndexEnd - 1] === '/') { - scriptSelectorIndexStart += 1; - scriptSelectorIndexEnd -= 1; - } - } - - selector = line.slice(scriptSelectorIndexStart, scriptSelectorIndexEnd); - } else if (fastStartsWithFrom(line, '+js(', suffixStartIndex)) { - mask = setBit(mask, COSMETICS_MASK.scriptInject); - selector = line.slice(suffixStartIndex + 4, line.length - 1); - } else { - // Detect special syntax - let indexOfColon = line.indexOf(':', suffixStartIndex); - while (indexOfColon !== -1) { - const indexAfterColon = indexOfColon + 1; - if (fastStartsWithFrom(line, 'style', indexAfterColon)) { - // ##selector :style(...) - if (line[indexAfterColon + 5] === '(' && line[line.length - 1] === ')') { - selector = line.slice(suffixStartIndex, indexOfColon); - style = line.slice(indexAfterColon + 6, -1); - } else { - console.error('?????', line, indexAfterColon); - return null; - } - } else if ( - fastStartsWithFrom(line, '-abp-', indexAfterColon) || - fastStartsWithFrom(line, 'contains', indexAfterColon) || - fastStartsWithFrom(line, 'has', indexAfterColon) || - fastStartsWithFrom(line, 'if', indexAfterColon) || - fastStartsWithFrom(line, 'if-not', indexAfterColon) || - fastStartsWithFrom(line, 'matches-css', indexAfterColon) || - fastStartsWithFrom(line, 'matches-css-after', indexAfterColon) || - fastStartsWithFrom(line, 'matches-css-before', indexAfterColon) || - fastStartsWithFrom(line, 'not', indexAfterColon) || - fastStartsWithFrom(line, 'properties', indexAfterColon) || - fastStartsWithFrom(line, 'subject', indexAfterColon) || - fastStartsWithFrom(line, 'xpath', indexAfterColon) - ) { - return null; - } - indexOfColon = line.indexOf(':', indexAfterColon); - } - - // If we reach this point, filter is not extended syntax - if (selector === undefined && suffixStartIndex < line.length) { - selector = line.slice(suffixStartIndex); - } - - if (selector === undefined || !isValidCss(selector)) { - // Not a valid selector - return null; - } - } - - return new CosmeticFilter({ - hostnames, - mask, - selector, - style, - }); -} diff --git a/src/parsing/list.ts b/src/parsing/list.ts deleted file mode 100644 index 3d98c7a18c..0000000000 --- a/src/parsing/list.ts +++ /dev/null @@ -1,148 +0,0 @@ -import { fastStartsWith, fastStartsWithFrom } from '../utils'; - -import { CosmeticFilter, parseCosmeticFilter } from './cosmetic-filter'; -import { NetworkFilter, parseNetworkFilter } from './network-filter'; - -const SPACE = /\s/; - -const enum FilterType { - NOT_SUPPORTED, - NETWORK, - COSMETIC, -} - -function detectFilterType(line: string): FilterType { - // Ignore comments - if ( - line.length === 1 || - line.charAt(0) === '!' || - (line.charAt(0) === '#' && SPACE.test(line.charAt(1))) || - fastStartsWith(line, '[Adblock') - ) { - return FilterType.NOT_SUPPORTED; - } - - if (fastStartsWith(line, '|') || fastStartsWith(line, '@@|')) { - return FilterType.NETWORK; - } - - // Ignore Adguard cosmetics - // `$$` - if (line.indexOf('$$') !== -1) { - return FilterType.NOT_SUPPORTED; - } - - // Check if filter is cosmetics - const sharpIndex = line.indexOf('#'); - if (sharpIndex !== -1) { - const afterSharpIndex = sharpIndex + 1; - - // Ignore Adguard cosmetics - // `#$#` `#@$#` - // `#%#` `#@%#` - // `#?#` - if ( - fastStartsWithFrom(line, /* #@$# */ '@$#', afterSharpIndex) || - fastStartsWithFrom(line, /* #@%# */ '@%#', afterSharpIndex) || - fastStartsWithFrom(line, /* #%# */ '%#', afterSharpIndex) || - fastStartsWithFrom(line, /* #$# */ '$#', afterSharpIndex) || - fastStartsWithFrom(line, /* #?# */ '?#', afterSharpIndex) - ) { - return FilterType.NOT_SUPPORTED; - } else if ( - fastStartsWithFrom(line, /* ## */ '#', afterSharpIndex) || - fastStartsWithFrom(line, /* #@# */ '@#', afterSharpIndex) - ) { - // Parse supported cosmetic filter - // `##` `#@#` - return FilterType.COSMETIC; - } - } - - // Everything else is a network filter - return FilterType.NETWORK; -} - -export function f(strings: TemplateStringsArray): NetworkFilter | CosmeticFilter | null { - const rawFilter = strings.raw[0]; - const filterType = detectFilterType(rawFilter); - - let filter: NetworkFilter | CosmeticFilter | null = null; - if (filterType === FilterType.NETWORK) { - filter = parseNetworkFilter(rawFilter); - } else if (filterType === FilterType.COSMETIC) { - filter = parseCosmeticFilter(rawFilter); - } - - if (filter !== null) { - filter.rawLine = rawFilter; - } - - return filter; -} - -export function parseList( - data: string, - { loadNetworkFilters = true, loadCosmeticFilters = true, debug = false } = {}, -): { networkFilters: NetworkFilter[]; cosmeticFilters: CosmeticFilter[] } { - const networkFilters: NetworkFilter[] = []; - const cosmeticFilters: CosmeticFilter[] = []; - const lines = data.split('\n'); - - for (let i = 0; i < lines.length; i += 1) { - const line = lines[i].trim(); - - if (line.length > 0) { - const filterType = detectFilterType(line); - - if (filterType === FilterType.NETWORK && loadNetworkFilters) { - const filter = parseNetworkFilter(line); - if (filter !== null) { - // In debug mode, keep the original line - if (debug === true) { - filter.rawLine = line; - } - networkFilters.push(filter); - } - } else if (filterType === FilterType.COSMETIC && loadCosmeticFilters) { - const filter = parseCosmeticFilter(line); - if (filter !== null) { - // In debug mode, keep the original line - if (debug === true) { - filter.rawLine = line; - } - cosmeticFilters.push(filter); - } - } - } - } - - return { - cosmeticFilters, - networkFilters, - }; -} - -export function parseJSResource(data: string): Map> { - const resources = new Map(); - const trimComments = (str: string) => str.replace(/^#.*$/gm, ''); - const chunks = data.split('\n\n'); - - for (let i = 1; i < chunks.length; i += 1) { - const resource = trimComments(chunks[i]).trim(); - const firstNewLine = resource.indexOf('\n'); - const [name, type] = resource.slice(0, firstNewLine).split(' '); - const body = resource.slice(firstNewLine + 1); - - if (name === undefined || type === undefined || body === undefined) { - continue; - } - - if (!resources.has(type)) { - resources.set(type, new Map()); - } - resources.get(type).set(name, body); - } - - return resources; -} diff --git a/src/parsing/network-filter.ts b/src/parsing/network-filter.ts deleted file mode 100644 index 85dbc03b53..0000000000 --- a/src/parsing/network-filter.ts +++ /dev/null @@ -1,1024 +0,0 @@ -import * as punycode from 'punycode'; -import { RequestType } from '../request'; -import { - clearBit, - createFuzzySignature, - fastHash, - fastStartsWith, - fastStartsWithFrom, - getBit, - hasUnicode, - setBit, - tokenize, - tokenizeFilter, -} from '../utils'; -import IFilter from './interface'; - -const TOKENS_BUFFER = new Uint32Array(200); - -/** - * Masks used to store options of network filters in a bitmask. - */ -export const enum NETWORK_FILTER_MASK { - // Content Policy Type - fromImage = 1 << 0, - fromMedia = 1 << 1, - fromObject = 1 << 2, - fromOther = 1 << 3, - fromPing = 1 << 4, - fromScript = 1 << 5, - fromStylesheet = 1 << 6, - fromSubdocument = 1 << 7, - fromWebsocket = 1 << 8, // e.g.: ws, wss - fromXmlHttpRequest = 1 << 9, - fromFont = 1 << 10, - fromHttp = 1 << 11, - fromHttps = 1 << 12, - isImportant = 1 << 13, - matchCase = 1 << 14, - fuzzyMatch = 1 << 15, - - // Kind of patterns - thirdParty = 1 << 16, - firstParty = 1 << 17, - isRegex = 1 << 18, - isLeftAnchor = 1 << 19, - isRightAnchor = 1 << 20, - isHostnameAnchor = 1 << 21, - isException = 1 << 22, - isCSP = 1 << 23, -} - -/** - * Mask used when a network filter can be applied on any content type. - */ -const FROM_ANY: number = - NETWORK_FILTER_MASK.fromFont | - NETWORK_FILTER_MASK.fromImage | - NETWORK_FILTER_MASK.fromMedia | - NETWORK_FILTER_MASK.fromObject | - NETWORK_FILTER_MASK.fromOther | - NETWORK_FILTER_MASK.fromPing | - NETWORK_FILTER_MASK.fromScript | - NETWORK_FILTER_MASK.fromStylesheet | - NETWORK_FILTER_MASK.fromSubdocument | - NETWORK_FILTER_MASK.fromWebsocket | - NETWORK_FILTER_MASK.fromXmlHttpRequest; - -/** - * Map content type value to mask the corresponding mask. - * ref: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIContentPolicy - */ -const CPT_TO_MASK: { - [s: number]: number; -} = { - [RequestType.other]: NETWORK_FILTER_MASK.fromOther, - [RequestType.script]: NETWORK_FILTER_MASK.fromScript, - [RequestType.image]: NETWORK_FILTER_MASK.fromImage, - [RequestType.stylesheet]: NETWORK_FILTER_MASK.fromStylesheet, - [RequestType.object]: NETWORK_FILTER_MASK.fromObject, - [RequestType.subdocument]: NETWORK_FILTER_MASK.fromSubdocument, - [RequestType.ping]: NETWORK_FILTER_MASK.fromPing, - [RequestType.beacon]: NETWORK_FILTER_MASK.fromPing, - [RequestType.xmlhttprequest]: NETWORK_FILTER_MASK.fromXmlHttpRequest, - [RequestType.font]: NETWORK_FILTER_MASK.fromFont, - [RequestType.media]: NETWORK_FILTER_MASK.fromMedia, - [RequestType.websocket]: NETWORK_FILTER_MASK.fromWebsocket, - [RequestType.dtd]: NETWORK_FILTER_MASK.fromOther, - [RequestType.fetch]: NETWORK_FILTER_MASK.fromOther, - [RequestType.xlst]: NETWORK_FILTER_MASK.fromOther, -}; - -function computeFilterId( - csp: string | undefined, - mask: number, - filter: string | undefined, - hostname: string | undefined, - optDomains: Uint32Array | undefined, - optNotDomains: Uint32Array | undefined, -): number { - let hash = (5408 * 33) ^ mask; - - if (csp !== undefined) { - for (let i = 0; i < csp.length; i += 1) { - hash = (hash * 33) ^ csp.charCodeAt(i); - } - } - - if (optDomains !== undefined) { - for (let i = 0; i < optDomains.length; i += 1) { - hash = (hash * 33) ^ optDomains[i]; - } - } - - if (optNotDomains !== undefined) { - for (let i = 0; i < optNotDomains.length; i += 1) { - hash = (hash * 33) ^ optNotDomains[i]; - } - } - - if (filter !== undefined) { - for (let i = 0; i < filter.length; i += 1) { - hash = (hash * 33) ^ filter.charCodeAt(i); - } - } - - if (hostname !== undefined) { - for (let i = 0; i < hostname.length; i += 1) { - hash = (hash * 33) ^ hostname.charCodeAt(i); - } - } - - return hash >>> 0; -} - -const SEPARATOR = /[/^*]/; - -/** - * Compiles a filter pattern to a regex. This is only performed *lazily* for - * filters containing at least a * or ^ symbol. Because Regexes are expansive, - * we try to convert some patterns to plain filters. - */ -function compileRegex(filterStr: string, isRightAnchor: boolean, isLeftAnchor: boolean): RegExp { - let filter = filterStr; - - // Escape special regex characters: |.$+?{}()[]\ - filter = filter.replace(/([|.$+?{}()[\]\\])/g, '\\$1'); - - // * can match anything - filter = filter.replace(/\*/g, '.*'); - // ^ can match any separator or the end of the pattern - filter = filter.replace(/\^/g, '(?:[^\\w\\d_.%-]|$)'); - - // Should match end of url - if (isRightAnchor) { - filter = `${filter}$`; - } - - if (isLeftAnchor) { - filter = `^${filter}`; - } - - return new RegExp(filter); -} - -const EMPTY_ARRAY = new Uint32Array([]); -const MATCH_ALL = new RegExp(''); - -// TODO: -// 1. Options not supported yet: -// - badfilter -// - inline-script -// - popup -// - popunder -// - generichide -// - genericblock -// 2. Replace `split` with `substr` -export class NetworkFilter implements IFilter { - public readonly mask: number; - public readonly filter?: string; - public readonly optDomains?: Uint32Array; - public readonly optNotDomains?: Uint32Array; - public readonly redirect?: string; - public readonly hostname?: string; - public readonly csp?: string; - public readonly bug?: number; - - // Set only in debug mode - public rawLine?: string; - - public id?: number; - private fuzzySignature?: Uint32Array; - private regex?: RegExp; - private optimized: boolean = false; - - constructor({ - bug, - csp, - filter, - hostname, - id, - mask, - optDomains, - optNotDomains, - rawLine, - redirect, - regex, - }: { mask: number; regex?: RegExp } & Partial) { - this.bug = bug; - this.csp = csp; - this.filter = filter; - this.hostname = hostname; - this.id = id; - this.mask = mask; - this.optDomains = optDomains; - this.optNotDomains = optNotDomains; - this.rawLine = rawLine; - this.redirect = redirect; - this.regex = regex; - } - - public isCosmeticFilter() { - return false; - } - public isNetworkFilter() { - return true; - } - - /** - * Tries to recreate the original representation of the filter (adblock - * syntax) from the internal representation. - */ - public toString() { - if (this.rawLine !== undefined) { - return this.rawLine; - } - - let filter = ''; - - if (this.isException()) { - filter += '@@'; - } - if (this.isHostnameAnchor()) { - filter += '||'; - } - if (this.isLeftAnchor()) { - filter += '|'; - } - - if (this.hasHostname()) { - filter += this.getHostname(); - filter += '^'; - } - - if (!this.isRegex()) { - filter += this.getFilter(); - } else { - // Visualize the compiled regex - filter += this.getRegex().source; - } - - // Options - const options: string[] = []; - - if (!this.fromAny()) { - if (this.isFuzzy()) { - options.push('fuzzy'); - } - if (this.fromImage()) { - options.push('image'); - } - if (this.fromMedia()) { - options.push('media'); - } - if (this.fromObject()) { - options.push('object'); - } - if (this.fromOther()) { - options.push('other'); - } - if (this.fromPing()) { - options.push('ping'); - } - if (this.fromScript()) { - options.push('script'); - } - if (this.fromStylesheet()) { - options.push('stylesheet'); - } - if (this.fromSubdocument()) { - options.push('subdocument'); - } - if (this.fromWebsocket()) { - options.push('websocket'); - } - if (this.fromXmlHttpRequest()) { - options.push('xmlhttprequest'); - } - if (this.fromFont()) { - options.push('font'); - } - } - - if (this.isImportant()) { - options.push('important'); - } - if (this.isRedirect()) { - options.push(`redirect=${this.getRedirect()}`); - } - if (this.firstParty() !== this.thirdParty()) { - if (this.firstParty()) { - options.push('first-party'); - } - if (this.thirdParty()) { - options.push('third-party'); - } - } - - // if (this.hasOptDomains() || this.hasOptNotDomains()) { - // const domains = [...this.getOptDomains()]; - // this.getOptNotDomains().forEach((nd) => domains.push(`~${nd}`)); - // options.push(`domain=${domains.join('|')}`); - // } - - if (options.length > 0) { - filter += `$${options.join(',')}`; - } - - if (this.isRightAnchor()) { - filter += '|'; - } - - return filter; - } - - // Public API (Read-Only) - public getId(): number { - if (this.id === undefined) { - this.id = computeFilterId( - this.csp, - this.mask, - this.filter, - this.hostname, - this.optDomains, - this.optNotDomains, - ); - } - return this.id; - } - - public hasFilter(): boolean { - return this.filter !== undefined; - } - - public hasOptNotDomains(): boolean { - return this.optNotDomains !== undefined; - } - - public getNumberOfOptNotDomains(): number { - if (this.optNotDomains !== undefined) { - return this.optNotDomains.length; - } - return 0; - } - - public getOptNotDomains(): Uint32Array { - this.optimize(); - return this.optNotDomains || EMPTY_ARRAY; - } - - public getNumberOfOptDomains(): number { - if (this.optDomains !== undefined) { - return this.optDomains.length; - } - return 0; - } - - public hasOptDomains(): boolean { - return this.optDomains !== undefined; - } - - public getOptDomains(): Uint32Array { - this.optimize(); - return this.optDomains || EMPTY_ARRAY; - } - - public getMask(): number { - return this.mask; - } - - public getCptMask(): number { - return this.getMask() & FROM_ANY; - } - - public isRedirect(): boolean { - return this.redirect !== undefined; - } - - public getRedirect(): string { - return this.redirect || ''; - } - - public hasHostname(): boolean { - return this.hostname !== undefined; - } - - public getHostname(): string { - return this.hostname || ''; - } - - public getFilter(): string { - return this.filter || ''; - } - - public getRegex(): RegExp { - this.optimize(); - return this.regex || MATCH_ALL; - } - - public getFuzzySignature(): Uint32Array { - this.optimize(); - return this.fuzzySignature || EMPTY_ARRAY; - } - - public getTokens(): Uint32Array[] { - let tokensBufferIndex = 0; - - // If there is only one domain and no domain negation, we also use this - // domain as a token. - if ( - this.optDomains !== undefined && - this.optNotDomains === undefined && - this.optDomains.length === 1 - ) { - TOKENS_BUFFER[tokensBufferIndex] = this.optDomains[0]; - tokensBufferIndex += 1; - } - - // Get tokens from filter - if (this.filter !== undefined) { - const skipLastToken = this.isPlain() && !this.isRightAnchor() && !this.isFuzzy(); - const skipFirstToken = this.isRightAnchor(); - const filterTokens = tokenizeFilter(this.filter, skipFirstToken, skipLastToken); - TOKENS_BUFFER.set(filterTokens, tokensBufferIndex); - tokensBufferIndex += filterTokens.length; - } - - // Append tokens from hostname, if any - if (this.hostname !== undefined) { - const hostnameTokens = tokenize(this.hostname); - TOKENS_BUFFER.set(hostnameTokens, tokensBufferIndex); - tokensBufferIndex += hostnameTokens.length; - } - - // If we got no tokens for the filter/hostname part, then we will dispatch - // this filter in multiple buckets based on the domains option. - if ( - tokensBufferIndex === 0 && - this.optDomains !== undefined && - this.optNotDomains === undefined - ) { - return [...this.optDomains].map((d) => new Uint32Array([d])); - } - - // Add optional token for protocol - if (this.fromHttp() && !this.fromHttps()) { - TOKENS_BUFFER[tokensBufferIndex] = fastHash('http'); - tokensBufferIndex += 1; - } else if (this.fromHttps() && !this.fromHttp()) { - TOKENS_BUFFER[tokensBufferIndex] = fastHash('https'); - tokensBufferIndex += 1; - } - - return [TOKENS_BUFFER.slice(0, tokensBufferIndex)]; - } - - /** - * Check if this filter should apply to a request with this content type. - */ - public isCptAllowed(cpt: RequestType): boolean { - const mask = CPT_TO_MASK[cpt]; - if (mask !== undefined) { - return getBit(this.mask, mask); - } - - // If content type is not supported (or not specified), we return `true` - // only if the filter does not specify any resource type. - return this.fromAny(); - } - - public isFuzzy() { - return getBit(this.mask, NETWORK_FILTER_MASK.fuzzyMatch); - } - - public isException() { - return getBit(this.mask, NETWORK_FILTER_MASK.isException); - } - - public isHostnameAnchor() { - return getBit(this.mask, NETWORK_FILTER_MASK.isHostnameAnchor); - } - - public isRightAnchor() { - return getBit(this.mask, NETWORK_FILTER_MASK.isRightAnchor); - } - - public isLeftAnchor() { - return getBit(this.mask, NETWORK_FILTER_MASK.isLeftAnchor); - } - - public matchCase() { - return getBit(this.mask, NETWORK_FILTER_MASK.matchCase); - } - - public isImportant() { - return getBit(this.mask, NETWORK_FILTER_MASK.isImportant); - } - - public isRegex() { - return getBit(this.mask, NETWORK_FILTER_MASK.isRegex); - } - - public isPlain() { - return !getBit(this.mask, NETWORK_FILTER_MASK.isRegex); - } - - public isCSP() { - return getBit(this.mask, NETWORK_FILTER_MASK.isCSP); - } - - public hasBug() { - return this.bug !== undefined; - } - - public fromAny() { - return this.getCptMask() === FROM_ANY; - } - - public thirdParty() { - return getBit(this.mask, NETWORK_FILTER_MASK.thirdParty); - } - - public firstParty() { - return getBit(this.mask, NETWORK_FILTER_MASK.firstParty); - } - - public fromImage() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromImage); - } - - public fromMedia() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromMedia); - } - - public fromObject() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromObject); - } - - public fromOther() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromOther); - } - - public fromPing() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromPing); - } - - public fromScript() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromScript); - } - - public fromStylesheet() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromStylesheet); - } - - public fromSubdocument() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromSubdocument); - } - - public fromWebsocket() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromWebsocket); - } - - public fromHttp() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromHttp); - } - - public fromHttps() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromHttps); - } - - public fromXmlHttpRequest() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromXmlHttpRequest); - } - - public fromFont() { - return getBit(this.mask, NETWORK_FILTER_MASK.fromFont); - } - - private optimize() { - if (this.optimized === false) { - this.optimized = true; - if (this.optNotDomains !== undefined) { - this.optNotDomains.sort(); - } - if (this.optDomains !== undefined) { - this.optDomains.sort(); - } - if (this.filter !== undefined && this.regex === undefined && this.isRegex()) { - this.regex = compileRegex(this.filter, this.isRightAnchor(), this.isLeftAnchor()); - } - if (this.filter !== undefined && this.isFuzzy()) { - this.fuzzySignature = createFuzzySignature(this.filter); - } - } - } -} - -// --------------------------------------------------------------------------- -// Filter parsing -// --------------------------------------------------------------------------- - -function setNetworkMask(mask: number, m: number, value: boolean): number { - if (value === true) { - return setBit(mask, m); - } - - return clearBit(mask, m); -} - -/** - * Check if the sub-string contained between the indices start and end is a - * regex filter (it contains a '*' or '^' char). Here we are limited by the - * capability of javascript to check the presence of a pattern between two - * indices (same for Regex...). - * // TODO - we could use sticky regex here - */ -function checkIsRegex(filter: string, start: number, end: number): boolean { - const starIndex = filter.indexOf('*', start); - const separatorIndex = filter.indexOf('^', start); - return (starIndex !== -1 && starIndex < end) || (separatorIndex !== -1 && separatorIndex < end); -} - -/** - * Parse a line containing a network filter into a NetworkFilter object. - * This must be *very* efficient. - */ -export function parseNetworkFilter(rawLine: string): NetworkFilter | null { - const line: string = rawLine; - - // Represent options as a bitmask - let mask: number = - NETWORK_FILTER_MASK.thirdParty | - NETWORK_FILTER_MASK.firstParty | - NETWORK_FILTER_MASK.fromHttps | - NETWORK_FILTER_MASK.fromHttp; - - // Temporary masks for positive (e.g.: $script) and negative (e.g.: $~script) - // content type options. - let cptMaskPositive: number = 0; - let cptMaskNegative: number = FROM_ANY; - - let hostname: string | undefined; - - let optDomains: Uint32Array | undefined; - let optNotDomains: Uint32Array | undefined; - let redirect: string | undefined; - let csp: string | undefined; - let bug: number | undefined; - - // Start parsing - let filterIndexStart: number = 0; - let filterIndexEnd: number = line.length; - - // @@filter == Exception - if (fastStartsWith(line, '@@')) { - filterIndexStart += 2; - mask = setBit(mask, NETWORK_FILTER_MASK.isException); - } - - // filter$options == Options - // ^ ^ - // | | - // | optionsIndex - // filterIndexStart - const optionsIndex: number = line.lastIndexOf('$'); - if (optionsIndex !== -1) { - // Parse options and set flags - filterIndexEnd = optionsIndex; - - // --------------------------------------------------------------------- // - // parseOptions - // TODO: This could be implemented without string copy, - // using indices, like in main parsing functions. - const rawOptions = line.slice(optionsIndex + 1); - const options = rawOptions.split(','); - for (let i = 0; i < options.length; i += 1) { - const rawOption = options[i]; - let negation = false; - let option = rawOption; - - // Check for negation: ~option - if (fastStartsWith(option, '~')) { - negation = true; - option = option.slice(1); - } else { - negation = false; - } - - // Check for options: option=value1|value2 - let optionValue: string = ''; - if (option.indexOf('=') !== -1) { - const optionAndValues = option.split('=', 2); - option = optionAndValues[0]; - optionValue = optionAndValues[1]; - } - - switch (option) { - case 'domain': { - const optionValues: string[] = optionValue.split('|'); - const optDomainsArray: number[] = []; - const optNotDomainsArray: number[] = []; - - for (let j = 0; j < optionValues.length; j += 1) { - const value: string = optionValues[j]; - if (value) { - if (fastStartsWith(value, '~')) { - optNotDomainsArray.push(fastHash(value.slice(1))); - } else { - optDomainsArray.push(fastHash(value)); - } - } - } - - if (optDomainsArray.length > 0) { - optDomains = new Uint32Array(optDomainsArray); - } - - if (optNotDomainsArray.length > 0) { - optNotDomains = new Uint32Array(optNotDomainsArray); - } - - break; - } - case 'badfilter': - // TODO - how to handle those, if we start in mask, then the id will - // differ from the other filter. We could keep original line. How do - // to eliminate thos efficiently? They will probably endup in the same - // bucket, so maybe we could do that on a per-bucket basis? - return null; - case 'important': - // Note: `negation` should always be `false` here. - if (negation) { - return null; - } - - mask = setBit(mask, NETWORK_FILTER_MASK.isImportant); - break; - case 'match-case': - // Note: `negation` should always be `false` here. - if (negation) { - return null; - } - - mask = setBit(mask, NETWORK_FILTER_MASK.matchCase); - break; - case 'third-party': - if (negation) { - // ~third-party means we should clear the flag - mask = clearBit(mask, NETWORK_FILTER_MASK.thirdParty); - } else { - // third-party means ~first-party - mask = clearBit(mask, NETWORK_FILTER_MASK.firstParty); - } - break; - case 'first-party': - if (negation) { - // ~first-party means we should clear the flag - mask = clearBit(mask, NETWORK_FILTER_MASK.firstParty); - } else { - // first-party means ~third-party - mask = clearBit(mask, NETWORK_FILTER_MASK.thirdParty); - } - break; - case 'fuzzy': - mask = setBit(mask, NETWORK_FILTER_MASK.fuzzyMatch); - break; - case 'collapse': - break; - case 'bug': - bug = parseInt(optionValue, 10); - break; - case 'redirect': - // Negation of redirection doesn't make sense - if (negation) { - return null; - } - - // Ignore this filter if no redirection resource is specified - if (optionValue.length === 0) { - return null; - } - - redirect = optionValue; - break; - case 'csp': - mask = setBit(mask, NETWORK_FILTER_MASK.isCSP); - if (optionValue.length > 0) { - csp = optionValue; - } - break; - default: { - // Handle content type options separatly - let optionMask: number = 0; - switch (option) { - case 'image': - optionMask = NETWORK_FILTER_MASK.fromImage; - break; - case 'media': - optionMask = NETWORK_FILTER_MASK.fromMedia; - break; - case 'object': - optionMask = NETWORK_FILTER_MASK.fromObject; - break; - case 'object-subrequest': - optionMask = NETWORK_FILTER_MASK.fromObject; - break; - case 'other': - optionMask = NETWORK_FILTER_MASK.fromOther; - break; - case 'ping': - case 'beacon': - optionMask = NETWORK_FILTER_MASK.fromPing; - break; - case 'script': - optionMask = NETWORK_FILTER_MASK.fromScript; - break; - case 'stylesheet': - optionMask = NETWORK_FILTER_MASK.fromStylesheet; - break; - case 'subdocument': - optionMask = NETWORK_FILTER_MASK.fromSubdocument; - break; - case 'xmlhttprequest': - case 'xhr': - optionMask = NETWORK_FILTER_MASK.fromXmlHttpRequest; - break; - case 'websocket': - optionMask = NETWORK_FILTER_MASK.fromWebsocket; - break; - case 'font': - optionMask = NETWORK_FILTER_MASK.fromFont; - break; - default: - return null; - } - - // Disable this filter if we don't support all the options - if (optionMask === 0) { - return null; - } - - // We got a valid cpt option, update mask - if (negation) { - cptMaskNegative = clearBit(cptMaskNegative, optionMask); - } else { - cptMaskPositive = setBit(cptMaskPositive, optionMask); - } - break; - } - } - } - // End of option parsing - // --------------------------------------------------------------------- // - } - - if (cptMaskPositive === 0) { - mask |= cptMaskNegative; - } else if (cptMaskNegative === FROM_ANY) { - mask |= cptMaskPositive; - } else { - mask |= cptMaskPositive & cptMaskNegative; - } - - // Identify kind of pattern - - // Deal with hostname pattern - if (line[filterIndexEnd - 1] === '|') { - mask = setBit(mask, NETWORK_FILTER_MASK.isRightAnchor); - filterIndexEnd -= 1; - } - - if (fastStartsWithFrom(line, '||', filterIndexStart)) { - mask = setBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor); - filterIndexStart += 2; - } else if (line[filterIndexStart] === '|') { - mask = setBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart += 1; - } - - const isRegex = checkIsRegex(line, filterIndexStart, filterIndexEnd); - mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isRegex, isRegex); - - if (getBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor)) { - if (isRegex) { - // Split at the first '/', '*' or '^' character to get the hostname - // and then the pattern. - // TODO - this could be made more efficient if we could match between two - // indices. Once again, we have to do more work than is really needed. - const firstSeparator = line.search(SEPARATOR); - - if (firstSeparator !== -1) { - hostname = line.slice(filterIndexStart, firstSeparator); - filterIndexStart = firstSeparator; - - // If the only symbol remaining for the selector is '^' then ignore it - // but set the filter as right anchored since there should not be any - // other label on the right - if (filterIndexEnd - filterIndexStart === 1 && line[filterIndexStart] === '^') { - mask = clearBit(mask, NETWORK_FILTER_MASK.isRegex); - filterIndexStart = filterIndexEnd; - mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isRightAnchor, true); - } else { - mask = setNetworkMask(mask, NETWORK_FILTER_MASK.isLeftAnchor, true); - mask = setNetworkMask( - mask, - NETWORK_FILTER_MASK.isRegex, - checkIsRegex(line, filterIndexStart, filterIndexEnd), - ); - } - } - } else { - // Look for next / - const slashIndex = line.indexOf('/', filterIndexStart); - if (slashIndex !== -1) { - hostname = line.slice(filterIndexStart, slashIndex); - filterIndexStart = slashIndex; - mask = setBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - } else { - hostname = line.slice(filterIndexStart, filterIndexEnd); - filterIndexStart = filterIndexEnd; - } - } - } - - // Remove trailing '*' - if (filterIndexEnd - filterIndexStart > 0 && line[filterIndexEnd - 1] === '*') { - filterIndexEnd -= 1; - } - - // Remove leading '*' if the filter is not hostname anchored. - if (filterIndexEnd - filterIndexStart > 0 && line[filterIndexStart] === '*') { - mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart += 1; - } - - // Transform filters on protocol (http, https, ws) - if (getBit(mask, NETWORK_FILTER_MASK.isLeftAnchor)) { - if ( - filterIndexEnd - filterIndexStart === 5 && - fastStartsWithFrom(line, 'ws://', filterIndexStart) - ) { - mask = setBit(mask, NETWORK_FILTER_MASK.fromWebsocket); - mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart = filterIndexEnd; - } else if ( - filterIndexEnd - filterIndexStart === 7 && - fastStartsWithFrom(line, 'http://', filterIndexStart) - ) { - mask = setBit(mask, NETWORK_FILTER_MASK.fromHttp); - mask = clearBit(mask, NETWORK_FILTER_MASK.fromHttps); - mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart = filterIndexEnd; - } else if ( - filterIndexEnd - filterIndexStart === 8 && - fastStartsWithFrom(line, 'https://', filterIndexStart) - ) { - mask = setBit(mask, NETWORK_FILTER_MASK.fromHttps); - mask = clearBit(mask, NETWORK_FILTER_MASK.fromHttp); - mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart = filterIndexEnd; - } else if ( - filterIndexEnd - filterIndexStart === 8 && - fastStartsWithFrom(line, 'http*://', filterIndexStart) - ) { - mask = setBit(mask, NETWORK_FILTER_MASK.fromHttps); - mask = setBit(mask, NETWORK_FILTER_MASK.fromHttp); - mask = clearBit(mask, NETWORK_FILTER_MASK.isLeftAnchor); - filterIndexStart = filterIndexEnd; - } - } - - let filter: string | undefined; - if (filterIndexEnd - filterIndexStart > 0) { - filter = line.slice(filterIndexStart, filterIndexEnd).toLowerCase(); - mask = setNetworkMask( - mask, - NETWORK_FILTER_MASK.isRegex, - checkIsRegex(filter, 0, filter.length), - ); - } - - // TODO - // - ignore hostname anchor is not hostname provided - - if (hostname !== undefined) { - if (getBit(mask, NETWORK_FILTER_MASK.isHostnameAnchor) && fastStartsWith(hostname, 'www.')) { - hostname = hostname.slice(4); - } - hostname = hostname.toLowerCase(); - if (hasUnicode(hostname)) { - hostname = punycode.toASCII(hostname); - } - } - - return new NetworkFilter({ - bug, - csp, - filter, - hostname, - mask, - optDomains, - optNotDomains, - redirect, - }); -} diff --git a/src/resources.ts b/src/resources.ts new file mode 100644 index 0000000000..c8be0a4833 --- /dev/null +++ b/src/resources.ts @@ -0,0 +1,113 @@ +import StaticDataView from './data-view'; + +interface IResource { + contentType: string; + data: string; +} + +/** + * Abstraction on top of resources.txt used for redirections as well as script + * injections. It contains logic to parse, serialize and get resources by name + * for use in the engine. + */ +export default class Resources { + public static deserialize(buffer: StaticDataView): Resources { + const checksum = buffer.getASCII(); + + // Deserialize `resources` + const resources: Map = new Map(); + const numberOfResources = buffer.getUint8(); + for (let i = 0; i < numberOfResources; i += 1) { + resources.set(buffer.getASCII(), { + contentType: buffer.getASCII(), + data: buffer.getASCII(), + }); + } + + // Deserialize `js` + const js: Map = new Map(); + resources.forEach(({ contentType, data }, name) => { + if (contentType === 'application/javascript') { + js.set(name, data); + } + }); + + return new Resources({ + checksum, + js, + resources, + }); + } + + public static parse(data: string, { checksum }: { checksum: string }): Resources { + const typeToResource = new Map(); + const trimComments = (str: string) => str.replace(/^\s*#.*$/gm, ''); + const chunks = data.split('\n\n'); + + for (let i = 0; i < chunks.length; i += 1) { + const resource = trimComments(chunks[i]).trim(); + if (resource.length !== 0) { + const firstNewLine = resource.indexOf('\n'); + const [name, type] = resource.slice(0, firstNewLine).split(/\s+/); + const body = resource.slice(firstNewLine + 1); + + if (name === undefined || type === undefined || body === undefined) { + continue; + } + + if (!typeToResource.has(type)) { + typeToResource.set(type, new Map()); + } + typeToResource.get(type).set(name, body); + } + } + + // the resource containing javascirpts to be injected + const js = typeToResource.get('application/javascript'); + + // Create a mapping from resource name to { contentType, data } + // used for request redirection. + const resourcesByName: Map = new Map(); + typeToResource.forEach((resources, contentType) => { + resources.forEach((resource: string, name: string) => { + resourcesByName.set(name, { + contentType, + data: resource, + }); + }); + }); + + return new Resources({ + checksum, + js, + resources: resourcesByName, + }); + } + + public readonly checksum: string; + public readonly js: Map; + public readonly resources: Map; + + constructor({ checksum = '', js = new Map(), resources = new Map() }: Partial = {}) { + this.checksum = checksum; + this.js = js; + this.resources = resources; + } + + public getResource(name: string): IResource | undefined { + return this.resources.get(name); + } + + public serialize(buffer: StaticDataView): void { + // Serialize `checksum` + buffer.pushASCII(this.checksum); + + // Serialize `resources` + buffer.pushUint8(this.resources.size); + this.resources.forEach(({ contentType, data }, name) => { + buffer.pushASCII(name); + buffer.pushASCII(contentType); + buffer.pushASCII(data); + }); + } +} diff --git a/src/serialization.ts b/src/serialization.ts deleted file mode 100644 index b59211f8a7..0000000000 --- a/src/serialization.ts +++ /dev/null @@ -1,517 +0,0 @@ -/** - * This modules contains all functions and utils to serialize the adblocker - * efficiently. The central part if `StaticDataView`. - */ - -import StaticDataView from './data-view'; -import CosmeticFilterBucket from './engine/bucket/cosmetics'; -import NetworkFilterBucket from './engine/bucket/network'; -import Engine from './engine/engine'; -import IList from './engine/list'; -import ReverseIndex, { IBucket, newBucket } from './engine/reverse-index'; -import { CosmeticFilter } from './parsing/cosmetic-filter'; -import IFilter from './parsing/interface'; -import { NetworkFilter } from './parsing/network-filter'; - -export const ENGINE_VERSION = 16; - -/** - * To allow for a more compact representation of network filters, the - * representation is composed of a mandatory header, and some optional - * - * Header: - * ======= - * - * | opt | mask - * 8 32 - * - * For an empty filter having no pattern, hostname, the minimum size is: 42 bits. - * - * Then for each optional part (filter, hostname optDomains, optNotDomains, - * redirect), it takes 16 bits for the length of the string + the length of the - * string in bytes. - * - * The optional parts are written in order of there number of occurrence in the - * filter list used by the adblocker. The most common being `hostname`, then - * `filter`, `optDomains`, `optNotDomains`, `redirect`. - * - * Example: - * ======== - * - * @@||cliqz.com would result in a serialized version: - * - * | 1 | mask | 9 | c | l | i | q | z | . | c | o | m (16 bytes) - * - * In this case, the serialized version is actually bigger than the original - * filter, but faster to deserialize. In the future, we could optimize the - * representation to compact small filters better. - * - * Ideas: - * * variable length encoding for the mask (if not option, take max 1 byte). - * * first byte could contain the mask as well if small enough. - * * when packing ascii string, store several of them in each byte. - */ -function serializeNetworkFilter(filter: NetworkFilter, buffer: StaticDataView): void { - buffer.pushUint32(filter.getId()); - buffer.pushUint32(filter.mask); - - const index = buffer.getPos(); - buffer.pushUint8(0); - - // This bit-mask indicates which optional parts of the filter were serialized. - let optionalParts = 0; - - if (filter.bug !== undefined) { - optionalParts |= 1; - buffer.pushUint16(filter.bug); - } - - if (filter.isCSP()) { - optionalParts |= 2; - buffer.pushASCII(filter.csp); - } - - if (filter.hasFilter()) { - optionalParts |= 4; - buffer.pushASCII(filter.filter); - } - - if (filter.hasHostname()) { - optionalParts |= 8; - buffer.pushASCII(filter.hostname); - } - - if (filter.hasOptDomains()) { - optionalParts |= 16; - buffer.pushUint32Array(filter.optDomains); - } - - if (filter.hasOptNotDomains()) { - optionalParts |= 32; - buffer.pushUint32Array(filter.optNotDomains); - } - - if (filter.isRedirect()) { - optionalParts |= 64; - buffer.pushASCII(filter.redirect); - } - - buffer.setByte(index, optionalParts); -} - -/** - * Deserialize network filters. The code accessing the buffer should be - * symetrical to the one in `serializeNetworkFilter`. - */ -function deserializeNetworkFilter(buffer: StaticDataView): NetworkFilter { - const id = buffer.getUint32(); - const mask = buffer.getUint32(); - const optionalParts = buffer.getUint8(); - - // The order of these statements is important. Since `buffer.getX()` will - // internally increment the position of next byte to read, they need to be - // retrieved in the exact same order they were serialized (check - // `serializeNetworkFilter`). - return new NetworkFilter({ - // Mandatory fields - id, - mask, - - // Optional parts - bug: (optionalParts & 1) === 1 ? buffer.getUint16() : undefined, - csp: (optionalParts & 2) === 2 ? buffer.getASCII() : undefined, - filter: (optionalParts & 4) === 4 ? buffer.getASCII() : undefined, - hostname: (optionalParts & 8) === 8 ? buffer.getASCII() : undefined, - optDomains: (optionalParts & 16) === 16 ? buffer.getUint32Array() : undefined, - optNotDomains: (optionalParts & 32) === 32 ? buffer.getUint32Array() : undefined, - redirect: (optionalParts & 64) === 64 ? buffer.getASCII() : undefined, - }); -} - -/** - * The format of a cosmetic filter is: - * - * | mask | selector length | selector... | hostnames length | hostnames... - * 32 16 16 - * - * The header (mask) is 32 bits, then we have a total of 32 bits to store the - * length of `selector` and `hostnames` (16 bits each). - * - * Improvements similar to the onces mentioned in `serializeNetworkFilters` - * could be applied here, to get a more compact representation. - */ -function serializeCosmeticFilter(filter: CosmeticFilter, buffer: StaticDataView): void { - buffer.pushASCII(filter.hostnames); - buffer.pushUint32(filter.getId()); - buffer.pushUint8(filter.mask); - buffer.pushUTF8(filter.selector); - buffer.pushASCII(filter.style); -} - -/** - * Deserialize cosmetic filters. The code accessing the buffer should be - * symetrical to the one in `serializeCosmeticFilter`. - */ -function deserializeCosmeticFilter(buffer: StaticDataView): CosmeticFilter { - // The order of these fields should be the same as when we serialize them. - return new CosmeticFilter({ - hostnames: buffer.getASCII(), - id: buffer.getUint32(), - mask: buffer.getUint8(), - selector: buffer.getUTF8(), - style: buffer.getASCII(), - }); -} - -function serializeNetworkFilters(filters: NetworkFilter[], buffer: StaticDataView): void { - buffer.pushUint32(filters.length); - for (let i = 0; i < filters.length; i += 1) { - serializeNetworkFilter(filters[i], buffer); - } -} - -function serializeCosmeticFilters(filters: CosmeticFilter[], buffer: StaticDataView): void { - buffer.pushUint32(filters.length); - for (let i = 0; i < filters.length; i += 1) { - serializeCosmeticFilter(filters[i], buffer); - } -} - -function deserializeNetworkFilters( - buffer: StaticDataView, - allFilters: Map, -): NetworkFilter[] { - const length = buffer.getUint32(); - const filters: NetworkFilter[] = []; - for (let i = 0; i < length; i += 1) { - const filter = deserializeNetworkFilter(buffer); - filters.push(filter); - allFilters.set(filter.getId(), filter); - } - - return filters; -} - -function deserializeCosmeticFilters( - buffer: StaticDataView, - allFilters: Map, -): CosmeticFilter[] { - const length = buffer.getUint32(); - const filters: CosmeticFilter[] = []; - for (let i = 0; i < length; i += 1) { - const filter = deserializeCosmeticFilter(buffer); - filters.push(filter); - allFilters.set(filter.getId(), filter); - } - - return filters; -} - -function serializeLists(buffer: StaticDataView, lists: Map): void { - // Serialize number of lists - buffer.pushUint8(lists.size); - - lists.forEach((list, asset) => { - buffer.pushASCII(asset); - buffer.pushASCII(list.checksum); - serializeCosmeticFilters(list.cosmetics, buffer); - serializeNetworkFilters(list.csp, buffer); - serializeNetworkFilters(list.exceptions, buffer); - serializeNetworkFilters(list.filters, buffer); - serializeNetworkFilters(list.importants, buffer); - serializeNetworkFilters(list.redirects, buffer); - }); -} - -function deserializeLists( - buffer: StaticDataView, -): { - cosmeticFilters: Map; - networkFilters: Map; - lists: Map; -} { - const lists = new Map(); - const networkFilters = new Map(); - const cosmeticFilters = new Map(); - - // Get number of assets - const size = buffer.getUint8(); - for (let i = 0; i < size; i += 1) { - lists.set(buffer.getASCII(), { - checksum: buffer.getASCII(), - cosmetics: deserializeCosmeticFilters(buffer, cosmeticFilters), - csp: deserializeNetworkFilters(buffer, networkFilters), - exceptions: deserializeNetworkFilters(buffer, networkFilters), - filters: deserializeNetworkFilters(buffer, networkFilters), - importants: deserializeNetworkFilters(buffer, networkFilters), - redirects: deserializeNetworkFilters(buffer, networkFilters), - }); - } - - return { - cosmeticFilters, - lists, - networkFilters, - }; -} - -function serializeListOfFilter(filters: T[], buffer: StaticDataView) { - buffer.pushUint16(filters.length); - for (let i = 0; i < filters.length; i += 1) { - buffer.pushUint32(filters[i].getId()); - } -} - -function serializeBucket(token: number, filters: T[], buffer: StaticDataView) { - buffer.pushUint32(token); - serializeListOfFilter(filters, buffer); -} - -function deserializeListOfFilters( - buffer: StaticDataView, - filters: Map, -): T[] { - const bucket: T[] = []; - const length = buffer.getUint16(); - - for (let i = 0; i < length; i += 1) { - const filter = filters.get(buffer.getUint32()); - if (filter !== undefined) { - bucket.push(filter); - } - } - - return bucket; -} - -function deserializeBucket( - buffer: StaticDataView, - filters: Map, -): { - token: number; - bucket: IBucket; -} { - const token = buffer.getUint32(); - - return { - bucket: newBucket(deserializeListOfFilters(buffer, filters)), - token, - }; -} - -function serializeReverseIndex( - reverseIndex: ReverseIndex, - buffer: StaticDataView, -): void { - const index = reverseIndex.index; - - buffer.pushUint32(reverseIndex.size); - buffer.pushUint32(index.size); - - index.forEach((bucket, token) => { - serializeBucket(token, bucket.originals || bucket.filters, buffer); - }); -} - -function deserializeReverseIndex( - buffer: StaticDataView, - index: ReverseIndex, - filters: Map, -): ReverseIndex { - const deserializedIndex = new Map(); - - const size = buffer.getUint32(); - const numberOfTokens = buffer.getUint32(); - - for (let i = 0; i < numberOfTokens; i += 1) { - const { token, bucket } = deserializeBucket(buffer, filters); - deserializedIndex.set(token, bucket); - } - - index.index = deserializedIndex; - index.size = size; - - return index; -} - -function serializeNetworkFilterBucket(bucket: NetworkFilterBucket, buffer: StaticDataView): void { - buffer.pushASCII(bucket.name); - buffer.pushUint8(Number(bucket.enableOptimizations)); - serializeReverseIndex(bucket.index, buffer); -} - -function deserializeNetworkFilterBucket( - buffer: StaticDataView, - filters: Map, -): NetworkFilterBucket { - const bucket = new NetworkFilterBucket( - buffer.getASCII() || '', - undefined, - Boolean(buffer.getUint8()), - ); - bucket.index = deserializeReverseIndex(buffer, bucket.index, filters); - bucket.size = bucket.index.size; - return bucket; -} - -function serializeCosmeticFilterBucket( - bucket: CosmeticFilterBucket, - buffer: StaticDataView, -): void { - serializeListOfFilter(bucket.genericRules, buffer); - serializeReverseIndex(bucket.hostnameIndex, buffer); -} - -function deserializeCosmeticFilterBucket( - buffer: StaticDataView, - filters: Map, -): CosmeticFilterBucket { - const bucket = new CosmeticFilterBucket(); - bucket.genericRules = deserializeListOfFilters(buffer, filters); - bucket.hostnameIndex = deserializeReverseIndex( - buffer, - bucket.hostnameIndex, - filters, - ); - bucket.size = bucket.hostnameIndex.size + bucket.genericRules.length; - return bucket; -} - -function serializeResources(engine: Engine, buffer: StaticDataView): void { - // Serialize `resourceChecksum` - buffer.pushASCII(engine.resourceChecksum); - - // Serialize `resources` - buffer.pushUint8(engine.resources.size); - engine.resources.forEach(({ contentType, data }, name) => { - buffer.pushASCII(name); - buffer.pushASCII(contentType); - buffer.pushASCII(data); - }); -} - -function deserializeResources( - buffer: StaticDataView, -): { - js: Map; - resources: Map; - resourceChecksum: string; -} { - const js = new Map(); - const resources = new Map(); - const resourceChecksum = buffer.getASCII() || ''; - - // Deserialize `resources` - const resourcesSize = buffer.getUint8(); - for (let i = 0; i < resourcesSize; i += 1) { - resources.set(buffer.getASCII(), { - contentType: buffer.getASCII(), - data: buffer.getASCII(), - }); - } - - // Deserialize `js` - resources.forEach(({ contentType, data }, name) => { - if (contentType === 'application/javascript') { - js.set(name, data); - } - }); - - return { - js, - resourceChecksum, - resources, - }; -} - -/** - * Creates a string representation of the full engine. It can be stored - * on-disk for faster loading of the adblocker. The `load` method of a - * `Engine` instance can be used to restore the engine *in-place*. - */ -function serializeEngine(engine: Engine): Uint8Array { - // Create a big buffer! It does not have to be the right size since - // `StaticDataView` is able to resize itself dynamically if needed. - const buffer = new StaticDataView(8000000); - - buffer.pushUint8(ENGINE_VERSION); - buffer.pushUint8(Number(engine.enableOptimizations)); - buffer.pushUint8(Number(engine.loadCosmeticFilters)); - buffer.pushUint8(Number(engine.loadNetworkFilters)); - buffer.pushUint8(Number(engine.optimizeAOT)); - - buffer.pushUint32(Number(engine.size)); - - // Resources (js, resources) - serializeResources(engine, buffer); - - // Lists - serializeLists(buffer, engine.lists); - - // Buckets - serializeNetworkFilterBucket(engine.filters, buffer); - serializeNetworkFilterBucket(engine.exceptions, buffer); - serializeNetworkFilterBucket(engine.importants, buffer); - serializeNetworkFilterBucket(engine.redirects, buffer); - serializeNetworkFilterBucket(engine.csp, buffer); - - serializeCosmeticFilterBucket(engine.cosmetics, buffer); - - return buffer.crop(); -} - -function deserializeEngine(serialized: Uint8Array): Engine { - const buffer = new StaticDataView(0, serialized); - - // Before starting deserialization, we make sure that the version of the - // serialized engine is the same as the current source code. If not, we start - // fresh and create a new engine from the lists. - const serializedEngineVersion = buffer.getUint8(); - if (ENGINE_VERSION !== serializedEngineVersion) { - throw new Error('serialized engine version mismatch'); - } - - // Create a new engine with same options - const options = { - enableOptimizations: Boolean(buffer.getUint8()), - loadCosmeticFilters: Boolean(buffer.getUint8()), - loadNetworkFilters: Boolean(buffer.getUint8()), - optimizeAOT: Boolean(buffer.getUint8()), - }; - const engine = new Engine(options); - engine.size = buffer.getUint32(); - - // Deserialize resources - const { js, resources, resourceChecksum } = deserializeResources(buffer); - - engine.js = js; - engine.resources = resources; - engine.resourceChecksum = resourceChecksum; - - // Deserialize lists + filters - const { lists, networkFilters, cosmeticFilters } = deserializeLists(buffer); - - engine.lists = lists; - - // Deserialize buckets - engine.filters = deserializeNetworkFilterBucket(buffer, networkFilters); - engine.exceptions = deserializeNetworkFilterBucket(buffer, networkFilters); - engine.importants = deserializeNetworkFilterBucket(buffer, networkFilters); - engine.redirects = deserializeNetworkFilterBucket(buffer, networkFilters); - engine.csp = deserializeNetworkFilterBucket(buffer, networkFilters); - engine.cosmetics = deserializeCosmeticFilterBucket(buffer, cosmeticFilters); - - return engine; -} - -export { - IFilter, - serializeNetworkFilter, - deserializeNetworkFilter, - serializeCosmeticFilter, - deserializeCosmeticFilter, - serializeReverseIndex, - deserializeReverseIndex, - serializeEngine, - deserializeEngine, -}; diff --git a/src/utils.ts b/src/utils.ts index 8666c7a013..0dae1d7f87 100644 --- a/src/utils.ts +++ b/src/utils.ts @@ -4,6 +4,13 @@ import { compactTokens } from './compact-set'; * Bitwise helpers * ************************************************************************* */ +// From: https://stackoverflow.com/a/43122214/1185079 +export function bitCount(n: number): number { + n = n - ((n >> 1) & 0x55555555); + n = (n & 0x33333333) + ((n >> 2) & 0x33333333); + return (((n + (n >> 4)) & 0xf0f0f0f) * 0x1010101) >> 24; +} + export function getBit(n: number, mask: number): boolean { return !!(n & mask); } @@ -203,18 +210,9 @@ export function createFuzzySignature(pattern: string): Uint32Array { return compactTokens(new Uint32Array(fastTokenizer(pattern, isAllowedFilter))); } -export function binSearch(arr: Uint32Array, elt: number): boolean { - // TODO - check most common case? +export function binSearch(arr: Uint32Array, elt: number): number { if (arr.length === 0) { - return false; - } - - if (arr.length === 1) { - return arr[0] === elt; - } - - if (arr.length === 2) { - return arr[0] === elt || arr[1] === elt; + return -1; } let low = 0; @@ -228,10 +226,15 @@ export function binSearch(arr: Uint32Array, elt: number): boolean { } else if (midVal > elt) { high = mid - 1; } else { - return true; + return mid; } } - return false; + + return -1; +} + +export function binLookup(arr: Uint32Array, elt: number): boolean { + return binSearch(arr, elt) !== -1; } export function updateResponseHeadersWithCSP( diff --git a/test/compact-set.test.ts b/test/compact-set.test.ts index 257495daf2..dd0b00eecb 100644 --- a/test/compact-set.test.ts +++ b/test/compact-set.test.ts @@ -18,11 +18,13 @@ it('#compactTokens', () => { it('#hasEmptyIntersection', () => { expect(hasEmptyIntersection(a`abcde`, a`efgh`)).toEqual(false); + expect(hasEmptyIntersection(a`efgh`, a`abcde`)).toEqual(false); expect(hasEmptyIntersection(a`bcde`, a`aefgh`)).toEqual(false); expect(hasEmptyIntersection(a`abcde`, a`fgh`)).toEqual(true); expect(hasEmptyIntersection(a``, a``)).toEqual(true); expect(hasEmptyIntersection(a`abc`, a``)).toEqual(true); expect(hasEmptyIntersection(a``, a`abc`)).toEqual(true); + expect(hasEmptyIntersection(a``, a`abc`)).toEqual(true); }); it('#mergeCompactSets', () => { diff --git a/test/data/requests.ts b/test/data/requests.ts index d253e780e6..ca470b224f 100644 --- a/test/data/requests.ts +++ b/test/data/requests.ts @@ -90,6 +90,7 @@ export default [ filters: [ '|https://$image,xmlhttprequest,domain=pornhub.com|redtube.com|redtube.com.br|tube8.com|tube8.es|tube8.fr|youporn.com|youporngay.com', '@@||phncdn.com^$image,object-subrequest,other,domain=pornhub.com|redtube.com|redtube.com.br|tube8.com|tube8.es|tube8.fr|youporn.com|youporngay.com', + '@@||phncdn.com^$image,media,object,stylesheet,domain=gaytube.com|pornhub.com|redtube.com|redtube.it|tube8.com|tube8.es|tube8.fr|xtube.com|youjizz.com|youporn.com|youporngay.com', ], sourceUrl: 'https://www.pornhub.com', type: 'image', diff --git a/test/engine.test.ts b/test/engine.test.ts index ebb26ad9a8..83ad2739b8 100644 --- a/test/engine.test.ts +++ b/test/engine.test.ts @@ -1,35 +1,189 @@ import * as tldts from 'tldts'; import Engine from '../src/engine/engine'; -import { CosmeticFilter } from '../src/parsing/cosmetic-filter'; -import { makeRequest } from '../src/request'; +import NetworkFilter from '../src/filters/network'; +import Request, { makeRequest } from '../src/request'; +import Resources from '../src/resources'; import requests from './data/requests'; -function createEngine(filters: string, enableOptimizations: boolean = true) { - const newEngine = new Engine({ - enableOptimizations, - loadCosmeticFilters: true, - loadNetworkFilters: true, - optimizeAOT: true, +/** + * Helper function used in the Engine tests. All the assertions are performed by + * this function. It will be called to tests the different configurations of + * engines, for each of the requests and all of the filters. + */ +function test({ + engine, + filter, + testFiltersInIsolation, + resources, + request, + importants, + redirects, + exceptions, + normalFilters, +}: { + engine: Engine; + filter: NetworkFilter; + testFiltersInIsolation: boolean; + resources: Resources; + request: Request; + importants: string[]; + redirects: string[]; + exceptions: string[]; + normalFilters: string[]; +}): void { + it(`[engine] isolation=${testFiltersInIsolation} optimized=${engine.enableOptimizations} ${ + filter.rawLine + }`, () => { + // Set correct resources in `engine` (`resources` is expected to have been + // created using the matching redirect filters for the current Request so + // that all redirect matches will have a corresponding resource in + // `resources`). + engine.resources = resources; + + // Collect all matching filters for this request. + const matchingFilters = new Set(); + [...engine.matchAll(request)].forEach((matchingFilter) => { + (matchingFilter.rawLine || '').split(' <+> ').forEach((f: string) => { + matchingFilters.add(f); + }); + }); + + // Check if one of the filters is a special case: important, + // exception or redirect; and perform extra checks then. + if (filter.isImportant()) { + const result = engine.match(request); + expect(result.filter).not.toBeUndefined(); + if ( + result.filter !== undefined && + result.filter.rawLine !== undefined && + !result.filter.rawLine.includes('<+>') + ) { + expect(importants).toContainEqual(result.filter.rawLine); + + // Handle case where important filter is also a redirect + if (filter.isRedirect()) { + expect(redirects).toContainEqual(result.filter.rawLine); + } + } + + expect(result.exception).toBeUndefined(); + + if (!filter.isRedirect()) { + expect(result.redirect).toBeUndefined(); + } + + expect(result.match).toBeTruthy(); + } else if ( + filter.isException() && + normalFilters.length !== 0 && + !testFiltersInIsolation && + importants.length === 0 + ) { + const result = engine.match(request); + expect(result.exception).not.toBeUndefined(); + if ( + result.exception !== undefined && + result.exception.rawLine !== undefined && + !result.exception.rawLine.includes('<+>') + ) { + expect(exceptions).toContainEqual(result.exception.rawLine); + } + + expect(result.filter).not.toBeUndefined(); + expect(result.redirect).toBeUndefined(); + expect(result.match).toBeFalsy(); + } else if (filter.isRedirect() && exceptions.length === 0 && importants.length === 0) { + const result = engine.match(request); + expect(result.filter).not.toBeUndefined(); + if ( + result.filter !== undefined && + result.filter.rawLine !== undefined && + !result.filter.rawLine.includes('<+>') + ) { + expect(redirects).toContainEqual(result.filter.rawLine); + } + + expect(result.exception).toBeUndefined(); + expect(result.redirect).not.toBeUndefined(); + expect(result.match).toBeTruthy(); + } + + expect(matchingFilters).toContain(filter.rawLine); }); +} - newEngine.onUpdateFilters( - [ - { - asset: 'filters', - checksum: '', - filters, - }, - ], - new Set(), - true, - ); +function buildResourcesFromRequests(filters: NetworkFilter[]): Resources { + const resources: string[] = []; - return newEngine; + filters.forEach((filter) => { + if (filter.redirect !== undefined) { + const redirect = filter.redirect; + + // Guess resource type + let type = 'application/javascript'; + if (redirect.endsWith('.gif')) { + type = 'image/gif;base64'; + } + + resources.push(`${redirect} ${type}\n${redirect}`); + } + }); + + return Resources.parse(resources.join('\n\n'), { checksum: '' }); +} + +function createEngine(filters: string, enableOptimizations: boolean = true) { + return Engine.parse(filters, { + debug: true, + enableOptimizations, + }); } describe('#FiltersEngine', () => { + it('network filters are disabled', () => { + const request = makeRequest({ url: 'https://foo.com' }, tldts); + + // Enabled + expect( + Engine.parse('||foo.com', { loadNetworkFilters: true }).match(request).match, + ).toBeTruthy(); + + // Disabled + expect( + Engine.parse('||foo.com', { loadNetworkFilters: false }).match(request).match, + ).toBeFalsy(); + }); + + it('cosmetic filters are disabled', () => { + // Enabled + expect( + Engine.parse('##.selector', { loadCosmeticFilters: true }).getCosmeticsFilters( + 'foo.com', // hostname + 'foo.com', // domain + ), + ).toEqual({ + active: true, + blockedScripts: [], + scripts: [], + styles: '.selector { display: none !important; }', + }); + + // Disabled + expect( + Engine.parse('##.selector', { loadCosmeticFilters: false }).getCosmeticsFilters( + 'foo.com', // hostname + 'foo.com', // domain + ), + ).toEqual({ + active: false, + blockedScripts: [], + scripts: [], + styles: '', + }); + }); + describe('filters with bug id', () => { it('matches bug filter', () => { const filter = createEngine('||foo.com$bug=42').match( @@ -96,6 +250,46 @@ describe('#FiltersEngine', () => { ).toBeUndefined(); }); + it('network filters are disabled', () => { + expect( + Engine.parse('||foo.com$csp=bar', { loadNetworkFilters: false }).getCSPDirectives( + makeRequest( + { + url: 'https://foo.com', + }, + tldts, + ), + ), + ).toBeUndefined(); + }); + + it('request not supported', () => { + // Not supported protocol + expect( + Engine.parse('||foo.com$csp=bar').getCSPDirectives( + makeRequest( + { + url: 'ftp://foo.com', + }, + tldts, + ), + ), + ).toBeUndefined(); + + // Not document request + expect( + Engine.parse('||foo.com$csp=bar').getCSPDirectives( + makeRequest( + { + type: 'script', + url: 'ftp://foo.com', + }, + tldts, + ), + ), + ).toBeUndefined(); + }); + it('does not match request', () => { expect( createEngine('||bar.com$csp=bar').getCSPDirectives( @@ -181,205 +375,288 @@ $csp=baz,domain=bar.com }); describe('network filters', () => { + // Collect all filters from all requests in the dataset. Each test case + // contains one request as well as a list of filters matching this request + // (exceptions, normal filters, etc.). We create a big list of filters out + // of them. const allRequestFilters = requests.map(({ filters }) => filters.join('\n')).join('\n'); - [ - { enableOptimizations: true, allFilters: '' }, - { enableOptimizations: false, allFilters: '' }, - { enableOptimizations: true, allFilters: allRequestFilters }, - { enableOptimizations: false, allFilters: allRequestFilters }, - ].forEach((setup) => { - describe(`initialized with optimization: ${ - setup.enableOptimizations - } and filters: ${!!setup.allFilters}`, () => { - const engine = createEngine(setup.allFilters, setup.enableOptimizations); - - requests.forEach(({ filters, type, url, sourceUrl }) => { - filters.forEach((filter) => { - it(`${filter}, ${type} matches url=${url}, sourceUrl=${sourceUrl}`, () => { - // Update engine with this specific filter only if the engine is - // initially empty. - if (setup.allFilters.length === 0) { - engine.onUpdateFilters( - [ - { - asset: 'extraFilters', - checksum: '', - filters: filter, - }, - ], - new Set(['filters']), - true, - ); - } - - const matchingFilters = new Set(); - [ - ...engine.matchAll( - makeRequest( - { - sourceUrl, - type, - url, - }, - tldts, - ), - ), - ].forEach((optimizedFilter) => { - (optimizedFilter.rawLine || '').split(' <+> ').forEach((f: string) => { - matchingFilters.add(f); - }); - }); - - expect(matchingFilters).toContain(filter); - }); + // Create several base engines to be used in different scenarii: + // - Engine with *no filter* optimizations *enabled* + // - Engine with *no filter* optimizations *disabled* + // - Engine with *all filters* optimizations *enabled* + // - Engine with *all filters* optimizations *disabled* + // const engineEmptyOptimized = createEngine('', true); + // const engineEmpty = createEngine('', false); + const engineFullOptimized = createEngine(allRequestFilters, true); + const engineFull = createEngine(allRequestFilters, false); + + // For each request, make sure that we get the correct match in 4 different + // setups: + // - Engine with only the filter being tested + // - Engine with all the filters + // - Engine with optimizations enabled + // - Engine with optimizations disabled + for (let i = 0; i < requests.length; i += 1) { + const { filters, type, url, sourceUrl } = requests[i]; + + // Dispatch `filters` into the following categories: exception, important, + // redirects or normal filters. This will be used later to check the + // output of Engine.match. Additionally, we keep the list of NetworkFilter + // instances. + const exceptions: string[] = []; + const importants: string[] = []; + const redirects: string[] = []; + const normalFilters: string[] = []; + const parsedFilters: NetworkFilter[] = []; + for (let j = 0; j < filters.length; j += 1) { + const filter = filters[j]; + const parsed = NetworkFilter.parse(filter, true); + expect(parsed).not.toBeNull(); + if (parsed !== null) { + parsedFilters.push(parsed); + + if (parsed.isException()) { + exceptions.push(filter); + } + + if (parsed.isImportant()) { + importants.push(filter); + } + + if (parsed.isRedirect()) { + redirects.push(filter); + } + + if (!parsed.isRedirect() && !parsed.isException() && !parsed.isImportant()) { + normalFilters.push(filter); + } + } + } + + // Prepare a fake `resources.txt` created from the list of filters of type + // `redirect` in `filters`. A resource of the right name will be created + // for each of them. + const resources = buildResourcesFromRequests(parsedFilters); + + // Create an instance of `Request` to be shared for all the calls to + // `Engine.match` or `Engine.matchAll`. + const request = makeRequest( + { + sourceUrl, + type, + url, + }, + tldts, + ); + + describe(`[request] type=${type} url=${url}, sourceUrl=${sourceUrl}`, () => { + // Check each filter individually + for (let j = 0; j < parsedFilters.length; j += 1) { + const filter = parsedFilters[j]; + const baseConfig = { + exceptions, + filter, + importants, + normalFilters, + redirects, + request, + resources, + }; + + // Engine with only this filter + test({ + ...baseConfig, + engine: new Engine({ networkFilters: [filter] }), + testFiltersInIsolation: true, }); + + // All filters with optimizations enabled + test({ + ...baseConfig, + engine: engineFullOptimized, + testFiltersInIsolation: false, + }); + + // All filters with optimizations disabled + test({ + ...baseConfig, + engine: engineFull, + testFiltersInIsolation: false, + }); + } + }); + } + }); + + describe('#getCosmeticFilters', () => { + it('handles script blocking', () => { + expect( + Engine.parse('foo.*##script:contains(ads)').getCosmeticsFilters('foo.com', 'foo.com') + .blockedScripts, + ).toEqual(['ads']); + }); + + describe('script injections', () => { + it('injects script', () => { + const engine = Engine.parse('##+js(script.js,arg1)'); + engine.resources = Resources.parse('script.js application/javascript\n{{1}}', { + checksum: '', }); + expect(engine.getCosmeticsFilters('foo.com', 'foo.com').scripts).toEqual(['arg1']); + }); + + it('script missing', () => { + expect( + Engine.parse('##+js(foo,arg1)').getCosmeticsFilters('foo.com', 'foo.com').scripts, + ).toEqual([]); }); }); - }); - describe('cosmetic filters', () => { - describe('hostnames', () => { - [ - { - hostname: 'bild.de', - matches: [], - misMatches: [ - 'bild.de##script:contains(/^s*de.bild.cmsKonfig/)', - 'bild.de#@#script:contains(/^s*de.bild.cmsKonfig/)', - ], - }, - ].forEach((testCase: { hostname: string; matches: string[]; misMatches: string[] }) => { - it(testCase.hostname, () => { - const engine = createEngine( - [...testCase.matches, ...testCase.misMatches].join('\n'), - true, - ); - - const shouldMatch: Set = new Set(testCase.matches); - const shouldNotMatch: Set = new Set(testCase.misMatches); - - const rules = engine.cosmetics.getCosmeticsFilters( - testCase.hostname, - tldts.getDomain(testCase.hostname) || '', - ); - expect(rules.length).toEqual(shouldMatch.size); - rules.forEach((rule: CosmeticFilter) => { - expect(rule.rawLine).not.toBeNull(); - if (rule.rawLine !== undefined && !shouldMatch.has(rule.rawLine)) { - throw new Error(`Expected node ${testCase.hostname} ` + ` to match ${rule.rawLine}`); + it('handles custom :styles', () => { + expect( + Engine.parse( + ` +##.selector :style(foo) +##.selector :style(bar) +##.selector1 :style(foo)`, + ).getCosmeticsFilters('foo.com', 'foo.com').styles, + ).toEqual('.selector ,.selector1 { foo }\n\n.selector { bar }'); + }); + + // TODO - add more coverage here! + [ + // Exception cancels generic rule + { + hostname: 'google.com', + matches: [], + misMatches: ['##.adwords', 'google.com#@#.adwords'], + }, + // Exception cancels entity rule + { + hostname: 'google.com', + matches: [], + misMatches: ['google.*##.adwords', 'google.com#@#.adwords'], + }, + // Exception cancels hostname rule + { + hostname: 'google.com', + matches: [], + misMatches: ['google.com##.adwords', 'google.com#@#.adwords'], + }, + // Entity exception cancels generic rule + { + hostname: 'google.com', + matches: [], + misMatches: ['##.adwords', 'google.*#@#.adwords'], + }, + // Entity exception cancels entity rule + { + hostname: 'google.com', + matches: [], + misMatches: ['google.*##.adwords', 'google.*#@#.adwords'], + }, + // Exception does not cancel if selector is different + { + hostname: 'google.de', + matches: ['##.adwords2'], + misMatches: ['google.de#@#.adwords'], + }, + { + hostname: 'google.de', + matches: ['google.de##.adwords2'], + misMatches: ['google.de#@#.adwords'], + }, + // Exception does not cancel if hostname is different + { + hostname: 'google.de', + matches: ['##.adwords'], + misMatches: ['google.com#@#.adwords'], + }, + { + hostname: 'google.com', + matches: ['##.adwords'], + misMatches: ['accounts.google.com#@#.adwords'], + }, + { + hostname: 'speedtest.net', + matches: ['##.ad-stack'], + misMatches: [], + }, + { + hostname: 'example.de', + matches: ['###AD300Right'], + misMatches: [], + }, + { + hostname: 'pokerupdate.com', + matches: [], + misMatches: [], + }, + { + hostname: 'pokerupdate.com', + matches: ['pokerupdate.com##.related-room', 'pokerupdate.com##.prev-article'], + misMatches: [], + }, + { + hostname: 'google.com', + matches: [ + 'google.com,~mail.google.com##.c[style="margin: 0pt;"]', + '###tads + div + .c', + '##.mw > #rcnt > #center_col > #taw > #tvcap > .c', + '##.mw > #rcnt > #center_col > #taw > .c', + ], + misMatches: [], + }, + { + hostname: 'mail.google.com', + matches: [ + '###tads + div + .c', + '##.mw > #rcnt > #center_col > #taw > #tvcap > .c', + '##.mw > #rcnt > #center_col > #taw > .c', + ], + misMatches: ['google.com,~mail.google.com##.c[style="margin: 0pt;"]'], + }, + { + hostname: 'bitbucket.org', + matches: [], + misMatches: [], + }, + { + hostname: 'bild.de', + matches: [], + misMatches: [ + 'bild.de##script:contains(/^s*de.bild.cmsKonfig/)', + 'bild.de#@#script:contains(/^s*de.bild.cmsKonfig/)', + ], + }, + ].forEach((testCase: { hostname: string; matches: string[]; misMatches: string[] }) => { + it(JSON.stringify(testCase), () => { + // Initialize engine with all rules from test case + const engine = createEngine([...testCase.matches, ...testCase.misMatches].join('\n')); + + const shouldMatch: Set = new Set(testCase.matches); + const shouldNotMatch: Set = new Set(testCase.misMatches); + + // #getCosmeticFilters + const rules = engine.cosmetics.getCosmeticsFilters( + testCase.hostname, + tldts.getDomain(testCase.hostname) || '', + ); + + expect(rules.length).toEqual(shouldMatch.size); + rules.forEach((rule) => { + expect(rule.rawLine).not.toBeUndefined(); + if (rule.rawLine !== undefined) { + if (!shouldMatch.has(rule.rawLine)) { + throw new Error(`Expected ${rule.rawLine} to match ${testCase.hostname}`); } - if (rule.rawLine !== undefined && shouldNotMatch.has(rule.rawLine)) { - throw new Error( - `Expected node ${testCase.hostname} ` + ` not to match ${rule.rawLine}`, - ); + if (shouldNotMatch.has(rule.rawLine)) { + throw new Error(`Expected ${rule.rawLine} not to match ${testCase.hostname}`); } - }); + } }); }); }); - - describe('nodes', () => { - [ - { - hostname: 'google.com', - matches: ['##.adwords'], - misMatches: ['accounts.google.com#@#.adwords'], - node: ['.adwords'], - }, - { - hostname: 'speedtest.net', - matches: ['##.ad-stack'], - misMatches: [], - node: ['.ad-stack'], - }, - { - hostname: 'example.de', - matches: ['###AD300Right'], - misMatches: [], - node: ['#AD300Right'], - }, - { - hostname: 'pokerupdate.com', - matches: [], - misMatches: [], - node: ['#not_an_ad'], - }, - { - hostname: 'pokerupdate.com', - matches: ['pokerupdate.com##.related-room', 'pokerupdate.com##.prev-article'], - misMatches: [], - node: ['.related-room', '.prev-article'], - }, - { - hostname: 'google.com', - matches: [ - 'google.com,~mail.google.com##.c[style="margin: 0pt;"]', - '###tads + div + .c', - '##.mw > #rcnt > #center_col > #taw > #tvcap > .c', - '##.mw > #rcnt > #center_col > #taw > .c', - ], - misMatches: [], - node: ['.c'], - }, - { - hostname: 'mail.google.com', - matches: [ - '###tads + div + .c', - '##.mw > #rcnt > #center_col > #taw > #tvcap > .c', - '##.mw > #rcnt > #center_col > #taw > .c', - ], - misMatches: ['google.com,~mail.google.com##.c[style="margin: 0pt;"]'], - node: ['.c'], - }, - { - hostname: 'bitbucket.org', - matches: [], - misMatches: [], - node: ['.p'], - }, - ].forEach( - (testCase: { - hostname: string; - matches: string[]; - misMatches: string[]; - node: string[]; - }) => { - it(`${testCase.hostname}: ${JSON.stringify(testCase.node)}`, () => { - const engine = createEngine( - [...testCase.matches, ...testCase.misMatches].join('\n'), - true, - ); - - const shouldMatch: Set = new Set(testCase.matches); - const shouldNotMatch: Set = new Set(testCase.misMatches); - - const rules = engine.cosmetics.getCosmeticsFilters( - testCase.hostname, - tldts.getDomain(testCase.hostname) || '', - ); - expect(rules.length).toEqual(shouldMatch.size); - rules.forEach((rule) => { - expect(rule.rawLine).not.toBeNull(); - if (rule.rawLine !== undefined && !shouldMatch.has(rule.rawLine)) { - throw new Error( - `Expected node ${testCase.hostname} + ` + - `${JSON.stringify(testCase.node)}` + - ` to match ${rule.rawLine} ${JSON.stringify(rule)}`, - ); - } - if (rule.rawLine !== undefined && shouldNotMatch.has(rule.rawLine)) { - throw new Error( - `Expected node ${testCase.hostname} + ` + - `${JSON.stringify(testCase.node)}` + - ` not to match ${rule.rawLine} ${JSON.stringify(rule)}`, - ); - } - }); - }); - }, - ); - }); }); }); diff --git a/test/lists.test.ts b/test/lists.test.ts new file mode 100644 index 0000000000..56dbddb174 --- /dev/null +++ b/test/lists.test.ts @@ -0,0 +1,310 @@ +import { loadAllLists } from './utils'; + +import StaticDataView from '../src/data-view'; +import CosmeticFilter from '../src/filters/cosmetic'; +import NetworkFilter from '../src/filters/network'; +import Lists, { f, List, parseFilters } from '../src/lists'; + +const FILTERS = loadAllLists(); +const { cosmeticFilters, networkFilters } = parseFilters(FILTERS, { debug: true }); + +function expectElementsToBeTheSame(elements1: any[], elements2: any[]): void { + expect(new Set(elements1)).toEqual(new Set(elements2)); +} + +describe('#List', () => { + it('#serialize', () => { + const list = new List({ + debug: true, + loadCosmeticFilters: true, + loadNetworkFilters: true, + }); + + list.update(FILTERS, 'checksum'); + + const buffer = new StaticDataView(2000000); + list.serialize(buffer); + buffer.seekZero(); + expect(List.deserialize(buffer)).toEqual(list); + }); + + it('#getCosmeticFiltersIds', () => { + const list = new List(); + list.update(FILTERS, 'checksum'); + expectElementsToBeTheSame( + list.getCosmeticFiltersIds(), + cosmeticFilters.map((filter) => filter.getId()), + ); + }); + + it('#getNetworkFiltersIds', () => { + const list = new List(); + list.update(FILTERS, 'checksum'); + expectElementsToBeTheSame( + list.getNetworkFiltersIds(), + networkFilters.map((filter) => filter.getId()), + ); + }); + + describe('#update', () => { + it('returns all filters in initialize update', () => { + const list = new List({ debug: true }); + const diff = list.update(FILTERS, 'checksum'); + + expect(diff.removedNetworkFilters).toHaveLength(0); + expect(diff.removedCosmeticFilters).toHaveLength(0); + + expectElementsToBeTheSame( + diff.newNetworkFilters.map((filter) => filter.rawLine), + networkFilters.map((filter) => filter.rawLine), + ); + + expectElementsToBeTheSame( + diff.newCosmeticFilters.map((filter) => filter.rawLine), + cosmeticFilters.map((filter) => filter.rawLine), + ); + }); + + it('returns all filters as removed for empty update', () => { + // Initial list + const list = new List({ debug: true }); + list.update(FILTERS, 'checksum'); + + const diff = list.update('', 'checksum2'); + + // No new filters + expect(diff.newCosmeticFilters).toHaveLength(0); + expect(diff.newNetworkFilters).toHaveLength(0); + + // All filters removed + expectElementsToBeTheSame( + diff.removedNetworkFilters, + networkFilters.map((filter) => filter.getId()), + ); + + expectElementsToBeTheSame( + diff.removedCosmeticFilters, + cosmeticFilters.map((filter) => filter.getId()), + ); + }); + + it('returns empty diff for same list', () => { + const list = new List({ debug: true }); + list.update(FILTERS, 'checksum'); + + // Same list + expect(list.update(FILTERS, 'checksum2')).toEqual({ + newCosmeticFilters: [], + newNetworkFilters: [], + removedCosmeticFilters: [], + removedNetworkFilters: [], + }); + + // Same checksum + expect(list.update('', 'checksum2')).toEqual({ + newCosmeticFilters: [], + newNetworkFilters: [], + removedCosmeticFilters: [], + removedNetworkFilters: [], + }); + }); + + it('return correct diff', () => { + const list = new List({ debug: true }); + let diff = list.update( + ` +||foo.com +||bar.com +###.selector + `, + 'checksum1', + ); + + expect(diff.removedNetworkFilters).toHaveLength(0); + expect(diff.removedCosmeticFilters).toHaveLength(0); + + // One cosmetic filter + expectElementsToBeTheSame(diff.newCosmeticFilters.map((filter) => filter.rawLine), [ + '###.selector', + ]); + + // Two network filters + expectElementsToBeTheSame(diff.newNetworkFilters.map((filter) => filter.rawLine), [ + '||foo.com', + '||bar.com', + ]); + + // Update with one new cosmetic filter and one network filter removed + diff = list.update( + ` +||foo.com +###.selector +###.selector2 + `, + 'checksum2', + ); + + expect(diff.removedCosmeticFilters).toHaveLength(0); + expect(diff.newNetworkFilters).toHaveLength(0); + + expect(diff.removedNetworkFilters).toEqual([(f`||bar.com` as NetworkFilter).getId()]); + expectElementsToBeTheSame(diff.newCosmeticFilters.map((filter) => filter.rawLine), [ + '###.selector2', + ]); + + // Remove all cosmetics and add new network filters + diff = list.update( + ` +||foo.com +||bar.com +||baz.de + `, + 'checksum3', + ); + + expect(diff.removedNetworkFilters).toHaveLength(0); + expectElementsToBeTheSame(diff.removedCosmeticFilters, [ + (f`###.selector` as CosmeticFilter).getId(), + (f`###.selector2` as CosmeticFilter).getId(), + ]); + + expect(diff.newCosmeticFilters).toHaveLength(0); + expectElementsToBeTheSame(diff.newNetworkFilters.map((filter) => filter.rawLine), [ + '||bar.com', + '||baz.de', + ]); + }); + }); +}); + +describe('Lists', () => { + it('#deserialize', () => { + const lists = new Lists({ debug: true }); + lists.update([ + { name: 'list1', checksum: 'checksum1', list: '||foo.com' }, + { name: 'list2', checksum: 'checksum2', list: '||bar.com' }, + ]); + + const buffer = new StaticDataView(1000000); + lists.serialize(buffer); + buffer.seekZero(); + expect(Lists.deserialize(buffer)).toEqual(lists); + }); + + it('#getLoaded', () => { + const lists = new Lists(); + + // Initialize with two lists + lists.update([ + { name: 'list1', checksum: 'checksum1', list: '||foo.com' }, + { name: 'list2', checksum: 'checksum2', list: '||bar.com' }, + ]); + expectElementsToBeTheSame(lists.getLoaded(), ['list1', 'list2']); + + // Add a third list + lists.update([{ name: 'list3', checksum: 'checksum3', list: '||baz.com' }]); + expectElementsToBeTheSame(lists.getLoaded(), ['list1', 'list2', 'list3']); + }); + + it('#has', () => { + const lists = new Lists(); + + // Initialize with two lists + lists.update([ + { name: 'list1', checksum: 'checksum1', list: '||foo.com' }, + { name: 'list2', checksum: 'checksum2', list: '||bar.com' }, + ]); + + expect(lists.has('list1', 'checksum1')).toBeTruthy(); + expect(lists.has('list1', 'checksum')).toBeFalsy(); + + expect(lists.has('list2', 'checksum2')).toBeTruthy(); + expect(lists.has('list2', 'checksum')).toBeFalsy(); + + expect(lists.has('list3', 'checksum')).toBeFalsy(); + }); + + it('#delete', () => { + const lists = new Lists(); + + // Initialize with two lists + lists.update([ + { name: 'list1', checksum: 'checksum1', list: '||foo.com' }, + { name: 'list2', checksum: 'checksum2', list: '||bar.com' }, + { name: 'list3', checksum: 'checksum3', list: '||baz.com' }, + ]); + + const diff = lists.delete(['list1', 'list3']); + + // Check diff + expect(diff.newCosmeticFilters).toHaveLength(0); + expect(diff.newNetworkFilters).toHaveLength(0); + expect(diff.removedCosmeticFilters).toHaveLength(0); + expectElementsToBeTheSame(diff.removedNetworkFilters, [ + (f`||foo.com` as NetworkFilter).getId(), + (f`||baz.com` as NetworkFilter).getId(), + ]); + + // Check loaded lists + expect(lists.has('list1', 'checksum1')).toBeFalsy(); + expect(lists.has('list3', 'checksum3')).toBeFalsy(); + expect(lists.has('list2', 'checksum2')).toBeTruthy(); + }); + + it('#update', () => { + const lists = new Lists({ debug: true }); + let diff = lists.update([ + { name: 'list1', checksum: 'checksum1', list: '||foo.com' }, + { name: 'list2', checksum: 'checksum2', list: '##.selector' }, + ]); + + expect(diff.removedNetworkFilters).toHaveLength(0); + expect(diff.removedCosmeticFilters).toHaveLength(0); + + expectElementsToBeTheSame(diff.newCosmeticFilters.map((filter) => filter.rawLine), [ + '##.selector', + ]); + expectElementsToBeTheSame(diff.newNetworkFilters.map((filter) => filter.rawLine), [ + '||foo.com', + ]); + + // Update with new filters + diff = lists.update([ + { name: 'list1', checksum: 'checksum1_1', list: '||bar.com' }, + { name: 'list3', checksum: 'checksum3', list: '##.selector2' }, + ]); + + expectElementsToBeTheSame(diff.newNetworkFilters.map((filter) => filter.rawLine), [ + '||bar.com', + ]); + expectElementsToBeTheSame(diff.newCosmeticFilters.map((filter) => filter.rawLine), [ + '##.selector2', + ]); + expectElementsToBeTheSame(diff.removedNetworkFilters, [ + (f`||foo.com` as NetworkFilter).getId(), + ]); + expect(diff.removedCosmeticFilters).toHaveLength(0); + }); +}); + +describe('#f', () => { + it('handles CosmeticFilter', () => { + const filter = f`##.selector`; + expect(filter).not.toBeNull(); + if (filter !== null) { + expect(filter.isCosmeticFilter()).toBeTruthy(); + } + }); + + it('handles NetworkFitler', () => { + const filter = f`||foo.com`; + expect(filter).not.toBeNull(); + if (filter !== null) { + expect(filter.isNetworkFilter()).toBeTruthy(); + } + }); + + it('returns null for invalid filter', () => { + expect(f`#$#~~~`).toBeNull(); + }); +}); diff --git a/test/matching.test.ts b/test/matching.test.ts index 2f20797099..c4df26218b 100644 --- a/test/matching.test.ts +++ b/test/matching.test.ts @@ -1,10 +1,13 @@ import { getDomain, getHostname } from 'tldts'; -import matchCosmeticFilter, { getHostnameWithoutPublicSuffix } from '../src/matching/cosmetics'; -import matchNetworkFilter, { isAnchoredByHostname } from '../src/matching/network'; - -import { f } from '../src/parsing/list'; -import { parseNetworkFilter } from '../src/parsing/network-filter'; +import CosmeticFilter, { + getHashesFromLabelsBackward, + getHostnameWithoutPublicSuffix, + hashHostnameBackward, +} from '../src/filters/cosmetic'; +import NetworkFilter, { isAnchoredByHostname } from '../src/filters/network'; + +import { f } from '../src/lists'; import { makeRequest } from '../src/request'; import requests from './data/requests'; @@ -25,7 +28,7 @@ expect.extend({ getDomain, getHostname, }); - const match = matchNetworkFilter(filter, processedRequest); + const match = filter.match(processedRequest); if (match) { return { message: () => @@ -40,7 +43,7 @@ expect.extend({ }; }, toMatchHostname(filter, hostname) { - const match = matchCosmeticFilter(filter, hostname, getDomain(hostname) || ''); + const match = filter.match(hostname, getDomain(hostname) || ''); if (match) { return { message: () => `expected ${filter.toString()} to not match ${hostname}`, @@ -145,11 +148,11 @@ describe('#isAnchoredByHostname', () => { }); }); -describe('#matchNetworkFilter', () => { +describe('#NetworkFilter.match', () => { requests.forEach(({ filters, type, sourceUrl, url }) => { filters.forEach((filter) => { it(`${filter} matches ${type}, url=${url}, source=${sourceUrl}`, () => { - const networkFilter = parseNetworkFilter(filter); + const networkFilter = NetworkFilter.parse(filter); if (networkFilter !== null) { networkFilter.rawLine = filter; } @@ -186,6 +189,12 @@ describe('#matchNetworkFilter', () => { expect(f`foo bar baz$fuzzy`).toMatchRequest({ url: 'http://bar.foo.baz' }); expect(f`foo bar baz 42$fuzzy`).not.toMatchRequest({ url: 'http://bar.foo.baz' }); + + // Fast-path for when pattern is longer than the URL + expect(f`foo bar baz 42 43$fuzzy`).not.toMatchRequest({ url: 'http://bar.foo.baz' }); + + // No fuzzy signature, matches every URL? + expect(f`+$fuzzy`).toMatchRequest({ url: 'http://bar.foo.baz' }); }); it('||pattern', () => { @@ -246,6 +255,18 @@ describe('#matchNetworkFilter', () => { expect(f`|https://foo.com|`).toMatchRequest({ url: 'https://foo.com' }); }); + it('||pattern + left-anchor', () => { + expect(f`||foo.com^test`).toMatchRequest({ url: 'https://foo.com/test' }); + expect(f`||foo.com/test`).toMatchRequest({ url: 'https://foo.com/test' }); + expect(f`||foo.com^test`).not.toMatchRequest({ url: 'https://foo.com/tes' }); + expect(f`||foo.com/test`).not.toMatchRequest({ url: 'https://foo.com/tes' }); + + expect(f`||foo.com^`).toMatchRequest({ url: 'https://foo.com/test' }); + + expect(f`||foo.com/test*bar`).toMatchRequest({ url: 'https://foo.com/testbar' }); + expect(f`||foo.com^test*bar`).toMatchRequest({ url: 'https://foo.com/testbar' }); + }); + it('||hostname^*/pattern', () => { expect(f`||foo.com^*/bar`).not.toMatchRequest({ url: 'https://foo.com/bar' }); expect(f`||com^*/bar`).not.toMatchRequest({ url: 'https://foo.com/bar' }); @@ -328,31 +349,104 @@ describe('#matchNetworkFilter', () => { }); }); -describe('#matchCosmeticFilter', () => { +describe('#CosmeticFilter.match', () => { + it('genercic filter', () => { + expect(f`##.selector`).toMatchHostname('foo.com'); + }); + it('single domain', () => { expect(f`foo.com##.selector`).toMatchHostname('foo.com'); + expect(f`foo.com##.selector`).not.toMatchHostname('bar.com'); }); it('multiple domains', () => { expect(f`foo.com,test.com##.selector`).toMatchHostname('foo.com'); expect(f`foo.com,test.com##.selector`).toMatchHostname('test.com'); + expect(f`foo.com,test.com##.selector`).not.toMatchHostname('baz.com'); }); it('subdomain', () => { expect(f`foo.com,test.com##.selector`).toMatchHostname('sub.test.com'); + expect(f`foo.com,test.com##.selector`).toMatchHostname('sub.foo.com'); + expect(f`foo.com,sub.test.com##.selector`).toMatchHostname('sub.test.com'); + expect(f`foo.com,sub.test.com##.selector`).not.toMatchHostname('test.com'); + expect(f`foo.com,sub.test.com##.selector`).not.toMatchHostname('com'); }); it('entity', () => { + expect(f`foo.com,sub.test.*##.selector`).toMatchHostname('foo.com'); + expect(f`foo.com,sub.test.*##.selector`).toMatchHostname('bar.foo.com'); expect(f`foo.com,sub.test.*##.selector`).toMatchHostname('sub.test.com'); expect(f`foo.com,sub.test.*##.selector`).toMatchHostname('sub.test.fr'); + expect(f`foo.com,sub.test.*##.selector`).not.toMatchHostname('sub.test.evil.biz'); + expect(f`foo.*##.selector`).toMatchHostname('foo.co.uk'); + expect(f`foo.*##.selector`).toMatchHostname('bar.foo.co.uk'); + expect(f`foo.*##.selector`).toMatchHostname('baz.bar.foo.co.uk'); + expect(f`foo.*##.selector`).not.toMatchHostname('foo.evil.biz'); }); it('does not match', () => { expect(f`foo.*##.selector`).not.toMatchHostname('foo.bar.com'); expect(f`foo.*##.selector`).not.toMatchHostname('bar-foo.com'); }); + + describe('negations', () => { + it('entity', () => { + expect(f`~foo.*##.selector`).not.toMatchHostname('foo.com'); + expect(f`~foo.*##.selector`).toMatchHostname('foo.evil.biz'); + + expect(f`~foo.*,~bar.*##.selector`).toMatchHostname('baz.com'); + expect(f`~foo.*,~bar.*##.selector`).not.toMatchHostname('foo.com'); + expect(f`~foo.*,~bar.*##.selector`).not.toMatchHostname('sub.foo.com'); + expect(f`~foo.*,~bar.*##.selector`).not.toMatchHostname('bar.com'); + expect(f`~foo.*,~bar.*##.selector`).not.toMatchHostname('sub.bar.com'); + }); + + it('hostnames', () => { + expect(f`~foo.com##.selector`).not.toMatchHostname('foo.com'); + expect(f`~foo.com##.selector`).not.toMatchHostname('bar.foo.com'); + expect(f`~foo.com##.selector`).toMatchHostname('foo.com.bar'); + expect(f`~foo.com##.selector`).toMatchHostname('foo.co.uk'); + expect(f`~foo.com##.selector`).toMatchHostname('foo.co.uk'); + + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('foo.com'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('sub.foo.com'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('foo.de'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('sub.foo.de'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('bar.com'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).not.toMatchHostname('sub.bar.com'); + + expect(f`~foo.com,~foo.de,~bar.com##.selector`).toMatchHostname('bar.de'); + expect(f`~foo.com,~foo.de,~bar.com##.selector`).toMatchHostname('sub.bar.de'); + }); + }); + + describe('complex', () => { + it('handles entity with suffix exception', () => { + expect(f`foo.*,~foo.com##.selector`).not.toMatchHostname('foo.com'); + expect(f`foo.*,~foo.com##.selector`).not.toMatchHostname('sub.foo.com'); + expect(f`foo.*,~foo.com##.selector`).toMatchHostname('foo.de'); + expect(f`foo.*,~foo.com##.selector`).toMatchHostname('sub.foo.de'); + }); + + it('handles entity with subdomain exception', () => { + expect(f`foo.*,~sub.foo.*##.selector`).toMatchHostname('foo.com'); + expect(f`foo.*,~sub.foo.*##.selector`).toMatchHostname('foo.de'); + expect(f`foo.*,~sub.foo.*##.selector`).not.toMatchHostname('sub.foo.de'); + expect(f`foo.*,~sub.foo.*##.selector`).not.toMatchHostname('sub.foo.com'); + expect(f`foo.*,~sub.foo.*##.selector`).toMatchHostname('sub2.foo.com'); + }); + }); + + it('no domain provided', () => { + const parsed = CosmeticFilter.parse('foo.*##.selector'); + expect(parsed).not.toBeNull(); + if (parsed !== null) { + expect(parsed.match('foo.com', '')).toBeFalsy(); + } + }); }); describe('#getHostnameWithoutPublicSuffix', () => { @@ -376,3 +470,29 @@ describe('#getHostnameWithoutPublicSuffix', () => { expect(getHostnameWithoutPublicSuffix('foo.bar.com', 'bar.com')).toEqual('foo.bar'); }); }); + +describe('#getHashesFromLabelsBackward', () => { + it('hash all labels', () => { + expect(getHashesFromLabelsBackward('foo.bar.baz', 11, 11)).toEqual( + ['baz', 'bar.baz', 'foo.bar.baz'].map(hashHostnameBackward), + ); + }); + + it('hash subdomains only', () => { + expect(getHashesFromLabelsBackward('foo.bar.baz.com', 15, 8 /* start of domain */)).toEqual( + ['baz.com', 'bar.baz.com', 'foo.bar.baz.com'].map(hashHostnameBackward), + ); + }); + + it('hash ignoring suffix', () => { + expect(getHashesFromLabelsBackward('foo.bar.baz.com', 11, 11)).toEqual( + ['baz', 'bar.baz', 'foo.bar.baz'].map(hashHostnameBackward), + ); + }); + + it('hash subdomains only, ignoring suffix', () => { + expect(getHashesFromLabelsBackward('foo.bar.baz.com', 11, 8)).toEqual( + ['baz', 'bar.baz', 'foo.bar.baz'].map(hashHostnameBackward), + ); + }); +}); diff --git a/test/parsing.test.ts b/test/parsing.test.ts index 064e534a49..72fd75ccad 100644 --- a/test/parsing.test.ts +++ b/test/parsing.test.ts @@ -1,12 +1,21 @@ -import { parseCosmeticFilter } from '../src/parsing/cosmetic-filter'; -import { parseList } from '../src/parsing/list'; -import { parseNetworkFilter } from '../src/parsing/network-filter'; +import CosmeticFilter, { + DEFAULT_HIDDING_STYLE, + hashHostnameBackward, +} from '../src/filters/cosmetic'; +import NetworkFilter from '../src/filters/network'; +import { parseFilters } from '../src/lists'; import { fastHash } from '../src/utils'; +function h(hostnames: string[]): Uint32Array { + return new Uint32Array(hostnames.map(hashHostnameBackward)); +} + // TODO: collaps, popup, popunder, generichide, genericblock function network(filter: string, expected: any) { - const parsed = parseNetworkFilter(filter); + const parsed = NetworkFilter.parse(filter); if (parsed !== null) { + expect(parsed.isNetworkFilter()).toBeTruthy(); + expect(parsed.isCosmeticFilter()).toBeFalsy(); const verbose = { // Attributes bug: parsed.bug, @@ -91,6 +100,74 @@ const DEFAULT_NETWORK_FILTER = { }; describe('Network filters', () => { + describe('toString', () => { + const checkToString = (line: string, expected: string, debug: boolean = false) => { + const parsed = NetworkFilter.parse(line); + expect(parsed).not.toBeNull(); + if (parsed !== null) { + if (debug) { + parsed.rawLine = line; + } + expect(parsed.toString()).toBe(expected); + } + }; + + [ + // Negations + 'ads$~image', + 'ads$~media', + 'ads$~object', + 'ads$~other', + 'ads$~ping', + 'ads$~script', + 'ads$~font', + 'ads$~stylesheet', + 'ads$~xmlhttprequest', + + // Options + 'ads$fuzzy', + 'ads$image', + 'ads$media', + 'ads$object', + 'ads$other', + 'ads$ping', + 'ads$script', + 'ads$font', + 'ads$third-party', + 'ads$first-party', + 'ads$stylesheet', + 'ads$xmlhttprequest', + + 'ads$important', + 'ads$fuzzy', + 'ads$redirect=noop', + 'ads$bug=42', + ].forEach((line) => { + it(`pprint ${line}`, () => { + checkToString(line, line); + }); + }); + + it('pprint anchored hostnames', () => { + checkToString('@@||foo.com', '@@||foo.com^'); + checkToString('@@||foo.com|', '@@||foo.com^|'); + checkToString('|foo.com|', '|foo.com|'); + checkToString('foo.com|', 'foo.com|'); + }); + + it('pprint domain', () => { + checkToString('ads$domain=foo.com|bar.co.uk|~baz.io', 'ads$domain='); + }); + + it('pprint with debug=true', () => { + checkToString( + 'ads$domain=foo.com|bar.co.uk|~baz.io', + 'ads$domain=foo.com|bar.co.uk|~baz.io', + true, + ); + }); + }); + it('parses pattern', () => { const base = { ...DEFAULT_NETWORK_FILTER, @@ -414,6 +491,11 @@ describe('Network filters', () => { network('||foo.com$important', { isImportant: true }); }); + it('parses ~important', () => { + // Not supported + network('||foo.com$~important', null); + }); + it('defaults to false', () => { network('||foo.com', { isImportant: false }); }); @@ -541,6 +623,12 @@ describe('Network filters', () => { network('||foo.com$~redirect', null); }); + it('parses redirect without a value', () => { + // Not valid + network('||foo.com$redirect', null); + network('||foo.com$redirect=', null); + }); + it('defaults to false', () => { network('||foo.com', { isRedirect: false, @@ -622,6 +710,22 @@ describe('Network filters', () => { }); }); + describe('un-supported options', () => { + [ + 'badfilter', + 'genericblock', + 'generichide', + 'inline-script', + 'popunder', + 'popup', + 'woot', + ].forEach((unsupportedOption) => { + it(unsupportedOption, () => { + network(`||foo.com$${unsupportedOption}`, null); + }); + }); + }); + const allOptions = (value: boolean) => ({ fromFont: value, fromImage: value, @@ -691,13 +795,19 @@ describe('Network filters', () => { }); function cosmetic(filter: string, expected: any) { - const parsed = parseCosmeticFilter(filter); + const parsed = CosmeticFilter.parse(filter); if (parsed !== null) { + expect(parsed.isNetworkFilter()).toBeFalsy(); + expect(parsed.isCosmeticFilter()).toBeTruthy(); const verbose = { // Attributes - hostnames: parsed.getHostnames(), + entities: parsed.entities, + hostnames: parsed.hostnames, + notEntities: parsed.notEntities, + notHostnames: parsed.notHostnames, + selector: parsed.getSelector(), - style: parsed.style, + style: parsed.getStyle(), // Options isScriptBlock: parsed.isScriptBlock(), @@ -712,9 +822,8 @@ function cosmetic(filter: string, expected: any) { const DEFAULT_COSMETIC_FILTER = { // Attributes - hostnames: [], selector: '', - style: undefined, + style: DEFAULT_HIDDING_STYLE, // Options isScriptBlock: false, @@ -723,6 +832,36 @@ const DEFAULT_COSMETIC_FILTER = { }; describe('Cosmetic filters', () => { + describe('toString', () => { + const checkToString = (line: string, expected: string, debug: boolean = false) => { + const parsed = CosmeticFilter.parse(line); + expect(parsed).not.toBeNull(); + if (parsed !== null) { + if (debug) { + parsed.rawLine = line; + } + expect(parsed.toString()).toBe(expected); + } + }; + + ['##.selector', '##+js(foo.js)', '##script:contains(ads)'].forEach((line) => { + it(`pprint ${line}`, () => { + checkToString(line, line); + }); + }); + + it('pprint with hostnames', () => { + checkToString('foo.com##.selector', '##.selector'); + checkToString('~foo.com##.selector', '##.selector'); + checkToString('~foo.*##.selector', '##.selector'); + checkToString('foo.*##.selector', '##.selector'); + }); + + it('pprint with debug=true', () => { + checkToString('foo.com##.selector', 'foo.com##.selector', true); + }); + }); + describe('parses selector', () => { cosmetic('##iframe[src]', { ...DEFAULT_COSMETIC_FILTER, @@ -733,17 +872,27 @@ describe('Cosmetic filters', () => { it('parses hostnames', () => { cosmetic('foo.com##.selector', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com'], + hostnames: h(['foo.com']), selector: '.selector', }); cosmetic('foo.com,bar.io##.selector', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com', 'bar.io'], + hostnames: h(['foo.com', 'bar.io']), selector: '.selector', }); cosmetic('foo.com,bar.io,baz.*##.selector', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com', 'bar.io', 'baz.*'], + entities: h(['baz']), + hostnames: h(['foo.com', 'bar.io']), + selector: '.selector', + }); + + cosmetic('~entity.*,foo.com,~bar.io,baz.*,~entity2.*##.selector', { + ...DEFAULT_COSMETIC_FILTER, + entities: h(['baz']), + hostnames: h(['foo.com']), + notEntities: h(['entity', 'entity2']), + notHostnames: h(['bar.io']), selector: '.selector', }); }); @@ -752,14 +901,14 @@ describe('Cosmetic filters', () => { cosmetic('#@#script:contains(foo)', null); // We need hostnames cosmetic('foo.com#@#script:contains(foo)', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com'], + hostnames: h(['foo.com']), isScriptBlock: true, isUnhide: true, selector: 'foo', }); cosmetic('foo.com#@#.selector', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com'], + hostnames: h(['foo.com']), isUnhide: true, selector: '.selector', }); @@ -789,6 +938,11 @@ describe('Cosmetic filters', () => { isScriptInject: true, selector: 'script.js, arg1, arg2, arg3', }); + cosmetic('##+js(script.js, arg1, arg2, arg3)', { + ...DEFAULT_COSMETIC_FILTER, + isScriptInject: true, + selector: 'script.js, arg1, arg2, arg3', + }); }); it('parses :style', () => { @@ -806,10 +960,46 @@ describe('Cosmetic filters', () => { cosmetic('foo.com,bar.de###foo > bar >baz:style(display: none)', { ...DEFAULT_COSMETIC_FILTER, - hostnames: ['foo.com', 'bar.de'], + hostnames: h(['foo.com', 'bar.de']), selector: '#foo > bar >baz', style: 'display: none', }); + + cosmetic('foo.com,bar.de###foo > bar >baz:styleTYPO(display: none)', null); + }); + + // TODO + // it('rejects invalid selectors', () => { + // // @ts-ignore + // global.document = { + // createElement: () => ({ matches: () => false }), + // }; + // cosmetic('###.selector /invalid/', null); + + // // @ts-ignore + // global.document = { + // createElement: () => ({ + // matches: () => { + // throw new Error('Invalid'); + // }, + // }), + // }; + // cosmetic('###.selector /invalid/', null); + + // // @ts-ignore + // global.document = undefined; + // }); + + it('#getScript', () => { + const parsed = CosmeticFilter.parse('##+js(script.js, arg1, arg2, arg3)'); + expect(parsed).not.toBeNull(); + if (parsed !== null) { + expect(parsed.getScript(new Map([['script.js', '{{1}},{{2}},{{3}}']]))).toEqual( + 'arg1,arg2,arg3', + ); + + expect(parsed.getScript(new Map())).toBeUndefined(); + } }); }); @@ -825,11 +1015,8 @@ describe('Filters list', () => { '! ||foo.com', '[Adblock] ||foo.com', '[Adblock Plus 2.0] ||foo.com', - ].forEach((content) => { - const { cosmeticFilters, networkFilters } = parseList(content); - - expect(cosmeticFilters).toHaveLength(0); - expect(networkFilters).toHaveLength(0); + ].forEach((data) => { + expect(parseFilters(data)).toEqual(parseFilters('')); }); }); }); diff --git a/test/resources.test.ts b/test/resources.test.ts new file mode 100644 index 0000000000..0aef999779 --- /dev/null +++ b/test/resources.test.ts @@ -0,0 +1,72 @@ +import Resources from '../src/resources'; + +describe('#Resources', () => { + describe('#parse', () => { + it('parses empty resources', () => { + const resources = Resources.parse('', { checksum: 'checksum' }); + expect(resources.checksum).toEqual('checksum'); + expect(resources.js).toEqual(new Map()); + expect(resources.resources).toEqual(new Map()); + }); + + it('parses one resource', () => { + const resources = Resources.parse('foo application/javascript\ncontent', { + checksum: 'checksum', + }); + expect(resources.checksum).toEqual('checksum'); + expect(resources.js).toEqual(new Map([['foo', 'content']])); + expect(resources.resources).toEqual( + new Map([['foo', { contentType: 'application/javascript', data: 'content' }]]), + ); + }); + + it('parses two resources', () => { + const resources = Resources.parse( + ['foo application/javascript\ncontent1', 'pixel.png image/png;base64\ncontent2'].join( + '\n\n', + ), + { checksum: 'checksum' }, + ); + expect(resources.checksum).toEqual('checksum'); + expect(resources.js).toEqual(new Map([['foo', 'content1']])); + expect(resources.resources).toEqual( + new Map([ + ['foo', { contentType: 'application/javascript', data: 'content1' }], + ['pixel.png', { contentType: 'image/png;base64', data: 'content2' }], + ]), + ); + }); + + it('robust to weird format', () => { + const resources = Resources.parse( + ` +# Comment + # Comment 2 +foo application/javascript +content1 +# Comment 3 + +# Type missing +pixel.png +content + +# Content missing +pixel.png image/png;base64 + +# This one is good! +pixel.png image/png;base64 +content2 +`, + { checksum: 'checksum' }, + ); + expect(resources.checksum).toEqual('checksum'); + expect(resources.js).toEqual(new Map([['foo', 'content1']])); + expect(resources.resources).toEqual( + new Map([ + ['foo', { contentType: 'application/javascript', data: 'content1' }], + ['pixel.png', { contentType: 'image/png;base64', data: 'content2' }], + ]), + ); + }); + }); +}); diff --git a/test/reverse-index.test.ts b/test/reverse-index.test.ts new file mode 100644 index 0000000000..3ddcb0c237 --- /dev/null +++ b/test/reverse-index.test.ts @@ -0,0 +1,192 @@ +import StaticDataView from '../src/data-view'; +import ReverseIndex from '../src/engine/reverse-index'; +import CosmeticFilter from '../src/filters/cosmetic'; +import IFilter from '../src/filters/interface'; +import NetworkFilter from '../src/filters/network'; +import { parseFilters } from '../src/lists'; +import { tokenize } from '../src/utils'; +import { loadAllLists } from './utils'; + +describe('ReverseIndex', () => { + const { cosmeticFilters, networkFilters } = parseFilters(loadAllLists()); + + describe('#getFilters', () => { + function testGetFiltersImlp( + filters: T[], + deserialize: (buffer: StaticDataView) => T, + ): void { + const reverseIndex = new ReverseIndex({ + deserialize, + filters, + }); + + expect(new Set(reverseIndex.getFilters().map((f) => f.toString()))).toEqual( + new Set(filters.map((f) => f.toString())), + ); + } + + it('network', () => { + testGetFiltersImlp(networkFilters, NetworkFilter.deserialize); + }); + + it('cosmetic', () => { + testGetFiltersImlp(cosmeticFilters, CosmeticFilter.deserialize); + }); + }); + + it('#update', () => { + const reverseIndex = new ReverseIndex({ + deserialize: NetworkFilter.deserialize, + filters: parseFilters('||foo.com', { loadCosmeticFilters: false, debug: true }) + .networkFilters, + }); + + // Expect our filter to be listed + let filters = reverseIndex.getFilters(); + expect(filters.map((f) => f.rawLine)).toEqual(['||foo.com']); + + // Add one new filter + reverseIndex.update( + parseFilters('||bar.com', { loadCosmeticFilters: false, debug: true }).networkFilters, + ); + filters = reverseIndex.getFilters(); + expect(filters.map((f) => f.rawLine)).toEqual(['||foo.com', '||bar.com']); + + // Add a third filter and remove the two others + reverseIndex.update( + parseFilters('||baz.com', { loadCosmeticFilters: false, debug: true }).networkFilters, + new Set(filters.map((f) => f.getId())), + ); + filters = reverseIndex.getFilters(); + expect(filters.map((f) => f.rawLine)).toEqual(['||baz.com']); + }); + + describe('#iterMatchingFilters', () => { + const emptyIndex = new ReverseIndex({ deserialize: NetworkFilter.deserialize }); + const filters = ` +||foo.com +/ads/tracker.js$image +|woot|$redirect=noop.js + `; + const exampleIndex = new ReverseIndex({ + deserialize: NetworkFilter.deserialize, + filters: parseFilters(filters, { loadCosmeticFilters: false, debug: true }).networkFilters, + }); + + it('works on empty index', () => { + let matches = 0; + const cb = (_: NetworkFilter) => { + matches += 1; + return true; + }; + + // No tokens + emptyIndex.iterMatchingFilters(new Uint32Array(0), cb); + expect(matches).toBe(0); + + // Some tokens + emptyIndex.iterMatchingFilters(tokenize('foo bar baz'), cb); + expect(matches).toBe(0); + }); + + it('handle no match', () => { + let matches = 0; + const cb = (_: NetworkFilter) => { + matches += 1; + return true; + }; + + // No tokens + exampleIndex.iterMatchingFilters(tokenize('bar co.uk de image redirect'), cb); + expect(matches).toBe(0); + }); + + it('finds matches', () => { + const matches: Set = new Set(); + let ret: boolean = true; + const cb = (f: NetworkFilter) => { + matches.add(f.rawLine); + return ret; + }; + + [ + ['foo', ['||foo.com']], + ['com', []], // filter was indexed using 'foo' and not 'com' + ['ads', ['/ads/tracker.js$image']], + ['foo.ads', ['||foo.com', '/ads/tracker.js$image']], + ['woot', ['|woot|$redirect=noop.js']], + ['https://bar.foo.com/ads/tracker.js', ['||foo.com', '/ads/tracker.js$image']], + ].forEach(([input, expected]) => { + // Get all matches + matches.clear(); + ret = true; // iterate on all filters + exampleIndex.iterMatchingFilters(tokenize(input as string), cb); + expect(matches).toEqual(new Set(expected)); + + // Check early termination + matches.clear(); + ret = false; // early termination on first filter + exampleIndex.iterMatchingFilters(tokenize(input as string), cb); + expect(matches.size).toEqual(expected.length === 0 ? 0 : 1); + }); + }); + + it('stores filters without tokens in wildcard bucket', () => { + const index = new ReverseIndex({ + deserialize: NetworkFilter.deserialize, + filters: parseFilters( + ` +wildcard +||foo.com + `, + { loadCosmeticFilters: false, debug: true }, + ).networkFilters, + }); + + const matches: Set = new Set(); + const cb = (f: NetworkFilter) => { + matches.add(f.rawLine); + return true; + }; + + // Wildcard filter is always returned + [ + ['foo', ['||foo.com', 'wildcard']], + ['com', ['wildcard']], // filter was indexed using 'foo' and not 'com' + ].forEach(([input, expected]) => { + // Get all matches + matches.clear(); + index.iterMatchingFilters(tokenize(input as string), cb); + expect(matches).toEqual(new Set(expected)); + }); + }); + }); + + describe('#serialize', () => { + function testSerializeIndexImpl( + filters: T[], + deserialize: (buffer: StaticDataView) => T, + ): void { + const reverseIndex = new ReverseIndex({ + deserialize, + filters, + }); + + // Serialize index + const buffer = new StaticDataView(4000000); + reverseIndex.serialize(buffer); + + // Deserialize + buffer.seekZero(); + expect(ReverseIndex.deserialize(buffer, deserialize)).toEqual(reverseIndex); + } + + it('network', () => { + testSerializeIndexImpl(networkFilters, NetworkFilter.deserialize); + }); + + it('cosmetic', () => { + testSerializeIndexImpl(cosmeticFilters, CosmeticFilter.deserialize); + }); + }); +}); diff --git a/test/serialization.test.ts b/test/serialization.test.ts index f5479b3787..3bdfbe3aa6 100644 --- a/test/serialization.test.ts +++ b/test/serialization.test.ts @@ -2,85 +2,76 @@ import { loadAllLists, loadResources } from './utils'; import StaticDataView from '../src/data-view'; import Engine from '../src/engine/engine'; -import ReverseIndex from '../src/engine/reverse-index'; -import { parseList } from '../src/parsing/list'; -import { NetworkFilter } from '../src/parsing/network-filter'; -import { - deserializeCosmeticFilter, - deserializeEngine, - deserializeNetworkFilter, - deserializeReverseIndex, - serializeCosmeticFilter, - serializeEngine, - serializeNetworkFilter, - serializeReverseIndex, -} from '../src/serialization'; +import CosmeticFilter from '../src/filters/cosmetic'; +import IFilter from '../src/filters/interface'; +import NetworkFilter from '../src/filters/network'; +import { parseFilters } from '../src/lists'; describe('Serialization', () => { - const { networkFilters, cosmeticFilters } = parseList(loadAllLists()); + const { cosmeticFilters, networkFilters } = parseFilters(loadAllLists(), { debug: true }); describe('filters', () => { const buffer = new StaticDataView(1000000); + const checkFilterSerialization = (Filter: any, filter: IFilter) => { + buffer.seekZero(); + filter.serialize(buffer); + buffer.seekZero(); + expect(Filter.deserialize(buffer)).toEqual(filter); + }; + it('cosmetic', () => { - cosmeticFilters.forEach((filter) => { - buffer.seekZero(); - serializeCosmeticFilter(filter, buffer); - buffer.seekZero(); - expect(deserializeCosmeticFilter(buffer)).toEqual(filter); - }); + for (let i = 0; i < cosmeticFilters.length; i += 1) { + checkFilterSerialization(CosmeticFilter, cosmeticFilters[i]); + } }); it('network', () => { - networkFilters.forEach((filter) => { - buffer.seekZero(); - serializeNetworkFilter(filter, buffer); - buffer.seekZero(); - expect(deserializeNetworkFilter(buffer)).toEqual(filter); - }); - }); - }); - - it('ReverseIndex', () => { - const filters = new Map(); - networkFilters.forEach((filter) => { - filters.set(filter.getId(), filter); + for (let i = 0; i < networkFilters.length; i += 1) { + checkFilterSerialization(NetworkFilter, networkFilters[i]); + } }); - // Initialize index - const reverseIndex = new ReverseIndex((cb) => { - networkFilters.forEach(cb); + it('with bug ID', () => { + checkFilterSerialization(NetworkFilter, NetworkFilter.parse('ads$bug=42') as NetworkFilter); }); - - // Serialize index - const buffer = new StaticDataView(4000000); - serializeReverseIndex(reverseIndex, buffer); - buffer.seekZero(); - - const deserialized = new ReverseIndex(); - deserializeReverseIndex(buffer, deserialized, filters); - expect(deserialized).toEqual(reverseIndex); }); - it('Engine', () => { - const engine = new Engine({ - enableOptimizations: true, - loadCosmeticFilters: true, - loadNetworkFilters: true, - optimizeAOT: false, + describe('Engine', () => { + const buffer = new Uint8Array(15000000); + it('fails with wrong version', () => { + const engine = new Engine(); + const serialized = engine.serialize(buffer); + const version = serialized[0]; + serialized[0] = 1; // override version + expect(() => { + Engine.deserialize(serialized); + }).toThrow('serialized engine version mismatch'); + serialized[0] = version; + expect(Engine.deserialize(serialized)).toEqual(engine); }); - engine.onUpdateFilters([{ filters: loadAllLists(), asset: 'list1', checksum: 'checksum' }]); - engine.onUpdateResource([{ checksum: 'resources1', filters: loadResources() }]); - - const serialized = serializeEngine(engine); + it('handles full engine', () => { + const engine = new Engine(); + engine.updateResources(loadResources(), 'resources1'); + engine.update({ + newCosmeticFilters: cosmeticFilters, + newNetworkFilters: networkFilters, + }); - const version = serialized[0]; - serialized[0] = 1; // override version - expect(() => { - deserializeEngine(serialized); - }).toThrow('serialized engine version mismatch'); - serialized[0] = version; + // Add one list + engine.updateList({ + checksum: 'checksum', + list: ` +||foo.com +domain.com##.selector +/ads/$script +###foo + `, + name: 'list', + }); - expect(deserializeEngine(serialized)).toEqual(engine); + const serialized = engine.serialize(buffer); + expect(Engine.deserialize(serialized)).toEqual(engine); + }); }); }); diff --git a/test/utils.test.ts b/test/utils.test.ts index 618db55525..159359fdb1 100644 --- a/test/utils.test.ts +++ b/test/utils.test.ts @@ -1,5 +1,11 @@ -import { parseList } from '../src/parsing/list'; -import { fastHash, tokenize } from '../src/utils'; +import { parseFilters } from '../src/lists'; +import { + binSearch, + fastHash, + hasUnicode, + tokenize, + updateResponseHeadersWithCSP, +} from '../src/utils'; import requests from './data/requests'; import { loadAllLists } from './utils'; @@ -36,23 +42,20 @@ function checkCollisions(filters: any[]) { } describe('Utils', () => { - describe('fastHash', () => { + describe('#fastHash', () => { it('does not produce collision on network filters', () => { - const { networkFilters } = parseList(loadAllLists()); - checkCollisions(networkFilters); + checkCollisions(parseFilters(loadAllLists()).networkFilters); }); it('does not produce collision on requests dataset', () => { // Collect all raw filters - const { networkFilters } = parseList( - requests.map(({ filters }) => filters.join('\n')).join('\n'), + checkCollisions( + parseFilters(requests.map(({ filters }) => filters.join('\n')).join('\n')).networkFilters, ); - checkCollisions(networkFilters); }); it('does not produce collision on cosmetic filters', () => { - const { cosmeticFilters } = parseList(loadAllLists()); - checkCollisions(cosmeticFilters); + checkCollisions(parseFilters(loadAllLists()).cosmeticFilters); }); }); @@ -62,5 +65,122 @@ describe('Utils', () => { expect(tokenize('foo/bar')).toEqual(t(['foo', 'bar'])); expect(tokenize('foo-bar')).toEqual(t(['foo', 'bar'])); expect(tokenize('foo.bar')).toEqual(t(['foo', 'bar'])); + expect(tokenize('foo.barƬ')).toEqual(t(['foo', 'barƬ'])); + }); + + it('#hasUnicode', () => { + for (let i = 0; i < 127; i += 1) { + expect(hasUnicode(`foo${String.fromCharCode(i)}`)).toBeFalsy(); + } + + expect(hasUnicode('。◕ ∀ ◕。)')).toBeTruthy(); + expect(hasUnicode('`ィ(´∀`∩')).toBeTruthy(); + expect(hasUnicode('__ロ(,_,*)')).toBeTruthy(); + expect(hasUnicode('・( ̄∀ ̄)・:*:')).toBeTruthy(); + expect(hasUnicode('゚・✿ヾ╲(。◕‿◕。)╱✿・゚')).toBeTruthy(); + expect(hasUnicode(',。・:*:・゜’( ☻ ω ☻ )。・:*:・゜’')).toBeTruthy(); + expect(hasUnicode('(╯°□°)╯︵ ┻━┻)')).toBeTruthy(); + expect(hasUnicode('(ノಥ益ಥ)ノ ┻━┻')).toBeTruthy(); + expect(hasUnicode('┬─┬ノ( º _ ºノ)')).toBeTruthy(); + expect(hasUnicode('( ͡° ͜ʖ ͡°)')).toBeTruthy(); + expect(hasUnicode('¯_(ツ)_/¯')).toBeTruthy(); + }); + + describe('#binSearch', () => { + it('returns -1 on empty array', () => { + expect(binSearch(new Uint32Array(0), 42)).toBe(-1); + }); + + it('handles array of one element', () => { + expect(binSearch(new Uint32Array([1]), 42)).toBe(-1); + expect(binSearch(new Uint32Array([42]), 42)).toBe(0); + }); + + it('handles array of two elements', () => { + expect(binSearch(new Uint32Array([0, 1]), 42)).toBe(-1); + expect(binSearch(new Uint32Array([1, 42]), 42)).toBe(1); + expect(binSearch(new Uint32Array([42, 1]), 42)).toBe(0); + expect(binSearch(new Uint32Array([42, 42]), 42)).not.toBe(-1); + }); + + it('handles no match', () => { + expect(binSearch(new Uint32Array(10000), 42)).toBe(-1); + }); + + it('handles match on first element', () => { + const array = new Uint32Array(10000); + for (let i = 1; i < array.length; i += 1) { + array[i] = 1; + } + expect(binSearch(array, 0)).toBe(0); + }); + + it('handles match on last element', () => { + const array = new Uint32Array(10000); + array[array.length - 1] = 42; + expect(binSearch(array, 42)).toBe(array.length - 1); + }); + }); + + describe('#updateResponseHeadersWithCSP', () => { + const baseDetails: chrome.webRequest.WebResponseHeadersDetails = { + frameId: 42, + method: 'POST', + parentFrameId: 42, + requestId: '42', + statusCode: 200, + statusLine: '', + tabId: 42, + timeStamp: 0, + type: 'main_frame', + url: 'https://foo.com', + }; + + it('does not update if no policies', () => { + expect(updateResponseHeadersWithCSP(baseDetails, undefined)).toEqual({}); + }); + + it('creates headers if they do not exist', () => { + expect(updateResponseHeadersWithCSP(baseDetails, 'CSP')).toEqual({ + responseHeaders: [{ name: 'content-security-policy', value: 'CSP' }], + }); + }); + + it('create csp header if not exist', () => { + expect(updateResponseHeadersWithCSP({ ...baseDetails, responseHeaders: [] }, 'CSP')).toEqual( + { + responseHeaders: [{ name: 'content-security-policy', value: 'CSP' }], + }, + ); + }); + + it('leaves other headers unchanged', () => { + expect( + updateResponseHeadersWithCSP( + { ...baseDetails, responseHeaders: [{ name: 'header1', value: 'value1' }] }, + 'CSP', + ), + ).toEqual({ + responseHeaders: [ + { name: 'header1', value: 'value1' }, + { name: 'content-security-policy', value: 'CSP' }, + ], + }); + }); + + it('updates existing csp policies', () => { + // Lower-case header name + expect( + updateResponseHeadersWithCSP( + { + ...baseDetails, + responseHeaders: [{ name: 'cOnTeNt-Security-policy', value: 'CSP1' }], + }, + 'CSP', + ), + ).toEqual({ + responseHeaders: [{ name: 'content-security-policy', value: 'CSP; CSP1' }], + }); + }); }); });