This will save.. five comparisons.. occasionally? But it's a more
transparent design, and it seems more charitable to group interface
changes together.
Don't redundantly readIO/parse the rulesets; do this once, lazily
carrying out the involved operations. Rulesets are invariant over
executions.
This improves performance by a few orders of magnitude. Though at some
point we should substitute linear search for lookup on a generalised
suffix tree of rooted domains.
Breaking interface change, though I'll likely restore the old form
soon with IO TH.
I read in the bible (https://www.eff.org/https-everywhere/rulesets) that:
"""
To cover all of a domain's subdomains, you may want to specify a
wildcard target like *.twitter.com. Specifying this type of left-side
wildcard matches any host name with .twitter.com as a suffix, e.g.
www.twitter.com or urls.api.twitter.com. You can also specify a
right-side wildcard like www.google.*. Right-side wildcards, unlike
left-side wildcards, apply only one level deep. So if you want to
cover all countries you'll generally need to specify www.google.*,
www.google.co.*, and www.google.com.* to cover domains like
www.google.co.uk or www.google.com.au.
"""
The previous interpretation is both incorrect (because right wildcards
only apply one level deep) and potentially expensive (regular
expression matching is exponential in the worst-case.)
As 'a6f28e07a1edc8f62f3dfaf7965b3a818c2f4a7f' showed, there may be
breaking changes in the structure of rulesets between releases. I
don't intend to verify that every pair in the product works (is there
reason to be interested in any other than the latest?), so let's not
acommodate any more than one.