β

CSS selector simplifier regular expression in Java

Peterbe.com 22 阅读

The Problem

I'm working on a project where it needs to evaluate CSS as a string. Basically, it compares CSS selectors against a DOM to see if the CSS selector is used in the DOM.

But CSS has pseudo classes . A common one a lot of people are familiar with is: a:hover { text-decoration: crazy } . So that :hover part is not relevant when evaluating the CSS selector against the DOM. So you chop off the :hover bit and is left with a which you can then look for in the DOM.

But there are some tricks and make this less trivial. Consider, this from Bootstrap 3

a[href^="#"]:after,
a[href^="javascript:"]:after {
    content: "";
}

In this case we can't simply split on a : character.

Another non-trivial example comes from Semantic UI :

.ui[class*="4:3"].embed {
  padding-bottom: 75%;
}
.ui[class*="16:9"].embed {
  padding-bottom: 56.25%;
}
.ui[class*="21:9"].embed {
  padding-bottom: 42.85714286%;
}

Basically, if you just split the selectors (e.g. a:hover ) on the first : and keep everything to the left (e.g. a ), with these non-trivial CSS selectors you'd get this:


a[href^="javascript

and


.ui[class*="4

etc. These CSS selectors will fail. Both Firefox and Chrome seem to swallow any errors but cheerio will raise a SyntaxError and not just that but the problem is that the CSS selector is just the wrong one to look for.

The Solution

The solution has to be to split by the : character when it's not between two quotation marks.

This Stackoverflow post helped me with the regex. It was trivial to extend now my final solution looks like this:

/**
 * Reduce a CSS selector to be without any pseudo class parts.
 * For example, from 'a:hover' return 'a'. And from 'input::-moz-focus-inner'
 * to 'input'.
 * Also, more advanced ones like 'a[href^="javascript:"]:after' to
 * 'a[href^="javascript:"]'.
 * The last example works too if the input was 'a[href^='javascript:']:after'
 * instead (using ' instead of ").
 *
 * @param {string} selector
 * @return {string}
 */
const reduceCSSSelector = selector => {
  return selector.split(
    /:(?=([^"'\\]*(\\.|["']([^"'\\]*\\.)*[^"'\\]*['"]))*[^"']*$)/g
  )[0]
}

Extra; About regexes

I've been coding for about 20 years and would like to think I know my way around writing regular expressions in various languages. However, I'm also eager to admit that I often fumble and rely on googling/stackoverflow more than actually understanding what the heck I'm doing. That's why I found this comment so amusing:

Thank you! Didn't think it was possible. I understand 100% of the theory, about 60% of the regex, and I'm down to 0% when it comes to writing it on my own. Oh, well, maybe one of these days. – Azmisov

作者:Peterbe.com
Peter Bengtssons's personal homepage about little things that concern him.
原文地址:CSS selector simplifier regular expression in Java, 感谢原作者分享。

发表评论