kingfisher/data/rules/nytimes.yml
Mick Grove 0f953f59a5 pattern_requirements for rules — Post-regex character-class gating to cut false positives without lookarounds. Authors can now require minimum counts of digits, uppercase, lowercase, and special characters, with an optional custom special-char set.
Why: Hyperscan doesn’t support lookaheads/behinds, so many “must contain X and Y” checks had to be baked into the regex (hurting readability) or were impossible. pattern_requirements applies lightweight, in-memory checks after a match is found, keeping patterns fast and clean.
2025-11-04 13:55:31 -05:00

34 lines
871 B
YAML

rules:
- name: New York Times API Key
id: kingfisher.nytimes.1
pattern: |
(?xi)
(?:nytimes|new[- ]?york[- ]?times)
(?:.|[\n\r]){0,32}?
\b
(
[a-z0-9_\-=]{32}
)
\b
pattern_requirements:
min_digits: 2
min_entropy: 3.2
confidence: medium
examples:
- NYTIMES_API_KEY=abcd1234efgh5678ijkl9012mnop3456
- '"new-york-times": "zyxw9876vuts5432rqpo1098nmlk7654"'
references:
- https://developer.nytimes.com/
validation:
type: Http
content:
request:
method: GET
url: https://api.nytimes.com/svc/topstories/v2/home.json?api-key={{ TOKEN }}
headers:
Accept: application/json
response_matcher:
- report_response: true
- type: StatusMatch
status:
- 200