Scoring — Combiner
mercury.match.evaluate(link: BankLink, company: MercuryCompany) -> LinkResult
Weighted sum
Weights:
| Field | Weight | Constant |
|---|---|---|
| Name | 2.5 | NAME_WEIGHT |
| 1.5 | EMAIL_WEIGHT | |
| Phone | 1.5 | PHONE_WEIGHT |
Threshold:
| Constant | Value | Comparison |
|---|---|---|
MATCH_THRESHOLD | 2.5 | total >= threshold → Match |
Invariants
| Signal shape | Total | Verdict | Example |
|---|---|---|---|
| Three strong agreements | 5.0 | Match | Link 1 |
| Full name + phone | 4.0 | Match | Links 3, 7 |
| Full name + email | 4.0 | Match | Link 5 |
| Full name alone | 2.5 | Match | Link 6 (nickname-resolved) |
| Partial name + phone | 2.75 | Match | Link 8 |
| Phone alone | 1.5 | Mismatch | Link 2 |
| Email alone | 1.5 | Mismatch | — |
| Partial name alone | 1.25 | Mismatch | Link 9 |
| Nothing agrees | 0.0 | Mismatch | Links 4, 9 |
Rationale for the weights
The rebalanced weights reflect a specific judgment: a full first+last name match is two tokens of bio evidence on a highly specific signal, so it clears the threshold by itself (exactly at 2.5). Every narrower signal — phone alone, email alone, first-name-only, surname-only — needs a second source of agreement.
This is a simplified expression of the Fellegi–Sunter intuition: signals with high discriminating power (log m/u ratio) should dominate the decision statistic. See matching-approach for the F&S comparison.
LinkResult
@dataclass(frozen=True)
class LinkResult:
linkId: int
verdict: Verdict # Match | Mismatch
name_score: float # [0, 1]
email_score: float # [0, 1]
phone_score: float # [0, 1]
total: float # weighted sum, [0, 5.5]The CLI prints only linkId and verdict. The component scores are kept on the result so a future caller can emit graded output or audit per-field contributions without rescoring.