Most B2B engagement scoring is theatre. It counts opens. It counts clicks. It tags every contact who downloaded the gated PDF as "hot." Then VP Sales asks how the "hot" contact list converts to meetings, and nobody has a clean answer because the score isn't actually predicting anything.
I built engagement scoring twice at previous B2B scale-ups, and both times the same thing happened: the formula was a weighted sum of the wrong signals, the score correlated with nothing, and the sales team learned to ignore it. So when we built boxli's scoring, we worked backwards from a question I would actually ask of the data: which of these contacts is most likely to take a meeting in the next 14 days?
The formula we ship by default
The score is computed inline by fn_calc_signal_score() and surfaced in the contact_engagement_rollup view. Weights live in the signal_score_weights singleton table and are tunable per-deployment. The defaults:
signal_score =
times_engaged * 1.0
+ min(total_duration_sec / 60, 15) * 2.0
+ places_shared * 3.0
+ days_active * 0.5
+ (5.0 if days_since_last_scan <= 7 else 0)
- (3.0 if days_since_last_scan > 45 else 0)That formula encodes four convictions about what predicts a meeting.
Conviction 1: pass-around is worth three raw engagements
Each distinct place_id that reports an event for the same send is worth three points. If a recipient carries the box from their home office to HQ, that one event is worth three regular scans. We considered higher weights — much higher, because pass-around detection is the cleanest leading indicator we model — but raised the bar on the trigger threshold instead, so a single pass-around event still promotes the contact into the alerting band without distorting the rest of the score.
Conviction 2: duration matters more than frequency, capped
Three 30-second plays are worse than one 90-second play. The recipient who sat through 90 seconds actually watched; the triple-30 recipient paused and never came back. So duration enters as total_duration_sec / 60 minutes weighted at 2.0 — one minute of watch time is worth two raw engagements. Crucially, duration is capped at 15 minutes via duration_cap_min so a single rewatched-five-times outlier doesn't drown out everything else on a contact.
Conviction 3: recency is a step function, not a decay
Most engagement scoring uses an exponential time-decay. We tried it. It punishes early signals when they should be celebrated. A scan from yesterday and a scan from 30 days ago are fundamentally different events — not points on a smooth curve.
Our recency bonus is a single step: +5 points if the most recent scan is within the recency_window_days (default 7), zero otherwise. We tested a finer-grained 24-hour vs. 7-day split and it didn't add predictive lift; the simpler step function won.
Conviction 4: stale signals should actively decay
A contact whose last scan is 60 days old isn't neutral — they're a colder lead than a contact who never scanned at all, because their initial signal pulled them onto your list and the absence-since is informative. So we apply an explicit staleness_penalty of 3 points after staleness_window_days (default 45). It's the only negative term in the formula and it's deliberately small — nudging the score, not crashing it.
Conviction 5: days-active is a low-weight signal that compounds
We add a tiny 0.5-per-day-active term. By itself it's noise. Across a 30-day pilot it's a 15-point ceiling on engagement-frequency-as-recency, capturing the contact who keeps coming back without overweighting any single replay.
What the score does not include
Three things we explicitly chose not to weight:
- QR redirect destination CTR. Every recipient lands on the personalized page; the click is a function of NFC tap, not intent. We track it for analytics but it doesn't score.
- Time-of-day. We thought recipients who engaged outside of business hours would convert better. They don't. Time-of-day stratifies in the analytics dashboard but doesn't score.
- Box design. We considered weighting recipients of higher-spec designs (premium gift, custom-printed brochure) more than recipients of a lighter design. Empirically the design doesn't affect close rate per signal — what matters is whether the signal fired, not which design carried it.
Tuning it for your funnel
The weights are stored in signal_score_weights as a singleton config row, currently editable through the service role only. (An admin UI is on the roadmap. For now, ask us on the demo and we'll tune yours by hand.)
Two knobs that matter most for your funnel:
- Pass-around weight (
w_places_shared). Default 3. If you sell into individual contributors (not a buying committee), drop it to 1.5. Pass-around is less predictive when the buyer is the user. - Duration cap (
duration_cap_min). Default 15 minutes. If you ship 10–12 minute videos at the high end, raise it to 20. If your videos are 90 seconds, drop it to 5 so a single replay doesn't dominate the score. - Staleness window. Default 45 days. Shorter sales cycles (event follow-up, inbound) want this at 21 days; longer enterprise deals want 90.
- Threshold for the signal_threshold trigger. Default fires at 15. Lower it to 10 if you want a noisier hot-list; raise it to 20 for ABM-only motions where you only want exceptional sends to alert.
What we use it for
The score drives the default signal_threshold trigger (Slack alert at 15+ score) and surfaces in the HubSpot contact property boxli_signal_score. Build views off it. Sort sequences off it. Use it to break ties when AEs are deciding which contact to chase first.
The score is a tool, not an answer. It tells you which contact to call first; it doesn't tell you what to say. That's still on your reps. But getting them to call the right contact first is most of the job.
If you want to see how this looks on your contacts, apply for a pilot — we'll wire it into your CRM on the kickoff call.