Skip to content

ElasticSearch Scoring Function Recommendations

Date: 2026-01-18 Focus: Improving listing ranking based on engagement data analysis

Executive Summary

Analysis of the current ElasticSearch scoring function reveals significant opportunities for improvement. The current system uses static signals only (verified, video, location) and misses critical engagement signals that strongly predict listing quality:

  • Favorite rate has 0.509 correlation with CVR (strong!)
  • From Owner listings have 3x higher CVR
  • Freshness boost should be stepped, not continuous

Current Scoring Function Analysis

Architecture

The current system rotates through 4 scoring models every hour:

Hour 1, 5, 9, 13, 17, 21 → VERIFIED model
Hour 2, 6, 10, 14, 18, 22 → REFRESH model
Hour 3, 7, 11, 15, 19, 23 → OWNERS model
Hour 4, 8, 12, 16, 20 → VERIFIED_OFFICES model

Current Scoring Formula

score =
(verified * 5-25) +
(pr / factor) +
(has_video * 5) +
(user_type * 0-25) +
(user_has_sub * 0 or -20) +
(user_fee * 0-20) +
(project_id > 0 ? 1000 : 0) + // Mobile only
(accurate_location ? 0 : -100) +
(create_time_hour / factor) +
(boosted ? 50-100 : 0) +
(ldu / factor) +
(cdt / factor) +
(is_precise_location ? 100 : 0)

Current Model Weights

FactorVERIFIEDREFRESHOWNERSVERIFIED_OFFICES
verified255155
pr1025525
video5555
user_type1515025
cdt751257550
user_has_sub00—200
boosted10010050100

Gaps Identified by Data Analysis

1. No Engagement Signals

SignalCurrent UsageData FindingRecommendation
CTR (impression→view)❌ Not usedDecays 4.6x with ageAdd as ranking signal
CVR (view→contact)❌ Not usedStable ~5-6%, varies by qualityPrimary quality signal
Favorite rate❌ Not used0.509 correlation with CVRHigh-weight signal
Session rate❌ Not used2.5x higher for high-CVR listingsMedium-weight signal
Share rate❌ Not used3.7x higher for high-CVR listingsLow-weight signal

2. Indirect Owner Boost

Current approach: OWNERS model penalizes subscribers (user_has_sub: -20)

Problem: This is indirect and only active 25% of the time.

Data finding: Owner listings have 3x higher CVR (13.95% vs 4.8%)

Recommendation: Explicit from_owner boost in all models.

3. Continuous vs. Stepped Freshness

Current approach: cdt / (todayNumber / weight) - continuous decay

Data finding: New listings show distinct performance tiers:

DaysAvg VPDBoost Factor
0-1333.5x baseline
2-323.62.5x baseline
4-712.51.3x baseline
8+9.41.0x baseline

Recommendation: Stepped freshness boost for first 7 days.

4. Video Weight Too Low

Current: 5 points

Data finding: Video listings get +49% more views

Recommendation: Increase to 15-20 points.


New Fields Required in ElasticSearch

// Add to listing document (computed daily from stats)
{
"from_owner": true, // Boolean - is advertiser_type = 'owner'
"ctr_7d": 0.25, // Float 0-1 - CTR over last 7 days
"cvr_7d": 0.06, // Float 0-1 - CVR over last 7 days
"favorite_rate_7d": 0.08, // Float 0-1 - Favorites/views last 7 days
"session_rate_7d": 0.22, // Float 0-1 - 30-sec sessions/views
"days_since_created": 15 // Integer - for stepped freshness
}

New Weights Model

const SCORES_MODELS_V2 = {
ENGAGEMENT_BASED: {
// === Static Signals (existing) ===
verified: 20,
pr: 10,
video: 15, // Increased from 5
is_precise_location: 100,
not_accurate_location: -100,
boosted: 100,
// === Freshness (new stepped approach) ===
freshness_0_3_days: 150, // NEW
freshness_4_7_days: 75, // NEW
cdt_continuous: 25, // Reduced from 75-125
// === User Signals ===
from_owner: 50, // NEW - explicit owner boost
user_type: 10,
user_has_sub: 0,
user_has_fees: 10,
// === Engagement Signals (NEW) ===
ctr_7d: 30, // Normalized 0-1
cvr_7d: 50, // Most important!
favorite_rate_7d: 40, // Strong quality signal
session_rate_7d: 20, // Engagement depth
},
};

New Script Score

{
query: oldQuery,
functions: [
{
script_score: {
script: {
source: `
double score = 0;
// === STATIC SIGNALS ===
score += doc['verified'].value * ${weights.verified};
score += doc['pr'].value / ${10 / weights.pr};
score += doc['has_video'].value * ${weights.video};
score += doc['is_precise_location'].value == true ? 100 : 0;
score += doc['accurate_location'].value > 0 ? 0 : ${weights.not_accurate_location};
score += doc['boosted'].value == 1 ? ${weights.boosted} : 0;
// === STEPPED FRESHNESS BOOST ===
long daysSinceCreated = doc['days_since_created'].value;
if (daysSinceCreated <= 3) {
score += ${weights.freshness_0_3_days};
} else if (daysSinceCreated <= 7) {
score += ${weights.freshness_4_7_days};
}
// Continuous decay for older listings
score += doc['cdt'].value / ${todayNumber / weights.cdt_continuous};
// === USER SIGNALS ===
score += doc['from_owner'].value == true ? ${weights.from_owner} : 0;
score += doc['user.type'].value * ${weights.user_type};
score += doc['user.paid'].value > 0 ? ${weights.user_has_sub} : 0;
score += doc['user.fee'].value * ${weights.user_has_fees};
// === ENGAGEMENT SIGNALS ===
// Only apply if listing has sufficient data (>7 days old, >100 impressions)
if (daysSinceCreated > 7 && doc['impressions_total'].value > 100) {
score += doc['ctr_7d'].value * ${weights.ctr_7d};
score += doc['cvr_7d'].value * ${weights.cvr_7d};
score += doc['favorite_rate_7d'].value * ${weights.favorite_rate_7d};
score += doc['session_rate_7d'].value * ${weights.session_rate_7d};
}
// === PROJECT BOOST (mobile only) ===
score += doc['project_id'].value > 0 ? ${from_web ? 0 : 1000} : 0;
return Math.round(score);
`
}
},
weight: 1
}
],
score_mode: 'sum'
}

Implementation Plan

Phase 1: Quick Wins (No New Fields)

Timeline: Can deploy immediately

ChangeCurrentNewImpact
Video weight515+49% view correlation
Add stepped freshnessContinuous onlyAdd tier bonusesBetter new listing visibility
// Add to existing script (no new fields needed)
long daysOld = (${todayNumber} - doc['cdt'].value);
double freshnessBoost = daysOld <= 3 ? 150 : daysOld <= 7 ? 75 : 0;
score += freshnessBoost;

Phase 2: Add Owner Signal

Timeline: 1-2 days

  1. Add from_owner boolean to ES mapping
  2. Populate from advertiser_type = 'owner' in listings
  3. Add to scoring: score += doc['from_owner'].value ? 50 : 0

Phase 3: Add Engagement Signals

Timeline: 1-2 weeks

  1. Create daily aggregation job:

SELECT id, views_7d / NULLIF(impressions_7d, 0) AS ctr_7d, contacts_7d / NULLIF(views_7d, 0) AS cvr_7d, favorites_7d / NULLIF(views_7d, 0) AS favorite_rate_7d, sessions_7d / NULLIF(views_7d, 0) AS session_rate_7d FROM listing_stats_7d

2. Normalize values to 0-1 range (use percentiles within category/district)
3. Update ES mapping and indexing pipeline
4. Add engagement signals to scoring formula
### Phase 4: Remove Hourly Model Rotation
**Timeline:** After Phase 3 validated
The hourly rotation (VERIFIED → REFRESH → OWNERS → VERIFIED_OFFICES) was likely designed to give different listing types fair exposure. With engagement-based scoring, this becomes unnecessary:
- Good owner listings will rank well due to `from_owner` boost + high CVR
- Fresh listings will rank well due to stepped freshness boost
- Verified/quality listings will rank well due to engagement signals
Consider A/B testing removal of rotation in favor of single engagement-based model.
---
## Expected Impact
| Metric | Current | Expected After | Improvement |
| ------------------------ | -------------- | ---------------- | ---------------- |
| Avg CTR (search results) | ~25% | ~35% | +40% |
| Avg CVR (overall) | 5.5% | 6.5%+ | +18% |
| Owner listing visibility | Varies by hour | Consistent boost | More predictable |
| New listing success rate | Unknown | Measurable | Data-driven |
---
## A/B Test Design
### Test Groups
- **Control:** Current scoring with hourly rotation
- **Treatment:** New scoring with engagement signals
### Success Metrics
1. **Primary:** Overall CVR (contacts / views)
2. **Secondary:**
- CTR by position (are we showing more relevant listings?)
- Time to first contact for new listings
- Owner vs Agent performance gap
### Minimum Sample Size
- ~10,000 searches per group
- ~2 weeks of data for statistical significance
---
## Monitoring Queries
### Track Engagement Signal Distribution
```sql
SELECT
quantile (0.1) (ctr_7d) AS ctr_p10,
quantile (0.5) (ctr_7d) AS ctr_p50,
quantile (0.9) (ctr_7d) AS ctr_p90,
quantile (0.1) (cvr_7d) AS cvr_p10,
quantile (0.5) (cvr_7d) AS cvr_p50,
quantile (0.9) (cvr_7d) AS cvr_p90
FROM
listing_engagement_metrics
WHERE
category = 1

Validate Freshness Boost Effect

SELECT
days_bucket,
count() AS listings,
avg(impressions_today) AS avg_impressions,
avg(views_today) AS avg_views
FROM
listings_with_new_scoring
GROUP BY
days_bucket
ORDER BY
days_bucket

Summary of Recommendations

PriorityChangeEffortImpact
1Add stepped freshness boostLowMedium
2Increase video weight (5→15)LowLow
3Add from_owner field and boostLowHigh
4Add engagement signals (ctr, cvr, favorites)MediumHigh
5Remove hourly model rotationLowMedium
6A/B test new scoringMediumValidation

The biggest opportunity is adding engagement signals, particularly favorite_rate which has 0.509 correlation with CVR - meaning listings that users favorite are much more likely to convert to contacts.