How We Calculate
Complete transparency on our puzzle insights algorithms
Puzzle Difficulty
How hard is each puzzle? We compare your time to your personal average to find out.
Skill Profile
How skilled are you? We rank you against other solvers on each puzzle, weighted by difficulty and recency.
MSP Rating Ladder
Portfolio-based competitive rating. Your best results from the last 24 months, difficulty-weighted with time decay.
1 Player Baseline
Each player's baseline is the weighted median of their first-attempt solo times for a specific piece count. Recent solves count more thanks to exponential decay weighting.
Time decay with 3-month plateau
Solves from the last 3 months count at full weight. After that, older solves gradually lose influence so your baseline reflects your current speed.
effective_age = max(0, age_in_months - 3)weight = exp(-effective_age / 13.5)
The baseline is a weighted median of your first-attempt solo times at each piece count. Recent solves push the median more than old ones.
Minimum: 5 distinct puzzles solved (first attempt, solo) per piece count.
Exotic piece counts
For puzzles with unusual piece counts (e.g. 631 pieces), most players won't have 5 solves at that exact count. We solve this by estimating their baseline from nearby piece counts they do have data for, using log-space interpolation. This means more puzzles get difficulty ratings.
2 Puzzle Difficulty
For each qualifying player who solved a puzzle, we compute a difficulty index:
Your baseline: 55 min
You solved this puzzle in: 63 min
Difficulty index: 63 / 55 = 1.15
→ 15% harder than average for you
difficulty_index = solve_time / player_baseline
puzzle_difficulty = median(all player indices)
The puzzle's difficulty score is the median of all qualifying indices. Indices above 5.0 are excluded as outliers.
Minimum: 5 qualified players needed for a difficulty score.
For each player with a baseline at this piece count:
difficulty_index = seconds_to_solve / player_baseline
Qualification:
- First attempt only (one per player per puzzle)
- Solo, non-suspicious, completed
- Index must be <= 5.0 (outlier ceiling)
puzzle_difficulty = median(all qualifying indices)
Minimum: 5 indices from different players
Difficulty Tiers
3 Player Skill
Player skill measures how you perform compared to other solvers on each puzzle, weighted by puzzle difficulty. Harder puzzles are worth more.
Example: How your skill score is calculated
You solved Puzzle X in 45 min, 3 months ago.
Out of 50 first-attempt solvers, 38 were slower — your percentile = 0.78
Puzzle X is Challenging (difficulty 1.18) — difficulty weight = 1.09
Weighted percentile: 0.78 × 1.09 = 0.85
Solve is 3 months old (within plateau) — age weight = 1.00
Your skill score = weighted median of all such entries, where recent solves have more influence.
Skill recency weighting
Your skill tier reflects your current ability. Recent puzzles have more influence than old ones via a gentle time decay with a 6-month plateau.
effective_age = max(0, age_months - 6)age_weight = exp(-effective_age / 24)
Minimum: 10 qualifying puzzles, each with at least 20 first-attempt solvers.
For each puzzle with 20+ first-attempt solvers:
percentile = (slower_count + tied_count / 2) / (total_solvers - 1)
confidence = min(1.0, sample_size / 50)
blend = 0.5 × confidence
difficulty_weight = (1 - blend) + blend × puzzle_difficulty
weighted_percentile = percentile × difficulty_weight
effective_age = max(0, age_in_months - 6)
age_weight = exp(-effective_age / 24)
skill_score = weighted_median(weighted_percentiles, age_weights)
Tier = based on percentile rank among all players
Skill Tiers
4 MSP Rating Ladder
A competitive portfolio-based rating. Your best puzzle results from the last 24 months are evaluated, combining first-attempt performance (75%) with best-time performance (25%). Harder puzzles are worth more. Recent results carry more weight through gentle time decay.
First attempts (75%)
Solving a puzzle cold carries the most weight — it's the purest measure of skill. Your top 100 first-attempt results are evaluated.
Best times (25%)
Your fastest time on each puzzle counts too — genuine improvement through practice is rewarded, but at lower weight than cold solving.
Rating recency weighting
Portfolio entries are gently decayed by age so recent results matter more, while older strong results still count. A 3-month plateau keeps very recent results at full value.
effective_age = max(0, age_months - 3)decay = exp(-effective_age / 30)decayed_points = points × decay
Hard cutoff at 24 months — solves older than that are excluded entirely.
Example: Cold Solver vs Grinder
Same Hard puzzle (difficulty 1.35, weight 1.175), solved last month (decay = 1.0):
Player A: Strong first attempt (92nd percentile), never re-solves
Score: 0.92 × 1.175 × 1.0 = 1.08
Player B: Weak first attempt (60th), grinds to 95th percentile best
Score: 0.75 × 0.70 + 0.25 × 1.12 = 0.81
Player A wins decisively — raw cold-solving talent leads, but B gets partial credit for improvement.
Anti-gaming properties
- Grinding same puzzle: Only affects one best-time entry (25% weight). First-attempt score is locked.
- Volume accumulation: Cap of 100. After 100 puzzles, more solves only help if better than your weakest entry.
- Easy puzzle farming: Difficulty weighting makes hard puzzles worth more at the same percentile.
- Inactivity: Time decay gradually reduces older results. Stay active to maintain your rating.
Minimum: 20 first attempts + 50 qualifying puzzles within a 24-month rolling window. Each puzzle must have 20+ solvers.
Rolling window: 24 months. Decay: 3-month plateau.
For each puzzle in window (20+ public solvers):
fa_percentile = rank among first-attempt solvers (0–1)
bt_percentile = rank among all players' best times (0–1)
confidence = min(1.0, sample_size / 50)
blend = 0.5 × confidence
difficulty_weight = (1 - blend) + blend × difficulty
effective_age = max(0, age_months - 3)
decay = exp(-effective_age / 30)
fa_points = fa_percentile × difficulty_weight × decay
bt_points = bt_percentile × difficulty_weight × decay
First-attempt portfolio = mean(top 100 fa_points)
Best-time portfolio = mean(top 100 bt_points)
Rating = 0.75 × FA_portfolio + 0.25 × BT_portfolio
Entry: 20 first attempts + 50 total solves within window
Display: rating × 1000 (e.g., 0.85 → 850)
5 Time Prediction
We use a layered prediction system. If you've solved the puzzle before, you get a personal prediction using your improvement history and data-driven improvement ratios. Otherwise, you get a statistical prediction based on your baseline and the puzzle's difficulty.
Improvement ratios
We compute improvement ratios — how much faster players typically get between consecutive attempts (1st→2nd, 2nd→3rd, etc.). These are computed from real data across all players and puzzles, stratified by the time gap between attempts.
ratio(N→N+1) = median(time_attempt_N+1 / time_attempt_N)
Sources (priority order):
1. Your personal ratio (from all your repeat solves)
2. Global ratio (from all players, per piece count)
3. Default: 0.90
Gap correction: your personal ratio is adjusted
by global gap-bucket data to account for memory fading.
Time gap between attempts affects how much you remember. Ratios are bucketed by gap duration:
| Bucket | Range |
|---|---|
| lt30d | < 30 days |
| 1_3m | 30–89 days |
| 3_12m | 90–364 days |
| gt12m | 365+ days |
Personal prediction (1+ prior solves)
When you've solved the puzzle before, we predict your next time using improvement ratios blended with Holt's damped trend smoothing. With fewer prior solves, the improvement ratio dominates. With more history, Holt's trend takes over.
ratio_prediction = last_time × improvement_ratio(N→N+1, gap)
Blending with Holt's damped trend (α=0.5, β=0.4, φ=0.8):
N = 1: 100% ratio prediction
N = 2-5: blend (Holt's weight = (N-1)/5)
N ≥ 6: 100% Holt's damped trend
Floor: max(predicted, best_time × 0.70)
The blending ensures smooth transitions: with few data points, cross-puzzle improvement patterns guide the prediction. As puzzle-specific history grows, the trend model takes over.
Statistical prediction (first-time puzzle)
When you haven't solved this puzzle before, we combine your baseline with the puzzle's difficulty:
predicted_time = player_baseline × puzzle_difficulty
Your 500pc baseline: 55 min
Puzzle difficulty: 1.15 (Challenging)
Predicted time: 55 × 1.15 = ~63 min
Range uses P25–P75 of the puzzle's difficulty indices: baseline × P25 to baseline × P75. Safety bounds keep the range between 30% and 300% of the predicted time.
6 Puzzle Personality Metrics
Memorability
How much easier does this puzzle get once you've solved it before?
Measures how quickly players learn this puzzle across their first 3 attempts, normalized against the average puzzle learning rate. A value > 1.0 means this puzzle rewards familiarity more than average. Min: 8 players with 3+ attempts.learning_rate = (attempt_1 - attempt_3) / attempt_1
memorability = puzzle_learning_rate / global_learning_rate
Skill Sensitivity
Does skill make a big difference on this puzzle?
Ratio of 75th to 25th percentile difficulty indices. Higher = bigger gap between fast and slow solvers. Min: 20 qualifying indices.skill_sensitivity = P75(indices) / P25(indices)
Predictability
How predictable is this puzzle's difficulty?
Score from 0 to 1 (bounded). Closer to 1 = more consistent difficulty experience across different solvers. Min: 20 qualifying indices.CV = std_dev(indices) / mean(indices)
predictability = 1 / (1 + CV)
Box Dependence
How much harder is this puzzle without seeing the box?
Ratio of unboxed to boxed difficulty. 1.0 = no difference, 2.0 = twice as hard without box. Min: 10 unboxed + 5 boxed solvers.box_dependence = median(unboxed_indices) / median(boxed_indices)
Improvement Ceiling
How much can practice improve your time on this puzzle?
Ratio of median first-attempt time to 10th percentile of all attempts. Higher means practice dramatically improves times. Min: 20 solvers.improvement_ceiling = P50(first_attempts) / P10(all_attempts)
7 Confidence & Data Quality
Confidence Levels
Data Quality
- Solo only: Duo and team solves are excluded
- First attempts: Only first encounters with each puzzle count for difficulty and baselines
- Outlier filter: Difficulty indices above 5.0 are excluded (likely timer errors)
- Suspicious exclusion: Times flagged as suspicious are excluded
- Recency weighting: Recent solves count more — baselines, skill tiers, and Rating all use time decay to reflect your current ability
Minimum data thresholds
| Player Baseline | 5 distinct first-attempt solo solves per piece count |
| Puzzle Difficulty | 5 qualifying indices from different players |
| Player Skill | 10 qualifying puzzles with 20+ first-attempt solvers each |
| MSP Rating Ladder | 20 first attempts + 50 total solves within 24-month window |
| Memorability | 8 players with 3+ attempts |
| Skill Sensitivity / Predictability | 20 qualifying indices |
| Box Dependence | 10 unboxed + 5 boxed solvers with baselines |
| Improvement Ceiling | 20 solvers |
8 Opting Out
Players can opt out of ranking and skill tracking in their profile settings. Opted-out players are still included in all background calculations — their results contribute to puzzle difficulty scores and other players' rankings. They simply don't appear on the public MSP Rating ladder and don't see their own skill tier or Rating. Opting back in restores full visibility immediately.
Next recalculation in --:--