This is a case study in building a local, single-image measurement pipeline with sub-millimeter calibration.
- π GitHub: github.com/fengfengxie/ring-sizer
- π Live Demo: huggingface.co/spaces/feng-x/ring-sizer

1. Problem Statement
Buying a ring online requires knowing your ring size, but most people don’t. Professional sizing tools (ring sizing kit, sizing strips) are not always accessible. The question: can a single phone photo, with a common reference object, give a reliable ring size recommendation?
The core challenge is deceptively simple: measure the width of a finger in an image. In practice, it involves:
- Scale calibration β converting pixels to real-world units
- Anatomical segmentation β isolating a specific finger from the hand
- Precise edge detection β sub-pixel localization of finger boundaries
- Systematic bias correction β CV measurements consistently overestimate
- Size mapping β translating a physical diameter to a discrete ring size
This project tackles all five, resulting in a fully local pipeline that achieves Β±0.5 mm diameter accuracy and Β±1 ring size recommendation under controlled conditions.
2. System Overview
How It Works
- User places their hand next to a standard credit card on a flat surface
- A single top-down photo is taken (phone camera, with flash on)
- The pipeline detects the card for scale, segments the finger, measures its width, applies calibration, and recommends a ring size
Architecture
The system runs as a 9-phase sequential pipeline:
Image β Quality Check β Card Detection β Hand Segmentation β Contour Extraction
β Axis Estimation β Zone Localization β Width Measurement β Confidence Score
β [Calibration] β [Ring Size] β JSON + Overlay PNG
All processing is local (no cloud). The stack is Python 3 with OpenCV, NumPy, MediaPipe, and SciPy.
Key Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Scale reference | Credit card (ISO 7810) | Universally available, known dimensions (85.60 Γ 53.98 mm) |
| Hand detection | MediaPipe Hands | Pretrained, 21 landmarks, no custom training needed |
| Measurement zone | 15β25% of finger length from palm | Standard ring-wearing position, avoids knuckle |
| Width aggregation | Median of 20 cross-sections | Robust to outlier intersections |
| Edge detection | Sobel gradients with sub-pixel refinement | <0.5px precision vs ~2px for contour-based |
| Bias correction | Linear regression on ground truth | Systematic error, not random β ideal for regression |
| Size mapping | Nearest-match lookup with 2-size range | More robust than regression for n=29 |
3. Technical Deep Dive
3.1 Scale Calibration
The credit card serves as a known-size reference. Detection uses:
- Adaptive thresholding + contour detection to find rectangular shapes
- Aspect ratio validation (must be β1.586, the ISO standard)
- Perspective correction to handle angled views
- Scale factor computation:
px_per_cm = detected_long_edge_px / 8.56
Stability result: Across 20 images from a fixed tripod, the detected scale was 128.48 Β± 0.57 px/cm (CV = 0.44%) β extremely consistent.
3.2 Finger Segmentation
MediaPipe provides 21 hand landmarks per hand, including 4 per finger (MCP, PIP, DIP, TIP). The pipeline:
- Detects the hand and all landmarks
- Uses wrist β fingertip vector to determine orientation
- Rotates the image to canonical orientation (wrist at bottom)
- Generates a binary mask of the target finger using landmark-guided region growing
- Cleans the mask (morphological operations, area filtering)
The user can specify which finger to measure (--finger-index index|middle|ring), defaulting to the index finger.
3.3 Edge Detection: From Contour to Sobel
v0 (Contour-based): Extract the outer contour of the finger mask, then find intersections with perpendicular cross-section lines. Simple but limited to pixel-level precision (~2px resolution).
v1 (Sobel refinement): A significant accuracy improvement:
- ROI extraction around the ring-wearing zone with padding
- Bidirectional Sobel filtering β detects both darkβbright and brightβdark gradients, handling any background
- Mask-constrained edge search β starts from the finger mask boundary (anatomical prior), then searches Β±10px for the strongest gradient (sub-pixel precision)
- Parabola fitting for sub-pixel localization: samples gradient at {x-1, x, x+1}, fits f(x) = axΒ² + bx + c, peak at x = -b/(2a). Achieves <0.5px precision (~0.03 mm at typical resolution)
- Outlier filtering using Median Absolute Deviation (MAD) β removes measurements >3 MAD from median
The system uses auto mode by default: attempts Sobel, validates quality (gradient strength, consistency >50%, width reasonableness), and falls back to contour if quality is insufficient.
3.4 Calibration: Correcting Systematic Bias
Even with sub-pixel edge detection, the pipeline systematically overestimates finger width. This is expected β CV measures the outermost visible edge (including some skin reflection/glow), while a caliper measures the physical boundary.
Data collection: 10 subjects Γ 3 fingers Γ 2 photos = 60 measurements, with caliper ground truth.
Key finding: The over-measurement is systematic (+8.8% mean), not random. Shot-to-shot noise (0.028 cm) is much smaller than the bias (0.158 cm), making linear regression ideal.
Calibration model:
actual_diameter = 0.7921 Γ measured_diameter + 0.2503
Derived via ordinary least squares on 60 paired measurements. Leave-one-person-out cross-validation confirms generalization:
| Metric | Before Calibration | After Calibration | Improvement |
|---|---|---|---|
| MAE | 0.158 cm (1.58 mm) | 0.060 cm (0.60 mm) | 62% |
| RMSE | 0.176 cm | 0.075 cm | 57% |
| Max error | 0.347 cm | 0.174 cm | 50% |
The calibration coefficients are stored in src/calibration.json and applied automatically as a post-processing step.
3.5 Ring Size Mapping
The final step maps calibrated diameter to China standard ring sizes (sizes 6β13, inner diameters 16.9β22.7 mm).
The fundamental challenge: finger diameter at the wearing zone does not perfectly predict ring size. Ring sizing depends on the knuckle (which the ring slides over), soft tissue compressibility, and personal preference. Even ground-truth caliper measurements deviate from the size chart by Β±0.73 mm.
Method: Nearest-match lookup with 2-size range.
calibrated_width = 18.81 mm
β Nearest size: 8 (18.6 mm, Ξ=0.21)
β Second nearest: 9 (19.4 mm, Ξ=0.59)
β Output: "Best match 8, recommended 8-9"
Validation (29 person-finger pairs with known ring sizes):
| Finger | N | Pearson r | Exact Match | 2-Size Range Hit |
|---|---|---|---|---|
| Index | 10 | 0.919 | 60% | 90% |
| Middle | 10 | 0.955 | 50% | 70% |
| Ring | 9 | 0.787 | 44% | 78% |
| All | 29 | 0.863 | 52% | 79% |
The index finger is the most reliable (90% range hit rate). Ring finger is the least reliable due to higher cross-section variability.
4. Confidence Scoring
Every measurement includes a confidence score (0β1) computed from four weighted components:
| Component | Weight | What It Measures |
|---|---|---|
| Card detection | 25% | Aspect ratio match, corner quality, scale reasonableness |
| Finger segmentation | 25% | Mask area, contour quality, landmark presence |
| Edge quality | 20% | Gradient strength, consistency, smoothness, symmetry |
| Measurement | 30% | Width variance, sample count, outlier ratio |
Confidence levels: HIGH (>0.85), MEDIUM (0.6β0.85), LOW (<0.6). The system warns when confidence falls below a configurable threshold (default 0.7).
5. Input & Output Details
JSON Result
{
"finger_outer_diameter_cm": 1.78,
"confidence": 0.91,
"scale_px_per_cm": 128.03,
"ring_size": {
"best_match": 8,
"range_min": 8,
"range_max": 9,
"diameter_mm": 17.80
},
"raw_diameter_cm": 1.92,
"calibration_applied": true,
"edge_method_used": "sobel",
"fail_reason": null
}
Visual Overlay
Every measurement produces an annotated PNG showing:
- Credit card contour and detected corners
- Finger contour, axis, and ring-wearing zone
- Individual edge detection points (left/right)
- Width measurement lines
- Calibrated diameter, ring size recommendation, and confidence
Input Requirements
- Resolution: 1080p or higher
- View angle: near top-down
- One hand with fingers extended
- Credit card visible on the same plane
- Good lighting (flash recommended to eliminate shadows)
6. Lessons Learned
1. Systematic bias dominates random noise. The pipeline’s Β±1.6 mm error was 85% bias, 15% noise. A simple linear regression removed most of it. The takeaway: before adding complexity (better models, more features), check if the error is systematic.
2. Ground truth data is the highest-ROI investment. Collecting 10 people’s caliper measurements (a few hours of work) enabled both calibration and validation. Without it, we’d be guessing at accuracy.
3. Sub-pixel edge detection matters less than calibration. v1’s Sobel refinement improved theoretical precision from ~2px to <0.5px. But the calibration in v2 reduced actual error by 3Γ more. Precision without accuracy is misleading.
4. Ring size β finger diameter. The wearing-zone diameter is a proxy, not a direct predictor. Knuckle size, tissue compressibility, and personal preference all influence the right size. Outputting a range rather than a single number is both more honest and more useful.
5. Controlled conditions matter enormously. Our Β±0.5 mm accuracy is achieved with a tripod, flash, white background, and co-planar card. Real-world phone photos will perform worse. Being explicit about conditions prevents overpromising.
7. Limitations
- Sample size: n=30 (10 subjects Γ 3 fingers). Sufficient for detecting the systematic bias pattern but not for fine-grained per-finger models.
- Demographics: All subjects are from one geographic region. Finger morphology varies across populations.
- Controlled setup: Accuracy claims assume tripod, flash, white background, co-planar card. Hand-held photos at arbitrary angles will degrade performance.
- Single-view measurement: A top-down photo captures width but not depth. The finger cross-section is elliptical, and we only measure the major axis.
- Size chart: Currently China standard only (sizes 6β13). Other standards (US, EU, UK) would need additional lookup tables.
8. Accuracy Summary
| Metric | Value | Condition |
|---|---|---|
| Diameter MAE | 0.60 mm | Calibrated, controlled setup, n=30 |
| Diameter RMSE | 0.75 mm | Same |
| Diameter max error | 1.74 mm | Same |
| Ring size exact match | 52% | All fingers |
| Ring size Β±1 range hit | 79% | All fingers |
| Ring size Β±1 (index only) | 90% | Best-performing finger |
| Card detection CV | 0.44% | Scale stability across 20 images |
| Shot-to-shot repeatability | 0.28 mm | Same subject, two photos |
Bottom line: Under controlled conditions, the system measures finger diameter to Β±0.5 mm and recommends ring size to Β±1 size, with the index finger being the most reliable target.