This repository contains the source code for our Gold Medal submission to the Beginners' Hypothesis 2026 challenge. Our solution achieves a near-perfect score by decomposing the problem into three orthogonal factors: Semantic Object Classification, Global Matrix Energy (Holt Metric), and Geometric Dependency (Peralta Factor).
🏆 Final Score: 0.9756 (Private Leaderboard)
🚀 Runtime: ~12 Minutes (GPU T4 x2)
Unlike standard approaches that throw a single massive backbone (e.g., EfficientNet-B7) at the problem, we hypothesized that the three target variables were generated by fundamentally different processes. We built three specialized solvers:
| Factor | Target Variable | Type | The Solver (Model) | Key Insight |
|---|---|---|---|---|
| I | label |
Classification | ConvNeXt Base | Semantic features (Object recognition). |
| II | variable |
Regression | SVD (Linear Algebra) | The variable is the Effective Rank Ratio ( |
| III | hidden_label |
Classification | ResNet-18 (Scratch) | Geometric dependency between Red/Blue dots. |
- Goal: Classify the background object (e.g., "Microscope", "Compass").
- Challenge: The objects are heavily obscured by darkness and noise.
- Solution: ConvNeXt Base (ImageNet weights).
- Why it worked: ConvNeXt's modernized architecture (large kernels, layer norms) excelled at extracting semantic features from low-light, noisy environments where older ResNets struggled.
-
Goal: Predict the continuous
variable($R^2=1.0$ ). -
Failed Experiments:
- Deep Learning (CNNs): High RMSE. Failed to converge.
-
Matrix Hack (Pixel Ridge Regression):
$R^2 \approx 0.66$ . Proved the variable wasn't pixel-position dependent.
- The "Eureka" Moment: We analyzed the Singular Value Decomposition (SVD) of the image matrices.
-
The Formula:
$$
\text{variable} = \frac{\sigma_1}{\sum_{i=1}^{N} \sigma_i}
$$
- This ratio represents the "Spectral Concentration" of the image matrix—a global property measuring how much information is contained in the first singular value.
-
Result: Perfect linear correlation (
$R^2 = 1.0000$ ).
- Goal: Classify the hidden label (A, B, C, D) based on the Red and Blue dots.
- The "Texture Bias" Problem: Pre-trained models (ResNet-50) failed (~96% acc) because they prioritized texture over geometry.
-
Failed Experiments:
- Hough Circle Transform: Failed due to pixelated noise and non-circular artifacts.
- YOLOv8: Overkill; detection boxes were unstable on 3x3 pixel dots.
-
The Solution:
- Architecture: ResNet-18 trained from scratch.
-
Geometric Cleaning: We implemented a "Quadrant Split" that zeroed out noise, enforcing the rule that Blue
$\in$ Top-Left and Red$\in$ Bottom-Right. -
Loss Function: We stuck to MSE (Mean Squared Error) over Quartic Error (
$Error^4$ ) to avoid gradient explosion from outliers.
The graph below proves that the variable is a direct function of the matrix's singular values, not a visual feature.
A confusion matrix showing where pre-trained models failed (confusing Classes B and C) compared to our geometric-focused model.
- Python 3.8+
- PyTorch 2.0+
- OpenCV, Scikit-Learn, Pandas
- Clone the repo:
git clone [https://github.com/GabaSatvik/DSG-BEGINNERS-HYPOTHESIS-2026.git](https://github.com/GabaSatvik/DSG-BEGINNERS-HYPOTHESIS-2026.git)
- Download Data: Place the competition data in ./data/.
- Run the Kernel:
jupyter notebook solution.ipynb
