When Scalability Breaks Reality: Lessons from Building the Odds Importer

When Scalability Breaks Reality: Lessons from Building the Odds Importer
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Backend | Ruby on Rails | Core data processing framework |
| Message Queue | AWS SQS | Distributed task processing |
| Database | PostgreSQL | Primary data storage |
| Cache | Redis | Fast data access and temporary storage |
| Load Balancer | AWS ELB | Traffic distribution across servers |
| Monitoring | AWS CloudWatch | System metrics and alerting |
| Data Format | JSON/XML | Input feed processing |
| Infrastructure | AWS EC2 Auto Scaling | Dynamic server provisioning |
| API Layer | RESTful Rails | Frontend data delivery |
| Processing Pipeline | Custom ETL | Real-time odds transformation |
I used to think scalability was the ultimate goal β that if a system could scale horizontally, it could handle anything.
I was wrong.
When we built the Odds Importer for OddsTrader, scalability was the north star. We designed it so we could spin up 10, 20, even 30 servers to process odds feeds in parallel. It worked beautifully β in theory. But what I didn't anticipate was how data context could quietly destroy that perfect design.
The Architecture That Looked Perfect
We started with two core principles:
- Horizontal Scalability:
Each server could take a chunk of data and process it independently. - Sub-Second Updates:
Once the feed was downloaded, updates needed to propagate to the frontend in under a second.
The problem? Some XML feeds were massive β 30 MB or more. So we decided to split them into small JSON "lines" that could be processed independently across multiple servers.
Example: The Input Feed
{
"sport": "NFL",
"gameId": "NE-NYJ-2025-11-01",
"matchup": {
"home": "NE",
"away": "NYJ",
"startTime": "2025-11-01T18:00:00Z"
},
"markets": [
{
"market": "Point Spread",
"odds": [
{ "team": "NE", "spread": -5, "price": -110 },
{ "team": "NYJ", "spread": +5, "price": -110 }
]
},
{
"market": "Total Points",
"odds": [
{ "type": "Over", "line": 42.5, "price": -110 },
{ "type": "Under", "line": 42.5, "price": -110 }
]
}
]
}
We'd split this into multiple independent chunks:
[
{
"gameId": "NE-NYJ-2025-11-01",
"market": "Point Spread",
"team": "NE",
"spread": -5,
"price": -110
},
{
"gameId": "NE-NYJ-2025-11-01",
"market": "Point Spread",
"team": "NYJ",
"spread": +5,
"price": -110
},
{
"gameId": "NE-NYJ-2025-11-01",
"market": "Total Points",
"type": "Over",
"line": 42.5,
"price": -110
},
{
"gameId": "NE-NYJ-2025-11-01",
"market": "Total Points",
"type": "Under",
"line": 42.5,
"price": -110
}
]
That denormalization made scaling simple. Each record was atomic β or so we thought.
π‘ The Scalability Trap: Making data artificially atomic can destroy the very relationships that give it meaning.
The Real-World Failure: Futures and Missing Context
The design assumed every record could live on its own. But in sports data, context is everything.
For typical two-outcome markets, like point spreads, this was fine β both sides update nearly simultaneously. But in futures markets, the data behaves differently.
Here's an example:
{
"market": "Super Bowl Winner",
"odds": [
{ "team": "NE", "price": 300 },
{ "team": "KC", "price": 500 },
{ "team": "NYJ", "price": 50000 }
]
}
When the Jets are eliminated, their line disappears. So the next update might look like this:
{
"market": "Super Bowl Winner",
"odds": [
{ "team": "NE", "price": 250 },
{ "team": "KC", "price": 450 }
]
}
If you process this line-by-line across 20 servers, you'll never know the Jets were removed β only that they didn't update.
The system can't tell the difference between "missing data" and "removed market."
"Horizontal scaling amplifies inconsistency. When your architecture assumes atomic updates, but your data is contextual, scaling only makes the wrong behavior happen faster." β Hard-learned lesson from the trenches
This wasn't a theory problem β it was a reality that hit us in production with millions of users watching.
We had feeds that handled this gracefully by marking removed teams explicitly:
{ "team": "NYJ", "status": "off", "market": "Super Bowl Winner" }
But most didn't. Our "perfectly scalable" importer suddenly required manual intervention. We ended up building an internal admin tool so operators could mark eliminated teams manually β the opposite of scalable.
When Data Arrives Out of Sync
The second problem was even trickier: live statistics.
We wanted to show real-time play-by-play β score, time, down, distance β all updating automatically. But each stat came through a different feed, updated at different times.
Example Feed Sequence
// Update 1
{ "quarter": 1, "time": "10:25", "down": 1, "distance": 10, "yardLine": "NE 25" }
// Update 2 (time only)
{ "quarter": 1, "time": "10:15" }
// Update 3 (yards only)
{ "yardLine": "NE 30" }
// Update 4 (final state)
{ "quarter": 1, "time": "10:15", "down": 1, "distance": 5, "yardLine": "NE 30" }
Each update looked valid β but the frontend would render these in between states: time changed, but yard line didn't. Yard line changed, but down didn't. It looked like the game was breaking physics.
We couldn't rewrite the importer without breaking everything else. So we grouped related stats and forced the frontend to wait until all fields in that group had been updated before showing changes.
"updateGroups": {
"playState": ["quarter", "time", "down", "distance", "yardLine"]
}
It was hacky, but it worked β mostly. Performance tanked, and we paid that cost for years.
| Problem | Our "Solution" | Real Cost |
|---|---|---|
| Missing context in futures | Manual admin tool | Hours of daily operator work |
| Out-of-sync live stats | Grouped updates | 300ms+ latency penalty |
| Scale complexity | More monitoring | 4 AM debugging sessions |
β οΈ Technical Debt Reality: What works "for now" usually becomes tomorrow's bottleneck. These quick fixes compounded into architectural constraints that lasted years.
The Broader Lesson
On paper, horizontal scalability solves throughput. In reality, it amplifies inconsistency.
When your architecture assumes atomic updates, but your data is contextual, scaling only makes the wrong behavior happen faster.
If I Were Rebuilding It Today
If I rebuilt the importer now, I'd treat related odds and stats as transactional groups β processed together, versioned together, and retired together.
Something like:
{
"batchId": "sb2025-001",
"market": "Super Bowl Winner",
"timestamp": "2025-11-01T20:00:00Z",
"records": [
{ "team": "NE", "price": 250 },
{ "team": "KC", "price": 450 },
{ "team": "NYJ", "status": "off" }
]
}
That single record can be diffed, versioned, and replayed β without losing context or flooding the system.
Modern Architecture Approach
Instead of splitting everything into atomic pieces, I'd use:
- Event Streaming with Kafka for ordered processing
- Batch Processing for contextual groups
- State Snapshots for consistency verification
- Explicit Deletes instead of implicit removals
β Context-Aware Scaling: Sometimes, the fastest way to get things right is to process related data together, not split it apart.
What This Taught Me
1. Scale exposes weak assumptions
If your system depends on context, scaling horizontally multiplies confusion, not throughput.
2. Real-time β real order
Concurrency and truth aren't the same thing β you have to design for state consistency, not speed.
3. Hacks create invisible debts
What works "for now" usually becomes tomorrow's bottleneck.
4. Context beats parallelism
Sometimes, the fastest way to get things right is to process them together.
"A system can only move as fast as its context stays intact." β The most expensive lesson we learned
No amount of horizontal scaling can fix fundamentally broken data relationships. You have to solve for correctness first, then scale.
The Real Success Metrics
In the end, the odds importer worked β it powered millions of updates per day. But the metrics that mattered weren't just throughput:
- Accuracy: 99.9% data consistency after the fixes
- Operational overhead: Reduced from 4 hours/day to 30 minutes/day
- Developer sanity: No more 4 AM debugging sessions
- User experience: Sub-second updates with correct context
The system taught me something more valuable than scale: context is not optional. You can't scale away fundamental design problems β you can only make them happen faster and more expensively.
When building distributed systems, always ask: "What relationships am I breaking by splitting this data?" Because once you lose context, no amount of horizontal scaling will get it back.
Brian Wight
Technical leader and entrepreneur focused on building scalable systems and high-performing teams. Passionate about ownership culture, data-driven decision making, and turning complex problems into simple solutions.