At a workshop in Bangkok earlier this year, I was presenting on infrastructure planning for CAPI operations when someone from a Pacific Island statistical office asked a question that cut right to the heart of it: "What happens when all our enumerators try to sync at the same time?" The honest answer is that if you haven't planned for it, everything falls over.
I've seen this play out firsthand in Cambodia. When we scaled up the Cambodia Agricultural Survey to full national coverage, we had close to 4,000 enumerators working across 25 provinces, all collecting data on tablets using Survey Solutions. At the end of each fieldwork day, they'd connect to whatever WiFi or mobile data they could find and hit sync. The first time we ran this at scale, the server buckled. Not because the hardware was inadequate on paper, but because we'd sized it for average load rather than peak load. In survey operations, the peak is the only number that matters. Everyone syncs in the same two-hour window after dinner, and your infrastructure either handles that spike or it doesn't.
The cloud versus on-premise question is one I get asked constantly, and the answer is genuinely context-dependent. In Cambodia, NIS maintains its own server infrastructure, partly for data sovereignty reasons and partly because government IT procurement makes cloud subscriptions administratively painful. But on-premise means you own every problem: power failures, cooling, backups, security patches. For the Bangkok workshop, I framed the decision around three factors: data sensitivity requirements, in-house IT capacity, and total cost of ownership over the survey cycle. Most offices underestimate the third one dramatically.
Security is the area where I see the most dangerous gaps. It's not that people don't care about protecting respondent data — they do. It's that security gets treated as a checkbox rather than a design principle. Encrypting data at rest and in transit is table stakes. The harder questions are about access control, audit logging, and what happens to data on tablets when an enumerator loses a device in a remote province. We've had tablets stolen from motorbikes during fieldwork. Your security model needs to account for that reality, not just the clean version in the project document.
The checklist I shared in Bangkok wasn't glamorous. Server sizing for concurrent connections, backup frequency and restore testing, network bandwidth at provincial training centres, device management policies, incident response procedures. None of it is conceptually difficult. But the gap between knowing these things matter and actually having them documented and tested before fieldwork begins is where most operations get into trouble. Infrastructure planning is the work you do so that the interesting analytical work can actually happen.