Building a Global 'Disaster AI' Data Platform

Oh Jeong-hun, CEO of Pentagate · Flood, Landslide, Wildfire Statistics Lacking Despite Climate Crisis · Developing Air-Gapped Disaster AI Foundation Model · "Build a Global Platform, but Each Country Decides and Bears Responsibility"

Opinion|
|
By Oh Jeong-hun (Commentary)
||
null - Seoul Economic Daily Opinion News from South Korea

In an era of climate crisis and climate change — epitomized by global warming — disasters always begin where connections are weakest. Wildfires spread along wind corridors in mountainous terrain. Floods engulf rivers and underground spaces in an instant. Landslides start from small, unnoticed anomalies on slopes. Yet at precisely those moments, systems dependent on the internet and the cloud become most vulnerable. When communications are severed or external access is restricted, decision-making and warnings are delayed. The question for future disaster-response technology, then, is not "How sophisticated is it?" but "Does it work immediately on-site even when connectivity is lost?"

What is needed is a disaster AI foundation model that operates independently on air-gapped networks. This does not simply mean a video analyzer that recognizes flames or puddles. It must be a multimodal, spatiotemporal base model that comprehends drones, CCTV feeds, thermal imaging, water-level gauges, rain gauges, anemometers, topographic and geological data, historical disaster records, and field reports all together. In other words, for wildfires it must read not only smoke and thermal anomalies but also vegetation dryness, wind direction, and spread probability. For floods it must simultaneously assess the rate and extent of water-level rise, drainage delays, and road inundation. For landslides it must synthesize soil saturation, slope cracks, and rockfall precursors to predict risk levels. The model must capture the conditions that make disasters possible and the earliest warning signals — not merely recognize a disaster after it occurs.

Yet a formidable real-world barrier stands before this vision: an acute shortage of initial data. Floods, landslides, and wildfires are not everyday occurrences that generate routine data. Major disasters are rare. Even when they do occur, video quality is often poor, timestamps vary, and storage formats and standards differ from agency to agency. Some regions have abundant CCTV footage but no labels. Others have sensor data that is not linked to video. In air-gapped environments, freely aggregating data externally is even harder. Climate change is altering the very nature of disasters, which compounds the problem. Historical data increasingly fails to explain future risks. Ultimately, the hardest challenge for disaster AI lies not in algorithms but in data that is scarce and incomplete.

Paradoxically, that very challenge is why a foundation model is necessary. The solution is not simply "collect more disaster footage." Disasters may be rare, but precursor and condition data are far more abundant. Data on normal river flows, post-rainfall drainage changes, micro-cracks on slopes, forest dryness levels, nighttime thermal distributions, and fog and haze patterns accumulate every day. An effective approach at the initial stage is therefore to perform self-supervised learning on large volumes of unlabeled video and sensor data — teaching the model "how normal conditions change" first — and then fine-tune it with small amounts of actual disaster data. Even without having seen enough disasters, the model can learn to understand the anomalies that lead to them.

Synthetic data and simulation data are also indispensable. Floods can be modeled through hydraulic and hydrological simulations. Landslides can be generated via slope-stability analysis and rainfall-infiltration models. Wildfires can be simulated with spread models incorporating wind direction, gradient, and fuel load. By varying conditions in a digital-twin environment — daytime versus nighttime, fog, heavy rain, smoke, camera shake, low resolution — training data can be produced even when real incident footage is scarce, boosting the model's initial generalization performance. Because disaster data is both rare and dangerous, synthetic data fills gaps that on-site filming alone can never cover. The disaster AI of the future must move beyond "models trained solely on real data" toward a hybrid learning framework combining measured data, simulation data, and synthetic imagery.

Another solution is the fusion of physics-based models with AI. The less data available, the more prone AI is to plausible but wrong predictions. Disasters, however, follow relatively clear physical laws. Beyond certain rainfall thresholds and soil saturation levels, slope risk increases. Knowing rainfall patterns, watershed characteristics, and hydrological structures can narrow down flood-prone segments. Temperature, humidity, and wind-speed conditions are closely tied to wildfire spread. A hybrid architecture is therefore needed in which AI reads anomalies from imagery while physics-based models verify their plausibility. This is not a mere auxiliary function but a core mechanism for raising model reliability in data-scarce environments. Disaster AI must ultimately be a technology that reasons from principles even with limited observations, not one that merely matches scenes it has seen many times before.

A global shared platform should be designed from this same perspective. The answer is not a single centralized system but a common core paired with regional adapters. The common core provides a disaster-type classification taxonomy, data standards, base model weights, a risk-scoring engine, explainable alert interfaces, and audit-log and security frameworks. Regional adapters, meanwhile, are tailored to each country's terrain, climate, language, administrative system, and alert criteria. A country where avalanches are the primary threat differs fundamentally from one where wildfires dominate. Even within floods, urban stormwater inundation and major-river overflow demand entirely different responses. The platform must therefore be globally shared, but judgment and accountability must remain strictly under local sovereignty.

The air-gapped constraint, in fact, demands a new mode of collaboration. Rather than sending all raw data externally, each agency and country should train models within its own air-gapped network and exchange the results as verified model packages, parameters, and metadata in standardized formats. Performance-improvement experience can be shared without centralizing data. Adding active learning — continuously incorporating field experts' interpretations — and retraining systems that learn from false alarms and missed detections will gradually alleviate the initial data shortage over time. The crucial shift is viewing disaster AI not as a product delivered once and done, but as an operational platform that continuously evolves through field experience.

The essence of a disaster AI foundation model is not about building one enormous model. It is about creating public intelligent infrastructure that keeps running when connectivity is lost, that can learn even when data is scarce, that reads precursors even when actual disasters are few, and that adapts to each country's institutions and terrain. Disasters do not wait for the internet to be restored. They do not wait for enough data to accumulate. The AI we must build is therefore clear: a model that starts with limited data yet learns on its own, leverages both real and virtual inputs, combines physical laws with field experience, and can issue immediate warnings even inside an air-gapped network. Only then can disaster AI transcend technology demonstrations and become a global safety platform that protects lives and time.

null - Seoul Economic Daily Opinion News from South Korea

Related Video

AI-translated from Korean. Quotes from foreign sources are based on Korean-language reports and may not reflect exact original wording.