Leveraging Large Language Models for Disaster Relief Communications

In disaster situations, communication is critical but often fragmented, delayed, or inaccessible. In this project with Easol and Coefficient, I explored how Large Language Models (LLMs) can bridge communication gaps between agencies and communities during emergencies. The research culminated in a working prototype that generates accurate, localised crisis messaging and helps teams to prioritise and manage information from multiple sources.

Can AI help save lives in a crisis?

Imagine there’s been an earthquake. Buildings have collapsed, roads are blocked, and people are scared, confused and desperate for information. Now imagine being the person responsible for coordinating and prioritising all the information that comes pouring in from diverse sources, such as social media, field reports, radio communications…the list goes on! You must sort through this data and communicate clear, accurate instructions in real time to different teams to help people find missing loved ones and receive the help they need. No pressure, right?

This is the world I stepped into when I began exploring how Large Language Models (LLMs) like GPT-4 could help improve emergency communications.

What drove this project?

I advocate using AI to support and enhance human teamwork. The more I’ve learned about disaster response, the clearer it is that effective communication is key.

In sensitive interviews with police, emergency responders and relief workers, I discovered that agencies struggle to coordinate and maintain communication and misinformation spreads like wildfire through vulnerable communities because the people and systems can’t keep up.

That’s where LLMs come in. These models are capable of rapidly processing and summarising huge volumes of data. But can they actually make a difference when every second counts?

Leveraging AI to support communications in the field

To evaluate the potential of LLMs, specifically GPT-style models, the team created a prototype and fed it mock data: field reports, sensor readouts and updates from emergency services. We then asked it to generate messages for different audiences:

Short, plain-language alerts for the public and teams on the ground.
Tactical briefings for first responders.
Social media updates tailored to character limits and tone.

We wanted to see if Vyron (we gave our AI prototype a name) could adapt to specific audiences (e.g., civilians vs. emergency personnel) and improve the flow of information through our multi-disciplinary team.

Vyron was then assessed by teams running around town simulating searching for a missing person in an urban area. We captured:

Accuracy of information synthesis.
Tone, clarity and usability.
Potential misuse or hallucination risks.

It wasn’t perfect…

We learned that Vyron is capable of generating clear, high quality outputs, especially when prompts are well-structured. The model could simulate audience-specific tone and format effectively (e.g., summarising data into a tweet-style evacuation alert) and could prioritise more recent information over less relevant reports when generating responses.

But, there were moments where the AI hallucinated – it just made things up. Sometimes it glossed over critical safety information and of course, it doesn’t know the real world – it knows patterns in data.

What that told me was: this isn’t about replacing humans. It’s about building tools to help humans do their job faster, better and more effectively under pressure.

…but the teams trusted it!

While Vyron was originally designed as a data aggregator, the teams on the ground reported that they frequently relied on it to recommend future actions, assist them with planning routes and suggest places to search.

This told me that the team needed to develop prediction and recommendation capability and that certain ethical and practical limitations must be addressed before letting this loose in the field where real people’s lives are at risk.

LLMs can hallucinate or oversimplify critical information.
Human-in-the-loop oversight remains essential.
Training and trust calibration are necessary to prevent over-trust or under-reliance.

Why this matters (and why now)

We’re seeing increasing interest in this space. Companies like Easol are exploring AI-driven platforms for real-time emergency messaging, and there’s a growing recognition that traditional systems just can’t keep up.

This project sits at the intersection of this innovation and impact. It shows that with the right safeguards, LLMs may be able to support life-saving communication, especially for under-resourced agencies and communities.

What’s next?

I’d love to expand this work, partnering with frontline organisations, integrating live data sources and testing it in real drills. I’m also keen to explore policy and ethical frameworks for responsible AI use in crisis contexts.

If you know someone working in disaster response, emergency tech, or public safety who’s open to experimenting with AI, let’s talk.

And if you’re a funder or looking to collaborate, please get in touch.