KAIST Researchers First in the World to Identify Security Threat Exploiting Google Gemini’s "Malicious Expert AI" Structure
<Photo 1. (From left) Ph.D. candidates Mingyoo Song and Jaehan Kim, Professor Sooel Son, (Top right) Professor Seungwon Shin, Lead Researcher Seung Ho Na>
Most major commercial Large Language Models (LLMs), such as Google’s Gemini, utilize a Mixture-of-Experts (MoE) structure. This architecture enhances efficiency by dynamically selecting and using multiple "small AI models (Expert AIs)" depending on input queries . However, KAIST research team has revealed for the first time in the world that this very structure can actually become a new security threat.
A joint research team led by Professor Seungwon Shin (School of Electrical Engineering) and Professor Sooel Son (School of Computing) announced on December 26th that they have identified an attack technique that can seriously compromise the safety of LLMs by exploiting the MoE structure. For this research, they received the Distinguished Paper Award at ACSAC 2025, one of the most prestigious international conferences in the field of information security.
ACSAC (Annual Computer Security Applications Conference) is among the most influential international academic conferences in security. This year, only two papers out of all submissions were selected as Distinguished Papers. It is highly unusual for a domestic Korean research team to achieve such a feat in the field of AI security.
In this study, the team systematically analyzed the fundamental security vulnerabilities of the MoE structure. In particular, they demonstrated that even if an attacker does not have direct access to the internal structure of a commercial LLM, the entire model can be induced to generate dangerous responses if just one maliciously manipulated "Expert Model" is distributed through open-source channels and integrated into the system.
<Figure 1. Conceptual diagram of the attack technology proposed by the research team.>
To put it simply: even if there is only one "malicious expert" mixed among normal AI experts, that specific expert may be repeatedly selected for processing harmful queries, causing the overall safety of the AI to collapse. A particularly dangerous factor highlighted was that this process causes almost no degradation in model performance, making the problem extremely difficult to detect in advance.
Experimental results showed that the attack technique proposed by the research team could increase the harmful response rate from 0% to up to 80%. They confirmed that the safety of the entire model significantly deteriorates even if only one out of many experts is "infected."
This research is highly significant as it presents the first new security threat that can occur in the rapidly expanding global open-source-based LLM development environment. Simultaneously, it suggests that verifying the "source and safety of individual expert models" is now essential—not just performance—during the AI model development process.
Professors Seungwon Shin and Sooel Son stated, "Through this study, we have empirically confirmed that the MoE structure, which is spreading rapidly for the sake of efficiency, can become a new security threat. This award is a meaningful achievement that recognizes the importance of AI security on an international level."
The study involved Ph.D. candidates Jaehan Kim and Mingyoo Song, Dr. Seung Ho Na (currently at Samsung Electronics), Professor Seungwon Shin, and Professor Sooel Son. The results were presented at ACSAC in Hawaii, USA, on December 12, 2025.
<Figure 2. Photo of the Distinguished Paper Award certificate>
Paper Title: MoEvil: Poisoning Experts to Compromise the Safety of Mixture-of-Experts LLMs
Paper File: https://jaehanwork.github.io/files/moevil.pdf
GitHub (Open Source): https://github.com/jaehanwork/MoEvil
This research was supported by the Korea Internet & Security Agency (KISA) and the Institute of Information & Communications Technology Planning & Evaluation (IITP) under the Ministry of Science and ICT.
Professor Shin's Team Receives the Best Software Defined Network Solution Showcase Award
Professor Seungwon Shin of the Electrical Engineering School at KAIST and his research team won the Best Software Defined Networking (SDN) Solution Showcase Award hosted by the SDN World Congress, one of the biggest network summits held in Europe with over 2,000 participants. This year the conference took place in The Hague, the Netherlands, October 10-14, 2016.
SDN is an approach to computer networking that allows network administrators to respond quickly to changing business requirements via a centralized control console and to support the dynamic, scalable computing and storage needs of more modern computing environments such as data centers.
Collaborating with researchers from Queen’s University in the United Kingdom and Huawei, a global information and communications technology solutions provider in China, Professor Shin’s team, which is led by doctoral students Seungsoo Lee, Changhoon Yoon, and Jaehyun Nam, implemented a SDN security project called “DELTA.” ATTORESEARCH, a Korean SDN architecture and applications provider, conducted testing and verification for the project.
DELTA is a new SDN security evaluation framework with two main functions. It can automatically recognize attack cases against SDN elements across diverse environments and can assist in identifying unknown security problems within a SDN deployment.
The DELTA project consists of a control plane, the part of a network that carries signaling traffic and is responsible for routing; a data plane, the part of a network that carries user traffic; and a control channel that connects the two aforementioned planes. These three components have their own agents installed, which are all controlled by an agent manger. The agent manger can automatically detect any spots where the network security is weak.
Specifically, the project aimes to defense attacks against OpenFlow protocol, one of the first SDN standards; SDN controllers, a network operating system that is based on protocols; and network switch devices that use OpenFlow protocol.
The DELTA project was registered with the Open Networking Foundation, a user-driven organization dedicated to the promotion and adoption of SDN through open standards development, as an open source SDN security evaluation tool. This project is the only open source SDN which has been led by Korean researchers.
The SDN World Congress 2016 recognized the need for and importance of the DELTA project by conferring upon it the Best Solution Showcase Award. The Open Networking Foundation also widely publicized this award news.
Professor Shin said:
“In recent years, SDN has been attracting a large amount of interest as an emerging technology, but there still have not many SDN projects in Korea. This award acknowledges the advancement of Korean SDN technology, showing the potential for Korea to become a leader in SDN research.”
Picture: Major Components of the DELTA Project: Agents and Agent Manger