News • 11/28/2025

The evolution of artificial intelligence towards systems capable of acting, executing commands, and interacting with files and operating systems has introduced new and complex security challenges.

 

In response to these emerging risks, CERT-AGID has conducted exploratory research into the behaviour of AI agents. The aim of this research is to analyse how systems interpret operational functions and identify any vulnerabilities.

 

The Report

 

CERT-AgID conducted an experiment with the Gemini Software Development Kit (SDK) to observe how an AI agent can interact with the operating system and what risks may lie behind its apparent simplicity.

 

The document explains, step by step, how an AI agent “thinks and acts”, how it interprets the functions assigned to it and, above all, how it can inadvertently reveal sensitive information if the interfaces are not carefully designed.

 

Protection, in fact, depends not only on the efficiency of the AI model, but also on the quality and robustness of the code and tools that connect it to the operating environment. It is not enough for an agent to respond consistently: what matters is how it behaves when performing a real action.
 

 

Prevention VS reaction

 

The focus is on prevention. Just as in medicine, early diagnosis – which in this case means checking, testing and reviewing the code well before its release – is much more effective than treatment after the incident.

 

If the code connecting the agent to the operating environment is vulnerable or written without adequate precautions, the agent will simply execute the design error to the letter, without worrying about any risks.

 

Prevention, therefore, is the only way to ensure that artificial intelligence remains under the control of human intelligence.

Security can no longer be considered an accessory, but must be an integral part of its architecture from the outset.
 

Read the full document: Agenti IA e Sicurezza: comprendere per governare.