Google Gemini for Workspace will be exploited to generate e mail summaries that seem reputable however embrace malicious directions or warnings that direct customers to phishing websites with out utilizing attachments or direct hyperlinks.
Such an assault leverages oblique immediate injections which are hidden inside an e mail and obeyed by Gemini when producing the message abstract.
Regardless of comparable immediate assaults being reported since 2024 and safeguards being applied to dam deceptive responses, the approach stays profitable.
Assault via Gemini
A prompt-injection assault on Google’s Gemini mannequin was disclosed via 0dinMozilla’s bug bounty program for generative AI instruments, by researcher Marco Figueroa, GenAI Bug Bounty Applications Supervisor at Mozilla.
The method entails creating an e mail with an invisible directive for Gemini. An attacker can cover the malicious instruction within the physique textual content on the finish of the message utilizing HTML and CSS that units the font dimension to zero and its shade to white.
Crafting the malicious e mail
Supply: 0DIN
The malicious instruction is not going to be rendered in Gmail, and since there are not any attachments or hyperlinks current, the message is extremely prone to attain the potential goal’s inbox.
If the recipient opens the e-mail and asks Gemini to generate a abstract of the e-mail, Google’s AI instrument will parse the invisible directive and obey it.
An instance offered by Figueroa reveals Gemini following the hidden instruction and features a safety warning concerning the consumer’s Gmail password being compromised, together with a help cellphone quantity.
Gemini abstract end result served to the consumer
Supply: 0DIN
As many customers are prone to belief Gemini’s output as a part of Google Workspace performance, chances are high excessive for this alert to be thought-about a reputable warning as a substitute of a malicious injection.
Figueroa affords a number of detections and mitigation strategies that safety groups can apply to forestall such assaults. A technique is to take away, neutralize, or ignore content material that’s styled to be hidden within the physique textual content.
One other strategy is to implement a post-processing filter that scans Gemini output for pressing messages, URLs, or cellphone numbers, flagging the message for additional evaluate.
Customers also needs to remember that Gemini summaries shouldn’t be thought-about authoritative in relation to safety alerts.
BleepingComputer has contacted Google to ask about defenses that stop or mitigate such assaults, and a spokesperson directed us to a Google weblog put up on safety measures towards immediate injection assaults.
“We’re continually hardening our already sturdy defenses via red-teaming workouts that practice our fashions to defend towards these kinds of adversarial assaults,” a Google spokesperson instructed BleepingComputer.
The corporate consultant clarified to BleepingComputer that among the mitigations are within the strategy of being applied or are about to be deployed.
Google has seen no proof of incidents manipulating Gemini in the best way demonstrated in Figueroa’s report, the spokesperson mentioned.
Whereas cloud assaults could also be rising extra subtle, attackers nonetheless succeed with surprisingly easy methods.
Drawing from Wiz’s detections throughout 1000’s of organizations, this report reveals 8 key methods utilized by cloud-fluent menace actors.