close
close

Google DeepMind introduces watermarking for AI-generated text

Google DeepMind introduces watermarking for AI-generated text

The chatbot revolution has flooded our world with AI-generated text: it has infiltrated our newsfeeds, term papers, and inboxes. It is so absurdly abundant that industries have sprung up to create movements and countermovements. Some companies offer services to identify AI-generated text by analyzing the material, while others say their tools will “humanize” your AI-generated text and make it unrecognizable. Both types of tools have questionable performance, and as chatbots get better, it becomes increasingly difficult to tell whether words were strung together by a human or an algorithm.

Here's another approach: adding some sort of watermark or proof of content to the text from the start, so users can easily verify that the text is AI-generated. New research from Google DeepMind, described today in the journal Natureoffers a way to do just that. The system, called SynthID-Text, does not compromise “the quality, accuracy, creativity, or speed of text generation,” says Pushmeet Kohli, vice president of research at Google DeepMind and co-author of the paper. However, the researchers admit that their system is far from foolproof and is not yet available to everyone – it is more of a demonstration than a scalable solution.

Google has already integrated this new watermarking system into its Gemini chatbot, the company announced today. The company has also open-sourced the tool and made it available to developers and companies so they can use the tool to determine whether text output comes from their own large language models (LLMs), the AI ​​systems that power chatbots . However, currently only Google and the relevant developers have access to the detector that checks the watermark. As Kohli says, “While SynthID is not a panacea for identifying AI-generated content, it is an important building block for developing more reliable AI identification tools.”

The rise of content credentials

Content credits have been a hot topic for images and videos and have been seen as a way to counter the rise of deepfakes. Tech companies and major media companies have joined forces in an initiative called C2PA, which has developed a system for attaching encrypted metadata to image and video files, indicating whether they are real or AI-generated. However, text is a much more difficult problem because text can be so easily changed to obscure or remove a watermark. Although SynthID-Text is not the first attempt to create a text watermarking system, it is the first to be tested on 20 million prompts.

External experts who deal with content references see the DeepMind research as a good step. “It promises to improve C2PA’s use of persistent content evidence for documents and raw text,” said Andrew Jenks, director of media provenance at Microsoft and chairman of C2PA. “This is a difficult problem to solve and it's good to see some progress being made,” says Bruce MacCormack, C2PA Steering Committee member.

This is how Google's text watermarks work

SynthID Text intervenes discreetly in the generation process: it changes some of the words a chatbot outputs to the user in a way that is invisible to humans but clear to a SynthID detector. “Such modifications give the generated text a statistical signature,” the researchers write in the paper. “During the watermark detection phase, the signature can be measured to determine whether the text was actually generated by the watermarked LLM.”

The LLMs that power chatbots work by generating sentences word by word, looking at the context of what came before to select a likely next word. Essentially, SynthID-Text intervenes by randomly assigning numerical values ​​to the candidate words and letting the LLM output words with higher values. Later, a detector can pick up a text and calculate its total score; Watermarked text receives a higher score than non-watermarked text. The DeepMind team checked the performance of its system compared to other text watermarking tools that change the generation process and found that it was better at recognizing watermarked text.

However, the researchers admit in their paper that it is still easy to change text generated by Gemini and fool the detector. Even if users don't know which words to change, if they significantly edit the text or even ask another chatbot to summarize the text, the watermark will likely be obscured.

Testing text watermarks at scale

To ensure that SynthID text did not actually cause chatbots to provide poorer responses, the team tested it using 20 million prompts on Gemini. Half of these prompts were forwarded to the SynthID text system and received a watermarked response, while the other half received Gemini's standard response. Judging by users' thumbs up and thumbs down feedback, the watermarked answers were just as satisfying to users as the standard answers.

This is great for Google and the developers building on Gemini. However, to address the entire problem of identifying AI-generated text (called AI slop by some), many more AI companies will need to implement watermarking technologies – ideally in an interoperable way so that one detector can identify text from many different LLMs. And even in the unlikely event that all major AI companies agreed to an agreement, there would still be the problem of open source LLMs, which can easily be modified to remove any watermark functionality.

C2PA's MacCormack notes that detection becomes a particular problem when you start thinking practically about implementation. “There are challenges when inspecting text in the wild,” he says, “where you would have to know what watermarking model was applied in order to know how and where to look for the signal.” Overall, he says, they have the researchers still have a lot of work to do. This effort “is not a dead end,” MacCormack says, “but it is the first step in a long journey.”

Leave a Reply

Your email address will not be published. Required fields are marked *