Artificial intelligence (AI) is here to stay and as it permeates into near about all facets of our everyday lives, it becomes imperative to take cognizance of the risk levels associated with AI, especially generative AI that has been defined in Article 50(2) of the newly enacted European Artificial Intelligence Act, 2024 (“AI Act”) as “AI systems, including general-purpose AI systems, that generate synthetic type content such as audio, images, videos, or text”. This essentially means any AI system which can manipulate media to create artificial contents, such as deepfakes. Therefore, it becomes critical to assess how necessary safeguards can be put in place for identifying and if necessary, regulating, such artificially generated content.
What does watermarking mean in the context of AI generated content?
Watermarking means branding of any output (whether in text, image or audio-visual format), generated through artificial intelligence models, so that it is easily distinguishable as content generated through the usage of artificial intelligence models and not by any humans.
Watermarking may also serve as digital signatures, helping content creators to mark ownership and prevent illegal or unauthorized use of their content, also restricting infringement of intellectual property. With watermarking, one can easily identify the creator of manipulated media, which, in turn, helps in self-regulating the widely unfamiliar and new terrain of generative AI and content created through large language models.
Water-marking methodologies range from simplistic text tagging of content (which by the way can be easily removed) to more complex and technical methods such as embedded graphic or crypto watermarks.
What is the global legal standing around watermarking?
Although there is a global ongoing discourse on the need and ways of regulating AI, the AI Act is the first enacted legislation that defines many aspects of AI and adopts a risk-based approach for regulating and governing AI. This law applies to any AI system available within the territory of the European Union and categorizes generative AI systems as limited risk systems, which are subject to transparency requirements, such as the obligation to inform users.
This law also puts forth certain mandatorily applicable watermarking standards and specifies that providers shall ensure that “outputs of the AI system are marked in a machine-readable format and detectable as artificially generated or manipulated”. Providers shall also ensure “their technical solutions are effective, interoperable, robust, and reliable as far as this is technically feasible, taking into account specificities and limitations of different types of content, costs of implementation, and the generally acknowledged state-of-the-art, as may be reflected in relevant technical standards”.
In the United States, there is the California Provenance, Authenticity and Watermarking Standards Bill (“Bill”), which is a California Assembly Bill, yet to be passed by the senate. However, this Bill specifies that beginning February 01, 2025, a generative AI provider would be required to “…… among other things, place imperceptible and maximally indelible watermarks containing provenance data into synthetic content produced or significantly modified by a generative AI system that the provider makes available ……”. The Bill, in fact, also goes a step beyond to provide that on and from January 01, 2026, all newly manufactured digital cameras and recording devices sold, offered for sale, or distributed in California should offer users the option to place an authenticity watermark and provenance watermark in the content produced by that device.
Both the AI Act and the Bill also provide for hefty penalties for non-compliance which can go up to as high as 3% of the total annual global turnover of the relevant non-compliant entity.
Singapore has also published the Model AI Governance Framework for Generative AI in May 2024 which recognizes the need for digital watermarking and cryptographic provenance for AI generated content.
In India, certain advisories have been issued by the Ministry of Electronics and Information Technology (MEITY), which emphasize the need for intermediaries to ensure due diligence by mandating the labelling of synthetically created or modified content on their platforms. In its advisory dated March 15, 2024, MEITY has clarified that “where any intermediary through its software or any other computer resource permits or facilitates synthetic creation, generation or modification of a text, audio, visual or audio-visual information, in such a manner that such information may be used potentially as misinformation or deepfake, it is advised that such information created, generated, or modified through its software or any other computer resource is labeled or embedded with permanent unique metadata or identifier, in a manner that such label, metadata or identifier can be used to identify that such information has been created, generated or modified using the computer resource of the intermediary. Further, in case any changes are made by a user, the metadata should be so configured to enable identification of such user or computer resource that has effected such change”. Penalties for non-compliance with the advisory has been linked to penalties and/or prosecution, as applicable, under the Information Technology Act, 2000.
How is the industry moving towards watermarking regulations?
According to research conducted by the Mozilla Foundation (see here), “when it comes to identifying synthetic content, we’re at a glass half full, glass half empty moment. Current watermarking and labeling technologies show promise and ingenuity, particularly when used together. Still, they’re not enough to effectively counter the dangers of undisclosed synthetic content — especially amid dozens of elections around the world.”
Taking feedback from the Oversight Board, Meta has been making changes to their Manipulated Media Policy (see here). While earlier the watermark for AI generated content would be ‘Made with AI’, now, Meta has moved to watermarking AI generated content with ‘AI Info’, which people can click on for more information.
Similar to any other new technology, generative AI is also growing at an extremely fast pace, where legal regulations need to go hand in hand with industry driven self-regulation standards and technological means for governance, both from a legal as well as ethical perspective. Watermarking seems like a promising tool in that respect.
The level, extent and standards of watermarking, though, are still evolving and need to be closely followed, for striking a fine balance between enhancing and stifling innovation.