Patent specifications and drawings are vital carriers of technical information. Once disclosed by the China National Intellectual Property Administration (CNIPA) or other national patent offices, they become part of society’s publicly accessible knowledge resources. With the rapid development of generative artificial intelligence, a critical question has emerged: can published patent specifications and drawings be used as training data for AI models? In China, this use raises complex issues at the intersection of copyright law, the principle of fair use, and emerging regulatory requirements.
According to China’s Copyright Law, a work must possess originality and be expressed in a tangible form. Although patent specifications and drawings are subject to the formal requirements of patent law and must be examined and published by the patent administration, their essence is not that of administrative documents but rather technical literature created by applicants. As long as they meet the constituent requirements of a work, they should be protected under copyright law.
In terms of textual expression, sentence arrangement, and the design of drawings, patent specifications and drawings retain creative space, reflecting the author’s individual choices and even a “scientific aesthetic.” Thus, specifications and drawings that reach the minimum threshold of originality may indeed qualify as works within the meaning of copyright law.
Judicial practice has already provided clear guidance. In case (2021) Jing 73 Civil Final No. 4384[i], for example, the Beijing IP Court held that patent specifications and drawings are not administrative documents, and the examination and publication by the national patent authority does not alter their nature. The court found that the arrangement and choice of lines in the drawings demonstrated originality, warranting protection as a graphic work. Similarly, in case (2022) Shan IP Civil Final No. 112[ii], the Shanxi High People’s Court emphasized that the textual expression, word choice, and sentence structure in patent specifications exhibited originality, qualifying them as literary works, and that compliance with formal requirements does not negate their copyright attributes.
These cases demonstrate that courts generally recognize that patent specifications and drawings, when meeting the originality requirement, should be protected under copyright law rather than treated merely as technical documents. However, the purpose of disclosure is to disseminate technical knowledge, which naturally limits the exercise of copyright. The examination and publication by patent authorities, as well as the reproduction and dissemination of these documents by the public for the purpose of accessing technical information, are generally regarded as fair use, provided they do not interfere with the normal exploitation of the works or cause unreasonable harm to the copyright holder. Scholars likewise note that the technical disclosure function of patent specifications and drawings necessitates a moderate narrowing of copyright protection, so as not to impede the circulation and utilization of knowledge.
In the development of artificial intelligence, the legality of training data has always been one of the core issues. Incorporating published patent specifications and drawings into training datasets reflects the reuse of technical resources but also raises complex considerations under copyright law. If patent specifications and drawings possess originality, they qualify as works protected by copyright, and training activities may implicate reproduction rights or the right of communication through information networks, thereby creating potential infringement risks.
Balancing copyright protection with technological innovation remains a challenge for both judicial practice and policy. China’s Copyright Law establishes the “three-step test,” requiring that uses must fall within statutory circumstances, not interfere with the normal exploitation of the work, and not unreasonably prejudice the legitimate interests of the copyright holder. Whether AI training satisfies this framework is not yet settled. Although the training process is essentially analytical use of works rather than direct substitution, the boundaries of “normal exploitation” and “reasonable prejudice” still require further judicial clarification.
In judicial practice, the Shanghai Intellectual Property Court in the 2023 LoRA/Altman case introduced the concept of “analytical use,” holding that the purpose of generative AI training is to parse the conceptual elements and expressive patterns of works rather than to reproduce them, and thus may be recognized as fair use. At the same time, the court emphasized that AI service providers must adopt measures to prevent users from generating infringing content, noting that fair use does not equate to exemption from liability. This precedent provides important guidance for AI training but also highlights the compliance responsibilities imposed on service providers.
Meanwhile, regulatory frameworks are being progressively refined. The “Interim Measures for the Administration of Generative Artificial Intelligence Services” explicitly require AI service providers to use training data from lawful sources, respect intellectual property rights, enhance the authenticity and diversity of data, and establish transparent mechanisms to prevent infringing outputs. These provisions not only provide enterprises with a compliance framework but also indicate that, even if judicial practice adopts a tolerant stance toward training activities, enterprises must still bear compliance obligations at the regulatory level.
Published patent specifications and drawings provide an important data resource for the development of artificial intelligence. In China, their use for AI training lies at the intersection of copyright law, the principle of fair use, and emerging regulatory requirements. The future direction may involve, on the one hand, gradually clarifying the boundaries of fair use through judicial precedent, and on the other , establishing workable compliance frameworks through policy and industry standards, thereby promoting innovation while safeguarding the order of intellectual property rights.
From an international perspective, the U.S. Copyright Office released its third report on artificial intelligence in May 2025, focusing on the training of generative AI models [iii]. The United States relies on the principle of fair use, and courts in multiple cases involving search engines and data mining have tended to recognize analytical rather than substitutive uses as fair use.
By contrast, the European Union established a text and data mining exception under the Copyright Directive (Directive (EU) 2019/790)[iv], allowing research institutions and certain commercial entities to use protected works under specific conditions, while granting rights holders an opt‑out choice to exclude their works from data mining or AI training.
These differences suggest that China’s future institutional design may draw on the U.S. path of case law in shaping the boundaries of fair use, while also considering the legislative models of the EU and Japan. By combining its own needs for intellectual property protection with industrial development, China can explore a balanced mechanism suited to its context.
This article was published by IAM in Jan 2026
[i] https://wenshu.court.gov.cn/
[ii] https://wenshu.court.gov.cn/
[iii] https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf