Japan AI Copyright Laws: Article 30-4 and Training Data

Japan operates the world's most permissive statutory regime for AI training data. Under Article 30-4 of the Copyright Act of Japan, added by the 2018 amendment and in force since January 1, 2019, anyone may use copyrighted works as AI training data without a license, subject to narrow limits. Understanding those limits matters for every developer, researcher, or company working with Japanese-hosted data.
Information last verified on 2026-06-25. This article presents general legal information, not legal advice. Japanese guidance on AI and copyright continues to develop.
This article covers Japan's copyright framework for AI systems: AI training and text-and-data mining, the copyrightability of AI-generated output, and software protection under Japanese law. For a comparative overview, see how AI and copyright differ worldwide.
Article 30-4: Japan's Broad AI Training Exemption
Japan's 2018 amendment to the Copyright Act introduced Article 30-4, which came into force on January 1, 2019. The provision permits any person to exploit a copyrighted work "to the extent considered necessary" when the purpose is not to personally enjoy or cause another to enjoy the expression in that work. The Agency for Cultural Affairs (Bunka-cho) confirms this covers recording and storing works as AI training data. Because the provision is not limited to academic or non-commercial activity, it applies equally to commercial AI developers, startups, and large technology companies training foundation models. No licensing fee, no opt-out mechanism, and no mandatory disclosure requirement attaches to uses that fall within the rule. Legal scholars and industry observers widely describe Article 30-4 as the most permissive AI-training regime among major copyright jurisdictions.

What the Exemption Covers
| Use Case | Inside Art. 30-4? |
|---|---|
| Crawling public web content to train a general-purpose model | Yes |
| Using commercially licensed datasets for AI training (not reproducing for redistribution) | Generally yes |
| Academic text-and-data mining for non-AI research | Yes (same provision) |
| Fine-tuning a model to replicate a specific author's style for distribution | No (expression-enjoyment purpose) |
| RAG retrieval that surfaces copyrighted text verbatim to end users | No (reproduced for enjoyment) |
| Reproducing a licensed database in a way that displaces its market | No (proviso applies) |
The Article 30-4 Proviso: Where the Exemption Stops
Article 30-4 contains a proviso: the exemption does not apply where the use "would unreasonably prejudice the interests of the copyright owner." This clause prevents the provision from becoming absolute. In 2024, the Agency for Cultural Affairs published its "General Understanding on AI and Copyright in Japan," interpretive guidance that identifies several categories of use that fall outside the exemption.
First, training or fine-tuning whose purpose includes enabling users to enjoy the protected expression is excluded. The clearest example is a system deliberately trained to reproduce a popular novelist's prose or a musician's lyrical style so that consumers can access that expression through the AI rather than purchasing the original work. Second, acquiring training data from known piracy sources does not qualify even if the downstream use would otherwise be permissible. Third, reproducing a licensed database in a manner that harms the market for that database falls under the proviso. The 2024 guidance is interpretive, not binding statute, but courts and practitioners treat it as authoritative administrative interpretation. Its conclusions are consistent with the legislative history of the 2018 amendment.
AI-Generated Output: No Copyright Without Human Creative Contribution
Under Japanese copyright doctrine, a work must originate from human creative expression to receive protection. Purely AI-generated output, where no human exercises creative judgment over the expressive choices in the result, is not copyrightable. This position is consistent with the Agency for Cultural Affairs' 2024 guidance and reflects how Japanese courts have historically treated works produced by non-human processes.

An AI system is treated as a tool, analogous to a camera or word processor. If a human directs the tool and makes substantive creative decisions about the form, arrangement, or content of the output, copyright can vest in that human's contribution. The threshold question is whether the human contribution rises above the level of supplying a prompt or idea. Ideas and concepts are not protectable under Japanese copyright law regardless of the medium, so instructing an AI with a detailed concept does not by itself create authorship. The human must shape the expressive result in a way that reflects individual creative choice. Where that threshold is met, the human contributor holds the copyright; where it is not, the output enters the public domain on creation.
Software Protection Under the Copyright Act
Article 10(1)(ix) of the Copyright Act of Japan lists "works of computer programming" among the categories of protected works. Software protection attaches automatically on creation, without registration, provided the program reflects the author's creative expression in the selection and arrangement of instructions. The protection is limited in scope. Article 10(3) expressly excludes from protection the programming language used to write the program, the rules governing the program's operation, and the algorithms or mathematical methods the program implements. This exclusion means that a developer who reverse-engineers or independently reimplements the logic or algorithm of a protected program does not infringe copyright, provided the resulting code is independently authored. Protection runs for the author's life plus 70 years, or 70 years from publication for corporate-authored works.
For AI-generated software, the same human-creative-contribution analysis applies. Code produced autonomously by a model without human creative input in the expressive choices is not protected.
How Japan Differs from the United States
Japan and the United States share one position: neither country recognizes copyright in purely AI-generated output. Beyond that point, the two systems diverge significantly.
| Dimension | Japan | United States |
|---|---|---|
| AI training safe harbor | Broad statutory rule (Art. 30-4, no license required) | No statute; case-by-case fair use analysis |
| Basis for training exemption | Purpose-based: not for expression-enjoyment | Four-factor fair use (transformative use, market harm, etc.) |
| Commercial training | Permitted under Art. 30-4 | Contested; active litigation pending |
| AI output copyright | No (unless human creative contribution) | No (Copyright Office and courts consistent) |
| Software: algorithm protection | Excluded by Art. 10(3) | Excluded (abstract ideas not copyrightable, Alice doctrine) |
| Binding AI guidance | Interpretive only (Bunka-cho 2024) | Copyright Office reports; no binding rule yet |
The practical significance is that a company training a large model on Japanese web content has a clear statutory basis in Japan that does not exist in the United States. That said, the 2024 guidance narrows the margin: a model built to let users access protected expression without paying for it will face the same proviso risk in Japan that it faces as a market-substitution problem in a US fair use analysis.
The information below is general legal information as of 2026-06-25. It does not constitute legal advice and does not address every factual situation. Consult a lawyer qualified in Japanese intellectual property law before making decisions that rely on this analysis.
Related Articles
- How AI and copyright differ worldwide
- AI copyright law in the United States
- European Union AI copyright laws
- China AI copyright laws
Last updated: 2026-06-25.
Frequently Asked Questions
Can I train an AI model on copyrighted Japanese books or articles without a license?
Article 30-4 of Japan's Copyright Act generally permits this where the purpose is data analysis rather than enabling users to enjoy the protected expression. Commercial and non-commercial actors both qualify. The proviso applies if the use would unreasonably harm the rights holder, such as by systematically reproducing a licensed database in a way that substitutes for the original product.
Does Japan's AI training exemption apply to content from piracy sites?
No. The Agency for Cultural Affairs' 2024 guidance makes clear that acquiring training data from known piracy sources is outside the Article 30-4 exemption, even if the downstream training use would otherwise qualify. Using illegally distributed content as training data carries copyright infringement risk.
Who owns copyright in content generated by an AI system in Japan?
No one holds copyright in purely AI-generated content. Japanese copyright law requires a human creative contribution to the expressive choices in the work. If a human directs an AI and makes substantive creative decisions about the output's form and content, that human may hold copyright in their contribution. Simply providing a prompt or idea is not enough to establish authorship under current doctrine.
Is fine-tuning a model on a specific author's works legal in Japan?
It depends on the purpose. The Agency for Cultural Affairs' 2024 guidance states that fine-tuning or retrieval-augmented generation designed to reproduce or surface protected expression for users to enjoy falls outside Article 30-4. A fine-tuning project intended to capture a specific author's style for commercial deployment would likely be treated as outside the safe harbor and require a license.
Are algorithms and programming languages protected by Japanese copyright?
No. Article 10(3) of the Copyright Act expressly excludes the programming language, rules, and algorithms from copyright protection, even when the software that implements them is protected. An independently written program using the same algorithm does not infringe. This aligns with US doctrine, which also treats abstract ideas and mathematical methods as outside copyright.
How is Japan's AI copyright framework different from the European Union's?
The EU Text and Data Mining exception under the Digital Single Market Directive allows TDM for research but gives rights holders the ability to opt out of commercial TDM. Japan's Article 30-4 has no opt-out mechanism and covers commercial use by default. Japan's framework is therefore significantly broader than the EU's for commercial AI developers, though both systems deny copyright to purely AI-generated output.
Sources and References
- Copyright Act of Japan, Art. 30-4 (Agency for Cultural Affairs, 2018 amendment summary)(bunka.go.jp)
- Agency for Cultural Affairs, General Understanding on AI and Copyright in Japan (2024)(bunka.go.jp)
- Copyright Act of Japan, Art. 10 (CRIC English translation)(cric.or.jp)
- Agency for Cultural Affairs (Bunka-cho), Copyright in Japan (overview)(bunka.go.jp)