Q1: What does the APPI amendment mean for using personal data to train AI models in Japan?

The APPI 2026 amendment introduces a conditional exemption that allows controllers to use personal data for AI model training without individual consent, provided the use is assessed as low‑risk, documented with a scored assessment and supported by appropriate pseudonymisation and technical safeguards. Sensitive categories such as biometric and health data remain subject to elevated restrictions.

Q3: How should businesses assess whether data is "low‑risk" for the APPI AI exemption?

Controllers should apply a structured scoring matrix that evaluates five factors: identifiability, sensitivity, data origin, purpose alignment and output proximity. A total score within the low‑risk threshold (typically 5–7 on a 15‑point scale) indicates likely eligibility. The scored assessment must be documented and retained as an auditable record.

Q4: What contractual safeguards are required for cross‑border transfers of AI training data?

Under the amended Article 28, controllers transferring training data overseas must execute contracts that include purpose limitation, prohibition on secondary use, sub‑processor approval requirements, encryption and security obligations, audit rights and breach‑notification timelines. The contract must provide a level of protection equivalent to what the APPI affords domestically.

Q5: When must a controller notify the PPC or affected individuals about a breach?

Controllers must notify the PPC and affected individuals when a breach involves unauthorised disclosure, loss or misuse of personal data, particularly where sensitive categories are affected or data has been transferred cross‑border. Controllers should maintain a breach‑response playbook with designated contacts, notification templates and documented timelines.

Q6: Can pseudonymised data be reused for commercial AI model training under APPI 2026?

Yes, provided the pseudonymisation is robust (no reasonable re‑identification path exists), the low‑risk assessment has been completed and documented, and appropriate technical safeguards are maintained throughout the training pipeline. Controllers should retain evidence of the pseudonymisation method used and conduct periodic reviews to confirm that re‑identification risk has not increased due to new data availability.

Appi Ai Training Data Japan | Global Law Experts

The rules governing APPI AI training data in Japan shifted on April 7, 2026, when the Japanese Cabinet approved a package of amendments to the Act on the Protection of Personal Information (APPI) that directly affects every organisation developing, fine‑tuning or deploying machine‑learning models with personal data. The APPI 2026 amendment introduces a conditional consent exemption for certain low‑risk AI training uses, elevates pseudonymisation and biometric‑protection expectations, and tightens scrutiny of cross‑border data transfers under Article 28.

For general counsel, chief privacy officers and AI/ML leads, the practical effect is a simultaneous loosening and tightening: some datasets can now be processed without individual consent, provided the controller can demonstrate low risk and maintain robust documentation, while transfers involving sensitive categories face higher contractual and technical thresholds than before. This article provides a step‑by‑step compliance checklist, a low‑risk scoring matrix, sample contractual clauses, and an implementation roadmap designed to help in‑house teams translate these legislative changes into operational controls.

Key actions at a glance:

Consent change. Assess whether your AI training datasets qualify for the new low‑risk exemption, and document the analysis before relying on it.
Low‑risk test. Apply a structured scoring matrix covering identifiability, sensitivity, data origin and purpose alignment to each training pipeline.
Cross‑border safeguards. Update data‑export contracts, encryption standards and sub‑processor audit rights to meet the amended Article 28 expectations.

Quick APPI 2026 Summary: What Changed for AI Training Data

On April 7, 2026, the Japanese Cabinet approved amendments to the APPI that had been under policy development since late 2025 and early 2026. As reported by Mainichi at the time of Cabinet approval, the amendment package addresses the growing intersection of personal data processing and artificial intelligence by creating explicit, albeit conditional, pathways for using personal information in model training without obtaining prior individual consent. The Personal Information Protection Commission (PPC), Japan’s primary data‑protection regulator, had published policy‑direction papers outlining these changes months earlier, giving practitioners advance notice of the legislative trajectory.

The APPI amendment touches four areas that are directly relevant to AI teams:

Consent exemption for low‑risk AI training. Controllers may process personal data for statistical analysis or AI model training without individual consent where the use is assessed and documented as low‑risk, meaning it is unlikely to infringe the rights or interests of data subjects.
Pseudonymisation emphasis. The amendment reinforces the expectation that controllers adopt robust pseudonymisation techniques when processing personal data for secondary purposes, including AI development.
Biometric and sensitive‑data restrictions. The amended APPI elevates protections for biometric identifiers, health data and other special‑care‑required personal information, effectively excluding these categories from the low‑risk exemption without additional safeguards and justification.
Cross‑border transfer scrutiny. Article 28 obligations are expanded to require documented contractual and technical safeguards when personal data, or datasets derived from personal data, are transferred overseas for model training, hosting or inference.

Change	Impact	Required Action
Consent exemption for low‑risk AI training (April 7, 2026)	Allows certain personal data uses without consent where low risk and documented	Run low‑risk scoring, document rationale, update privacy notices and DPIAs
New emphasis on pseudonymisation / biometric protections	Stronger regulator expectations for de‑identification and elevated controls on biometric data	Adopt technical pseudonymisation standards, restrict biometric training sets
Cross‑border transfer scrutiny (Article 28 expectations)	Transfers for model training will require specific contractual and technical safeguards	Update contracts, adopt security measures and review hosting/training flows

Timeline of Key Compliance Steps

Date / Period	Milestone	Action for Compliance Teams
April 7, 2026	Cabinet approval of APPI amendment bill	Begin internal impact assessment; brief board and AI teams
Q2 2026	Diet deliberation and expected enactment	Monitor legislative progress; prepare updated DPIAs and contract templates
Q3–Q4 2026 (anticipated)	PPC implementing guidelines and Q&A publication	Map PPC guidance to internal controls; finalise low‑risk assessment methodology
2027 (anticipated)	Full enforcement of amended provisions	Complete contract remediation, staff training and audit readiness

Which Datasets Qualify for the Low‑Risk AI Exemption, Assessment Checklist

The consent exemption for APPI AI training data in Japan is not a blanket permission, it is a conditional pathway that requires a documented risk assessment. Industry observers expect the PPC to treat “low risk” as an objective, multi‑factor determination rather than a subjective business judgment. In‑house counsel should therefore adopt a structured scoring methodology that can be audited and defended.

The legal test centres on whether the proposed use of personal data for AI model training is unlikely to unjustly infringe the rights or legitimate interests of data subjects. The following factors are key to that determination:

Identifiability. Can the dataset, alone or in combination with other available data, identify a specific individual? Pseudonymised and aggregated datasets score lower risk; raw data with direct identifiers scores higher.
Sensitivity. Does the dataset include special‑care‑required personal information (health, criminal history, beliefs, race, biometric identifiers)? Sensitive categories are presumed higher risk and will generally not qualify for the exemption without additional controls.
Data origin. Was the data collected directly from subjects with a disclosed purpose, or was it obtained from a third party or scraped from public sources? Data obtained with a broader original purpose specification scores lower risk.
Purpose alignment. How closely does the AI training purpose align with the original collection purpose? Closely aligned purposes (e.g., product improvement analytics) score lower risk than entirely novel uses.
Output proximity. Is the model likely to generate outputs that could re‑identify individuals or reveal sensitive attributes? Models producing aggregated statistical outputs score lower risk than generative systems producing individual‑level content.

Scoring Matrix

Factor	Low Risk (1 pt)	Medium Risk (2 pts)	High Risk (3 pts)
Identifiability	Fully pseudonymised, no re‑identification path	Pseudonymised but combinable with other datasets	Direct identifiers present
Sensitivity	Non‑sensitive data only	Includes inferred sensitive attributes	Contains biometrics, health or criminal data
Data origin	First‑party, broad purpose disclosed	Third‑party with contractual controls	Scraped or obtained without documented basis
Purpose alignment	Closely related to original purpose	Reasonably related, documented justification	Entirely unrelated to original collection purpose
Output proximity	Aggregated, non‑individual outputs	Segment‑level outputs possible	Individual‑level or generative outputs

Threshold guidance: A total score of 5–7 points indicates the dataset is likely eligible for the low‑risk exemption, subject to documentation. A score of 8–11 requires additional safeguards and may need a DPIA. A score of 12–15 means the controller should obtain consent or apply full anonymisation before proceeding. These thresholds reflect the likely practical interpretation of the amendment; early indications suggest the PPC will expect controllers to maintain scored assessments as auditable records.

Documentation and Recordkeeping Requirements

For every dataset processed under the low‑risk exemption, controllers should maintain a compliance file containing: the completed scoring matrix with supporting rationale for each factor; a description of the AI training purpose and expected outputs; evidence of technical controls applied (pseudonymisation method, access restrictions); the date of the assessment and the identity of the responsible officer; and any subsequent reviews or score changes triggered by pipeline modifications. This documentation is likely to become the first item requested in any PPC inquiry.

Practical Compliance Checklist for APPI AI Training Data in Japan

The APPI 2026 amendment requires in‑house teams to implement coordinated legal, technical and operational controls before processing personal data for AI model training. The checklist below is designed as an operational tool that compliance officers can assign, track and evidence.

Legal Controls

Confirm lawful basis. For each training dataset, determine whether consent, the low‑risk exemption, anonymisation or another statutory basis applies. Document the decision in the compliance file.
Update purpose specifications. Review existing privacy notices and terms of service to ensure AI training and model development are explicitly listed as processing purposes. Where notices lack this language, issue supplementary disclosures before processing begins.
Conduct or update DPIAs. Complete a data‑protection impact assessment for every AI training pipeline that involves personal data. The DPIA should address identifiability, data‑flow mapping, sub‑processor involvement and output risks.
Review and amend contracts. Audit existing data‑processing agreements with vendors, cloud providers and sub‑processors to ensure they include AI‑specific clauses covering permitted uses, return/deletion obligations, cross‑border transfer safeguards and audit rights.
Map consent dependencies. Where the low‑risk exemption does not apply (e.g., biometric or health data), confirm that valid, specific consent has been obtained and recorded. Implement consent‑management tooling if necessary.

Technical Controls

Apply pseudonymisation at ingestion. Personal data entering training pipelines should be pseudonymised using hashing, tokenisation or differential privacy techniques before it reaches the model‑training environment.
Enforce data minimisation. Strip or mask any data fields that are not required for the stated training purpose. Avoid over‑collection by specifying the minimum viable feature set for each model.
Implement retention schedules. Define and enforce automated deletion or anonymisation of training datasets after the model development cycle concludes, unless a documented justification exists for retention.
Secure the labelling layer. Where human annotators interact with personal data during labelling, apply access controls, audit logging and contractual confidentiality obligations.
Encrypt data at rest and in transit. All training data should be encrypted using industry‑standard protocols (AES‑256 at rest, TLS 1.3 in transit) with key management separated from the processing environment.

Operational Controls

Maintain a data inventory. Catalogue every dataset used in AI training, including origin, legal basis, volume, retention date and associated low‑risk score. Update the inventory at each pipeline change.
Implement access controls. Restrict access to training data to authorised personnel using role‑based access controls (RBAC). Log all access events.
Log model‑training activities. Record training runs, hyperparameters, dataset versions, dates and responsible engineers. These logs serve as evidence of data protection compliance in Japan and support model governance requirements.
Build explainability artefacts. For models trained on personal data, maintain documentation of training methodology, input data characteristics and model behaviour sufficient to respond to regulator inquiries or data‑subject requests.

Checklist Item	Owner	Timeframe	Evidence to Retain
Confirm lawful basis per dataset	Privacy / Legal	Before processing	Compliance file with scored assessment
Update privacy notices	Legal / Comms	Within 30 days	Published notice version with date stamp
Complete DPIA	DPO / Privacy	Before processing	Signed DPIA document
Pseudonymise at ingestion	Engineering / Data	Before processing	Technical specification and audit log
Amend vendor contracts	Legal / Procurement	Within 90 days	Executed amendment or new agreement
Implement training logs	ML Engineering	Ongoing	Automated log repository

Do companies still need individual consent to use datasets for AI development after the APPI 2026 amendment? The answer is nuanced. For datasets that score within the low‑risk threshold, pseudonymised, non‑sensitive, purpose‑aligned and producing aggregated outputs, the amendment permits processing without individual consent, provided the controller documents the assessment and maintains appropriate technical and operational controls. However, datasets containing biometric identifiers, health records or other special‑care‑required personal information will generally still require consent or full anonymisation. In practical terms, most large‑scale AI training pipelines will include a mix of both categories, requiring a dataset‑by‑dataset determination.

Cross‑Border Data Transfers and Safeguards for AI Training Pipelines

The APPI amendment strengthens expectations for cross‑border data transfer under Article 28, and this has immediate implications for any organisation sending training data to overseas cloud infrastructure, offshore labelling teams, or foreign model‑development partners. The PPC expects controllers to implement transfer mechanisms that provide a level of protection equivalent to what the APPI affords domestically.

Three primary transfer mechanisms are available under the amended framework:

Adequacy determination. Transfer to a jurisdiction recognised by the PPC as providing an equivalent level of data protection. The PPC currently recognises the EU and the UK under its adequacy framework.
Contractual safeguards. Where no adequacy determination exists, the controller must execute a contract with the foreign recipient that includes specific data‑protection obligations equivalent to APPI requirements.
Consent‑based transfer. Obtaining the data subject’s informed consent to the specific cross‑border transfer, including disclosure of the destination country’s data‑protection regime.

Mechanism	When to Use	Pros & Cons
Adequacy determination	Transfer to EU/UK‑based cloud or partners	Simplest pathway; limited to recognised jurisdictions
Contractual safeguards (Article 28)	Transfer to US, APAC or other non‑adequate jurisdictions	Flexible; requires robust contract drafting and ongoing audit
Consent‑based transfer	Where contractual safeguards are impractical or datasets are small	Provides clear basis; burdensome at scale, consent fatigue risk

Practical Safeguards for Cross‑Border AI Data Transfers

For contractual‑safeguard transfers, the most common mechanism for AI training pipelines, controllers should implement a layered set of protections:

Data‑export clauses. Include specific provisions in the data‑processing agreement addressing: the categories of personal data transferred; the purpose limitation (AI model training only); prohibition on secondary use; and mandatory return or destruction upon project completion.
Technical measures. Encrypt training data before transfer. Where feasible, consider split‑processing architectures (federated learning, secure enclaves) that keep raw personal data within Japan while transmitting only model updates or aggregated outputs.
Sub‑processor controls. Require the foreign recipient to obtain written approval before engaging sub‑processors, and to flow down all APPI‑equivalent obligations contractually.
Audit rights. Reserve the right to audit the foreign recipient’s data‑handling practices, including on‑site inspections and documentary reviews, at least annually.
Breach notification. Require the foreign recipient to notify the controller within a specified period (industry observers expect 72 hours to become the prevailing standard) of any data breach affecting transferred training data.

When Transfers Are Higher Risk

Transfers involving biometric data, health records or other special‑care‑required personal information face elevated scrutiny under the APPI amendment. For these categories, early indications suggest the PPC will expect: a dedicated DPIA for the cross‑border element; DPO or senior privacy officer sign‑off; enhanced encryption and access‑control standards; and additional contractual warranties regarding the recipient’s security posture and legal environment. Controllers should treat any transfer of sensitive training data as a high‑risk processing activity that requires pre‑approval at the governance level.

Contracts, Vendor Management and Sample Clauses

Effective data protection compliance in Japan now requires AI‑specific contract clauses that go beyond standard data‑processing agreements. Vendor management should include pre‑engagement due diligence on the recipient’s security certifications, data‑handling practices and sub‑processor chain, followed by ongoing monitoring and periodic audits.

The following must‑have clauses should appear in any agreement governing the sharing of personal data for AI model training:

Purpose limitation and permitted use. Restrict the recipient’s use of data to the specified AI training purpose; prohibit use for the recipient’s own model development or commercial resale.
Model IP and derivatives. Clarify ownership of trained models, weights and derived datasets; specify whether the recipient retains any interest in model outputs produced from the controller’s data.
Security warranties. Require the recipient to maintain specified security standards (e.g., ISO 27001, SOC 2 Type II) and to implement pseudonymisation and encryption at agreed levels.
Audit and inspection. Grant the controller annual audit rights and the ability to request ad‑hoc audits following a security incident.
Return and deletion. Require certified deletion or return of all personal data and derived datasets upon termination, with evidence of destruction.

Sample Clause A, Cross‑Border Transfer Safeguard:

“The Recipient shall process Personal Data transferred under this Agreement solely for the purpose of [specified AI model training] and shall not transfer such data to any third party or sub‑processor without the prior written consent of the Controller. The Recipient shall implement and maintain technical and organisational measures at least equivalent to those required by Japan’s Act on the Protection of Personal Information (APPI), including encryption in transit and at rest, access controls, and breach notification within 72 hours of discovery. The Recipient shall submit to audits by the Controller or its designated third party upon reasonable notice.”

Sample Clause B, AI Training Data Use and Retained Rights:

“The Controller grants the Recipient a limited, non‑exclusive, non‑transferable licence to use the Datasets solely for the purpose of training the Model described in Schedule [X]. The Recipient shall not use the Datasets for any other purpose, including the training of its own proprietary models. All rights in the trained Model weights and architecture shall vest in the Controller. Upon completion of training or termination of this Agreement, the Recipient shall certify in writing the deletion of all copies of the Datasets from its systems within [30] days.”

Regulator Engagement, Breach Response and Enforcement Risk

The PPC has signalled that its post‑amendment enforcement priorities will focus on three areas: improper use of biometric and sensitive personal information in AI training; failures to implement documented low‑risk assessments before relying on the consent exemption; and cross‑border transfers that lack adequate contractual and technical safeguards. Industry observers expect enforcement actions to increase as the PPC builds institutional capacity in AI governance in Japan.

Controllers should notify the PPC and affected individuals when a breach involves unauthorised disclosure, loss or misuse of personal data processed for AI training, particularly where sensitive categories are involved or where data has been transferred overseas. Maintaining a regulator‑inquiry response kit is prudent. That kit should include: a summary of all datasets processed under the low‑risk exemption; copies of all scored assessments and DPIAs; a current data‑flow map showing cross‑border transfer paths; executed contracts with foreign recipients; and breach‑response logs.

What to Expect from PPC Inquiries

In the event of a PPC inquiry or investigation, controllers should be prepared to demonstrate: the legal basis relied upon for each training dataset; the methodology used for the low‑risk assessment; technical controls applied (pseudonymisation evidence, encryption certificates); contractual safeguards governing any cross‑border transfers; and the chain of custody for training data from ingestion to model deployment. Designating a single point of contact for regulator interactions, typically the DPO or a senior privacy counsel, streamlines response and reduces the risk of inconsistent communications.

Implementation Roadmap and Risk Matrix for APPI 2026 Compliance

A phased approach to implementation helps organisations prioritise high‑risk activities while building sustainable data protection compliance in Japan. The roadmap below allocates actions across three horizons:

0–90 days. Complete initial impact assessment; identify all AI training pipelines using personal data; run low‑risk scoring on active datasets; brief the board and DPO; begin contract‑remediation process for highest‑risk vendor relationships.
90–180 days. Finalise DPIAs for all training pipelines; implement technical controls (pseudonymisation at ingestion, encryption upgrades); execute amended contracts with all foreign data recipients; deploy training‑activity logging.
180–365 days. Conduct first‑cycle audits of vendor compliance; review and update low‑risk assessments following PPC implementing guidelines; train all relevant staff on updated procedures; establish ongoing monitoring cadence.

Risk Level	Dataset Characteristics	Priority Actions	Owner
High	Biometric, health or sensitive data; cross‑border transfer to non‑adequate jurisdiction	Obtain consent or anonymise; execute enhanced contracts; DPO sign‑off; complete DPIA immediately	DPO / Legal / CISO
Medium	Pseudonymised data with partial re‑identification risk; transfer to adequate jurisdiction	Complete low‑risk scoring; update contracts; implement technical controls within 90 days	Privacy / Engineering
Low	Fully pseudonymised, non‑sensitive, purpose‑aligned, domestic processing	Document low‑risk assessment; update privacy notice; implement standard technical controls	Privacy / Data team

Need Legal Advice?

This article was produced by Global Law Experts. For specialist advice on this topic, contact Noboru Kitayama at Mori Hamada & Matsumoto, a member of the Global Law Experts network.