Article

The Claude Code Leak and IP Considerations

06 May 2026

Back

Home

Insights & News

Insights & News Details

The Anthropic Claude Code Leak

On March 31, 2026, AI industry leader Anthropic shocked the market with the leak of the proprietary source code of “Claude Code”, its flagship agentic coding tool. A debugging artifact known as the sourcemap file was mistakenly bundled into the version 2.1.88 release on the public registry. The sourcemap file exposes approximately 512,000 lines of source code of Claude Code across nearly 2,000 files to the public.

If you follow AI, you would understand that the Claude Code leak is profound. Claude Code is a groundbreaking AI tool, representing a paradigm shift of how human interacts with AI since the launch of ChatGPT. More than just a chat window, Claude Code is a highly intelligent and autonomous AI coding agent. You can give Claude Code high-level instructions or describe your objectives in natural language. Claude Code will then plan out the execution of your instructions logically step by step. Then it will work directly on your computer, including reading data files, writing and modifying code files, creating directories, moving and organising[i] files into these directories. Given tools, it can perform certain tasks such as web search and working with PDFs.

While Anthropic confirmed that model weights, training data, and customer credentials were not compromised, the leak laid bare the highly proprietary “agent harness” layer[ii] and architectural secrets, such as its memory system, agent tools, and other not-yet-released agentic features.

Anthropic officially characterised the event as a “release packaging issue caused by human error” rather than a security breach.[iii]

“Cat Out of the Bag”

Waking up to the leak of one of the most coveted AI tools in the world, various groups of developers around the world immediately engaged with the exposed source code, such as:

Security researchers: Chaofan Shou was the researcher who first identified the vulnerability and publicly disclosed that the readable TypeScript codebase could be reconstructed via the accidental sourcemap file.

Power users and independent developers: some initiated an overnight rewrite of the tool – see the “clean-room” discussion below.

Rival AI companies: Anthropic’s competitors were identified as primary beneficiaries who could use the leak as a map to replicate Anthropic’s design logic.

Malicious actors: hackers quickly seeded “trojanized” versions of the leaked repository to spread malwares.

Global developer community: thousands of programmers mirrored, forked, and dissected the code across GitHub, with some estimates suggesting over 8,000 repositories were initially involved in the spread.

In this article, we focus on three categories of actions taken by the developers in response to the leak:

Direct copy: Within hours of the leak, developers mirrored the codebase across GitHub and decentralised platforms.

Translation: Many developers translated (most likely by AI) the original TypeScript codebase into other programming languages such as Python.

“Clean-room” reimplementation: A methodology employed to replicate a computer program’s functionality[iv]:

A “Dirty Team” is set up to analyse the original proprietary codebase. Based on the analysis, the Dirty Team writes out purely functional specifications of the exposed program.

A “Clean Team” is set up uncontaminated with knowledge of the original proprietary codebase. The Clean Team would read the Dirty Team’s functional specifications and recreate a program out of it.

By separating the two teams, the argument is that the codebase re-implemented by the Clean Team gives rise to an “independent creation” – an established legal defence in copyright infringement claims.

Source: NotebookLM

Copyright protection of source code under Hong Kong law

We will look into the three categories of actions above from the perspectives of Hong Kong IP laws.

In Hong Kong, computer source code is protected under the Copyright Ordinance (Cap 528) (“Copyright Ordinance”) as “literary works”[v]. That means the literal text form of the code in digital or material medium, covering both human-readable “source code” and machine-readable “object code”, is protected.

Crucially, copyright law protects only the form of expression rather than the underlying substance or idea. This idea-expression dichotomy ensures that while the specific text of a program is protected, but the business logic, functional effects, and general methods of operation are not protected, as these are viewed as the ideas or methods underlying the software.

Beyond the literal text of the code, computer programs can also receive protection for their artistic design. Visual elements created by the source code, such as icons and Graphical User Interfaces (GUIs), are protected as “artistic works”[vi].

Hong Kong law supports both human-generated and computer-generated works. Unlike Europe, the US and some other jurisdictions that maintain a strict human authorship requirement, the Copyright Ordinance specifically provides protection for works that is computer-generated. The government conducted a public consultation on copyright and artificial intelligence in 2024 and reaffirmed that the Copyright Ordinance contains applicable provisions to protect the copyright of AI-generated works.[vii]

Trade Secrets and the Breach of Confidence

In Hong Kong, trade secrets and undisclosed commercial information are protected under the common law doctrine of confidence. This doctrine does not require an express contract to be effective; an obligation of confidence arises whenever information is acquired by a person who knows, or as a reasonable person ought to know, that the owner wishes to keep that information confidential. Consequently, even if a third party is not subject to a formal non-disclosure agreement, they may still be liable for a breach of confidence if they knowingly exploit the leaked data.

A critical concern in the wake of a leak such as Anthropic’s is whether the information entering the “public arena” automatically causes it to lose its status as a secret. While trade secret protection generally lasts only as long as the information remains confidential, the law does not allow a defendant to benefit from a leak if they had notice of its confidential nature.

Taking reasonable steps to safeguard information remains the primary requirement for maintaining legal trade secret status. In the event of an exposure, owners must act within a reasonably short time to apply for injunctions, take-down orders, or delivery up of materials to prevent the information from losing its confidential character entirely. Therefore, Anthropic’s swift legal reactions to the repositories that posted the leaked code was a necessary step to assert trade secret.

The recent Hong Kong decision in Best Buy Electric Co Ltd v Built-In Pro Ltd[viii] clarified several vital principles regarding breach of confidence:

Knowledge over Relationship: Citing AG v Guardian Newspapers Ltd (No 2)[ix], the court reaffirmed that a prior confidential relationship is no longer necessary before a duty of confidence arise#_edns; the duty is triggered as soon as the confidential information comes to a person’s knowledge in circumstances where they have notice of its confidentiality.

Reasonable Expectation of Privacy: The touchstone for a claim is whether the owner had a reasonable expectation of privacy regarding the information.

Hacking and Intent: When a defendant intentionally takes steps to obtain information they know the owner expects to be private. Systematic downloading or hacking would be overwhelming evidence of a breach.

Rejecting the “Loophole” Defense: The court explicitly rejected the argument that information loses protection if it was accessible via a security loophole, stating that such an argument unfairly attempts to “blame the victim” for a technical oversight.

Pre-emptive Injunctions: A plaintiff is entitled to an injunction to restrain the use or further passing on of information even if the defendant does not intend to reveal it to third parties, because the mere possession by an unauthorized party puts the confidentiality at risk.

Reverse Engineering and the “Clean-Room” Methodology

“Independent creation” could be a defense against copyright infringement. Any one is permitted to study a program’s functioning to learn its features and business logic, as these are unprotectable ideas.[x] For example, while Claude Code may be a pioneering CLI (command-line interface) agentic AI tool, the general CLI form of an agentic tool itself cannot be protected by copyright.

However, using the “clean-room” as a methodology for claiming independent creation is especially controversial in the age of AI. With AI tools, adaptation of source code has become all too easy. Also, the distinction between the “Dirty Team” and the “Clean Team” blurs if both are controlled by the same person using the same AI models. If a developer uses an AI model trained on leaked proprietary data to refactor the code, the model may become tainted oracle by passing through protected expression and invalidating the claim of independent creation.

As of time of writing (April 2026), the “Claw Code” project that employed the “clean-room” methodology to reimplement has survived Anthropic’s first wave of enforcement attack. Whether the reimplementation can survive the more substantial attack remains to be seen, but the evidentiary challenges in proving a rigorous, multi-stage clean-room process happened on the night when the leak happened cannot be under-estimated.

Key Takeaways

If you are a proprietary owner such as Anthropic, in order to assert trade secret and breach of confidence, do take prompt legal actions such as injunctions against the alleged infringers. Otherwise, the existence of the secret in public space may erode its confidential nature.

For developers, remember that the legal effectiveness of the “clean-room” approach is still uncertain. Best of all, learn from others’ ideas, but don’t copy the forms.

[i] Yes, that includes deleting files as well. It is critical for users to set boundaries for their AI agents.

[ii] It is a software system designed around the large-language-model so that the model can turn into a stable and useful application. Typical components of a harness layer include tool registry, memory management, sandboxed environment.

[iii] Some commentators suggest that the leak could be caused by a defective AI design. For example, see comments by Allen Au of 0xmd: https://www.youtube.com/watch?v=IVSPh-IGYEo

[iv] The most prominent example is the clean-room project known as “Claw Code” by Korean developer Sigrid Jin.

[v] Section 4(1)(b), Copyright Ordinance. Section 4(1)(c) goes further and classifies “preparatory design material for a computer program” as literary work.

[vi] Section 2(1)(a) and Section 5, Copyright Ordinance.

[vii] https://www.ipd.gov.hk/en/copyright/current-topics/public-consultation-on-copyright-and-artificial/index.html

[viii] [2025] 2 HKLRD 1157

[ix] [1990] 1 AC 109

[x] Navitaire Inc v Easyjet Airline Co. and BulletProof Technologies, Inc., [2004] EWHC 1725