# Cyberattackers Torch Python Machine Learning Project
Source URL:     https://www.darkreading.com/application-security/cyberattackers-torch-python-machine-learning-project
Date:           20230103T2125

An unknown attacker slipped a malicious binary into the PyTorch machine
learning project by registering a malicious project with the Python
Package Index (PyPI), infecting users' machines if they downloaded a
nightly build between Dec. 25 and Dec. 30.

The PyTorch Foundation stated in an advisory on Dec. 31 that the effort
was a [dependency confusion attack][1], in which an unknown entity
created a package in the Python Package Index with the same name,
_torchtriton_ , as a code library on which the PyTorch project depends.
The malicious library included the functions normally used by PyTorch
but with a malicious modification: It would upload data from the
victim's system to a server at a now-defunct domain.

The malicious function would grab a variety of system-specific
information, the username, environment variables, a list of hosts to
which the victim's machine connects, the list of password hashes, and
the first 1,000 files in the user's home directory.

"Since the PyPI index takes precedence, this malicious package was being
installed instead of the version from our official repository," [the
advisory stated][2]. "This design enables somebody to register a package
by the same name as one that exists in a third party index, and [the
package manager] will install their version by default."

The attack is the latest software supply chain attack to target open
source repositories. In mid-December, for example, researchers
[discovered a malicious package][3] disguised as a client from
cybersecurity firm SentinelOne that had been uploaded to PyPI. In
another dependency confusion attack in November, attackers [created more
than two dozen clones of popular software][4] with names designed to
fool unwary developers. Similar attacks [have targeted][5] the .NET-
focused Nuget repository and the Node.js Package Manager (npm)
ecosystem.

## Same Name, Different Packages

In the latest attack on PyTorch, the attacker used the name of a
software package that PyTorch developers would load from the project's
private repository, and because the malicious package existed in the
PyPI repository, it gained precedence. The PyTorch Foundation removed
the dependency in its nightly builds and replaced the PyPI project with
a benign package, the advisory stated.

The group also removed any nightly builds that depend on the
_torchtriton_ dependency from the project's download page and says it
plans to take ownership of the _torchtriton_ project on PyPI.

Fortunately, because the _torchtritan_ dependency was only imported into
the nightly builds of the program, the impact of the attack did not
propagate to typical users, Paul Ducklin, a principal research scientist
at cybersecurity firm Sophos, said [in a blog post][6].

"We're guessing that the majority of PyTorch users won't have been
affected by this, either because they don't use nightly builds, or
weren't working over the vacation period, or both," he wrote. "But if
you are a PyTorch enthusiast who does tinker with nightly builds, and if
you've been working over the holidays, then even if you can't find any
clear evidence that you were compromised, you might nevertheless want to
consider generating new SSH key pairs as a precaution, and updating the
public keys that you've uploaded to the various servers that you access
via SSH."

The PyTorch Foundation confirmed that users of the stable version of the
PyTorch library would not be affected by the issue.

## Mistaken Intentions?

In a widely circulated _mea culpa_ , the attacker claimed that they are
a legitimate researcher and that the issue resulted from their
investigation into dependency confusion issues.

"I want to assure that it was not my intention to steal someone's
secrets," [the person wrote][7], claiming to have notified Facebook on
Dec. 29 of the issue and made reports to companies using the HackerOne
crowdsourcing platform. "Had my intents been malicious, I would never
have filled [sic] any bug bounty reports, and would have just sold the
data to the highest bidder."

Because of the statement, some experts considered the PyTorch advisory
to be [a "false alarm,"][8] but there have been other attackers that
have [donned the mantle of a misunderstood researcher][9].

Moreover, the impact of the attack could have exposed victims' sensitive
information, even if the person behind the malware had good intentions,
Sophos' Ducklin wrote in a blog post about the software supply chain
attack.

"How is this a 'false alarm'? " [he also said in a tweet][10]. "This
malware deliberately steals your data… and transmits it scrambled, not
encrypted ... so anyone on your network path who recorded it can
trivially decode it."

  [1]: https://www.darkreading.com/dr-tech/new-application-security-toolkit-uncovers-dependency-confusion-attacks
  [2]: https://pytorch.org/blog/compromised-nightly-dependency/#how-to-check-if-your-python-environment-is-affected
  [3]: https://www.darkreading.com/vulnerabilities-threats/malicious-python-trojan-impersonates-sentinelone-security-client
  [4]: https://www.darkreading.com/threat-intelligence/w4sp-stealer-aims-to-sting-python-developers-in-supply-chain-attack
  [5]: https://www.darkreading.com/attacks-breaches/automated-cybercampaign-attacks-bogus-software-building-blocks
  [6]: https://nakedsecurity.sophos.com/2023/01/01/pytorch-machine-learning-toolkit-pwned-from-christmas-to-new-year/
  [7]: https://twitter.com/bad_requests/status/1609588336112599041
  [8]: https://twitter.com/vxunderground/status/1609589042017878016
  [9]: https://www.darkreading.com/threat-intelligence/school-kid-uploads-ransomware-scripts-to-pypi-repository-as-fun-research-project
  [10]: https://twitter.com/duckblog/status/1610227902733336577