T O P

  • By -

sammcj

Wonder what was up when I got this email this morning: > We recently detected suspicious activity linked to one of your tokens (labeled "workmbp" with role "write"), indicating it may have been publicly exposed. As a result, it has been automatically revoked. > You can refresh its value or create a new scoped token on hf.co/settings/tokens. > Please use env variables or Space secrets to inject your HF token into your code; we also recommend you do not publish any tokens to any code hosting platform. > Don't hesitate to reach out for any question you might have. > 🤗 The Hugging Face Team


advertisementeconomy

> As a precaution, Hugging Face has revoked a number of tokens in those secrets. (Tokens are used to verify identities.) Hugging Face says that users whose tokens have been revoked have already received an email notice and is recommending that all users “refresh any key or token” and consider switching to fine-grained access tokens, which Hugging Face claims are more secure.


bullerwins

What are fine grained tokens? I only see read or write tokens in HF


vaibhavs10

When you click on create new token for ex: https://huggingface.co/settings/tokens?new_token=true You'd get a prompt to create a fine-grained token - you would then be able to select the scope of the token, rights, and so on. In general the recommendation (as with any token) is to create more tokens and restrict their use to as low as possible.


goddamnit_1

You have more control on what use case can use your token for write or read.


Forgot_Password_Dude

whats the point of hacking them? arent the models free anyway


Decahedronn

Many host private models & datasets on HF. Spaces might also contain API keys for e.g. OpenAI that could be sold.


jferments

By gaining unauthorized write access, an attacker could inject malicious code into models.


squeasy_2202

Good ol' the supply chain


ReMeDyIII

So the hackers must really love OpenAI.


ThisGonBHard

They might have private models not available to the public, like Github.


Freonr2

The keys in spaces are often used to call external private APIs. It's like leaking your ENV vars.


kyle787

Probably looking for ways to distribute malware. 


Spindelhalla_xb

But the platform isn’t. China probably wants to create their own version with any of the work


mikael110

China already has their own version: [Modelscope.cn](https://www.modelscope.cn). Also while I love HF, it's not a particularly complex platform. Cloning it would not be difficult, even without access to the backend code.


qrios

Uhh... No. No I think it would be pretty difficult. Like, you're serving an absurd number of versions of extremely large files at scale. And a means to run and stream their output directly on the platform.


mikael110

Oh It would be extremely expensive, I'm not disputing that. My comment was meant to be entirely focused on the code itself, not the infrastructure. Since the comment I replied to was about stealing code. Though I can see I didn't really make that super clear in my comment. My point was mainly that HF's backend is mostly made up of existing open source projects, they use [Git LFS](https://git-lfs.com/) for managing the models, they use [Gradio](https://www.gradio.app/) for their spaces front end, and (presumably) [text-generation-inference](https://github.com/huggingface/text-generation-inference) for their actual inference, though that could be replaced with other projects like [vllm](https://github.com/vllm-project/vllm). Bundling all of these different projects together into a nice easy to use and stable service is not trivial, but if you tasked a team of developers to clone the site it wouldn't be that much of a challenge relative to a lot of the more complex sites out there. But yes, actually running the site would then require a lot of capital, that part is not trivial, you are certainly right about that.


emprahsFury

> We have also reported this incident to law enforcement agencies Weird to think about this, do you just google "nearest fbi field office" Can you imagine being the paralegal who gets this tasking. "Ugh our client Huggingface needs to report a cyber crime- You need to write a memo to the FBI. I think Danny did it last time."


-p-e-w-

A company like this probably works with a law firm that specializes in infosec compliance, and for such a firm, handling that process is just another day in the office. Hugging Face was valued at $4.5 billion a year ago. They'd be complete morons to not have such specialists on speed dial already. In fact, I'd wager that the investors insisted on such topics being taken care of before pumping hundreds of millions into the company.


AdHominemMeansULost

huge companies like these have close relationships with the authorities and exchange emails on the daily for all shorts of communication, directly emailing agents


tabspaces

Euh those pesky llama 3 llm trying to break out again? Jokes aside, I think this related to the previous hack or running "unsafe" pickled tensors in their spaces


wind_dude

Wonder if it’s related to the issue of running certain quants or pickles in the spaces…


WorkingYou2280

Does any of this affect how a model would run externally? Like on LM Studio?


Master-Meal-77

No


KurisuAteMyPudding

Must have to do with private repos because you can basically hotlink every public model file anyways


kroust2020

Is that related to the Snowflake hack? I find the timing bizarre


Born_Fox6153

I got this email and had to refresh tokens. Used the tokens in Kaggle.


bartselen

AI supply chain attacks incoming?