> trust_remote_code=True This is a hard no from me, anyone know why this is so c...

rfoo · on July 6, 2023

That's because the model architecture hasn't been added to huggingface/transformers yet, because it literally was just published today.

    >>> from transformers import AutoTokenizer, AutoModel
    >>> model = AutoModel.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True, device='cuda')

Here, the "trust_remote_code=True" means "download the model code from huggingface repo 'internlm/internlm-chat-7b'", along with the weight, and run it. If it's False, the library would use builtin model architectures hardcoded in huggingface/transformers and only download the weight.

The scary flag is here because, of course, newcomers may not realize that model == code and if you load arbitrary model you are likely executing arbitrary code.

Wonder why, for example, you don't remember seeing LLaMA had this on release day? Because they don't use huggingface transformers library and don't use huggingface to distribute their model. You just clone and run their code from GitHub, and... how is this not "trust_remote_code"?

tomsmeding · on July 6, 2023

> newcomers may not realize that model == code

This makes sense in a way given the API of typical ML libraries. But there is no fundamental reason this needs to be the case.

Or, more correctly stated: model == code for sure, but said code need not have any rights to perform side effects. For some reason e.g. TensorFlow has stuff like tf.io.write_file [1] (is that actually an operation you can put in a model???), but one could easily imagine a more appropriate domain-specific model language that your code is compiled to, that can by design not perform any IO. Imagine that a model you distribute is not random Python code that may or may not run a model, but instead the model itself, i.e. the graph encoded in that domain-specific language.

Then downloading a random model from some random untrusted place is no different from downloading some random data from some untrusted place: you're going to execute the model, which may DOS you, but nothing much else will happen.

Unfortunately the ML world is too stuck in the imperative mindset for this (IMO more sensible) way of doing things. :)

[1]: https://www.tensorflow.org/api_docs/python/tf/io/write_file

411111111111111 · on July 6, 2023

At that point you'd need a machine learning DSL and runtime. Currently, it's all python libraries, so you can do everything python can... Which is everything, essentially.

It's highly unlikely that the market for running these models like an appliance securely in an untrusted context will ever manifest. It's just too much of a niche, as it would also reduce their extensibility/usability significantly

DougBTX · on July 6, 2023

Something like this may grow out of the GGML project, which is gaining traction. They already have a weights format which can be loaded with mmap, though AFAIK the model architecture still needs to be defined in C++.

tyfon · on July 6, 2023

I've only used llama via llama.cpp.

In general I think the python ML stuff is a mess. But I still won't execute code that recommend me to trust arbitrary remote code as the remote code can change at any time, it would be better to wait with the release until it was published to the transformers library or just include it in a clonable repo without the trust_remote_code flag.

It is much better to just be able to clone the code and have it locally so you can verify it once and not trust that it won't download any new code suddenly that you haven't been able to look at.

trust_remote_code means you have no control really, cloning a repo means you control when new code is added yourself.

rfoo · on July 6, 2023

Yeah, I agree promoting this usage is as bad as promoting `curl | sh` in README.md.

Similar to how you can inspect the content of a `curl | sh` script and then run it, the model is also in a clonable repo, you may just:

   git clone https://huggingface.co/internlm/internlm-7b-chat

and:

    >>> from transformers import AutoTokenizer, AutoModel
    >>> model = AutoModel.from_pretrained("./internlm-chat-7b", device='cuda')

tyfon · on July 6, 2023

This way is much more palpable for me, thank you for showing :)

m3affan · on July 6, 2023

Interesting attack vector. Malicious model codes.

oars · on July 6, 2023

Thank you for the explanation.

ipsum2 · on July 6, 2023

I believe it's because the model architecture isn't added to Huggingface transformer library, so it needs to eval some python code (i.e. load a pickle) to create the PyTorch model. Have not noticed it to be specific to models from China, almost all lesser known models have to do this.

jabbany · on July 6, 2023

Seems pretty common though, for defining custom architecture configs whatnot?

AFAIK the "remote code" is still openly hosted on huggingface so you can audit it if you like. Seems no more dangerous than things like `pip install some_random_library`?

Lockal · on July 6, 2023

This has become less common in recent days, at least for image generation (e. g. safetensors in Stable Diffusion).

The point of opensource models is that they can be finetuned. When many people create finetuned versions, a zoo of models appear. So far so good (maybe), but the bad practice of using untrusted code from the zoo sooner or later will lead to a wave of cryptominers, ransomware, and credential theft incidents.

rfoo · on July 6, 2023

I like this pip metaphor. If we had required `--trust-remote-code` for every `npm install` we could have avoided left-pad and most of the software supply chain drama in the past years.

SahAssar · on July 6, 2023

How would that have avoided left-pad? Do you just mean that people would have been discouraged from pulling in so many dependencies?

_ea1k · on July 6, 2023

I think that would just teach people to type --trust-remote-code fast.

ShamelessC · on July 6, 2023

Mind pasting the link to that line? Am on mobile and can’t find it myself easily.