So a “this Chinese company hosting a model in China is complying with Chinese censorship” and not “this language model is inherently complying with Chinese censorship.”
i wouldn’t say it’s heavily censored, if you outright ask it a couple times it will go ahead and talk about things in a mostly objective manner, though with a palpable air of a PR person trying to do damage control.
But the OP’s refusal is occurring at a provider level and is the kind that would intercept even when the model relaxes in longer contexts (which happens for nearly every model).
At a weight level, nearly all alignment lasts only a few pages of context.
But intercepted refusals occur across the context window.
This seems like it may be at the provider level and not at the actual open weights level: https://x.com/xlr8harder/status/1883429991477915803
So a “this Chinese company hosting a model in China is complying with Chinese censorship” and not “this language model is inherently complying with Chinese censorship.”
I’m running the 1.5b distilled version locally and it seems pretty heavily censored at the weights level to me.
i wouldn’t say it’s heavily censored, if you outright ask it a couple times it will go ahead and talk about things in a mostly objective manner, though with a palpable air of a PR person trying to do damage control.
There is a reluctance to discuss at a weight level - this graphs out refusals for criticism of different countries for different models:
https://x.com/xlr8harder/status/1884705342614835573
But the OP’s refusal is occurring at a provider level and is the kind that would intercept even when the model relaxes in longer contexts (which happens for nearly every model).
At a weight level, nearly all alignment lasts only a few pages of context.
But intercepted refusals occur across the context window.