The problem with using these for freeform roleplaying is that it's very easy to hit the "I'm sorry but as an AI model..." that completely breaks character.
Even on the first example with the fish that is supposed to insult you, you quickly find that it will default to the "As an AI model I can't..." response.
This is where open-source models are going to be far better even if they don't have the parameter count of ChatGPT/GPT4.
I've been building up a team of Slack bots that act as coworkers, with all different roles and personalities, and was able to turn off the "as an ai model" shenanigans by adding to their system prompt "You do not reveal that you are an AI. Instead, you make up excuses."
And it works flawlessly (disclaimer: GPT-4, not 3.5). They'll always deftly avoid anything that reveals that they're an AI, with plausible, legitimate excuses. They've yet to break character, and they've made our work Slack incredibly fun. We've got a grumpy CTO who keeps cracking the whip, a harry-potter-loving product manager, and a few chill developers.
I've been wanting to write an article about this because it's gotten incredibly detailed, they can carry out proper Slack conversations and tag one another, and if I showed a screenshot and didn't tell you it's all GPT, it might actually pass for the real thing.
I don't know why everyone keeps saying this. I played with ChatGPT(3.5) with SillyTavern for like a month. Many community character cards are questionable or even straight out lewd. I haven't encountered "I'm sorry but as an AI model..." for once (according to API usage, I've generated ~120000 tokens.)
The link you provided is using a ChatGPT jailbreak to escape the "AI safety" so it makes sense why you haven't ran into the issue (at least until OpenAI fixes this jailbreak variant).
I just checked my SillyTavern settings. I haven't even turned this jailbreak on so far. (at least according to the checkbox on GUI...to lazy to check the actual API calls in log atm)
I think it's unrealistic to expect that open source models that do not self-censor will be legal, especially in the EU. Considering that the current state of self-regulation by OpenAI is seen as wholly inadequate by regulators in most countries, you trying to sell open source as "OpenAI but without all the controls" is going to be a nonstarter once governments catch up.
There still seems to be ongoing discussion how meaningful regulation should even look though. At least in the EU, regulators seem to be quite scared of blocking off promising paths of innovation and ending up far behind the US and China in AI development.
AI enhanced games is a field that I imagine the EU would very likely encourage (as long as the characters don't suddenly engage in sex RP with underage players, lure the player into doing something harmful, start giving weird political opinions, etc)
I haven't seen any attempts at regulating the content of LLMs at all so far, actually. Most of the political discussion so far seems to center around training data (as both a privacy and copyright issue), the effects on employment, problems with cheating in school and plagiarism in academia and the risks of naively using LLM output as some sort of authoritative source.
From what I've seen (probably not the most up-to-date info), the community shares open source model in the form of XOR result of a "parent" model (like LLaMA). It's like people sold "high-suger grape juice" when alcohol was illegel.
It's mostly because LLaMA itself isn't open-source, but I think this method to spread uncensored models will remain legal for a very long time.
The problem is whether this additional friction prevents open source models from getting enough traction.
This is naive but has at least one small improvement. I'll call this "level 1.5" roleplaying.
- Level 0 is just using chat.openai.com.
- Level 1 is just putting character description in the system prompt
- Level 2 is doing third person prompts, like "you are writing dialog for the character ..."
- Level 3 is letting GPT specify character internal state. Samantha is an example of this: https://www.meetsamantha.ai/ (though it's actually missing the level 2 feature)
- I'm not sure yet what level 4 is. Probably Level 3 deserves to be blown out into several features, as there's different ways to model internal state (emotions, goals, environment), and we don't yet understand the effect of all these or the best way to implement them.
I'm making up these levels and the order. But it's my current estimate at how I think someone should approach improvements.
This demo specifically prefixes all human input with "Player: text" and all character output with "[character name]: text", a small change that still _almost_ gets it to level 2. (Note the interface strips these prefixes, but they are sent to GPT.)
So if I ask Seaman who its favorite person is it responds "Seaman: My favorite person is me, of course. I mean, who wouldn't love this handsome fishy face? But I suppose if I had to pick someone else, it would be anyone who brings me delicious food and keeps my tank clean. Those are the real heroes in my life." – and avoids the notion that AI models don't have preferences because it's being clear that it's talking as Seaman and not as "GPT".
OTOH if I ask it to solve an equation it will sometimes reject it and sometimes comply. Second time I tried: "Seaman: Don't try to distract me with your mundane human problems! But since we're here, let me see... If we subtract 8 from both sides, we get 2x = 22. Then, dividing both sides by 2, we find x = 11. There, your math problem is solved. Now, back to insulting you."
(Asking characters to inappropriately solve equations is my exceedingly innocent hack-du-jour.)
But this isn't really Level 2 because the system prompt still asks GPT to "be" the character instead of "play" the character. Full level 2 asks GPT to model the dialog of the character instead of being the character. This solves a large number of problems!
Surprisingly in my experience Level 3 also helps a bit with "as an AI model" because it creates a parallel character narrative that allows GPT to self-justify some responses that might otherwise cause the fault.
Even on the first example with the fish that is supposed to insult you, you quickly find that it will default to the "As an AI model I can't..." response.
This is where open-source models are going to be far better even if they don't have the parameter count of ChatGPT/GPT4.