Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:
I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.
Aaaand I got blocked
they can’t translate chinese, they receive a bunch of symbols and have a book with a bunch of instructions on how to answer based on the input (I can’t speak chinese, so I will just go with japanese for my example)
imagine the following rule set:
input: 元気ですか?今何をしていますか?
output: うん, 元気. 質問を答えますよ :P
input: 日本語わかりますか?
output: え?もちろん!
With an exhaustive set of, say, 7 billion rules, the algorithm can mechanically map an input to an output, but this does not mean that it can speak Japanese.
Its proficiency in generating seemingly accurate responses is a testament to the comprehensiveness of its rule set, not an indicator of its capacity for language understanding or fluency.
That’s a very thorough explanation, thanks. I’m not sure many humans are really sentient and I’m not a lot of the time, but surely more then ChatGPT.