An Ethical AI Never Says “I”

rw-book-cover

Metadata

Feelings, desires, personality, and even sentience, so far the privilege of biological, living beings, have been mistakenly attributed to highly sophisticated algorithms, designed to run on silicon-based integrated circuits and arrange “tokens” consisting of words into plausible sequences. (View Highlight)
there is no “I” in LLMs’ “Is”, no matter how excitedly sentience fans would like to see one emerge. If an LLM shows you the words “I’m sorry”, no matter how genuine and innocent it sounds, don’t be fooled: there isn’t anybody who is feeling sorry in any meaningful sense. (View Highlight)
the bias towards anthropomorphization is so strong to seem irresistible; and second, that if we lean into it instead of adopting safeguards, it leads to outcomes ranging from the depressing to the catastrophic. (View Highlight)
Among the many features of a reliable and ethical AI, therefore, a simple one is that it should never say “I”. Some will consider this proposal unfeasible, arguing that the software has emergent properties whose workings we do not fully understand; it mimics the text it has already been trained on; and it would be pointless to close the stable door after the horse has bolted (View Highlight)
In a healthy ecology of language, using the first person would be reserved to living beings, such as humans (View Highlight)