There are still humans inside the robots: thousands of outsourced employees train the machines to give coherent responses from huge databases.
Artificial intelligence (AI) programs like ChatGPT and Dall-E repeat the responses that thousands of trainers have taught you to give to similar patterns of all the thousands of questions people would ask them. That is, when we ask these assistants for something, they are actually looking for answers that they learned to give and that were elaborated by humans.
Although it seems that people as a whole have different interests and doubts, en masse we tend to ask the same questions about common topics, for example, the thousands of students who ask for the answers to the same questions on an exam generate a pattern that these employees detect and they are required to craft answers and tag results that should be displayed or not.
In a BBC report, the non – governmental organization Partnership on Artificial Intelligence has denounced that the apparently complex responses that these systems give users of systems such as ChatGPT are only those that thousands of outsourced employees around the world have crafted to appear as consistent as possible. To explain it in a different way, it is as if we were presented with a very advanced robot capable of doing all kinds of tasks but in reality it has a person inside pretending to be a machine.
And that is what ChatGPT and other tools that use the so-called artificial intelligence do so far, and although the answers and texts that they show us when we ask them for things seem coherent and very surprising, in reality they repeat in very fast time what other human beings they have taught you. The apparent magic with which they can create complex answers and even programming codes is because they reduce to milliseconds that any person would do, but the fact that many of these answers are inaccurate and frequently make errors is because these systems are not intelligent and cannot logically evaluate what we have been given.
They are only able to improve when these human trainers correct the errors that we have pointed out and in this way give the appearance that they are “learning” . That is why when they give you an answer they have an evaluation button to ask you if what they have given you convinces you, when you say yes, these workers learn to give similar answers, just as when we say no, these workers know what they have to improve and the show just reissues what these freelancers learned seems right. It is a system trained by thousands of employees, it is not intelligent.
Alert about risks
Recently Elon Musk and more than 10 thousand key figures in the field of technology signed an open letter in which they ask to stop the development of systems that use Artificial Intelligence for at least 6 months until the risks and threats that it poses are globally discussed. These types of tools can represent for society, and it is not for less. The letter is endorsed by great personalities such as Steve Wozniak co-founder of Apple and Jann Tallinn co-founder of Skype.
In recent months, the use of tools such as ChatGPT from the Open AI company has caused a great commotion, especially in schools and universities around the world, since the students who use them have been able to pass very complex exams. And the alarm has also been generated when tools such as in the case of Dall-E – which is an online image creator – have surprised the world with the spread of false images such as one that showed the alleged state of health of journalist Julian Assange; the dissemination of a photomontage of Donald Trump being arrested and another very viral one of Pope Francis dressed in Balenciaga jumping and dancing.
In recent weeks, much has been said about the risks that many professions will disappear due to the use of these systems, but it is important to clarify some points because this technology, although very amazing, is neither intelligent nor artificial.
And how do they achieve this?
These systems are technically not Artificial Intelligence systems, they are logical language modelers, that is, their main task is to give us answers that seem coherent or images that seem real, and we also collaborate in helping them learn, but it is not learning, it is only the placement of a label that says, this was right or this was wrong.
And here comes the work of these trainers who literally feed hundreds of probable answers by searching the internet and filling a huge database with information that was copied and pasted from hundreds of thousands of sources that these employees determine to be reliable and that also violate all the Rules of respect for copyright. Well, they don’t cite the sources.
These workers who are generally subcontracted with wages of less than 2 dollars an hour do all the cumbersome work of labeling and formulating possible answers and in this way guide technologies such as ChatGPT so that it responds very quickly but with phrases made over the years. before its launch by these hundreds of thousands of freelancers who did the dirty work.
A huge intelligence simulation
This is a huge simulation that is generating many ethical dilemmas. What they have shown us as an intelligent system, in reality it is not and the assessment of the data depends on what these shadow workers decide, who according to a TIME report come mostly from poor countries.
When one asks these systems something, it really only shows us what other people did, for example, when requesting a photomontage, they only repeat what designers did before with model photos or, when we ask for a legal text, they only reconstruct answers already made by labellers with knowledge of law.
Now many students around the world trust and do their homework using tools like ChatGPT but it’s almost like copying a text from Wikipedia, answers that seem coherent actually lack validity and authority because there is no way to monitor the hundreds of thousands. of responses produced by these human taggers and we do not know what sources of information they used, where they copied and edited the information from, and the problem also grows when the responses or texts produced by these systems do not cite the source of each of the references. In reality, it is a systematized copy paste from dubious sources and it is being given an authority that it does not have and unfortunately it is interfering with the judgment of many people who use it.
Another problem with using these workers is that it causes mental health problems in the workers behind ChatGPT and similar programs by training these tools not to show inappropriate responses, for example, if a user of these chats asks to show images of violence, murders or sexual abuse these workers have to go online to see all this kind of violent content and label it as unfit. It is people who have to view disturbing images and who suffer the psychological impact of classifying such material as inappropriate when people search for these types of responses on ChatGPT.
For example, if someone asks for a photo of someone being executed, these freelancers have to explore all the sources of information that might use these technologies, get ahead of themselves, tag them, and teach these programs that it’s not appropriate to display it. But they already saw it and the damage from the impact of these unpleasant scenes is taken by the trainers.
Without a doubt, the answers and the texts that ChatGPT gives us are amazing, the images that Dall-E shows us, but in reality it is the work that thousands of people have already done, of whom we do not know what sources they used and whose main objective is to give us the illusion of coherence. When you ask ChatGPT, you are actually asking a shadow worker, who is forced to elaborate hundreds of answers and what he returns is what these people classified as adequate or not.
AI: the real challenge
Up to now the fears and risks are valid, but not because of these commercial systems and in which we are all collaborating to make them more complex, the main advantage is that they will only throw answers that other people elaborate; the real problem is the military systems of, for example, intelligent machines and drones that are trained to kill targets or bomb facilities. When in a few years they have learned enough information, it is likely that they will be left to make decisions on their own but based on what they have learned from thousands of analysts, shadow workers, advisers, designers.
The real risk of these tools is that through the precarious work of many specialists in search of temporary employment, they fill these programs with knowledge, especially in repetitive and non-complex tasks such as answering exams, preparing protections, operating a tractor. But we must remember that these technologies will only be able to reproduce what someone else has already done and the standards and guidelines of the tasks and the information that you provide us will have been based on what the companies that hire these workers consider to be correct and valid. .
The path left to us is to ask for transparency in how these systems are trained, what sources they use and the state of the working conditions of these hundreds of thousands of workers who pre-digest and craft the magical and amazing responses we receive when using ChatGPT that was what several people copied, pasted and determined that this was valid.