AI-based automatic text creation tools easily produce highly accurate text, which is considered 'too dangerous' by developers


by

rawpixel

OpenAI , a nonprofit AI research organization funded by Elon Musk and others, has developed a text generator that could be considered a text version of Deepfake , a video conversion technology that uses AI (artificial intelligence). However, the developers are concerned that the ability to automatically generate highly accurate text is 'too dangerous.'

New AI fake text generator may be too dangerous to release, say creators | Technology | The Guardian
https://www.theguardian.com/technology/2019/feb/14/elon-musk-backed-ai-writes-convincing-news-fiction

OpenAI has developed a new text generation AI model called 'GPT2.' However, because GPT2 is so superior, it poses a very high risk of being misused, so the publication of its technical details in a paper has been postponed.

Although the publication of the paper has been postponed, the Guardian , a major British newspaper, has had the opportunity to use GPT2, and the following movie shows how GPT2 can actually generate text automatically.

How OpenAI writes convincing news stories and works of fiction - YouTube


GPT2 is an AI model that can automatically generate news and fiction. As a test, we asked it to write a Brexit- related article for The Guardian, so a human would input the opening sentence.



The human-generated opening sentence reads, 'Brexit has already cost the UK economy at least £80 billion (about 11 trillion yen) since the referendum.'



GPT2 then automatically generated the text. The underlined text was all generated by GPT2. The next sentence generated by GPT2 was, 'Furthermore, many industry experts believe that the economic losses from Brexit will be even greater.'



By simply writing a simple opening sentence, GPT2 can seamlessly generate the following sentence.



Next, let's enter the opening sentence of Jane Austen's novel, ' Pride and Prejudice .'




The resulting sentence was completely different from the original . In a sense, GPT2 generated a fake version of 'Pride and Prejudice.'



GPT2 is an AI model that can predict and automatically generate sentences based on a few words. The high quality of its output and its potential for diverse applications raise concerns about its potential dangers, according to the researchers who developed it. The Guardian notes that GPT2 can easily generate plausible sentences and rarely exhibits the flaws seen in existing AI text generators, such as 'forgetting what has been written mid-paragraph' and 'poor syntax in longer sentences.'

According to The Guardian, GPT2 is groundbreaking in two respects. First, its size. Dario Amodei, director of research at OpenAI, said, 'GPT2's AI model is 12 times larger than existing state-of-the-art AI models, and its dataset is 15 times larger, covering a much broader range.' The GPT2 AI model was trained on Reddit, searching for links with more than three votes, using a dataset of approximately 10 million matching posts. The dataset's size, including text alone, is 40GB, equivalent to the data size of approximately 35,000 copies of the novel ' Moby Dick .'

GPT2 is far more versatile than existing text generators, structuring input text to perform tasks such as translation and summarization, and can generate sentences that can pass simple reading comprehension tests. GPT2 performs at least as well as other AI models built specifically for such tasks, giving it another major advantage over conventional text generation AI models.

According to Alex Hern, an editor at The Guardian, the following sentence was generated using GPT2 and was not manually edited at all. It also revealed that the sentence was generated in just 15 seconds .



However, OpenAI decided to postpone the release of GPT2 because the quality of the text generated by GPT2 is so good that it needs time to consider in more detail what problems malicious users might encounter by using GPT2. Jack Clark, Director of Privacy at OpenAI, said, 'If you can't predict all the features of an AI model, you need to see what the model can do. There are many more people who are good at thinking about what it can do maliciously than we are developers.'

To accurately evaluate the potential of GPT2, OpenAI applied some modest adjustments to GPT2 to create a version capable of generating spam and fake news. Because GPT2 uses text that exists on the Internet as its dataset, it is relatively easy to turn it into a generator for generating conspiracy theories and biased text.

'The cost and price of adopting technology continues to decrease, and the rules for controlling it have fundamentally changed,' said Clark of OpenAI. 'We're not saying what we're doing is right, or 'this is the way to do it.' We're just trying to develop more rigorous thinking. It's like trying to build a road while crossing it at the same time.' He emphasizes the need to set clear rules for new technologies.

in AI,   Video,   Software,   Science, Posted by logu_ii