AI Safety and Word Clouds

    What is AI safety? What are the concrete problems in AI safety? I would like to talk about this in this blog while also using word clouds by two sites. I have used an article by Faculty for these 2 word clouds. I did not read the article yet and decided to see or guess what the article talks about from the word clouds.

    This word cloud below is made by Jason Davies. Excluding the word "word", "algorithm" and "sprite" stands out a lot in this word cloud. I honestly do not know where the word "sprite" came from in the article. Regarding the theme of AI safety, I believe words such as "step" and "time" seem very appropriate for the topic "AI safety." I am interested in the words "hierarchical" and "candidate", as I wonder what those words implement in the article.




    The word cloud below by TagCrowd also includes the word "algorithm" very largely. I felt that the word cloud by TagCrowd picked up more keywords about AI safety than Jason Davies's.





The Article 

    In Faculty's article, they stated that AI safety can be described in space and that we can categorize AI safety into 4 types. 


        1.  Autonomous decision-making and learning with a benign (not harmful) intent
        Autonomous AI is the one that learns by itself. The negative side of autonomous learners is that it is hard to know how they will behave when deployed. During the learning process, if the supervisor (human) is absent, it could lead to the AI doing harmful things just to reach its goal. As AI is mainly built to reach its goal as its utmost priority rather than be safe, things such as hacking their reward functions, induce negative side-effects, and unsafely exploring environments.

    2. Human-controlled AI with a benign intent
    This type of AI is one of the most simplest forms of AI. Algorithms are set by humans and are frozen until they are manually deployed. Although its system is simple, there are still some chances of it doing actions we do not expect. They can be non-robust as different data can make them behave differently, be biased as even well-trained robots can still be affected by biased algorithms from their past training data, and violate privacy as algorithms that are put in may include personal information about the members of the training set and once deployed, that information can be extracted, and finally, the explainability of AI itself is still a problem as providing AI a reason of a certain decision by an algorithm is a very hard obstacle.

    3. Human-controlled AI with malicious intent
    Of course, there will be people who would try and use AI for bad purposes so this is a problem concerning AI safety. 

    After reading this article, I still felt that TagCrowd's word cloud was more 'accurate' than Jason Davies's word cloud in terms of word relation. I felt that words such as 'deployed', 'human', 'learning', and 'intent' were some keywords from the article that was in TagCrowd's word cloud as well. 
    As for the topic of AI safety, I think thinking about AI safety in a 'space' gives us a general view of what the key points are in this topic. The explanation mentioned can be seen in other resources as well. A report made by members of Google Brain, Open AI, Stanford University, and UC Berkley gave examples of AI's negative risks in terms of cleaning robots (ex. Roomba). They have stated avoiding negative side effects, avoiding reward hacking, safe exploration, robustness to distributional shift, and scalable overweight as their main concerns. Excluding scalable oversight, which is how AI should be able to find a way to do the right thing with limited information. 

    At first, I thought that the goals of AI safety would mostly relate to autonomous AI but as we saw above, human-controlled AI has risks as well. The scariest would honestly be humans using AI to gain certain power or take advantage of something and harm other people. 

    In the end, maybe humanity is the scariest thing on this planet after all...


    This was our blog for question #9 of our 30 questions.



References:

Amodei, D., Christiano, P., Mane, D., Olah, C., Schulman, J., & Steinhardt, J. (2016, July). Concrete Problems in AI Safety. Cornell University. https://arxiv.org/pdf/1606.06565.pdf

Faculty. (2022, June 20). What is AI Safety? Retrieved July 20, 2022, from https://faculty.ai/blog/what-is-ai-safety/#:%7E:text=Artificial%20Intelligence%20(AI)%20Safety%20can,do%20not%20harm%20humanity






 

Comments

  1. Jason Davies' tag clouds seem better aesthetically but TagCrowd may be better for making predictions about what the inputted text might be trying to communicate. I like the way you used the tools to test their value for predicting what the texts were emphasizing.

    I share your fear that AI might be "used by humans to gain certain power or take advantage of something to harm other people." Actually, AI powered autonomous or semi-autonomous weapons are poised to be implemented in the battlefield at any time approval is given.

    The United Nations is trying to come up with rules and restrictions for their use, but the fact that Russian delegates can't get to Geneva easily for the conferences due to travel restrictions is complicating negotiations. According to the Washington Post (https://www.washingtonpost.com/technology/2022/03/11/autonomous-weapons-geneva-un/) Turkey may have been the first country to kill a human being "entirely because a machine thought they should" be killed when one of its drones recently fired autonomously in the Libyan civil war. China is advanced with facial recognition and its possible application in war for targeted killing. So there are many terrible possibilities. The fact that the atomic and hydrogen bombs were put into action almost immediately after the technology was developed doesn't bode well for humanity.

    ReplyDelete

Post a Comment

Popular Posts