This website uses cookies and similar technologies to understand visitors' experiences. By continuing to use this website, you accept our use of cookies and similar technologies,Terms of Use, and Privacy Policy.

Sep 13 2019 - 11:41 AM
Learn from failure: AI is not the panacea

Artificial Intelligence (AI) is rapidly infiltrating every aspect of society. In this blog, I want to share some takeaways from applied deep learning (COMSW4995) by professor Josh Gordon.

How AI can be wrong (examples from deep learning)

Wang and Kosinski (2017)  from Stanford University applied a neural network to extract features from 35,326 facial images. The goal of this project is to classify sexual orientation. The result shows: a classifier could correctly distinguish between gay and heterosexual in 81% of cases for the man (human testing is 61%), and 71% (human testing is 54%). Then, they claim "those findings advance our understanding of the origins of sexual orientation and the limits of human perception." This research became the cover of the economist (9/9/2017) with the title "Nowhere to hide -- what machines can tell from your face."

It is true that deep learning has a better ability in classification, but does this means that machine can advance our understanding of the origins of sexual orientation? From a rebuttal, Blaise tries to explore more detail in data and figure out how the machine learn. By combining all the 35,326 facial images based on four classes, we can find the difference between these composite faces. For example, the heterosexual male has a beard while the gay male is more likely to wear glasses. The most critical variables that AI learn from these training data are makeup, eyeshadow, facial hair, glasses, selfie angle, and amount of sun exposure.

Moreover, if you use this information to redo the classification with some traditional machine learning algorithms (e.g., random forest), the accuracy is almost the same. In other word, what machine did is no more than checking whether someone is wearing glasses. The phrenological analyses produce no statistically significant or meaningful effects.

However, this is just an example of phrenology in the 21 century (Jones, Alfaro-Almagro, and Jbabdi, 2018). Wearing new clothes (AI), physiognomy is sadly coming back (Blaise, 2017).

  • Automated Inference on Criminality using Face Images

Wu and Zhang (2016) claim that "We study, for the first time, automated inference on criminality based solely on still face images, which is free of any biases of subjective judgments of human observers. Via supervised machine learning; we build four classifiers (logistic regression, KNN, SVM, CNN) using facial images of 1856 real persons controlled for race, gender, age, and facial expressions, nearly half of whom were convicted criminals, for discriminating between criminals and noncriminals."

However, when we look at the image that they used for deep learning. We can find the difference between the top row of images (criminals) and the bottom row of images (not criminals): smiling or frowning. Will it be very annoying that you will be classified as criminal with 95% of confidence by a super-powerful deep learning algorithm just because you are frowning?

  • Intersectional accuracy disparities in commercial gender classification

The misuse of AI is not just happening in the Academy. According to Buolamwini and Gebru (2018), "the substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms."

Note, the number indicates the prediction accuracy.

The main reason for this gap, however, can be straightforward: unbalanced data source. There are much lighter male images than darker female images used in these algorithms.

  • Tay. ai

Another example is the application of deep learning in NLP. Tay was a chatterbot released by Microsoft via Twitter. However, the first release only lasted 16 hours because Tay started post inflammatory and offensive tweets which it learned from other posts from the human. One week later, Microsoft shut down the Tay since it post some drug-related tweets after the second release.


If I’ve learned one thing, it’s that technology doesn’t change who we are, it magnifies who we are, the good and the bad. -- Tim Cook

AI is powerful but not the panacea.

  1. When all you have is a hammer, everything looks like a nail. Keep a cautious, humble, and critical attitude towards modern technology. More critical thing than developing and applying the techniques is using it in the right way.
  2. Human society is much more complicated, pluralistic, and dynamic that what we can describe just using coding and algorithm. That is why data science is not just about techniques; it is about how we look at the world. Collaborative efforts from engineers, statisticians, educators, social scientists, journalists, and all area of study are required.
  3. Data, data, data. All the examples that we showed, to some extent, have the problem of data. Incomplete, misleading, and uncontrollable data source leads to a failed result, which seems to be perfect. Some people may claim all we need is to collect more high-quality data! It will be very challenging. Besides, can we uncover the truth of society just by doing complicated algorithms, even the data is perfect? This logic of "statistics imperialism" probably only shows nothing more than the ignorance. From my perspective, data and techniques are always a useful tool for learning. However, to find the truth, we need much more.
  4. Bias always exists. Bias can be hidden in the algorithm. For example, almost all recommendation system aims at improving the traffic from the user (more click, more shopping, or longer watching duration). This idea is used in model fitting, evaluation, and optimization. However, when we apply the same algorithm in the education context, a biased and unrealistic assumption is created: more traffic means more learning, more improvement in performance, and a brighter future.

Further Informations:

My study note for paper Deep Learning: A Critical Appraisal.

Posted in: Research|By: Yi Chen|1308 Reads