Deep Learning is not the AI future
By Fabio Ciucci, CEO at Anfy srl.
Everyone now is learning, or claiming to learn, Deep Learning (DL), the only field of Artificial Intelligence (AI) that went viral. Paid and free DL courses count 100,000s of students of all ages. Too many startups and products are named “deep-something”, just as buzzword: very few are using DL really. Most ignore that DL is the 1% of the Machine Learning (ML) field, and that ML is the 1% of the AI field. Remaining 99% is what’s used in practice for most tasks. A “DL-only expert” is not a “whole AI expert”.
DL is not synonym of AI! The most advertised AI tools by Google, Facebook etc are mainly or only DL, so the wide public thinks that all the new AI records are (and will be) done with DL only. This is not true. Decision Trees like XGBoost are not making headlines, but silently beat DL at many Kaggle tabular data competitions. The media implied that AlphaGo is DL-only, but it’s a Monte Carlo tree search + DL, an evidence that pure DL was not enough to win. Many reinforcement learning tasks are solved with Neuroevolution’s NEAT, no backpropagation. There is “deep misinformation” in AI.
I am not saying that DL is not solving the tasks: DL is impressive. Trees and other algorithms don’t beat DL often, and there is no DL substitute to solve some tasks, but I expect non-DL systems to be (re)discovered in the future to beat DL. Perhaps also solving the legal nightmare of DL decisions, that even if correct, can’t be explained when legally questioned? Also I would like to read in the press about DL issues like “catastrophic forgetting”, the tendency to abruptly forget previously learned information upon learning new information, and about the daily fight against “overfitting”. About “intelligence”: DL will simply believe the training data given, without understand what’s true or false, real or imaginary, fair or unfair. Humans believe fake news too, but only up to a certain level, and even kids know that movies are fiction, not real. For more details, if you got time, read my longer article: AI (Deep Learning) explained simply.
Really, DL is a 1980s tech, older than HTML: Trained with more data, 1970s “neural networks with hidden layers” gave better results, then was renamed as DL and hyped. In 1992 I briefly checked some neural network source codes, together with other stuff like fractals and cellular automata. Like almost everyone else, I dismissed DL at the time as an academic math puzzle with no practical uses. Instead, I focused on learning what gave immediate results: 3D for video games, then internet, and so on. But we was all wrong, DL can do amazing things with big data! I got fascinated in 2015 by Deep Dream, then by GANs etc. Still, DL it’s not the last, perfect AI science we can invent.
The ancient DL was already studied extensively and updated across decades to solve more tasks more accurately, but no DL version (Convolutional, RNN, RNN + LSTM, GANs etc.) can explain its own decisions. While DL will surely solve more tasks and kill more jobs in future, unlikely will solve all, or reserve surprising updates capable of discussing a legally valid defense about the fairness of its own decisions.
Deep Learning can’t understand these 2 philosophers
Future AI should explore other, new or old but overlooked ways, not DL only. A DL limit is that considers truth simply what it spots more frequently in the data, and false what’s statistically more rare, or opposite of what’s more frequent. The DL fairness comes not from DL itself, but from the humans selecting and preparing the DL data. A DL can read texts and translate between texts, but not in “human way”. If a DL model is trained over 100 books: 40 telling how hate, war, death and destruction are bad, and 60 books telling that Hitler’s Nazi ideas was correct, the DL will end up 100% Nazi!
DL will never figure out on its own that killing Jews, gays and disabled people is bad, if Nazism is the most popular opinion in the training data. No wonder that DL will not explain its own decisions, except a naive: “I’ve read most often that Nazism is right, so it should be right”. DL will learn and mimic the most flawed logic without figure out the flaws, including terrorism. Even small kids understand on their own who’s the bad guys in a movie, but not DL, unless humans teach it explicitly first. The DL specific things like gradient descent with backpropagation are cool, as well as custom DL hardware, but that’s mostly statistics and geometry, so probably will not be in the AI of 2037.
For many tasks, Deep Learning AI is or will become illegal, not compliant. Who collects data about citizens of the 28 European countries, should follow the General Data Protection Regulation (GDPR) by May 25, 2018. This is the date when DL will be abandoned for several apps in EU, causing AI startups to quickly replace DL with whatever else, or risking to be fined. Fines for noncompliance are 4% of global revenue, including USA revenue. GDPR, about automated decision-making, requires the right to an explanation, and to prevent discriminatory effects based on race, opinions, health, etc. Laws similar to GDPR exist or are planned worldwide, it’s only matter of time. The US Fair Credit Reporting Act requires to disclose all of the factors that adversely affected the credit score of the consumer, for a maximum of 4 factors allowed. DL factors are normally thousands or millions, not just 4, how to simplify into 4? AI, like bitcoin ICOs, started ignoring regulation, but laws and fines always come.
DL systems taking more relevant decisions than telling if an image is a cat, or where to add bunny ears to selfies, will be replaced with non-DL systems. The AI will have to be accountable, so different from DL, with outcomes you can explain to average judges and users in simple, legally valid words. DL complexity, that looks like “magics” to judges and users, is a legal risk: not a cool feature. DL will advice or alert humans, for example detecting sicknesses from medical images, to be verified by a medical doctor, but this is only partial automation lacking details. What to tell to users getting rejected from the AI (denied a loan, job, etc.) and asking explanations?
Laws are including the “right to an explanation”, for example why a job or a loan is denied. DL gives results with no natural (legal) language explanations. Pages of DL variables are available, but not acceptable by judges or users, since not even the best mathematicians or other algorithms can figure out and simplify into words a DL model. Even where humans take final decisions, the AI tools should give detailed reasons that humans can either figure out as wrong (and so override, reverse the AI decision), or quickly accept by simply copy, paste and sign explanations prepared by AI. No one knows how to modify DL to give simple human-like explanations, so DL can’t be made compliant! This issue affects also several other AI and Machine Learning algorithms, but not all or as much as DL. Decision trees also become not explainable if boosted or in ensemble. But in the future, new or rediscovered AIs, that can defend their own decisions, will be used for the regulated decisions in place of both DL and humans.
In the case of GDPR, only human staff can reject an application: the AI can automate the positive outcomes, else, if the AI denies a loan, job etc., it should pass the task to human staff, that will handle those negative decisions that make users angry, inquisitive. But in case of denial, the human staff will have no help or explanation from a DL-based AI, they can’t know if the DL logic was right or wrong. They will have to check the data from scratch on their own, to decide if ultimately reject or not, and write a reasonable cause for the decision. The risk is that the human staff, to save time and money, will make up fake explanations for AI rejections, and blindly accept AI approvals. But judges called to decide on the fairness of AI rejections, will also ask why the others was accepted, to compare. To be safe, you need solid reasons for accepting too, not for rejecting only, no matter what’s in laws like GDPR. Non-DL AI systems providing human readable explanations of all decisions to users, judges and support staff, will be ultimately the only ones used, for both fully and partially automated decisions.
Explainability was already a big issue before of any specific laws and before DL. In antitrust cases, companies like Google are asked why a product rather than others is shown in the top of search results. This was before DL too: many other algorithms also mix data in a crazy way to get results, so no human can easily reconstruct the decision reasons. Judges are told that engineers don’t know exactly, and pages of linear algebraare given as evidence. This can’t end well: billion dollars of fines was ruled in multiple cases, with warnings to change systems, even before a specific law existed. Class action lawsuits of users automatically denied jobs, loans, refunds etc, against automated decision units of stores, banks, insurances, etc. will be the norm, and being unable to explain will mean “no defense“, being fined, and a brand’s public relations disaster.
For most people, “AI” means the sci-fi movies AI that can give smart explanations, where humans can quickly decide if they agree or not, very easy for legal validation. Most people, including judges and who write laws like GDPR, hearing that companies are “AI-first” or “adding AI”, expect an “AI” like in movies, that would defend its own decisions if called in court, impressing users and judges. Instead, we got unexplainable “DL AI”, that will not be used much, even for tasks it can solve, just because lacking interpretability. DL will not save costs and will not kill jobs where sensitive automated decisions are needed. Even where humans must take the final decision anyway, tool AIs explaining their advice will be much preferable to tool AIs giving responses without giving causes or reasons. Explainable AIs, when (re)discovered, will be safer, legally compliant, cheaper, faster, and replace both DL and humans. Since DL was invented in 1960s-1980s then rediscovered in 2010s, probably the base of explainable future AIs is already described by some researchers somewhere, but being not DL, no one will careto check and develop these AI types for years. Until rediscovered and hyped.
GDPR, about automated decision-making, also requires to prevent discriminatory effects based on race, opinions, health status, etc. But DL models trained from user-generated data like social media and news (rather than ground truth data like medical or financial records), always contain evil biases implicitly. As told before, DL can read a lot of texts and data, and mimic its contents, but will not critically understand it. DL will just believe what’s spotted more often, underline patterns and trends found in data, and so: amplify the human society biases and problems. The data shows that black people are arrested more often than white people: the DL will simply suspect blacks first if any crime is committed. The data shows that more males than females are directors in corporate boards: the DL will simply prefer male candidates in job applications.
DL decisions end up more discriminatory, racist, sexist than the average sample in the training data. This issue happens in all the ML algorithms, but DL model bias is one of the hardest to test, detect, control and tune. It is so hard to fix, that rather than try to patch it, simply caused the abrupt cancellation of many DL experiments already, from chat bots went nazi and hateful, to apps whitening black face photos in “beauty” filters.
You can’t fix a discriminatory, racist or sexist DL model by trying to balance it with patches after the training. DL is a neural network, and unlike some other AI methods, you can’t edit specific answers with local surgery, you must retrain all with different, 100% balanced and fair data, rare in the wild world. DL mimics what found in the data without understand it: DL will not disagree with any data, will not figure out the injustices in the society, it’s just all “data to learn”. You should hire a dedicated human staff to create fake fair data of an ideal society where white people are arrested as often as blacks, where 50% of directors are women, and so on. But the cost of creating vast amounts of de-biased data edited by human experts, just to train a DL model, makes not worth to replace humans with AI in first place! Further, even if you had trained a DL model that really is fair, you have no evidence to convince a judge or a user about the fairness of any decision, since the DL will give no explanations.
DL will be of secondary importance, used for non-business apps or games not posing legal risks. When explainable AIs will be popular, DL will not be abandoned like magnetic tapes or cathode TVs. People losing game plays against bots will unlikely convince a judge to fine the AI company because it can’t explain how the AI won. People unhappy of how FaceApp edited their selfie photo into older, younger, or opposite sex, will unlikely convince a judge to fine FaceApp because it can’t explain how the AI decided the new looks (except a “race change” filter, removed after massive protests, no judge needed). Detecting sickness in medical images is a safe DL use, as long as users will ask confirmation from human doctors before to take medication.
The legally safe DL market is very limited: judges can fine in all the cases where the decision outcome can make a financial or health difference or be discriminatory, where DL will not help to understand if and why the decision was fair. How about self-driving cars? DL seems a legal risk to use in all that is more than art, games or good taste jokes. Existing non-DL methods can replace DL where needed, and new methods will be (re)discovered, so the AI progress will continue nicely. Especially if everyone will study (and invest into) all the old and new algorithms of the whole AI and Machine Learning sciences, not only DL: the only way to become a “whole AI lifetime expert”.
Except DL being “illegal” to use for many useful tasks it can solve, it’s also unable to solve several tasks: those requiring the abstract reasoning to figure out what’s fair and unfair in the data seen, and to explain the logic of its own decisions. Even for tasks not requiring explanation where DL seems the best system, like image recognition, DL is not as safe as human eyes. You can fool DL with “adversarial examples“: photos of something, like a cat, with invisible perturbations added, can fool the DL into seeing other, like a dog. All humans will still see a cat, but the DL will instead see a dog or whatever the hacker secretly embedded. This can be exploited in street signs to hack current self-driving cars. New AI systems resisting this hack will replace DL.
The author of Keras, the most popular DL library, in his post “The limitations of deep learning”, said: “the only real success of DL has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data.” These spaces got lots of dimensions, not just 3D, this is how DL can mimic Picasso art styles, Poker bluffs and some human creativity in many tasks. But in layman terms, I would say that this means: DL can be trained to recognize cat photos without understand what is a cat, and to be racist without knowing of being racist. DL can recognize cats or be racist or win at games, which is impressive and at times useful, but DL can’t explain why a photo shows a cat, or if a decision was racist.
In “The future of deep learning” the Keras author describes his vision of a new system where DL is only in “geometric modules”, that should interact with not yet existing “algorithmic modules” and “meta learners”. This would increase the number and types of tasks solved, but still failing to explain the decisions, due to the DL modules. It’s like when we can’t explain, in words, certain feelings or images computed in our brain. Humans explain all, but with mostly made up, oversimplified excuses, that everyone seems to believe as accurate. Machines are instead unfairly asked to be really accurate. Other experts are drafting new AI systems that do not include DL at all, but they lack funds: everyone invests in DL only now, and the DL mania will continue for a while. No one knows what will be the next big AI thing, but unlikely will be DL 2.0.
The DL is hyped because only who sells DL software and hardware, despite the conflict of interest, is interviewed in the AI debates. Have you noticed any legitimate “natural intelligence” experts, like psychologists and philosophers, supporting DL?
If you have neither AI urgency or time to study, wait the next AI system to be ready and study it directly, skipping DL 1.0. Else, if you have AI urgency and/or time to study, be sure to cover the whole AI and the many Machine Learning fields, not DL only.
Thanks for reading. If you feel good, please click the like and share buttons. But before to comment, especially if you just paid an expensive DL course or you disagree, please read first my longer article in full: AI (Deep Learning) explained simply. If interested in the robots safety sci-fi debate, read: Will AI kill us all after taking our jobs?
Original. Reposted with permission.
Bio: Fabio Ciucci is a Founder and CEO at Anfy srl in Lucca, Italy. Since 1996 he created own companies and advised others (enterpreneurs, family offices, venture capitalists) on private equity hi-tech investments and innovation, and artificial intelligence.