By Ted Smillie on Sunday, February 24th, 2019
Machine learning mistakes range from serious to ridiculous. Here are some examples:
“The answers they come up with are likely to be inaccurate or wrong because the software is identifying patterns that exist only in that data set and not the real world.”
Speaking at the February 2019 Annual meeting of the American Association for the Advancement of Science (AAAS), Dr. Genevera Allen from Rice University warned against using machine learning software to analyse data that has already been collected.
“There is general recognition of a reproducibility crisis in science right now. I would venture to argue that a huge part of that does come from the use of machine learning techniques in science.”
Dr Allen is working with a group of biomedical researchers at Baylor College of Medicine in Houston to develop “machine learning and statistical techniques that can not only sift through large amounts of data to make discoveries, but also report how uncertain their results are and their likely reproducibility” On a lighter note, A February 20, 2019 Side View from Crikey’s Bernard Keane gives a links to some tongue in cheek examples from Tyler Vigen of spurious correlations, e.g.
The Tyler Vigen website gives details of the data behind the correlation and allows visitors to create their own spurious correlations.