Artificial intelligence and statistics

“Machine learning is based on a number of earlier building blocks, starting with classical statistics.”

Dorian Pyle, data expert, McKinsey

This is an excellent and timely opinion piece in Frontiers of Digital Health which opines on the difference between artificial intelligence and statistics in the published healthcare literature. It is interesting to note that it is not merely the number of publications with the term artificial intelligence that has increased but, in fact, the proportion of AI studies in all of medical literature has increased from about 0.2 in 2000 to about 0.8 in 2020 (and the increase may be exponential in the near future). The subtext here is that, perhaps, some authors are deliberately introducing the words artificial intelligence and machine learning to gain attention to their works.

The authors introduce a table relatively early on in the manuscript that delineates some of the vocabulary terms found in the statistics world and their typical counterparts in the ML/AI world. For example, “sensitivity” in statistical modeling is “recall” in machine learning. In addition, “confusion matrix” is a term that is familiar to data scientists, but statisticians know this concept as “contingency table”. Conversely, statisticians use terms like “parameters and log likelihood”, while our data scientist colleagues would be more likely to use “weights and loss”. Perhaps at least a few of the commonly used terms can be the same, but sometimes professional hubris gets in the way of a reconciliation.

There are a myriad of guidelines for reporting validation of research using AI methodology that include TRIPOD-AI and PROBAST-AI, as well as DECIDE-AI, STARD-AI, and CONSORT-AI. While these are all well intentioned efforts to gain consensus for AI-related works, it is a bit daunting to go through all of these guidelines. In addition, the authors correctly point out that one of the major issues is actually failure of these AI projects to be indoctrinated into clinical practice. Even if an AI project conforms perfectly to these guidelines, if the algorithm is not incorporated into clinical practice and results in clinical impact, it is simply a good AI project from the data science perspective without improving patient outcomes.

One way is to embed AI applications into key drivers of the health system, as exemplified by the National Health Service in the UK. Another strategy is to improve the data (and IT) infrastructure of the organization to accommodate the AI projects. One important caveat the authors remind us of is that the approved algorithms often lack adequate follow up.

Perhaps the most important insight of this paper is the explanation of how high signal/noise situations (such as visual recognition, language translation, and games with rules) with rapid feedback while training, and availability of the correct answers, are the most successful AI/ML projects. In medicine, we often have low signal/noise and small datasets in diagnostic and descriptive prognostic research and these become unfavorable substrates for AI/ML, so that these methodologies are perhaps no more accurate than logistic regression.

The authors finally point out the key differences between humans and machines. Humans, including children, can learn from very small amounts of data. In addition, machines have no common sense, so that AI is vulnerable to erroneous data. On the other hand, AI algorithms can learn from huge amounts of data, so  it has this advantage over even seasoned clinicians with decades of experience. However, larger amounts of data do not lead to increased quality of data. These differences are essential to remember when we finally leverage AI to its fullest in improving clinical practice and patient outcomes.

Read the full paper here.

 

The post Artificial intelligence and statistics appeared first on AIMed.


Author:

aimed_aj
Published
Categorized as AI