Progress in large language models (LLM) has been rapid lately and, I suspect, is moving faster than our understanding of what these models are really capable of. Microsoft's GPT-4 has exhibited evidence of a deeper world-model understanding than even GPT-3.5, which is scary as well as exhilarating.
For the application of helping physicians in practice, an enterprising startup has put out a chat-based app, Nabla, that promises to help physicians with their chart notes. I am not sure that LLM is mature enough to deploy for this application. First of all, the software runs on a cloud server, and this is always a concern. The company claims that it is "HIPAA-eligible" and "GDPR-compliant" but it will have to be approved by hospital or clinic security before it can be deployed. From what I can see, it outputs rather simple statements based on patient input, and seems to akin to a voice dictation system that is just able to pad snippets into a regular sentence. It won't create the kind of chart notes that I am accustomed to generating, especially in the Assessment and Plan section, which depends on a knowledge of the literature and interpretation of clinical findings and lab results, and sets down my line of thinking. So far, I've not encountered software that will save me that effort. As this software isn't asked to be creative, there is probably no risk of hallucinations or other unwanted side-effects of more complex generative chat. Never before, has a physician dictated a chart note with confidential and sensitive information to a startup corporate entity before. As protected information will be exchanged, will each user's input will be stored for use in a future training set? If so, how is protected information censored?
In the area of expert systems, great strides have been made by Google with LLM as expert systems. However it has been recognized that:
The problem is that the medical domain is a special domain. In contrast to other fields, there are different issues and even greater safety issues. As we have seen models like ChatGPT can also hallucinate, and be capable of the spread of misinformation.