AI language models fail to produce an appropriate early diagnosis more than 80% of the time, suggesting they are not yet safe ...