Information Reasoning and Question Answering in Healthcare: A PubMedQA Benchmark Study
Information extraction and analysis in healthcare have benefited significantly from advances in the AI and NLP community in recent years, often directly impacting human lives or at least their well-being. However, processing and analyzing mostly unstructured and semi-structured health and biomedical information presents substantial challenges to researchers. The inherent complexity of the domain, limited data availability, data ambiguity, and the high cost of involving medical experts make supporting healthcare difficult. With the recent revolution in AI, particularly in Large Language Models (LLMs), the quality of analyses of these data has increased significantly - though it has not yet reached the level required by medical practitioners. In this work, we address the challenge of information reasoning and question answering, with a particular focus on healthcare. We conduct a detailed analysis of PubMedQA, one of the most challenging and prominent benchmarks. We design, implement, and evaluate Med-RCQ (Medical Reasoning by Concluding and Questioning), a novel LLM-based method to analyze health and biomedical information with the objective of supporting informed decisions. Our PubMedQA benchmark results show that Med-RCQ outperforms most competing open approaches in terms of result quality and even provides similar results compared to large commercial models, while outperforming all other benchmark results in an `accuracy per number of model parameters' metric.