数据集

LLM Pathology Knowledge graph

Large Language Model (LLM) is used to deeply mine and analyze private pathological data, and the rich information in the field of pathology is integrated by constructing a refined knowledge graph. By introducing graph relationship analysis and advanced machine learning methods, multi-dimensional analysis and reasoning of complex pathological data are realized, so as to significantly improve the accuracy, professionalism and practicability of data query. This method can not only provide more accurate diagnostic AIDS, but also support medical researchers to discover knowledge in large-scale data sets, promoting the innovation and progress of pathology research.

Cheng Zhang

Cleaned Pathology Image-Caption Dataset (public)

The pathological image and text dataset collected and organized from public mainly consists of pathological images and their corresponding textual annotations obtained from open sources such as social media platforms, forums, medical papers, pathology tutorial videos, and textbooks. These image-text pairs have been applied with basic cleaning and curation, with descriptions outside the domain of pathology and low-readability textual annotations being removed. The dataset contains valuable multimodal knowledge and their corresponding relationships defined in the area of pathology.

Hanlin Long

Cleaned Pathology Image-Caption Dataset (all)

The pathological image and text dataset collected and organized from both public and non-public primarily includes pathological images and their corresponding textual annotations obtained from open sources such as social media platforms, forums, medical papers, pathology tutorial videos, and textbooks, as well as pathological reports and teaching notes from more specialized cooperative institutions and hospitals that are not publicly accessible. These image-text pairs have been applied with basic cleaning and curation, with descriptions outside the domain of pathology and low-readability textual annotations being removed. The dataset contains more precise multimodal knowledge and corresponding relationships defined in the area of pathology.

Hanlin Long

Breast Cancer LLM Evaluation Benchmark

We have developed a breast cancer benchmark to evaluate the performance of artificial intelligence models in pathological image analysis.

Feiyu Huang