T
he pathological image and text dataset collected and organized from public mainly consists of pathological images and their corresponding textual annotations obtained from open sources such as social media platforms, forums, medical papers, pathology tutorial videos, and textbooks. These image-text pairs have been applied with basic cleaning and curation, with descriptions outside the domain of pathology and low-readability textual annotations being removed. The dataset contains valuable multimodal knowledge and their corresponding relationships defined in the area of pathology.