研究项目

Non-pathological image filtering in pathological multimodal data sets
Existing open-source pathology text-image paired datasets (e.g., Quilt-1m) are constructed by extracting frames from YouTube videos, though initial filtering strategies have been applied, significant noise (e.g., non-pathological images) remains. Training classifiers on datasets of varying scales and architectures demonstrates substantial performance disparities among different models. Experimental results further indicate that fine-tuning large models using an optimized dataset (filtered to exclude non-pathological data) significantly enhances their performance in downstream tasks.
Wenjin Qi
Vector Retrieval and GPU Acceleration
This project introduces a hierarchical retrieval framework and an efficient search method for Whole Slide Images (WSI). The framework preserves the spatial hierarchy and semantic information of pathological images across different magnifications by constructing a multi-level vector index. During retrieval, the system utilizes a GPU-accelerated parallel computing pipeline to enable rapid searches for query regions of any size. It then employs a topological recombination algorithm to aggregate the matched image patches into diagnostically meaningful, coherent areas. This method significantly enhances the accuracy and speed of large-scale pathological image retrieval, addressing the inflexibility and inefficiency of traditional approaches when handling multi-scale and variable-sized queries.

A Training Free Algorithm for Patch Level Quality Control in Pathological Images
The production process of pathological digital slides involves multiple critical steps, and potential quality issues in any of these steps may lead to defects such as image defocusing and tissue overlap. These abnormal regions result in the loss of pathological structural information, significantly compromising the accuracy and reliability of clinical diagnoses. Therefore, there is an urgent need to develop a rapid and efficient algorithmic framework to precisely identify and filter problematic regions, while further investigating the interference mechanisms and quantifying the impact of such low-quality image patches on the training of intelligent pathological analysis models.
Wenjin QiMLLM Evaluation in Breast Cancer
This project is focused on constructing a comprehensive benchmark to evaluate the performance of multimodal large language models (MLLMs) in breast cancer tasks.
Feiyu HuangCLAM-based Image-Caption Generation
The high cost of labeling in the medical field has led to a lack of annotated data related to whole slide imaging (WSI), thereby limiting the performance of many downstream tasks, such as training and application of pathology CLIP.
Pengyu GuoKB-enhanced Pathology CLIP (public datasets)
KB-enhanced Pathology CLIP addresses the variability in performance of pathology foundation models across different branches of pathology.
Hanlin Long
Image-Caption Data Market Demo
Acquiring high-quality training data is critical for the development of highly accurate and robust machine learning models, particularly as foundational models emerge.

Pathology Image-Text Structured Alignment Based on Multiple Instance Learning
The pathology foundation model trained on massive pathological texts provides strong pathology image-text alignment capabilities.
Pengyu Guo
Pivot:Enhancing Pathology Image-Text Alignment with a Pathology Knowledge Base
Pivot aligns pathological images, pathological ontologies, and text.
Hanlin LongRIVL:Addressing Modality Missing in Pathology Image-Text Alignment Using Interpolation
Infering the semantic vector of the text annotation for a candidate image based on several images most similar to it.
Hanlin Long
A Complex Label Aggregation Method Based on Computer Vision Foundation Models and Crowdsourced Spectral Clustering
This project proposes a novel aggregation method for complex crowdsourced labels, integrating computer vision foundation models with a new crowdsourced spectral clustering technique. The system first refines the initial annotations by incorporating image information via a vision foundation model, overcoming the limitations of annotator abilities. It then employs an iterative clustering process (determining cluster number, clustering, and removing outliers) using a depth-first search-based approach to effectively group complex annotations (e.g., bounding boxes) targeting the same entity. Finally, it performs a weighted aggregation based on quality estimations of both annotators and annotations. This solution significantly improves the accuracy of multi-object, multi-class complex label aggregation and provides a new framework for acquiring high-quality data.

A Question-Answering Processing Method, Apparatus, Electronic Device, Storage Medium and Product
This project introduces a medical pathology knowledge retrieval framework that integrates Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG). It enhances query understanding through a medical knowledge graph and retrieves relevant content via similarity-based indexing. The LLM then summarizes the results and generates interactive dialogues, enabling intelligent and reliable medical knowledge retrieval and analysis.
Ri Su
URICA: Pathology Slide Retrieval System
This work introduces the Unified Region-based Affine Identifier Retrieval Algorithm (URICA) to address region-level representation challenges in whole slide image retrieval, achieving rotation- and scale-invariant matching via semantic tessellation and affinity consistency, with theoretical proof of pixel-level retrieval approximation.
Ri Su
SCAR
This work bridges learning theory and data-centric practice through cross-modal supervision, defining the Foundation Data Size (FDS) for theoretical generalization and proposing SCAR, a four-dimensional framework for unified assessment of data quality and utility across modalities.
粟日
Semantic Retrieval-Based Biomedical Literature Indexing System for PubMed
This project develops a semantic retrieval system for biomedical research papers from PubMed, aiming to achieve precise semantic matching and efficient large-scale retrieval. A total of 1,336,133 papers are included, each segmented into 2,000-character chunks to enable fine-grained semantic indexing. The system ultimately generates 23,715,917 searchable nodes (chunks), providing a robust data foundation for advanced semantic retrieval and scientific question answering.
Ri Su
Cheng Zhang