Academics | The Hong Kong University of Science and Technology

Image-Caption Data Market Demo
Image-Caption Data Market Demo
A cquiring high-quality training data is critical for the development of highly accurate and robust machine learning models, particularly as foundational models emerge.

However, prior research on data marketplaces encounters two key challenges. Firstly, most marketplaces fail to allow buyers to demand the quality requirement from their purchased data. Consequently, buyers may be unable to utilize all purchased data due to the subpar quality of the purchase data. Secondly, the primary optimization objective of existing marketplaces is to maximize the utility of sellers without considering buyers' utility. Therefore, the price obtained by the existing biased objective is unfair, potentially resulting in a decrease in buyer purchases and the degradation of the trading volume in data markets.

To tackle these two challenges, we propose a novel, fair and quality-based data market extbf{FQora}. Specifically, FQora utilizes two types of quality-based pricing functions and effective quality assessment functions to implement quality-constrained queries. Additionally, to ensure balanced utility allocation, we introduce mean variance constraint to maintain the low-risk development in the long term and address the fair market objective, maximizing utilities of both sellers and buyers with a novel Balanced Pareto Optimization. We theoretically show that Balanced Pareto Optimization can resolve multi-object optimization by utilizing a dual problem and ensuring convergence. Extensive experiments on four real-world datasets provide empirical support for our theoretical analysis and confirm the superior performance of our proposed FQora.