On the Ability of Developers' Training Data Preservation of Learnware

Abstract

The learnware paradigm aims to enable users to leverage numerous existing well-trained models instead of building machine learning models from scratch. In this paradigm, developers worldwide can submit their well-trained models spontaneously into a learnware dock system, and the system helps developers generate specification for each model to form a learnware. As the key component, a specification should characterize the capabilities of the model, enabling it to be adequately identified and reused, while preserving the developer's original data. Recently, the RKME (Reduced Kernel Mean Embedding) specification was proposed and most commonly utilized. This paper provides a theoretical analysis of RKME specification about its preservation ability for developer's training data. By modeling it as a geometric problem on manifolds and utilizing tools from geometric analysis, we prove that the RKME specification is able to disclose none of the developer's original data and possesses robust defense against common inference attacks, while preserving sufficient information for effective learnware identification.

Cite

Text

Lei et al. "On the Ability of Developers' Training Data Preservation of Learnware." Neural Information Processing Systems, 2024. doi:10.52202/079017-1150

Markdown

[Lei et al. "On the Ability of Developers' Training Data Preservation of Learnware." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/lei2024neurips-ability/) doi:10.52202/079017-1150

BibTeX

@inproceedings{lei2024neurips-ability,
  title     = {{On the Ability of Developers' Training Data Preservation of Learnware}},
  author    = {Lei, Hao-Yi and Tan, Zhi-Hao and Zhou, Zhi-Hua},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1150},
  url       = {https://mlanthology.org/neurips/2024/lei2024neurips-ability/}
}