MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices

Zhao, Youpeng; Lin, Ming; Tang, Huadong; Wu, Qiang; Wang, Jun

doi:10.1609/AAAI.V39I21.34445

MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices

Youpeng Zhao, Ming Lin, Huadong Tang, Qiang Wu, Jun Wang

AAAI 2025 pp. 22840-22848

doi:10.1609/AAAI.V39I21.34445 /aaai/2025/zhao2025aaai-merino/

Abstract

Generative Large Language Models (LLMs) stand as a revolutionary advancement in the modern era of artificial intelligence (AI). However, scaling down LLMs for resource-constrained hardware, such as Internet-of-Things (IoT) devices requires non-trivial efforts and domain knowledge. In this paper, we propose a novel information-entropy framework for designing mobile-friendly generative language models. The whole design procedure involves solving a mathematical programming (MP) problem, which can be done on the CPU within minutes, making it nearly zero-cost. We evaluate our designed models, termed MeRino, across fourteen NLP downstream tasks, showing their competitive performance against the state-of-the-art autoregressive transformer models under the mobile setting. Notably, MeRino achieves similar or better performance on both language modeling and zero-shot learning tasks, compared to the 350M parameter OPT while being 4.9x faster on NVIDIA Jetson Nano with 5.5x reduction in model size.

PDF AAAI Semantic Scholar

Cite

Text

Zhao et al. "MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I21.34445

Markdown

[Zhao et al. "MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/zhao2025aaai-merino/) doi:10.1609/AAAI.V39I21.34445

BibTeX

@inproceedings{zhao2025aaai-merino,
  title     = {{MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices}},
  author    = {Zhao, Youpeng and Lin, Ming and Tang, Huadong and Wu, Qiang and Wang, Jun},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {22840-22848},
  doi       = {10.1609/AAAI.V39I21.34445},
  url       = {https://mlanthology.org/aaai/2025/zhao2025aaai-merino/}
}