Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Step-GUI Technical Report

by Liang, Danxun , Dai, Yiting , Zeng, Yuqing , Li, Brian , Li, Qiongyao , Chen, Zelin , Xu, Min , Wang, Zhuoyu , Gao, Shuli , Haolong Yan , Zhu, Xianwei , Xiao, Fengqiong , Yang, Shiliang , Xie, Jingjing , Yan, Chengxu , Wu, Hao , Gao, Qiong , Zeng, Qinlin , Zhou, Xin , Hou, Xiaojie , Dong, Jie , Yuan, Jiayu , Xu, Chunqin , Fan, Zhimin , Liu, Xingbin , Jiao, Binxing , Huang, Zhiguo , Yin, Jianpeng , Shen, Yeqing , Li, Chenyang , Zhong, Shangwu , Kang, An , Lei, Lei , Zhu, Yibo , Sui, Yifan , Li, Hongbing , Feng, Xuanti , Duan, Mengmeng , Jiang, Daxin , Ren, Jiaju , Wang, Ning , Liu, Guodong , Wang, Jia , Yang, Mi , Huang, Xin , Cheng, Hang , Liu, Manjiao , Wang, Zhirui , Huang, Junhao , Liu, Xin , Gao, Shisi , Liang, Yingdan , Zheng, Ge , Zihan Yan , Yan, Zhonghao , Tan, Kaijun , Yu, Renjie , Li, Guopeng , Li, Hang , Wu, Nan , Peng, Guozhen , Zhao, Yingxiu , Li, Dong , Ren, Mengqiang , Fan, Guanghao , Deng, Yineng , Wen, Xuan , Meng, Ziyang , Tan, Liguo , Luo, Shuang , Cao, Kai , Sun, Wen , Li, Shunshan , Cai, Xuedan , Liu, Shaofan , Liang, Xin , Shi, Liying , Zhang, Yixun , Weng, Zejia , Luo, Mao , Shi, Yukang , Chen, Hongming , Wang, Na , Zhang, Xiangyu , Zhang, Jingyang , Ma, Peiyao , Zhou, Xu , Li, Jianyong , Chen, Mei , Liu, Xiaojia , Li, Jing , Gao,

in Annotations / Automation / Graphical user interface / Large language models / Privacy

2025

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Step-GUI Technical Report

in Annotations / Automation / Graphical user interface / Large language models / Privacy

2025

Confirm

Do you wish to request the book?

Step-GUI Technical Report

in Annotations / Automation / Graphical user interface / Large language models / Privacy

2025

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Paper

Step-GUI Technical Report

Liang, Danxun,

Dai, Yiting,

Zeng, Yuqing,

Li, Brian,

Li, Qiongyao,

Chen, Zelin,

Xu, Min,

Wang, Zhuoyu,

Gao, Shuli,

Haolong Yan,

Zhu, Xianwei,

Xiao, Fengqiong,

Yang, Shiliang,

Xie, Jingjing,

Yan, Chengxu,

Wu, Hao,

Gao, Qiong,

Zeng, Qinlin,

Zhou, Xin,

Hou, Xiaojie,

Dong, Jie,

Yuan, Jiayu,

Xu, Chunqin,

Fan, Zhimin,

Liu, Xingbin,

Jiao, Binxing,

Huang, Zhiguo,

Yin, Jianpeng,

Shen, Yeqing,

Li, Chenyang,

Zhong, Shangwu,

Kang, An,

Lei, Lei,

Zhu, Yibo,

Sui, Yifan,

Li, Hongbing,

Feng, Xuanti,

Duan, Mengmeng,

Jiang, Daxin,

Ren, Jiaju,

Wang, Ning,

Liu, Guodong,

Wang, Jia,

Yang, Mi,

Huang, Xin,

Cheng, Hang,

Liu, Manjiao,

Wang, Zhirui,

Huang, Junhao,

Liu, Xin,

Gao, Shisi,

Liang, Yingdan,

Zheng, Ge,

Zihan Yan,

Yan, Zhonghao,

Tan, Kaijun,

Yu, Renjie,

Li, Guopeng,

Li, Hang,

Wu, Nan,

Peng, Guozhen,

Zhao, Yingxiu,

Li, Dong,

Ren, Mengqiang,

Fan, Guanghao,

Deng, Yineng,

Wen, Xuan,

Meng, Ziyang,

Tan, Liguo,

Luo, Shuang,

Cao, Kai,

Sun, Wen,

Li, Shunshan,

Cai, Xuedan,

Liu, Shaofan,

Liang, Xin,

Shi, Liying,

Zhang, Yixun,

Weng, Zejia,

Luo, Mao,

Shi, Yukang,

Chen, Hongming,

Wang, Na,

Zhang, Xiangyu,

Zhang, Jingyang,

Ma, Peiyao,

Zhou, Xu,

Li, Jianyong,

Chen, Mei,

Liu, Xiaojia,

Li, Jing,

Gao,

2025

Overview

Recent advances in multimodal large language models unlock unprecedented opportunities for GUI automation. However, a fundamental challenge remains: how to efficiently acquire high-quality training data while maintaining annotation reliability? We introduce a self-evolving training pipeline powered by the Calibrated Step Reward System, which converts model-generated trajectories into reliable training signals through trajectory-level calibration, achieving >90% annotation accuracy with 10-100x lower cost. Leveraging this pipeline, we introduce Step-GUI, a family of models (4B/8B) that achieves state-of-the-art GUI performance (8B: 80.2% AndroidWorld, 48.5% OSWorld, 62.6% ScreenShot-Pro) while maintaining robust general capabilities. As GUI agent capabilities improve, practical deployment demands standardized interfaces across heterogeneous devices while protecting user privacy. To this end, we propose GUI-MCP, the first Model Context Protocol for GUI automation with hierarchical architecture that combines low-level atomic operations and high-level task delegation to local specialist models, enabling high-privacy execution where sensitive data stays on-device. Finally, to assess whether agents can handle authentic everyday usage, we introduce AndroidDaily, a benchmark grounded in real-world mobile usage patterns with 3146 static actions and 235 end-to-end tasks across high-frequency daily scenarios (8B: static 89.91%, end-to-end 52.50%). Our work advances the development of practical GUI agents and demonstrates strong potential for real-world deployment in everyday digital interactions.

Share this book

Add to My Shelf

Publisher

Cornell University Library, arXiv.org

Subject

Annotations

/ Automation

/ Graphical user interface

/ Large language models

/ Privacy