CSpace  > 应用数学研究所
Optimal subsample selection for massive logistic regression with distributed data
Zuo, Lulu1; Zhang, Haixiang1; Wang, HaiYing2; Sun, Liuquan3
2021-02-27
发表期刊COMPUTATIONAL STATISTICS
ISSN0943-4062
页码28
摘要With the emergence of big data, it is increasingly common that the data are distributed. i.e., the data are stored at many distributed sites (machines or nodes) owing to data collection or business operations, etc. We propose a distributed subsampling procedure in such a setting to efficiently approximate the maximum likelihood estimator for the logistic regression. We establish the consistency and asymptotic normality of the subsample estimator given the full data. The optimal subsampling probabilities and optimal allocation sizes are explicitly obtained. We develop a two-step algorithm to approximate the optimal subsampling procedure. Numerical simulations and an application to airline data are presented to evaluate the performance of our subsampling method.
关键词Allocation size Big data Distributed and massive data Subsample estimator Subsampling probabilities
DOI10.1007/s00180-021-01089-0
收录类别SCI
语种英语
资助项目National Science Foundation (NSF), USA grant[DMS-1812013] ; National Natural Science Foundation of China[11771431] ; National Natural Science Foundation of China[11690015] ; National Natural Science Foundation of China[11926341] ; Key Laboratory of RCSDS, CAS[2008DP173182]
WOS研究方向Mathematics
WOS类目Statistics & Probability
WOS记录号WOS:000622671900002
出版者SPRINGER HEIDELBERG
引用统计
文献类型期刊论文
条目标识符http://ir.amss.ac.cn/handle/2S8OKBNM/58237
专题应用数学研究所
通讯作者Zhang, Haixiang
作者单位1.Tianjin Univ, Ctr Appl Math, Tianjin 300072, Peoples R China
2.Univ Connecticut, Dept Stat, Mansfield, CT 06269 USA
3.Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Zuo, Lulu,Zhang, Haixiang,Wang, HaiYing,et al. Optimal subsample selection for massive logistic regression with distributed data[J]. COMPUTATIONAL STATISTICS,2021:28.
APA Zuo, Lulu,Zhang, Haixiang,Wang, HaiYing,&Sun, Liuquan.(2021).Optimal subsample selection for massive logistic regression with distributed data.COMPUTATIONAL STATISTICS,28.
MLA Zuo, Lulu,et al."Optimal subsample selection for massive logistic regression with distributed data".COMPUTATIONAL STATISTICS (2021):28.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zuo, Lulu]的文章
[Zhang, Haixiang]的文章
[Wang, HaiYing]的文章
百度学术
百度学术中相似的文章
[Zuo, Lulu]的文章
[Zhang, Haixiang]的文章
[Wang, HaiYing]的文章
必应学术
必应学术中相似的文章
[Zuo, Lulu]的文章
[Zhang, Haixiang]的文章
[Wang, HaiYing]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。