CSpace
A distributed multiple sample testing for massive data
Xie Xiaoyue1,2; Shi Jian1,2; Song Kai3
2021-04-08
Source PublicationJOURNAL OF APPLIED STATISTICS
ISSN0266-4763
Pages19
AbstractWhen the data are stored in a distributed manner, direct application of traditional hypothesis testing procedures is often prohibitive due to communication costs and privacy concerns. This paper mainly develops and investigates a distributed two-node Kolmogorov-Smirnov hypothesis testing scheme, implemented by the divide-and-conquer strategy. In addition, this paper also provides a distributed fraud detection and a distribution-based classification for multi-node machines based on the proposed hypothesis testing scheme. The distributed fraud detection is to detect which node stores fraud data in multi-node machines and the distribution-based classification is to determine whether the multi-node distributions differ and classify different distributions. These methods can improve the accuracy of statistical inference in a distributed storage architecture. Furthermore, this paper verifies the feasibility of the proposed methods by simulation and real example studies.
KeywordDistributed scheme hypothesis testing fraud detection classification
DOI10.1080/02664763.2021.1911967
Indexed BySCI
Language英语
WOS Research AreaMathematics
WOS SubjectStatistics & Probability
WOS IDWOS:000637242100001
PublisherTAYLOR & FRANCIS LTD
Citation statistics
Document Type期刊论文
Identifierhttp://ir.amss.ac.cn/handle/2S8OKBNM/58424
Collection中国科学院数学与系统科学研究院
Corresponding AuthorShi Jian
Affiliation1.Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
3.Beijing Inst Technol, Sch Management & Econ, Beijing, Peoples R China
Recommended Citation
GB/T 7714
Xie Xiaoyue,Shi Jian,Song Kai. A distributed multiple sample testing for massive data[J]. JOURNAL OF APPLIED STATISTICS,2021:19.
APA Xie Xiaoyue,Shi Jian,&Song Kai.(2021).A distributed multiple sample testing for massive data.JOURNAL OF APPLIED STATISTICS,19.
MLA Xie Xiaoyue,et al."A distributed multiple sample testing for massive data".JOURNAL OF APPLIED STATISTICS (2021):19.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Xie Xiaoyue]'s Articles
[Shi Jian]'s Articles
[Song Kai]'s Articles
Baidu academic
Similar articles in Baidu academic
[Xie Xiaoyue]'s Articles
[Shi Jian]'s Articles
[Song Kai]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Xie Xiaoyue]'s Articles
[Shi Jian]'s Articles
[Song Kai]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.