题目内容 (请给出正确答案)
[主观题]

An outlier is a data object that deviates significantly from the rest of the objects, as if it were generated by a different mechanism.

提问人:网友okboy09 发布时间:2022-01-07
参考答案
  抱歉!暂无答案,正在努力更新中……
如搜索结果不匹配,请 联系老师 获取答案
更多“An outlier is a data object th…”相关的问题
第1题
Distance-based outlier Mining is not suitable to data set that does not fit any standard distribution model.
点击查看答案
第2题
Data mining is an(66)research field in database and artificial intelligence. In this paper

Data mining is an(66)research field in database and artificial intelligence. In this paper, the data mining techniques are introduced broadly including its producing background, its application and its classification. The principal techniques used in the data mining are surveyed also, which include rule induction, decision(67), artificial(68)network, genetic algorithm, fuzzy technique, rough set and visualization technique. Association rule mining, classification rule mining, outlier mining and clustering method are discussed in detail. The research achievements in association rule, the shortcomings of association rule measure standards and its(69), the evaluation methods of classification rules are presented. Existing outlier mining approaches are introduced which include outlier mining approach based on statistics, distance-based outlier mining approach, data detection method for deviation, rule-based outlier mining approach and multi-strategy method. Finally, the applications of data mining to science research, financial investment, market, insurance, manufacturing industry and communication network management are introduced. The application(70)of data mining are described.

A.intractable

B.emerging

C.easy

D.scabrous

点击查看答案
第3题
Which one is wrong about clustering and outliers?

A、Clustering belongs to supervised learning.

B、Principles of clustering include maximizing intra-class similarity and minimizing interclass similarity.

C、Outlier analysis can be useful in fraud detection and rare events analysis.

D、Outlier means a data object that does not comply with the general behavior of the data.

点击查看答案
第4题
Assignment 6 - Outlier mining You are required to ...

Assignment 6 - Outlier mining You are required to use outlier mining methods to detect the outliers with given data sets. In a section of a city road, several cameras are set to collect the plate of vehicles from 2017-06-09 to 2017-06-12, as well as the date and time when passing the start point and the finish point. Travel time is calculated later. Time serial is another form of transformation from start time. So each instance contains 8 attributes, including serial number, license plate number, date and time passing start/end point, time serial and travel time. There are totally 4977 instances. You need to finish the following tasks. Task: (1) Use statistic-based approach to detect the outliers of travel time. Calculate the mean value and the variance of travel time. Write out the confidence interval. Take time serial as X-axis and the travel time as Y-axis. Plot the scatter diagram and mark the outliers you have recognized. (2) Use distance-based approach to detect the outliers of travel time. An object o in data set D is defined as an outlier with parameters r and π described as DB(r,π), if a fraction of the objects in D lie at a distance less than r from o is less than π, o is an outlier. Let parameter r vary from 0.1 to 0.3 with the step of 0.1, and π vary from 30 to 90 with the step of 30, find the outliers and the number of the outliers. You can use the Euclidian distance. (3) Use density-based approach to detect the outliers of travel time. With different k (from 3 to 400 with the step of 5), the number of neighbors, calculate the LOF for each data point. Set 2.0 as a threshold for LOF and an object is labeled as an outlier if its LOF exceeds 2.0. Firstly, take k value as X-axis and the number of outliers as Y-axis. Plot the line chart. Secondly, calculate the LOF for each data point and give the top 4 outliers. Use k=350 and the Euclidian distance.

点击查看答案
第5题
Outlier arithmetics such as Isolation Forest can be used to detect traffic incident.
点击查看答案
第6题
If you cannot find a reason for an outlier or remove it, you should use the mean and IQR to summarize the center and spread.
点击查看答案
第7题
Which of the following is most affected by an outlier (extreme value)?

A、mean

B、median

C、mode

D、none of the above

点击查看答案
第8题
What is application case of outlier mining?

A、Traffic incident detection

B、Credit card fraud detection

C、Network intrusion detection

D、Medical analysis

点击查看答案
第9题
在一个n维的空间中,最好的检测outlier(离群点)的方法是()A.作正态分布概率图B.作盒形图C.马氏距

在一个n维的空间中,最好的检测outlier(离群点)的方法是()

A.作正态分布概率图

B.作盒形图

C.马氏距离

D.作散点图

点击查看答案
第10题
How to pick the right k by a heuristic method for density based outlier mining method?

A、K should be at least 10 to remove unwanted statistical fluctuations.

B、Pick 10 to 20 appears to work well in general.

C、Pick the upper bound value for k as the maximum of “close by” objects that can potentially be global outliers.

D、Pick the upper bound value for k as the maximum of “close by” objects that can potentially be local outliers.

点击查看答案
账号:
你好,尊敬的用户
复制账号
发送账号至手机
密码将被重置
获取验证码
发送
温馨提示
该问题答案仅针对搜题卡用户开放,请点击购买搜题卡。
马上购买搜题卡
我已购买搜题卡, 登录账号 继续查看答案
重置密码
确认修改
欢迎分享答案

为鼓励登录用户提交答案,简答题每个月将会抽取一批参与作答的用户给予奖励,具体奖励活动请关注官方微信公众号:简答题

简答题官方微信公众号

警告:系统检测到您的账号存在安全风险

为了保护您的账号安全,请在“简答题”公众号进行验证,点击“官网服务”-“账号验证”后输入验证码“”完成验证,验证成功后方可继续查看答案!

微信搜一搜
简答题
点击打开微信
警告:系统检测到您的账号存在安全风险
抱歉,您的账号因涉嫌违反简答题购买须知被冻结。您可在“简答题”微信公众号中的“官网服务”-“账号解封申请”申请解封,或联系客服
微信搜一搜
简答题
点击打开微信