When Is Attribute Agreement Analysis Used
In addition to the sample size problem, logistics can ensure that listeners do not remember the original attribute they attributed to a scenario when they see it for the second time, also a challenge. Of course, this can be avoided a bit by increasing the sample size and, better yet, waiting a while before giving the scenarios to the evaluators a second time (perhaps one to two weeks). Randomization of transitions from one audit to another can also be helpful. In addition, evaluators tend to work differently when they know they are being examined, so that the fact that they know it is a test also distorts the results. Hiding this in one way or another can help, but it`s almost impossible to achieve, despite the fact that it borders on the inthesis. And in addition to being at best marginally effective, these solutions increase an already demanding study with complexity and time. Despite these difficulties, performing an attribute analysis on bug tracking systems is not a waste of time. In fact, it is (or may be) an extremely informative, valuable and necessary exercise. The analysis of attributes should only be applied with caution and with a certain focus. An attribute analysis was developed to simultaneously assess the effects of repeatability and reproducibility on accuracy.
It allows the analyst to review the responses of several reviewers if they look at multiple scenarios multiple times. It establishes statistics that assess the ability of evaluators to agree with themselves (repeatability), with each other (reproducibility) and with a master or correct value (overall accuracy) known for each characteristic – over and over again. Assuming that the accuracy rate (or most likely error modes) of the bug tracking system is unknown, it is advisable to check 100 percent of the database for an appropriate framework of the previous history. What`s reasonable? It really depends on that, but to be sure, at least 100 samples should be examined over a last representative period. The definition of the database should take into account how database information should be used: to prioritize projects, investigate the cause or assess performance. 100 audit examples are a good starting point because it gives the analyst an approximate idea of the overall accuracy of the database. Analytically, this technique is a wonderful idea. But in practice, the technique can be difficult to execute judiciously. First, there is always the question of sample size. For attribute data, relatively large samples are required to be able to calculate percentages with relatively low confidence intervals.
If an expert looks at 50 different error scenarios – twice – and the match rate is 96 percent (48 votes vs. 50), the 95 percent confidence interval ranges from 86.29% to 99.51 percent. It is a fairly large margin of error, especially in terms of the challenge of choosing the scenarios, checking them in depth, making sure the value of the master is assigned, and then convincing the examiner to do the job – twice.