How to Discount Double-Counting When It Counts: Some Clarifications
Department of Philosophy, Major Williams Hall, Virginia Tech Blacksburg, VA, USA
mayod{at}vt.edu
| Abstract |
|---|
The issues of double-counting, use-constructing, and selection effects have long been the subject of debate in the philosophical as well as statistical literature. I have argued that it is the severity, stringency, or probativeness of the test—or lack of it—that should determine if a double-use of data is admissible. Hitchcock and Sober ([2004]) question whether this severity criterion' can perform its intended job. I argue that their criticisms stem from a flawed interpretation of the severity criterion. Taking their criticism as a springboard, I elucidate some of the central examples that have long been controversial, and clarify how the severity criterion is properly applied to them.
- Severity and Use-Constructing: Four Points (and Some Clarificatory Notes)
- 1.1 Point 1: Getting beyond all or nothing standpoints
- 1.2 Point 2: The rationale for prohibiting double-counting is the requirement that tests be severe
- 1.3 Point 3: Evaluate severity of a test T by its associated construction rule R
- 1.4 Point 4: The ease of passing vs. ease of erroneous passing: Statistical vs. Definitional probability
- 1.2 Point 2: The rationale for prohibiting double-counting is the requirement that tests be severe
- 1.1 Point 1: Getting beyond all or nothing standpoints
- The False Dilemma: Hitchcock and Sober
- 2.1 Marsha measures her desk reliably
- 2.2 A false dilemma
- 2.2 A false dilemma
- 2.1 Marsha measures her desk reliably
- Canonical Errors of Inference
- 3.1 How construction rules may alter the error-probing performance of tests
- 3.2 Rules for accounting for anomalies
- 3.3 Hunting for statistically significant differences
- 3.2 Rules for accounting for anomalies
- 3.1 How construction rules may alter the error-probing performance of tests
- Concluding Remarks