validity
Latest
Do we measure what we think we are measuring?
Tests used in the empirical sciences are often (implicitly) assumed to be representative of a target mechanism in the sense that similar tests should lead to similar results. In this talk, using resting-state electroencephalogram (EEG) as an example, I will argue that this assumption does not necessarily hold true. Typically EEG studies are conducted selecting one analysis method thought to be representative of the research question asked. Using multiple methods, we extracted a variety of features from a single resting-state EEG dataset and conducted correlational and case-control analyses. We found that many EEG features revealed a significant effect in the case-control analyses. Similarly, EEG features correlated significantly with cognitive tasks. However, when we compared these features pairwise, we did not find strong correlations. A number of explanations to these results will be discussed.
Commonly used face cognition tests yield low reliability and inconsistent performance: Implications for test design, analysis, and interpretation of individual differences data
Unfamiliar face processing (face cognition) ability varies considerably in the general population. However, the means of its assessment are not standardised, and selected laboratory tests vary between studies. It is also unclear whether 1) the most commonly employed tests are reliable, 2) participants show a degree of consistency in their performance, 3) and the face cognition tests broadly measure one underlying ability, akin to general intelligence. In this study, we asked participants to perform eight tests frequently employed in the individual differences literature. We examined the reliability of these tests, relationships between them, consistency in participants’ performance, and used data driven approaches to determine factors underpinning performance. Overall, our findings suggest that the reliability of these tests is poor to moderate, the correlations between them are weak, the consistency in participant performance across tasks is low and that performance can be broadly split into two factors: telling faces together, and telling faces apart. We recommend that future studies adjust analyses to account for stimuli (face images) and participants as random factors, routinely assess reliability, and that newly developed tests of face cognition are examined in the context of convergent validity with other commonly used measures of face cognition ability.
Categories, language, and visual working memory: how verbal labels change capacity limitations
The limited capacity of visual working memory constrains the quantity and quality of the information we can store in mind for ongoing processing. Research from our lab has demonstrated that verbal labeling/categorization of visual inputs increases its retention and fidelity in visual working memory. In this talk, I will outline the hypotheses that explain the interaction between visual and verbal inputs in working memory, leading to the boosts we observed. I will further show how manipulations of the categorical distinctiveness of the labels, the timing of their occurrence, to which item labels are applied, as well as their validity modulate the benefits one can draw from combining visual and verbal inputs to alleviate capacity limitations. Finally, I will discuss the implications of these results to our understanding of working memory and its interaction with prior knowledge.
Algorithmic advances in face matching: Stability of tests in atypical groups
Face matching tests have traditionally been developed to assess human face perception in the neurotypical range, but methods that underlie their development often make it difficult for these measures to be applied in atypical populations (developmental prosopagnosics, super recognizers) due to unadjusted difficulty. We have recently presented the development of the Oxford Face Matching Test, a measure that bases individual item-difficulty on algorithmically derived similarity of presented stimuli. The measure seems useful as it can be given online or in-laboratory, has good discriminability and high test-retest reliability in the neurotypical groups. In addition, it has good validity in separating atypical groups at either of the spectrum ends. In this talk, I examine the stability of the OFMT and other traditionally used measures in atypical groups. On top of the theoretical significance of determining whether reliability of tests is equivalent in atypical population, this is an important question because of the practical concerns of retesting the same participants across different lab groups. Theoretical and practical implications for further test development and data sharing are discussed.
validity coverage
4 items