Assessment tools comply with the requirements on psychometric properties if they are reliable and valid. In this article, we will describe the validity of the batteries of our tests.
Validity is the ability of a methodology to measure exactly what it was created to measure. To avoid any confusion over terms, we will use shooting as an analogy. The reliability of a methodology can be compared with the dispersion of the results, while its validity can be compared with the accuracy of the hit. As in the case of reliability, the assessment of a test’s validity is not exhausted through the application of only one method. There are several aspects of validity and, accordingly, several ways of measuring it: from subjective to accurate and mathematically sound.
Construct validity shows whether our test really measures exactly what it purports to do and what we expect from the test. For example, when using an intelligence test, we question whether this test really measures intelligence. Or maybe it measures erudition? Or maybe only one aspect of intelligence - the ability to do basic mathematics? In a personality questionnaire, this question would sound like this: are we actually measuring the factors that we want to measure? To answer this question, we will use some procedures that make it possible to obtain an answer about the level of construct validity.
As a rule, researchers use independently created tests which measure the same characteristics as your test. Your respondents complete two tests, and all you have to do is calculate the correlation between the indicators of your test and the benchmark test.
Another option is to conduct the test in a group that previously stood out in terms of the parameter that you need. For example, accountants are good at counting, while architects are adept at abstract thinking. If your test shows a significant difference between this group and other groups, then it actually measures this characteristic.
There is another way - assess the pronounced nature of the trait being measured among a specific group of people through an expert assessment, and then ask the group being evaluated to fill out a questionnaire. If the opinions of the experts coincide with the results of your test, then you can say that it has a high construct validity.
This may well be the most important indicator of the effectiveness of a test used in the business environment. Criterion validity measures the extent to which the test results correlate with performance at work.
It is worth noting here that low indicators of criterion validity during the use of a test may attest both to the poor quality of the methodologies and also show that the abilities being measured have nothing to do with performance at work. A correlational study is conducted to obtain reliable information on the criterion validity indicators. The employees of a company can be selected as the subject of the study; in this case, their test results are compared with their performance at work. This type of validity is called concurrent validity.
Construct Validity of the GREEN Battery “Interpretation of information”
To test the construct (theoretical) validity, ONTARGET conducted a study to compare the results of the Interpretation of information battery tests with the results from the tests of Psytech International (British company) adapted for Russian-speaking respondents. The study was conducted in 2013.
Table 1. Correlation of tests from the battery “Interpretation of information” with the tests of Psytech International
Construct Validity of the SAPPHIRE Battery “Analysis of information”
One of the validity indicators of the test battery is the correlation coefficient between the results of each test. Even though the results of verbal and numerical tests from the Analysis of information battery are related to a significant degree (which follows from the assumption of the G factor (general intelligence factor), the correlation of these tests indicates that they measure fundamentally different abilities. The study was conducted in 2015.
Table 2. Internal correlations between ONTARGET battery tests.
Criterion Validity of the RED Battery “Understanding of information”
To validate the tests, a study was conducted on the correlation of the test results with the development level of behavioral competencies. The test results were benchmarked against the competency ratings obtained from assessment centers and development centers conducted by DeTech (Development Technologies Ltd.). The total sample of the study was based on an assessment of more than 160 managers of various levels. The study was conducted in 2015.
As a variety of competency models were used at the different centers, all the competencies were combined into several large clusters. The following correlations were obtained (only correlation coefficients that are significant at the level of 0.05 are shown):
Table 3. Correlation of the “Understanding of information” tests with the competencies of assessment and development centers
The results demonstrate correlations of varying degrees of severity between the test results and a number of competencies. It follows that the numerical test is to a large extent related to the ability to analyze and solve problems, while the verbal test is related to commercial thinking.
Criteria Validity of the SCARLET Battery “Administration”
One validity indicator of the test battery is the correlation coefficient between the results of each test. Even though the results of verbal and numerical tests from the Administration battery are related to a significant degree (which follows from the assumption of the G factor), the correlation of these tests indicates that they measure fundamentally different abilities. The study was conducted in 2015.
Table 4. Internal correlations between the battery tests (Understanding Instructions and Working with Numerical Information)
Construct Validity of the SCARLET Battery “Administration”
To verify the construct (theoretical) validity, ONTARGET conducted a study to compare the results of the Administration battery tests with the results of the tests of Psytech International adapted for Russian-speaking respondents. The study was conducted in 2013.
Table 5. Correlation of tests from the battery “Administration” with the tests of Psytech International