August 25–27 — training ’Occupational Testing’ (available in Russian)
Request a call
Leave your phone number and we will call you

Reliability DEEP

In 2018 ONTARGET updated the DEEP questionnaire and in 2019 conducted research of the scales to investigate the compliance of the psychometric parameters with the criteria of the “Russian Standard for Personnel Testing”. In this article, we will describe the results of the study regarding the consistency reliability of the questionnaire scales.


According to the definition, a tool is considered reliable if its results do not depend on the influence of random factors. For example, a tool is considered reliable if it can show similar results in a situation where the same individual has been tested twice after a certain period of time. Reliability can also be described using the term measurement error. Each tool may have measurement errors which depend on a number of factors.

As a rule, they can be summed up as follows:

  • factors of the test environment: noise, lighting, uncomfortable temperature (heat/cold), time frame, time of day;
  • factors related to the respondent: state of health, motivation, mood, composure;
  • factors related to the test: definitions, ambiguity in sentences, correctness of the calculations, length of the test.

Users and test administrators should do all they can to minimize errors related to the first two groups of factors. Tool developers are responsible for errors related to the third factor. The most reliable tool is one where measurement errors do not have a significant impact on its results. In this study, we will talk about the consistency reliability of the questionnaire.

Reliability can also be calculated on the basis of one measurement: in this case, the tool needs to be broken into two comparable parts and the respondents' indicators must be benchmarked against each other. If consistency reliability is being measured, in other words, the extent to which each individual question measures the target criterion of the entire test, the main problem would appears to be the method used to split questionnaire items into two groups. The contemporary mathematical analysis tool makes it possible to measure consistency reliability by dividing the assessment tool into an arbitrary number of parts, up to and including division into the number of parts corresponding to the number of items in the questionnaire. The final measure of consistency reliability is measured using the Alpha coefficient developed by Lee Cronbach:

where n – is the number of tasks in the test, σi – is the standard deviation of separate tasks, σj – is the standard deviation of the whole test

According to the “Russian Standard for Personnel Testing” developed in 2015 for personality diagnostic methodologies, the recommended reliability indicator should not be lower than 0.6.

The sample for this study consisted of 10,506 people.

Composition of the sample by gender:

  • Female – 5,550
  • Male – 4,317
  • Not specified - 639

Composition of the sample by age:

  • Up to 25 – 1,471
  • 26-30 – 1,386
  • 31-35 – 1,723
  • 36-40 – 1,438
  • 40-50 – 1,635
  • Older than 50 – 784
  • Not specified – 2,069

The program IBM SPSS Statistics 23 was used to calculate the reliability.

Below we present the results of the consistency reliability for each scale of the DEEP personality questionnaire.

Table №1. Cronbach Alpha for DEEP Questionnaire Scales, 2019

According to the table above, we can conclude that the scales of the DEEP questionnaire have a sufficiently high consistency reliability, which suggests that items within each scale are aimed at measuring the same trait.

Записаться на Курс «Профессиональное тестирование»
Оставьте свои контакты и мы перезвоним вам
Выберите дату: