Need Simulated IRT Datasets for Your Research? Try Our Tool!

Dataset Generator

This script generates simulated datasets for Item Response Theory (IRT) analysis. IRT is a framework used in psychometrics for designing, analyzing, and scoring tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables (Embretson & Reise, 2000). The script specifically utilizes the 2-Parameter Logistic (2PL) model of IRT.

IRT and the 2PL Model

Item Response Theory (IRT) represents a significant advancement in psychometrics, focusing on the relationship between individual abilities and test item characteristics. IRT posits that the probability of a correct response to a test item is a function of the person's ability relative to the item's properties.

The 2-Parameter Logistic (2PL) model, as developed by Birnbaum (1968), is a foundational model in IRT. It extends the Rasch model (Rasch, 1960), also known as the 1-Parameter Logistic Model (1PL), by introducing an item discrimination parameter. This parameter allows the model to capture how effectively an item can differentiate between individuals with varying levels of the trait being assessed. In the 2PL model, the probability of a correct response is influenced by both the difference between a person's ability and the item's difficulty, as well as the item's ability to discriminate between individuals of different ability levels.

Mathematically, the 2PL model is expressed as:

\[ P(\text{correct}) = \frac{1}{1 + e^{-a_i(\theta - b_i)}} \]

where \( P(\text{correct}) \) is the probability of a correct response, \( \theta \) represents the individual's ability, \( a_i \) is the item's discrimination parameter, and \( b_i \) is the item's difficulty parameter.

In this generator, the 2PL model is operationalized to simulate datasets with varying item characteristics. Each item in the dataset is assigned a difficulty level and a discrimination parameter. This reflects real-world testing scenarios where some items are more informative than others in differentiating among individuals with different abilities. Such an approach facilitates a nuanced simulation of test data, providing a valuable tool for IRT-based research and analysis.

Mathematical Formulas

The generator is based on several key mathematical formulas:

1. Item Difficulty Calculation:

\[ b_i = \mu_{b} + (R - 0.5) \times 2 \times \sigma_{b} \]

where \( b_i \) is the difficulty of item \( i \), \( \mu_{b} \) is the mean difficulty, \( R \) is a random value from a uniform distribution, and \( \sigma_{b} \) is the standard deviation of difficulty.

2. Discrimination Sampling from Log-Normal Distribution:

\[ a_i = \exp\left(\ln(\mu_{a}) - 0.5 \times \ln\left(1 + \frac{\sigma_{a}^2}{\mu_{a}^2}\right) + Z \times \sqrt{\ln\left(1 + \frac{\sigma_{a}^2}{\mu_{a}^2}\right)}\right) \]

where \( a_i \) is the discrimination of item \( i \), \( \mu_{a} \) is the mean discrimination, \( \sigma_{a} \) is the standard deviation of discrimination, and \( Z \) is a sample from a standard normal distribution.

3. Probability of Correct Response (IRT Model):

\[ P_i(\theta) = \frac{1}{1 + \exp(-1.702 \times a_i \times (\theta - b_i))} \]

where \( P_i(\theta) \) is the probability of a correct response to item \( i \) by a subject with trait level \( \theta \), \( a_i \) is the item discrimination, and \( b_i \) is the item difficulty.

4. Subject Trait Level (Theta) Generation:

\[ \theta = Z \times \sigma_{\theta} + \mu_{\theta} \]

where \( \theta \) is the trait level of a subject, \( Z \) is a sample from a standard normal distribution, \( \sigma_{\theta} \) is the standard deviation, and \( \mu_{\theta} \) is the mean for the trait level.

5. Score Generation for Polytomous Items (k Categories):

\[ \text{score} = \min\left(\sum_{m=1}^{k-1} \mathbf{1}\left[ U < \sum_{n=1}^{m} P_{i,n}(\theta) \right], k-1 \right) \]

where \( \text{score} \) is the item score, \( \mathbf{1} \) is the indicator function, \( U \) is a random uniform variable, \( P_{i,n}(\theta) \) is the probability of scoring in category \( n \) for item \( i \) given \( \theta \), and \( k \) is the number of categories.

Note: The generator incorporates advanced methods for skewed distributions and plans for missing data handling in future versions.


Baker, F. B. (2001). The Basics of Item Response Theory. ERIC Clearinghouse on Assessment and Evaluation.

Birnbaum, A. (1968). Some Latent Trait Models and Their Use in Inferring an Examinee's Ability. In F. M. Lord & M. R. Novick, Statistical Theories of Mental Test Scores. Addison-Wesley.

Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Lawrence Erlbaum Associates.

Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Nielsen & Lydiche.

Publication: 2023