This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of three test lengths (10, 20, and 30 items) and nine different sample sizes (150, 250, 350, 500, 750, 1,000, 2,000, 3,000 and 5,000 examinees). These data sets were then used to create various research conditions in which test length, sample size, and IRT model variables were manipulated to investigate item parameter estimation accuracy under different conditions. The results suggest that rather than sample size or test length, the combination of these two variables is important and samples of 150, 250, 350, 500, and 750 examinees can be used to estimate item parameters accurately in three unidimensional dichotomous IRT models, depending on test length and model employed.