Translating and adapting tests and questionnaires across languages is a common strategy for comparing people who operate in different languages with respect to their achievement, attitude, personality, or other psychological construct. Unfortunately, when tests and questionnaires are translated from one language to another, there is no guarantee that the different language versions are equivalent. In this study, we present and evaluate a methodology for investigating the equivalence of translated-adapted items using bilingual test takers. The methodology involves applying item response theory models to data obtained from randomly equivalent groups of bilingual respondents. The technique was applied to an English-Turkish version of a course evaluation form. The results indicate that the methodology is effective for flagging items that function differentially across languages as well as for informing the test development and test adaptation processes. The utility and limitations of the procedure for evaluating translation equivalence are discussed.