Reference : Using Automatic Item Generation in the context of the Épreuves Standardisées (Épstan)...
Scientific congresses, symposiums and conference proceedings : Unpublished conference
Social & behavioral sciences, psychology : Multidisciplinary, general & others
Educational Sciences
http://hdl.handle.net/10993/48588
Using Automatic Item Generation in the context of the Épreuves Standardisées (Épstan): A pilot study on effects of altering item characteristics and semantic embeddings
English
Michels, Michael Andreas mailto [University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > LUCET >]
Hornung, Caroline mailto [University of Luxembourg > Faculty of Language and Literature, Humanities, Arts and Education (FLSHASE) > Luxembourg Centre for Educational Testing (LUCET) >]
Inostroza Fernandez, Pamela Isabel mailto [University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > LUCET >]
Sonnleitner, Philipp mailto [University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > LUCET >]
11-Nov-2021
Yes
LuxERA Emerging Researchers’ Conference 2021
10.11. - 11.11.2021
[en] Assessing mathematical skills in national school monitoring programs such as the Luxembourgish Épreuves Standardisées (ÉpStan) creates a constant demand of developing high-quality items that is both expensive and time-consuming. One approach to provide high-quality items in a more efficient way is Automatic Item Generation (AIG, Gierl, 2013). Instead of creating single items, cognitive item models form the base for an algorithmic generation of a large number of new items with supposedly identical item characteristics. The stability of item characteristics is questionable, however, when different semantic embeddings are used to present the mathematical problems (Dewolf, Van Dooren, & Verschaffel, 2017, Hoogland, et al., 2018). Given culture-specific knowledge differences in students, it is not guaranteed that illustrations showing everyday activities do not differentially impact item difficulty (Martin, et al., 2012). Moreover, the prediction of empirical item difficulties based on theoretical rationales has proved to be difficult (Leighton & Gierl, 2011). This paper presents a first attempt to better understand the impact of (a) different semantic embeddings, and (b) problem-related variations on mathematics items in grades 1 (n = 2338), 3 (n = 3835) and 5 (n = 3377) within the context of ÉpStan. In total, 30 mathematical problems were presented in up to 4 different versions, either using different but equally plausible semantic contexts or altering the problem’s content characteristics. Preliminary results of IRT-scaling and DIF-analysis reveal substantial effects of both, the embedding, as well as the problem characteristics on general item difficulties as well as on subgroup level. Further results and implications for developing mathematic items, and specifically, for using AIG in the course of Épstan will be discussed.
Faculty of Language and Literature, Humanities, Arts and Education (FLSHASE) > Luxembourg Centre for Educational Testing (LUCET)
Fonds National de la Recherche - FnR
http://hdl.handle.net/10993/48588
FnR ; FNR13650128 > Philipp Sonnleitner > FAIR-ITEMS > Fairness Of Latest Innovations In Item And Test Development In Mathematics > 01/09/2020 > 31/08/2023 > 2019

There is no file associated with this reference.

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.