Batty, Aaron



Faculty of Nursing and Medical Care (Shonan Fujisawa)


Associate Professor


Research Areas 【 Display / hide

  • English linguistics

Research Keywords 【 Display / hide

  • Assessment

  • Item Response Theory (IRT)

  • Language Testing


Papers 【 Display / hide

  • Predicting L2 reading proficiency with modalities of vocabulary knowledge: A bootstrapping approach

    McLean S., Stewart J., Batty A.

    Language Testing (Language Testing)   2020

    ISSN  02655322

     View Summary

    © The Author(s) 2020. Vocabulary’s relationship to reading proficiency is frequently cited as a justification for the assessment of L2 written receptive vocabulary knowledge. However, to date, there has been relatively little research regarding which modalities of vocabulary knowledge have the strongest correlations to reading proficiency, and observed differences have often been statistically non-significant. The present research employs a bootstrapping approach to reach a clearer understanding of relationships between various modalities of vocabulary knowledge to reading proficiency. Test-takers (N = 103) answered 1000 vocabulary test items spanning the third 1000 most frequent English words in the New General Service List corpus (Browne, Culligan, & Phillips, 2013). Items were answered under four modalities: Yes/No checklists, form recall, meaning recall, and meaning recognition. These pools of test items were then sampled with replacement to create 1000 simulated tests ranging in length from five to 200 items and the results were correlated to the Test of English for International Communication (TOEIC®) Reading scores. For all examined test lengths, meaning-recall vocabulary tests had the highest average correlations to reading proficiency, followed by form-recall vocabulary tests. The results indicated that tests of vocabulary recall are stronger predictors of reading proficiency than tests of vocabulary recognition, despite the theoretically closer relationship of vocabulary recognition to reading.

  • Validity evidence for a sentence repetition test of Swiss German Sign Language

    Haug T., Batty A., Venetz M., Notter C., Girard-Groeber S., Knoch U., Audeoud M.

    Language Testing (Language Testing)   2020

    ISSN  02655322

     View Summary

    © The Author(s) 2020. In this study we seek evidence of validity according to the socio-cognitive framework (Weir, 2005) for a new sentence repetition test (SRT) for young Deaf L1 Swiss German Sign Language (DSGS) users. SRTs have been developed for various purposes for both spoken and sign languages to assess language development in children. In order to address the need for tests to assess the grammatical development of Deaf L1 DSGS users in a school context, we developed an SRT. The test targets young learners aged 6–17 years, and we administered it to 46 Deaf students aged 6.92–17.33 (M = 11.17) years. In addition to the young learner data, we collected data from Deaf adults (N = 14) and from a sub-sample of the children (n = 19) who also took a test of DSGS narrative comprehension, serving as a criterion measure. We analyzed the data with many-facet Rasch modeling, regression analysis, and analysis of covariance. The results show evidence of scoring, criterion, and context validity, suggesting the suitability of the SRT for the intended purpose, and will inform the revision of the test for future use as an instrument to assess the sign language development of Deaf children.

  • Going online: The effect of mode of delivery on performances and perceptions on an English L2 writing test suite

    T Brunfaut, L Harding, AO Batty

    Assessing Writing 36   3 - 18 2018.04

    Research paper (scientific journal), Joint Work, Accepted,  ISSN  10752935

     View Summary

    © 2018 The Authors In response to changing stakeholder needs, large-scale language test providers have increasingly considered the feasibility of delivering paper-based examinations online. Evidence is required, however, to determine whether online delivery of writing tests results in changes to writing performance reflected in differential test scores across delivery modes, and whether test-takers hold favourable perceptions of online delivery. The current study aimed to determine the effect of delivery mode on the two writing tasks (reading-into-writing and extended writing) within the Trinity College London Integrated Skills in English (ISE) test suite across three proficiency levels (CEFR B1-C1). 283 test-takers (107 at ISE I/B1, 109 at ISE II/B2, and 67 at ISE III/C1) completed both writing tasks in paper-based and online mode. Test-takers also completed a questionnaire to gauge perceptions of the impact, usability and fairness of the delivery modes. Many-facet Rasch measurement (MFRM) analysis of scores revealed that delivery mode had no discernible effect, apart from the reading-into-writing task at ISE I, where the paper-based mode was slightly easier. Test-takers generally held more positive perceptions of the online delivery mode, although technical problems were reported. Findings are discussed with reference to the need for further research into interactions between delivery mode, task and level.

  • Investigating the impact of nonverbal communication cues on listening item types

    A Batty

    Language Learning and Language Teaching (John Benjamins)  50   161 - 175 2018

    (MISC)Research paper, Single Work,  ISSN  15699471

  • The impact of visual cues on item response in video-mediated tests of foreign language listening comprehension

    A Batty

    Lancaster University  2017

    Doctoral Thesis, Single Work

     View Summary

    The present thesis employed a mixed-methods research design spanning over two studies and an intermediate instrument development step to investigate the interactions between the presence of nonverbal and other visual cues with examinees, individual items, and item task types in video listening tests. The first study employed eye-tracking methodology to determine the specific visual cues to which examinees attend when interacting with a video-mediated listening test. The findings of Study I then informed the development of a new video-mediated listening test for Study II, which investigated the effect of the presence of visual cues on item and item task difficulty via many-facet Rasch modeling and qualitative item analysis. Individual examinee differences were also explored in both studies with respect to gender, proficiency, and perceptions of the two formats.

    Study I found that examinees view facial cues an average of 81% of the time regardless of task type, but spend significantly more time oriented toward the listener (i.e., the character who is not speaking) when presented with an implicit item. Direct viewing of gestures, despite their prominent place in the nonverbal communication literature, only accounted for approximately 1.35% of the total video time. Study II found that the presence of visual information exerted a facilitative effect on all items, but that it was significantly more pronounced with implicit items, despite the fact that they were more difficult than the explicit items in the audio format. Finally, no substantive interactions between proficiency, gender, or perception of the formats were observed between either viewing behavior or the facilitative effect of video.

    The thesis ultimately raises questions about the usefulness of video listening tests for listening comprehension assessment, as the effect appears to do little more that raise scores, and, in the case of implicit items, may obviate the need to comprehend the linguistic input in order to make correct inferences.

display all >>

Papers, etc., Registered in KOARA 【 Display / hide

Research Projects of Competitive Funds, etc. 【 Display / hide

  • An Objective Test of Communicative English Proficiency


    Keio University, BATTY Aaron Olaf, JEFFREY Stewart, Grant-in-Aid for Challenging Exploratory Research

     View Summary

    The researchers developed a new test of communicative speaking proficiency, called the Objective Communicative Speaking Test (OCST). The OCST is a timed information-gap task-based test delivered via tablet computers. The OCST measures the time required for a speaker to relate a piece of information unknown to the rater, on the assumption that more traditional components of oral proficiency will contribute to time to completion. The test was administered to a sample of 86 first- (L1s) and second-language (L2s) speakers of English, and their task completion times were assigned an L1-referenced score. The data were analyzed via many-facet Rasch analysis. As hypothesized, the objective design of the test reduced rater effects, and raters could be excluded from the model. An examinee reliability coefficient of 0.88 was observed, surpassing that of most subjective tests of speaking proficiency.


Courses Taught 【 Display / hide











display all >>