ARTICLES: A TO K

JOURNAL ARTICLES: A TO K

Ai-Hamly, M., & Coombe, C. (2005). To change or not to change: Investigating the value of MCQ answer changing for Gulf Arab students. Language Testing, 22(4), 509–531.

Akbarian, I. (2012). Book review: Measuring second language vocabulary acquisition. Language Testing, 29(4), 597–601.

Allalouf, A., & Abramzon, A. (2008). Constructing better second language assessments based on differential item functioning analysis. Language Assessment Quarterly, 5(2), 120–141.

Al-sadan, I. A. (2000). Educational assessment in Saudi Arabian schools. Assessment in Education: Principles, Policy & Practice, 7(1), 143–155.

Aryadoust, V., Goh, C. C. M., & Kim, L. O. (2011). An investigation of differential item functioning in the MELAB Listening Test. Language Assessment Quarterly, 8(4), 361–385.

Aydin, S. (2010). EFL writers’ perceptions of portfolio keeping. Assessing Writing, 15(3), 194–203.

Bae, J. (2007). Development of English skills need not suffer as a result of immersion: Grades 1 and 2 writing assessment in a Korean/English two-way immersion program. Language Learning, 57(2), 299–332.

Bae, J., & Lee, Y. S. (2011). The validation of parallel test forms: ‘Mountain’ and ‘beach’ picture series for assessment of language skills. Language Testing, 28(2), 155–177.

Bae, J., & Lee, Y. S. (2012). Evaluating the development of children’s writing ability in an EFL Context. Language Assessment Quarterly, 9(4), 348–374.

Barkaoui, K. (2014). Examining the impact of L2 proficiency and keyboarding skills on scores on TOEFL-iBT writing tasks. Language Testing, 31(2), 241–259.

Bax, S. (2013). The cognitive processing of candidates during reading tests: Evidence from eye-tracking. Language Testing, 30(4), 441–465.

Beglar, D. (2010). A Rasch-based validation of the Vocabulary Size Test. Language Testing, 27(1), 101–118.

Berry, R. (2011). Assessment trends in Hong Kong: Seeking to establish formative assessment in an examination culture. Assessment in Education: Principles, Policy & Practice, 18(2), 199–211.

Bethall, G., & Harutyunyan, K. (2008). Assessment and examinations in Armenia. Assessment in Education: Principles, Policy & Practice, 15(1), 107–119.

Bethall, G., & Zabulionis, A. (2011). The evolution of high-stakes testing at the school–university interface in the former republics of the USSR. Assessment in Education: Principles, Policy & Practice, 19(1), 7 –25.

Birenbaum, M. (2002). Assessing self-directed active learning in primary schools. Assessment in Education: Principles, Policy & Practice, 9(1), 119–138.

Birenbaum, M., Tatsuoka, C., & Xin, T. (2005). Large-scale diagnostic assessment: Comparison of eighth graders’ mathematics performance in the United States, Singapore and Israel. Assessment in Education: Principles, Policy & Practice, 12(2), 167–181.

Bonk, W. J., & Ockey, G. J. (2003). A many-facet Rasch analysis of the second language group oral discussion task. Language Testing, 20(1), 89–110.

Brown, A. (2005). Self-assessment of writing in independent language learning programs: The value of annotated samples. Assessing Writing, 10(2), 174–191.

Brown, J. D., & Bailey, K. M. (2008). Language testing courses: What are they in 2007? Language Testing, 25(3), 349–383.

Brown, T. L., Kennedy, K. J., Fok, P. K., Chan, J. K. S., & Yu, W. M. (2009). Assessment for student improvement: Understanding Hong Kong teachers’ conceptions and practices of assessment. Assessment in Education: Principles, Policy & Practice, 16(3), 347–363.

Butler, Y. G. (2009). How do teachers observe and evaluate elementary school students’ foreign language performance? A case study from South Korea. TESOL Quarterly, 43(3), 417–444.

Butler, Y. G., & Lee, J. (2006). On-task versus off-task self-assessments among Korean elementary school students studying English. The Modern Language Journal, 90(4), 506–518.

Butler, Y. G., & Lee, J. (2010). The effects of self-assessment among young learners of English. Language Testing, 27(1), 5–31.

Butler, Y. G., & Zeng, W. (2014). Young foreign language learners’ interactions during task-based paired assessments. Language Assessment Quarterly, 11(1), 45–75.

Cai, H. (2013). Partial dictation as a measure of EFL listening proficiency: Evidence from confirmatory factor analysis. Language Testing, 30(2), 177–199.

Carless, D. (2005). Prospects for the implementation of assessment for learning. Assessment in Education: Principles, Policy & Practice, 12(1), 39 –54.

Carless, D. (2007). Conceptualizing pre-emptive formative assessment. Assessment in Education: Principles, Policy & Practice, 14(2), 171–184.

Carey, M. D., Manell, R. H., & Dunn, P. K. (2011). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 28(2), 201–219.

Carr, N. T. (2006). The factor structure of test task characteristics and examinee performance. Language Testing, 23(3), 269–289.

Chae, S. (2003). Adaptation of a picture-type creativity test for pre-school children. Language Testing, 20(2), 179–188.

Chang, Y. F. (2006). On the use of the immediate recall task as a measure of second language reading comprehension. Language Testing, 23(4), 520–543.

Chapelle, C. A., Jamieson, J., & Hegelheimer, V. (2003). Validation of a web-based ESL test. Language Testing, 20(4), 409–439.

Chapelle, C. A., Chung, Y. R., Hegelheimer, V., Pendar, N., & Xu, J. (2010). Towards a computer-delivered test of productive grammatical ability. Language Testing, 27(4), 443–469.

Chen, F., & Chalhoub-Deville, M. (2014). Principles of quantile regression and an application. Language Testing, 31(1), 63–87.

Chen, J. (2011). Language assessment: Its development and future—an interview with Lyle F. Bachman. Language Assessment Quarterly, 8(3), 277–290.

Chen, L. (2004). On text structure, language proficiency, and reading comprehension test format interactions: A reply to Kobayashi, 2002. Language Testing, 21(2), 228–234.

Chen, N. (2006). A unique academic reading test. [Review of the book An Empirical Investigation of the Componentiality of L2 Reading in English for Academic Purposes by Weir, C. J. and Yan, J.] Language Assessment Quarterly, 3(1), 81–86.

Chen, N. (2010). Review of the book Building a Validity Argument for the Test of English as a Foreign Language by Chapelle, C. A., Enright, M. K. and Jamieson, J. M. Language Assessment Quarterly, 7(4), 377–382.

Chen, Q., May, L., Klenowski, V., & Kettle, M. (2014). The enactment of formative assessment in English language classrooms in two Chinese universities: Teacher and student responses. Assessment in Education: Principles, Policy & Practice, 21(3), 271–285.

Chen, Y. H. (2012). Cognitive diagnosis of mathematics performance between rural and urban students in Taiwan. Assessment in Education: Principles, Policy & Practice, 19(2), 193–209.

Cheng, L. (2008). The key to success: English language testing in China. Language Testing, 25(1), 15–37.

Cheng, L., Andrews, S., & Yu, Y. (2011). Impact and consequences of school-based assessment (SBA): Students’ and parents’ views of SBA in Hong Kong. Language Testing, 28(3), 221–249.

Cheng, L., & Qi, L. (2006). Description and examination of the National Matriculation English test. Language Assessment Quarterly, 3(1), 53–70.

Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: Purposes, methods, and procedures. Language Testing, 21(3), 360–389.

Cheng, L., & Wang, X. (2007). Grading, feedback, and reporting in ESL/EFL classroom. Language Assessment Quarterly, 4(1), 85–107.

Cheng, W., & Warren, M. (2005). Peer assessment of language proficiency. Language Testing, 22(1), 93–121.

Chi, Y. (2013). Review of the book Developing, Using, and Analyzing Rubrics in Language Assessment With Case Studies in Asian and Pacific Languages by Brown, J. D. Language Assessment Quarterly, 10(2), 236–239.

Chik, A., & Besser, S. (2011). International language test taking among young learners: A Hong Kong case study. Language Assessment Quarterly, 8(1), 73–91.

Cho, Y. (2003). Assessing writing: Are we bound by only one method? Assessing Writing, 8(3), 165–191.

Choi, I. C. (2008). The impact of EFL testing on EFL education in Korea. Language Testing, 25(1), 39–62.

Choi, I. C., Kim, K. S., & Boo, J. (2003). Comparability of a paper-based language test and a computer-based language test. Language Testing, 20(3), 295–320.

Cohen, A. D., & Upton, T. A. (2007). ‘I want to go back to the text’: Response strategies on the reading subtest of the new TOEFL. Language Testing, 24(2), 209–250.

Conian, D. (2005). The impact of wearing a face mask in a high-stakes oral examination: An exploratory post-SARS study in Hong Kong. Language Assessment Quarterly, 2(4), 235–261.

Coniam, D., & Falvey, P. (2013). Ten years on: The Hong Kong Language Proficiency Assessment for Teachers of English (LPATE). Language Testing, 30(1), 147–155.

Coombe, C., & Davidson, P. (2014). Common Educational Proficiency Assessment (CEPA) in English. Language Testing, 31(2), 269–276.

Crossley, S. A., Salsbury, T., & McNamara, D. S. (2012). Predicting the proficiency level of language learners using lexical indices. Language Testing, 29(2), 243–263.

Crusan, D. (2002). An assessment of ESL writing placement assessment. Assessing Writing, 8(1), 17–30.

Cumming, A. (2001). ESL/EFL instructors’ practices for writing assessment: Specific purposes or general purposes? Language Testing, 18(2), 207–224.

Currie, M., & Chiramanee, T. (2010). The effect of the multiple-choice item format on the measurement of knowledge of language structure. Language Testing, 27(4), 471–491.

Dastjerdi, H. V., & Talebinezhad, M. R. (2006). Chain-preserving deletion procedure in cloze: A discoursal perspective. Language Testing, 23(1), 58–72.

Dastjerdi, H. V., & Talebinezhad, M. R. (2006). A reply to Ruixia Yan’s critique on CPD procedure in cloze. Language Testing, 23(3), 408–409.

Davies, A. (2001). The logic of testing Languages for Specific Purposes. Language Testing, 18(2), 133–147.

Davis, L. (2009). The influence of interlocutor proficiency in a paired oral assessment. Language Testing, 26(3), 367–396.

Davison, C. (2004). The contradictory culture of teacher-based assessment: ESL teacher assessment practices in Australian and Hong Kong secondary schools. Language Testing, 21(3), 305–334.

Davison, C. (2007). Views from the chalkface: English language school-based assessment in Hong Kong. Language Assessment Quarterly, 4(1), 37–68.

Davison, C., Leung, C., Hill, K., & Sabet, M. (2009). Dynamic speaking assessments. TESOL Quarterly, 43(3), 537–545.

Diab, N. M. (2011). Assessing the relationship between different types of student feedback and the quality of revised writing. Assessing Writing, 16(4), 274–292.

Dudley, A. (2006). Multiple dichotomous-scored items in second language testing: Investigating the multiple true-false item type under norm-referenced conditions. Language Testing, 23(2), 198–228.

Deng, C., & Carless, D. R. (2010). Examination preparation or effective teaching: Conflicting priorities in the implementation of a pedagogic innovation. Language Assessment Quarterly, 7(4), 285–302.

Elgort, I. (2013). Effects of L1 definitions and cognate status of test items on the Vocabulary Size Test. Language Testing, 30(2), 253–272.

Erling, E. J., & Richardson, J. T. E. (2010). Measuring the academic skills of university students: Evaluation of a diagnostic procedure. Assessing Writing, 15(3), 177–193.

Esfandiari, R., & Myford, C. M. (2013). Severity differences among self-assessors, peer-assessors, and teacher assessors rating EFL essays. Assessing Writing, 18(2), 111–131.

Ezer, H., & Sivan, T. (2005). “Good” academic writing in Hebrew: The perceptions of pre-service teachers and their instructors. Assessing Writing, 10(2), 177–133.

Farhady, H. (2005). Language assessment: A linguametric perspective. Language Assessment Quarterly, 2(2), 147–164.

Fillpi, A. (2012). Do questions written in the target language make foreign language listening comprehension tests more difficult? Language Testing, 29(4), 511–532.

Finch, A. (2010). Review of the book Unlocking Assessment: Understanding for Reflection and Application by Swaffield, S. Assessment in Education: Principles, Policy & Practice, 17(1), 110–111.

Fitzpatrick, T., & Clenton, J. (2010). The challenge of validation: Assessing the performance of a test of productive vocabulary. Language Testing, 27(4), 537–554.

Fox, J. (2005). Rethinking second language admission requirements: Problems with language-residency criteria and the need for language assessment and support. Language Assessment Quarterly, 2(2), 85–115.

Fritz, E., & Ruegg, R. (2013). Rater sensitivity to lexical accuracy, sophistication and range when assessing writing. Assessing Writing, 18(2), 173–181.

Galaczi, E. D. (2008). Peer–peer interaction in a speaking test: The case of the first certificate in English examination. Language Assessment Quarterly, 5(2), 89–119.

Gan, Z., Davison, C., & Hamp-Lyons, L. (2009). Topic negotiation in peer group oral assessment situations: A conversation analytic approach. Applied Linguistics, 30(3), 315–334.

Gan, Z. (2010). Interaction in group oral assessment: A case study of higher- and lower-scoring students. Language Testing, 27(4), 585–602.

Gan, Z. (2012). Complexity measures, task type, and analytic evaluations of speaking proficiency in a school-based assessment context. Language Assessment Quarterly, 9(2), 133–151.

Gebril, A. (2009). Score generalizability of academic writing tasks: Does one test method fit it all? Language Testing, 26(4), 507–531.

Gebril, A. (2010). Bringing reading-to-write and writing-only assessment tasks together: A generalizability analysis. Assessing Writing, 15(2), 100–117.

Gebril, A., & Plakans, L. (2014). Assembling validity evidence for assessing academic writing: Rater reactions to integrated tasks. Assessing Writing, 21, 56–73.

Gennaro, D. D. (2013). How different are they? A comparison of Generation 1.5 and international L2 learners’ writing ability. Assessing Writing, 18(2), 154–172.

Geranpayeh, A. & Kunnan, A. J. (2007). Differential Item Functioning in terms of age in the Certificate of Academic English exam. Language Assessment Quarterly, 4, 190-222.

Ginther, A., Dimova, S., & Yang, R. (2010). Conceptual and empirical relationships between temporal measures of fluency and oral English proficiency with implications for automated scoring. Language Testing, 27(3), 379–399.

Glenwright, P. (2012). Language proficiency assessment for teachers: The effects of benchmarking on writing assessment in Hong Kong schools. Assessing Writing, 8(2), 84–109.

Green, A. (2005). EAP study recommendations and score gains on the IELTS Academic Writing test. Assessing Writing, 10(1), 44–60.

Green, A. (2007). Washback to learning outcomes: A comparative study of IELTS preparation and university pre-sessional language courses. Assessment in Education: Principles, Policy & Practice, 14(1), 75–97.

Green, A. B., & Weir, C. J. (2004). Can placement tests inform instructional decisions? Language Testing, 21 (4), 467–494.

Griffin, P., & Thanh, M. T. (2006). Reading achievements of Vietnamese Grade 5 pupils. Assessment in Education: Principles, Policy & Practice, 13(2), 155–177.

Gu, P. Y. (2014). The unbearable lightness of the curriculum: What drives the assessment practices of a teacher of English as a Foreign Language in a Chinese secondary school? Assessment in Education: Principles, Policy & Practice, 21(3), 286–305.

Gui, M. (2012). Exploring differences between Chinese and American EFL teachers’ evaluations of speech performance. Language Assessment Quarterly, 9(2), 186–203.

Guo, L., Crossley, S. A., & McNamara, D. S. (2013). Predicting human judgments of essay quality in both integrated and independent second language writing samples: A comparison study. Assessing Writing, 18(3), 218–238.

Hamp-Lyons, L. (2002). The scope of writing assessment. Assessing Writing, 8(1), 5–16.

Hamp-Lyons, L. (2004). Writing assessment in the world. Assessing Writing, 9(1), 1–3.

Hamp-Lyons, L. (2007). Worrying about rating. Assessing Writing, 1(1), 1–9.

Hamp-Lyons, L. (2009). Principles for large-scale classroom-based teacher assessment of English learners’ language: An initial framework from school-based assessment in Hong Kong. TESOL Quarterly, 43(3), 524–530.

Hamp-Lyons, L., & Lumley, T. (2001). Assessing language for specific purposes. Language Testing, 18(2), 127–132.

Han, M., & Yang, X. (2001). Educational assessment in China: Lessons from history and future prospects. Assessment in Education: Principles, Policy & Practice, 8(1), 5–10.

Harding, L. (2012). Accent, listening assessment and the potential for a shared-L1 advantage: A DIF perspective. Language Testing, 29(2), 163–180.

Hassan, K. E., & Jammal, R. (2005). Validation and development of norms for the Test for Auditory Comprehension of Language-Revised (TACL-R) in Lebanon. Assessment in Education: Principles, Policy & Practice, 12(2), 183–202.

Hassan, N. H., & Shih, C. M. (2013). The Singapore–Cambridge general certificate of Education Advanced-Level General Paper examination. Language Assessment Quarterly, 10(4), 444–451.

He, L., & Qi, L. (2010). Gui Shichun: Founding father of language testing in China. Language Assessment Quarterly, 7(3), 359–371.

He, L., & Dai, Y. (2006). A corpus-based investigation into the validity of the CET–SET group discussion. Language Testing, 23(3), 370–401.

He, L., & Shi, L. (2008). ESL students’ perceptions and experiences of standardized English writing tests. Assessing Writing, 13(2), 130–149.

Hettige, S. T. (2000). Economic liberalisation, qualications and livelihoods in Sri Lanka. Assessment in Education: Principles, Policy & Practice, 7(3), 325–333.

Hirai, A., & Koizumi, R. (2009). Development of a practical speaking test with a positive impact on learning using a story retelling technique. Language Assessment Quarterly, 6(2), 151–167.

Hirai, A., & Koizumi, R. (2013). Validation of empirically derived rating scales for a story retelling speaking test. Language Assessment Quarterly, 10(4), 398–422.

Hirvela, A., & Sweetland, Y. L. (2005). Two case studies of L2 writers’ experiences across learning-directed portfolio contexts. Assessing Writing, 10(2), 192–213.

Hsieh, M. C. (2013). An application of Multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure. Language Testing, 30(4), 491–512.

Hsieh, M. (2013). Comparing Yes/No Angoff and bookmark standard setting methods in the context of English assessment. Language Assessment Quarterly, 10(3), 331–350.

Huang, S. C. (2011). Convergent vs. divergent assessment: Impact on college EFL students’ motivation and self-regulated learning strategies. Language Testing, 28(2), 251–271.

Huang, S. C. (2012). Pushing learners to work through tests and marks: Motivating or demotivating? A case in a Taiwanese university. Language Assessment Quarterly, 9(1), 60–77.

Huang, J., & Foote, C. J. (2010). Grading between the lines: What really impacts professors’ holistic evaluation of ESL graduate student writing? Language Assessment Quarterly, 7(3), 219–233.

Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25(3), 385–402.

Inbar-Lourie, O. (2013). Guest editorial to the special issue on language assessment literacy. Language Testing, 30(3), 301–307.

Inbar-Lourie, O., & Donitsa-Schmidt, S. (2009). Exploring classroom assessment practices: The case of teachers of English as a foreign language. Assessment in Education: Principles, Policy & Practice, 16(2), 185–204.

In’nami, Y., & Koizumi, R. (2009). A meta-analysis of test format effects on reading and listening test performance: Focus on multiple-choice and open-ended formats. Language Testing, 26(2), 219–244.

In’nami, Y., & Koizumi, R. (2011). Structural Equation Modeling in language testing and learning research: A review. Language Assessment Quarterly, 8(3), 250–276.

In’nami, Y., & Koizumi, R. (2012). Factor structure of the revised TOEIC test: A multiple-sample analysis. Language Testing, 29(1), 131–152.

Isaacs, T., & Thomson, R. I. (2013). Rater experience, rating scale length, and judgments of L2 pronunciation: Revisiting research conventions. Language Assessment Quarterly, 10(2), 135–159.

Ishihara, N. (2009). Teacher-based assessment for foreign language pragmatics. TESOL Quarterly, 43(3), 445–470.

Iwashita, N. (2006). Syntactic complexity measures and their relation to oral proficiency in Japanese as a foreign language. Language Assessment Quarterly, 3(2), 151–169.

Iwashita, N., McNamara T., & Elder, C. (2001). Can we predict task difficulty in an oral proficiency test? Exploring the potential of an information-processing approach to task design. Language Learning, 51(3), 401–436.

Jafarpur, A. (2003). Is the test constructor a facet? Language Testing, 20(1), 57–87.

James, C. L. (2008). Electronic scoring of essays: Does topic matter? Assessing Writing, 13(2), 80–92.

Jang, E. E. (2009). Demystifying a Q-matrix for making diagnostic inferences about L2 reading skills. Language Assessment Quarterly, 6(3), 210–238.

Jeong, H. J. (2013). Defining assessment literacy: Is it different for language testers and non-language testers? Language Testing, 30(3), 345–362.

Jeong, H. J., Hashizume, H., Sugiura, M., Sassa, Y., Yokoyama, S., Shiozaki, S., & Kawashima, R. (2011). Testing second language oral proficiency in direct and semidirect settings: A social-cognitive neuroscience perspective. Language Learning, 61(3), 675–699.

Jin, T., & Mak, B. (2013). Distinguishing features in scoring L2 Chinese speaking performance: How do they work? Language Testing, 30(1), 23–47.

Jin, T., Mak, B., & Zhou, P. (2012). Confidence scoring of speaking performance: How does fuzziness become exact? Language Testing, 29(1), 43–65.

Jin, Y. (2010). The place of language testing and assessment in the professional preparation of foreign language teachers in China. Language Testing, 27(4), 555–584.

Jin, Y., & Fan, J. (2011). Test for English Majors (TEM) in China. Language Testing, 28(4), 589–596.

Johnson, D. (2003). Activity theory, mediated action and literacy: Assessing how children make meaning in multiple modes. Assessment in Education: Principles, Policy & Practice, 10(1), 103–129.

Johnson, J. S., & Lim, G. S. (2009). The influence of rater language background on writing performance assessment. Language Testing, 26(4), 485–505.

Kang, O., Rubin, D., & Pickering, L. (2010). Suprasegmental measures of accentedness and judgments of language learner proficiency in oral English. The Modern Language Journal, 94(4), 554–566.

Katzenberger, I., & Meilijson, S. (2014). Hebrew language assessment measure for preschool children: A comparison between typically developing children and children with specific language impairment. Language Testing, 31(1), 19–38.

Ke, C. (2006). A model of formative task-based language assessment for Chinese as a foreign language. Language Assessment Quarterly, 3(2), 207–227.

Kennet-Cohen, T., Turvall, E., & Oren, C. (2014). Detecting bias in selection for higher education: Three different methods. Assessment in Education: Principles, Policy & Practice, 21(2), 193–204.

Keppell, M., & Carless, D. (2006). Learning-oriented assessment: A technology-based case study. Assessment in Education: Principles, Policy & Practice, 13(2), 179–191.

Kim, M. (2001). Detecting DIF across the different language groups in a speaking test. Language Testing, 18(1), 89–114.

Kim, Y. H. (2009). An investigation into native and non-native teachers’ judgments of oral English performance: A mixed methods approach. Language Testing, 26(2), 187–217.

Kim, Y. H. (2011). Diagnosing EAP writing ability using the Reduced Reparameterized Unified Model. Language Testing, 28(4), 509–541.

Klein, J., & Taub, D. (2005). The effect of variations in handwriting and print on evaluation of student essays. Assessing Writing, 10(2), 134–148.

Klenowski, V. (2000). Portfolios: Promoting teaching. Assessment in Education: Principles, Policy & Practice, 7(2), 215–236.

Klenowski, V. (2006). Learning oriented assessment in the Asia Pacific region. Assessment in Education: Principles, Policy & Practice, 13(2), 131–134.

Klenowski, V. (2009). Assessment for learning revisited: An Asia-Pacific perspective. Assessment in Education: Principles, Policy & Practice, 16(3), 263–268.

Knock, U., Rouhshad, A., & Storch, N. (2014). Does the writing of undergraduate ESL students develop after one year of study in an English-medium university? Assessing Writing, 21, 1–17.

Kobayashi, M. (2002). Method effects on reading comprehension test performance: Text organization and response format. Language Testing, 19(2), 193–220.

Kobayashi, M. (2002). Cloze tests revisited: Exploring item characteristics with special attention to scoring methods. The Modern Language Journal, 86(4), 571–586.

Kobayashi, M. (2004). Investigation of test method effects: Text organization and response format: A response to Chen, 2004. Language Testing, 21(2), 235–244.

Kobayashi, M. (2005). Washback, washback, washback… [Review of the book Washback in Language Testing: Research Contexts and Methods by Cheng, L. and Watanabe, Y. with Curtis, A.] Language Assessment Quarterly, 2(4), 321–325.

Kobayashi, M., & Negishi, M. (2008). An interview with Professor Ohtomo: The founding father of language testing in Japan. Language Assessment Quarterly, 5(3), 244–266.

Kobrin, J. L., Deng, H., & Shaw, E. J. (2011). The association between SAT prompt characteristics, response features, and essay scores. Assessing Writing, 16(3), 154–169.

Koh, K., & Luke, A. (2009). Authentic and conventional assessment in Singapore schools: An empirical study of teacher assignments and student work. Assessment in Education: Principles, Policy & Practice, 16(3), 291–318.

Koizumi, R., Sakai, H., Ido, T., Ota, H., Hayama, M., Sato, M., & Nemoto, A. (2011). Development and validation of a diagnostic grammar test for Japanese learners of English. Language Assessment Quarterly, 8(1), 53–72.

Kokhan, K. (2012). Investigating the possibility of using TOEFL scores for university ESL decision-making: Placement trends and effect of time lag. Language Testing, 29(2), 291–308.

Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese second language writing performance. Language Testing, 19(1), 3–31.

Kondo-Brown, K. (2005). Differences in language skills: Heritage language learner subgroups and foreign language learners. The Modern Language Journal, 89(4), 563–581.

Kozaki, Y. (2004). Using GENOVA and FACETS to set multiple standards on performance assessment for certification in medical translation from Japanese into English. Language Testing, 21(1), 1–27.

Kozaki, Y. (2010). An alternative decision-making procedure for performance assessments: Using the Multifaceted Rash model to generate cut estimates. Language Assessment Quarterly, 7(1), 75–95.

Kozulin, A. (2011). Learning potential and cognitive modifiability. Assessment in Education: Principles, Policy & Practice, 18(2), 169–181.

Kunnan, A. J. (2003). [Review of the book The art of nonconversation by M. Johnson]. The Modern Language Journal, 87, 338-340.

Kunnan, A. J. (2007). Test fairness, test bias & DIF. Language Assessment Quarterly, 4,109-112.

Kunnan, A. J. (2009). Politics and legislation in citizenship testing in the U.S. Annual Review of Applied Linguistics, 29, 37-48.

Kunnan, A. J. (2009). The U.S. Naturalization Test. Language Assessment Quarterly, 6, 89-97.

Kunnan, A. J. (2010). Publishing in the era of electronic technologies. The Modern Language Journal, 94, 643-645.

Kunnan, A. J. (2010). Statistical analysis for test fairness. Revue Française de Linguistique Appliquée, 16, 39-48.

Kunnan, A. J. (2010). Fairness matters and Toulmin’s argument structures. Language Testing, 4, 183-189.

Kyriakides, L. (2004). Investigating validity from teachers’ perspectives through their engagement in large-scale assessment: The Emergent Literacy Baseline Assessment project. Assessment in Education: Principles, Policy & Practice, 11(2), 143–165.