非線性結構方程模式:以試題反應模式

Nonlinear Structural Equation Models: An Item Response

蘇啟明;王文中
Chi-Ming Su;Wen-Chung Wang


所屬期刊: 第6卷第4期 「測驗與評量」
主編:國立臺灣師範大學教育心理與輔導學系
林世華
系統編號: vol023_01
主題: 測驗與評量
出版年份: 2010
作者: 蘇啟明;王文中
作者(英文): Chi-Ming Su;Wen-Chung Wang
論文名稱: 非線性結構方程模式:以試題反應模式
論文名稱(英文): Nonlinear Structural Equation Models: An Item Response
共同作者:
最高學歷:
校院名稱:
系所名稱:
語文別:
論文頁數: 46
中文關鍵字: 試題反應理論;潛在反應;非線性結構模式;貝氏統計;可能值
英文關鍵字: item response theory;latent response;nonlinear structural equation modeling;Bayesian statistics;plausible value
服務單位: 國立中正大學心理學研究所博士候選人;香港教育學院心理研究學系講座教授
稿件字數: 11976
作者專長: 測驗與評量
投稿日期: 2010/11/7
論文下載: pdf檔案icon
摘要(中文): 人類學科裡的反應變項通常是二元或順序的,而不是等距。由於類別
的試題反應跟其所欲測量的潛在特質的關係不會是線性的,於是就發展了
試題反應理論來描述它們之間的非線性關係。另一種類別資料分析方法,試
圖建立潛在連續反應與所欲測量的潛在特質的線性關係,然後透過閾值模式
將潛在連續反應轉化為觀察的類別反應。當結構方程模式裡的測量模式中的
試題反應與潛在特質之間的關係是非線性時,就可稱為非線性結構模式。其
參數可用貝氏(WinBUGS軟體)或非貝氏(Mplus軟體)來估計。從模擬研
究裡,我們發現雖然這兩種軟體都可以很精確的估計參數,但當測驗短時,
WinBUGS 的估計效果比Mplus好。如果原始試題反應不可得的話,使用可
能值的作法可以有效估計內衍變項和外衍變項間的結構參數,但是最大概似
估計的作法嚴重低估結構參數,因為它完全沒有考慮測量誤差。兩個實證的
例子說明了非線性結構模式和可能值作法的意涵與應用。
摘要(英文): Response variables in the human sciences are often binary or ordinal rather than interval. Because the relationship between categorical item responses and their underlying latent traits cannot be linear, item response theory (IRT) models have been developed to
describe the nonlinear relationship between them. Another approach to categorical data is to establish linear relationship between latent continuous responses and their underlying latent traits and convert latent continuous responses to observed categorical item responses via threshold models. When the relationships between item responses and their underlying latent traits are nonlinear in the measurement part of structural equation modeling (SEM), the resulting SEM can be called nonlinear SEM (NSEM), to emphasize the nonlinear relationships between item responses and the underlying latent traits. Parameters in NSEM can be estimated using the Bayesian (WinBUGS) or non-Bayesian (Mplus) approaches. In
a series of simulations, it was found that although both WinBUGS and Mplus can recover parameters in NSEM very well, WinBUGS slightly out-performs Mplus when tests are short. When original item responses are not accessible, the use of the plausible-value approach can recover the structural parameters for exogenous and endogenous variables as satisfactorily as WinBUGS and Mplus can when original responses are accessible. However, the use of the maximum likelihood estimate approach underestimates the structural parameters substantially because the measurement error is ignored completely.
Depression and resource planning are two empirical examples that illustrate the implications and applications of NSEM and the plausible-value approach.
參考文獻: Adams, R. J., Wilson, M., & Wu, M. (1997). Multilevel item response models: An approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22, 47-76.
Andersen, E. B. (2004). Latent regression analysis based on the rating scale model. Psychology Science, 46, 209-226.
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.
Andrich, D. (1996). A hyperbolic cosine latent trait model for unfolding polytomous responses:Reconciling Thurstone and Likert methodologies. British Journal of Mathematical and Statistical Psychology, 49, 347-365.
Beck, A. T., Steer, R. A., Ball, R., & Ranieri, W. F. (1996). Comparison for Beck Depression Inventories-IA and –II in Psychiatric Outpatients. Journal of Personality, 67, 588-597.
Beguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541-562.
Birnbaum, A. (1968). Some latent trait models and their use in
inferring an examinees ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397-479).
Reading, MA: Addison-Wesley.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters:Application of an EM algorithm. Psychometrika, 46, 443-459.
Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2001). A mixture item response model for multiplechoice data. Journal of Educational and Behavioral Statistics, 26, 381-409.
Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153-168.
Christensen, K. B., Bjorner, J. B., Kreiner, S., & Petersen, J. H. (2004). Latent regression in loglinear Rasch models. Communications in Statistics: Theory and Methods, 33, 1341-1356.
Cowles, M. K. (2004). Review of WinBUGS 1.4. The American Statistician, 58, 330-336.
Embretson, S. E. (1996). Item response theory models and spurious interaction effects in factorial
ANOVA designs. Applied Psychological Measurement, 20, 201–212.
Fischer, G. H., & Ponocny, I. (1994). An extension of the partial credit model with an application
to the measurement of change. Psychometrika, 59, 177–192.
Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48, 3–26.
Fryback, D. G., Stout, N. K., & Rosenberg, M. A. (2001). An elementary introduction to Bayesian computing using WinBUGS. International Journal of Technology Assessment in Health Care,
17, 98-113. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1996). Bayesian data analysis. London,UK : Chapman & Hall.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE trans. Pattern Analysis and Machine Intelligence, 12, 609-628.
Glass, G. V., Peckham, P. D., & Sanders, J. R. (1972). Consequences of failure to meet assumptions underlying the analyses of variance and covariance, Review of Educational Research, 42, 237-288.
Hardin, J., & Hilbe, J. (2007). Generalized linear models and extensions (2nd ed.). College Station, TX: Stata Press.
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.
Joreskog, K.G. & Sorbom, D. (2006). LISREL 8.80 for Windows [Computer Software].
Lincolnwood, IL: Scientific Software International.
Kang, T., & Cohen, A. S. (2007). IRT model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331-358.
Lee, S.-Y. (2007). Structural equation modeling: A Bayesian approach. West Sussex, UK : Wiley.
Lee, S.-Y., & Tang, N.-S. (2006a). Analysis of nonlinear structural equation models with nonignorable missing covariates and ordered categorical data. Statistica Sinica, 16, 1117-1141.
Lee, S.-Y., & Tang, N.-S. (2006b). Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika, 71, 541-564.
Lee, S.-Y., & Zhu, H.-T. (2002). Maximum likelihood estimation of nonlinear structural equation models. Psychometrika, 67, 189-210.
Lee, S.-Y., Song, X.-Y., & Tang, N.-S. (2007). Bayesian methods for analyzing structural equation models with covariates, interaction, and quadratic latent variables. Structural Equation
Modeling, 14, 404–434.
Lee, S.-Y., Song, X.-Y., Cai, J.-H., So, W,-Y., Ma, C.-W., & Chan, C.-N. (2009). Non-linear structural equation models with correlated continuous and discrete data. British Journal of
Mathematical and Statistical Psychology, 62, 327-347.
Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological measurement, 30, 3-21.
Linacre, J. M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA.
Liu, K.-S., Cheng, Y.-Y., & Wang, W.-C. (2007). Rasch analysis of the Beck Depression Inventory-II with Taiwan university students. Paper presented at 2007 Pacific Rim Objective
Measurement Symposium. National College of Physical Education & Sports, Taoyuan, Taiwan.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale,NJ: Erlbaum.
Lubke, G. H., & Muthen, B. (2004). Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons.
Structural Equation Modeling, 11, 514-534.
Luo, G. (1998). A general formulation for unidimensional unfolding and pairwise preference models: Making explicit the latitude of acceptance. Journal of Mathematical Psychology, 42, 400-417.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.
Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-381.
Mislevy, R.J., Beaton, A., Kaplan, B.A., & Sheehan, K. (1992). Estimating population
characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 133-161.
Mooijaart, A., & Bentler, P. M. (2010). An alternative approach for nonlinear latent variable models. Structural Equation Modeling, 17, 357-373.
Muraki, R. J. (1992). A generalized partial credit model:Application of an EM-algorithm. Applied Psychological Measurement, 16, 159-176.
Muthen, B. (1979). A structural probit model with latent variables. Journal of the American Statistical Association, 74, 807-811.
Muthen, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 48-65.
Muthen, B. (1984). A general structural equation model with dichotomous, ordered categorical,and continuous latent variable indicators. Psychometrika, 49, 115-132.
Muthen, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557-585.
Muthen, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A.
Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205-243). Newbury Park,CA: Sage.
Muthen, B. (1996). Growth modeling with binary responses. In A. V. Eye & C. Clogg (Eds.),Categorical variables in developmental research: Methods of analysis (pp. 37-54). San Diego,CA: Academic Press.
Muthen, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.
Muthen, B., & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.
Muthen, B., & Speckart, G. (1983). Categorizing skewed, limited dependent variables: Using multivariate probit regression to evaluate the California Civil Addict Program. Evaluation Review, 7, 257-269.
Muthen, L., & Muthen, B. (2007). Mplus user’s guide (4th ed.). Los Angeles, CA: Muthen and Muthen.
Olsson, U (1979). Maximum likelihood estimation of the polychoric correlation coefficient.
Psychometrika, 44, 443–460.
Qiu, Z., Song, P. X.-K., & Tan, M. (2002). Bayesian hierarchical models for multi-level repeated
ordinal data using WinBUGS. Journal of Biopharmaceutical Statistics, 12, 121-135.
Raftery, A. E., & Lewis, S. M. (1996). Implementing MCMC. In W. R. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 115-130). London, UK:Chapman & Hall.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment test. Copenhagen,Denmark: Institute of Educational Research.
Reckase, M. D. (2009). Multidimensional item response theory. New York, NY: Springer.
Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A general item response theory model
for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3-32.
Samejima, F. (1969). Estimation of Latent Ability Using a Response Pattern of Graded Scores (Psychometric Monograph No. 17). Richmond, VA: Psychometric Society. Retrieved from
http://www.psychometrika.org/journal/online/MN17.pdf
Satorra, A. (1992). Asymptotic robust inferences in the analysis of mean and covariance structures.
In P. V. Marsden (Ed.), Sociological Methodology 1992 (pp. 249-278). Oxford, UK: Blackwell.
Sheu, C.-F., Chen, C.-T., Su, Y.-H., & Wang, W.-C. (2005). Using SAS PROC NLMIXED to fit item response theory models. Behavior Research Methods, 37, 202-218.
Shiau, W.-L. (2007). Multivariate analysis and best introduction of SEM. Taipei: GOTOP Information Inc.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton, FL: Chapman & Hall/CRC.
Song, X.-Y, & Lee, S.-Y. (2005). Maximum likelihood analysis of nonlinear structural equation models with dichotomous variables. Multivariate Behavioral Research, 40, 151-177.
Spiegelhalter, D., Thomas, A., & Best, N. (2003). WinBUGS version 1.4 [Computer program].
Cambridge, UK: MRC Biostatistics Unit, Institute of Public Health.
Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of
Applied Psychology, 91, 1292-1306.
Storch, E. A., Roberti, J. W., & Roth, D. A. (2001). Factor structure, concurrent validity, and internal consistency of the Beck Depression Inventory-Second Edition in a sample of college
students. Depression and Anxiety, 19, 187-189.
Sturtz, S., Ligges, U., & Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software, 12, 1-16.
Tierney, L. (1994). Exploring posterior distributions with Markov Chains. Annals of Statistics, 22, 1701-1762.
Tuerlinckx, F., & Wang, W.-C. (2004). Models for polytomous data. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models: A generalized linear and nonlinear approach (pp.75-109). New York, NY: Springer-Verlag.
Wainer, H., Bradlow, E.T., & Wang, X. (2007). Testlet response theory and its applications.
Cambridge: Cambridge University Press.
Wang, W.-C. (2004). Direct estimation of correlation as a measure of association strength using multidimensional item response models. Educational and Psychological Measurement, 64, 937-955.
Wang, W.-C., & Liu, C.-Y. (2007). Formulation and application of the generalized multilevel facets model. Educational and Psychological Measurement, 67, 683-605.
Wang, W.-C., Chen, P.-H., & Cheng, Y.-Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116-136.
Wilson, M. (1992). The ordered partition model: an extension of the partial credit model. Applied Psychological Measurement, 16, 309-325.
Zwinderman, A. H. (1991). A generalized Rasch model for manifest predictors. Psychometrika, 56, 589-600.