Skip Navigation


Rheumatology Advance Access originally published online on February 3, 2006
Rheumatology 2006 45(7):890-902; doi:10.1093/rheumatology/kei267
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
45/7/890    most recent
kei267v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Terwee, C. B.
Right arrow Articles by Dekker, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Terwee, C. B.
Right arrow Articles by Dekker, J.
Related Collections
Right arrow Osteoarthritis and Cartilage
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Published by Oxford University Press on behalf of the British Society for Rheumatology 2006.

Performance-based methods for measuring the physical function of patients with osteoarthritis of the hip or knee: a systematic review of measurement properties

C. B. Terwee1, L. B. Mokkink1, M. P. M. Steultjens1,2 and J. Dekker1,2

1 EMGO Institute, 2 Department of Rehabilitation Medicine, VU University Medical Center; Amsterdam, The Netherlands

Correspondence to: C. B. Terwee, EMGO Institute, VU University Medical Center, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands. E-mail: cb.terwee{at}vumc.nl


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
Objective. To systematically review the measurement properties (i.e. internal consistency, reproducibility, validity, responsiveness and interpretability) of all performance-based methods which have been used to measure the physical function of patients with osteoarthritis of the hip or knee.

Methods. A systematic search was conducted in Medline, CINAHL, PsychINFO and Embase. Standardized criteria were applied to assess the quality of the clinimetric studies and the measurement properties.

Results. Twenty-six performance-based methods were included: 13 walking tests, two stair-climb tests, one chair test and ten multi-item tests. Three out of seven multi-activity tests were tested for internal consistency and two were rated positively. Fourteen tests were tested for reliability and five were rated positively. The absolute measurement error (agreement) was assessed for 10 tests. Only one test received a positive rating. Fourteen tests were tested for construct validity. Only two tests received positive ratings. Responsiveness was assessed for 12 tests, but none of them received a positive rating. A lot of indeterminate ratings were given, mostly for small studies or non-optimal analyses.

Conclusion. Many more well-designed studies are needed to assess the measurement properties of performance-based methods. More importantly, however, before one can make a justified choice of a particular performance-based method, consensus is needed on what activities should be included in a performance-based test for patients with hip or knee osteoarthritis and which aspects of function should be measured.

KEY WORDS: Osteoarthritis, Systematic review, Performance-based method, Reproducibility of results, Validation study


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
Physical function, i.e. the ability to perform daily activities, is generally considered one of the most important outcome measures for patients with osteoarthritis (OA) of the hip or knee [1]. In addition to self-reports, a variety of performance-based methods are used to assess the physical function of patients with OA of the hip or knee, e.g. the ‘6-min walk test’ [2], self-paced walking time tests [3], the timed ‘get up and go’ test [4], different stair-climb tests, etc. Most tests include the recording of the time it takes a patient to perform the requested activity. In addition, several observational methods have been described that use ratings from observers to assess the quality of physical function [5, 6].

There is no overview of the measurement properties of these performance-based methods. Such an overview is necessary to decide which test to choose for routine clinical practice or in designing a study. The aim of this study was to systematically review the literature on the measurement properties (i.e. internal consistency, reproducibility, validity, responsiveness and interpretability) of all performance-based methods that have been used to measure the physical function of patients with OA of the hip or knee. Studies in patients undergoing hip or knee replacement (with or without a further specification of the patient population) were also included because in general the majority of these patients are diagnosed with OA [7]. A standardized set of criteria was applied to assess the quality of the clinimetric studies and the measurement properties. This review aims to offer investigators and clinicians a basis for choosing a performance-based method for clinical practice or for a study.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
Literature search
We searched the following databases: Medline (1966 to April 2004), CINAHL (1982 to September 2003), PsycINFO (1966 to September 2003) and Embase (1980 to week 40, 2003). The full search strategy is listed in Appendix 1. In short, a combination of different variations of the following text words was used: hip or knee, and osteoarthritis or replacement, and performance test or objective test or observational test or gait or stair or chair or walk or gait analysis or kinematic analysis. Additional articles were identified by manually searching references of the retrieved articles and the authors’ own literature databases.

Inclusion and exclusion criteria
We used the following inclusion criteria:

  1. The study population should include patients with OA of the hip or knee, or patients undergoing hip or knee replacement.
  2. A performance-based method should have been used to assess physical function, which was defined as a test in which patients had to perform one or more activities, while aspects related to the quality of physical function, i.e. either the (dis)ability to perform activities or the quality of the performance (e.g. time to perform the activity or kinematic parameters), were measured by an observer or a measurement instrument.
  3. The article should have been published in the English language.
  4. Information on the measurement properties of the performance-based method in a sample of patients with OA of the hip or knee, or patients undergoing hip or knee replacement should be provided.

Information on measurement properties was only included if it was intentionally collected or calculated to assess the measurement properties of the particular performance-based method. If, for example, correlations between a performance-based method and a self-report questionnaire were presented to assess the validity of the questionnaire (while the performance-based method was used as an external criterion), the data were not included in this review. We also excluded studies with fewer than 10 patients. Finally, we only included full-text articles; abstracts, books, theses or conference proceedings were excluded.

Selection of articles
Article selection, data extraction and quality assessment was performed by two independent reviewers (CBT and either LBM or MPMS). Disagreements were discussed and resolved. All abstracts were scanned against the inclusion criteria. The full-text article was retrieved of all abstracts that fulfilled the inclusion criteria and of abstracts that did not contain measurement properties but where indications were found that these properties were presented in the full-text article. The retrieved articles were reviewed again against the inclusion criteria.

Data extraction
A description of the performance-based method was extracted from the included articles, including the activities that were performed by the patients, the aspects of function that were measured, and the equipment used to measure these aspects. Information on the measurement properties of the instruments and the quality of the clinimetric studies was extracted, using a recently developed standardized checklist, described below.

Quality assessment of the clinimetric studies and measurement properties
Bot et al. [8] developed a checklist to evaluate the quality and outcomes of clinimetric studies on self-report questionnaires. The checklist was partly based on the review criteria developed by the Scientific Advisory Committee of the Medical Outcome Trust [9] and a checklist developed by Bombardier and Tugwell [10]. We slightly modified this checklist for the evaluation of performance-based methods (Appendix 2). All measurement properties were rated as positive (+), negative (–) or indeterminate (?) depending on the methods and results of the clinimetric studies (see below). If no information was available, a 0 was recorded. For all measurement properties a sample size of at least 50 patients was considered necessary to receive a positive rating [11]. The following measurement properties were evaluated.

Reproducibility
A distinction was made between reliability and absolute measurement error (agreement).

Reliability refers to the ability to differentiate among patients, despite measurement errors [12]. The intraclass correlation coefficient (ICC), or kappa for dichotomous or ordinal data, was considered as an adequate measure of reliability [12]. The use of Pearson correlation coefficients was rated as indeterminate, as it neglects systematic errors [13]. An ICC >0.70, with the lower limit of the confidence interval >0.60 or a sample size of at least 50 patients, was considered as acceptable [9, 14].

Absolute measurement error, sometimes called agreement [12], refers to the precision of the instrument and is expressed in the units of measurement of the instrument. The limits of agreement [15], the standard error of measurement (SEM), or smallest detectable change (SDC) [16], were regarded as adequate measures of absolute measurement error. The rating of the absolute measurement error depends on what is considered a minimal important difference (MID) in scores between or within persons. Absolute measurement error was rated positively if the limits of agreement or SDC were smaller than the MID. If the MID was not known, a value of 0.5 S.D. was considered as a general guideline for the MID [17].

Internal consistency
Factor analysis should have been applied to determine if scores to be summarized measure the same concept [18]. In addition, a Cronbach's alpha should have been calculated for each set of scores to be summarized, as a measure of internal consistency. A Cronbach's alpha of at least 0.70 was considered acceptable [14].

Construct validity
Validity refers to the ability of an instrument to measure the concept that it is intended to measure [9]. Construct validity was considered to be adequate if specific hypotheses were defined about expected relationships with other measures of physical function or about expected differences in scores between specific subgroups, and if at least 75% of the results of the clinimetric study were in correspondence with these hypotheses. Floor and ceiling effects were considered present if more than 15% of respondents achieved the highest or lowest possible score, respectively [19].

Responsiveness
Responsiveness refers to an instrument's ability to detect important change over time in the concept being measured [20, 21]. It should be considered an aspect of validity in a longitudinal setting [21, 22]. Responsiveness was therefore rated similarly to construct validity.

Reproducibility, validity and responsiveness depend on the setting and the population in which it is assessed. Therefore, we considered a clear description of the design of each individual clinimetric study—including characteristics of the study population (diagnosis and clinical features), measurements, testing conditions and data analysis—required to receive a positive rating. Furthermore, if any methodological weakness in the design or execution of the clinimetric study was found, the evaluated measurement property was rated as indeterminate.

Interpretability
Interpretability was defined as the degree to which one can assign qualitative meaning to quantitative scores [9]. We recorded whether the following information was presented that could aid in interpreting the scores of the performance-based test: (i) mean scores (and S.D.) of the study population before and after treatment; (ii) mean scores (and S.D.) in relevant subgroups of the study population; (iii) mean scores (and S.D.) of patients and mean scores of ‘healthy’ controls; (iv) mean scores (and S.D.) per category of other well-known measures; and (v) mean changes in scores (and S.D.) of patients who consider themselves to be improved and of patients who consider themselves to be unchanged. Investigators had to provide at least two of the previous described types of information for a positive rating of interpretability.

Practical issues
The time needed to complete the test was recorded as a measure of patient burden. In addition, the requirements to perform the test were recorded, in terms of apparatus and space required. These practical issues were not rated because their importance depends on the application of the test, e.g. in clinical practice or in a large study.


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
We obtained the following number of abstracts from the searches: Medline, 1834 abstracts; Embase, 221 additional abstracts; CINAHL, 64 additional abstracts; PsycINFO 17 additional abstracts. Seventy-three abstracts were selected for further inspection of the full-text article. The most common reasons for excluding an abstract were a lack of a performance-based method or a lack of information on measurement properties. In addition, 26 articles were considered from our own literature databases or by reference tracking of the retrieved articles. Seventy-one articles did not meet the inclusion criteria, mostly because of a lack of information on measurement properties. Finally, we included 28 studies, referring to 26 performance-based methods.

Description of the 26 performance-based methods
Thirteen walking tests were found representing minor variations on the same theme (Table 1), two different stair-climb tests and one chair test. Ten multi-item tests were found, in which different activities are being performed, such as walking, stair climbing, rising from a chair and sitting down. In most of these tests, the time to perform the activities was measured, one test measured level of assistance [the Iowa Level of Assistance Scale (ILAS)], two tests measured observed disability [Steultjens’ test and the Functional Assessment System (FAS)], and two tests measured functional parameters from accelerometer signals (Numact monitor and DynaPort KneeTest).


View this table:
[in this window]
[in a new window]
 
TABLE 1. Description of the performance-based methods

 
Internal consistency
In seven multi-activity tests, scores for different activities are summarized into one total score (Table 2). For three of these tests [Physical Activity Restrictions test (PAR), Steultjens’ test, and FAS], the dimensionality was examined by means of factor analysis or Mokken scale analysis. Only for the PAR and Steultjens’ test were Cronbach's alphas calculated for each subscale. Both tests were rated positively for internal consistency.


View this table:
[in this window]
[in a new window]
 
TABLE 2. Internal consistency and reproducibility

 
Reproducibility
Fourteen tests (54%) have been tested for reliability (Table 2). Most studies calculated ICCs and all ICCs were >0.70. Nine tests received an indeterminate rating for reliability, six times due to a small sample size (self-paced walk 1, 2 and 3, footprint analysis, Marks's test, and FAS) and three times due to an inadequate method or unclear description of the design (Lin's test, PAR and Steultjens’ test). Only five tests received a positive rating for reliability [5-min walking field test, gait analysis 2, get up and go, Aggregated Locomotor Function score (ALF) and ILAS].

Absolute measurement error (agreement) was assessed for 10 tests (38%) (Table 2). A MID was defined only for the ILAS test. The SDC of the ILAS (6 points) was smaller than the MID (7 points). For the self-paced walk 3 test, the get up and go test, the ALF and the Lin test, the SDC was less than 0.5 S.D. of the baseline mean score. These tests, however, received an indeterminate rating for absolute measurement error due to small sample sizes (self-paced walk 3 test, get up and go test and ALF) or flaws in the study design (Lin's test). The ILAS was therefore the only test that received a positive rating for absolute measurement error.

Construct validity
Fourteen of the 26 tests (54%) have been tested for construct validity. Nine studies received an indeterminate rating because no hypotheses were specified. For the footprint analysis test, the get up and go test and the PAR, at least 75% of the results of the clinimetric study were in correspondence with the hypotheses. The footprint analysis test, however, received an indeterminate rating due to a small sample size. Therefore, only the get up and go test and the PAR received positive ratings for construct validity.

Responsiveness
Responsiveness was assessed for 12 tests (46%). Ten studies received an indeterminate rating because no hypotheses were specified. For two tests (gait analysis 3 and Steultjens’ test), less than 75% of the results were in correspondence with the hypotheses. Steultjens’ test therefore received a negative rating for responsiveness; gait analysis 3 received an indeterminate rating due to a small sample size.

Interpretability
For eight tests, at least one type of information was presented. ‘No information found’ was recorded for interpretability of the PAR because three different interventions were used, which makes the mean scores uninterpretable. For five tests (the 6-min gait test, Locométrix, gait analysis 1, gait analysis 4 and the stair ascent test), two types of information were presented, but all received an indeterminate rating due to small sample sizes.

Practical issues
In 11 studies, the amount of time needed to perform the test was mentioned, varying from ‘quick’ to 30 min. From the description of the tests, we estimate that all tests can be performed within 30 min. The equipment needed to perform the test varies from 3 m of indoor space and a stopwatch to standardized equipment (e.g. stair or chair), accelerometers with special software, or a two-dimensional biomechanical gait analysis system.

Quality assessment
The quality assessment of the 26 performance-based methods is summarized in Table 4. Only 11 of the 169 ratings were positive (7%). Forty-six (27%) indeterminate ratings were given, mostly for small studies or non-optimal analyses. Four tests (self-paced walk 3, gait analysis 2, ALF and Steultjens’ test) received one positive rating, two tests (get up and go and PAR) received two positive ratings and one test (ILAS) received three positive ratings.


View this table:
[in this window]
[in a new window]
 
TABLE 4. Summary of the evaluation of the quality and outcomes of the clinimetric studies

 

View this table:
[in this window]
[in a new window]
 
TABLE 3. Construct validity and responsiveness

 

    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
We identified 26 performance-based methods that have been used to measure the physical function of patients with OA of the hip or knee or patients undergoing hip or knee replacement and for which measurement properties have been evaluated. None of the 26 performance-based methods included has been tested for all measurement properties and only 7% of the ratings were positive. Forty-six (27%) indeterminate ratings were given, mostly for small studies or non-optimal analyses. The multi-activity tests were most extensively evaluated. The ILAS received the best ratings.

There are no standardized criteria to evaluate the quality of performance-based methods. Some of our criteria are arbitrary and may be interpreted as strict, but we did not want to base our conclusions on small studies with unreliable results. Furthermore, when hypotheses are not specified in validity or responsiveness studies, the risk of bias is large, because often only the positive results will be presented. If authors disagree with our criteria, they can make their own judgement based on the results presented in Tables 2 and 3.

Some limitations of our study should be acknowledged. We only included information on measurement properties that was intentionally calculated to assess the measurement properties of the particular performance-based method. More evidence is probably available in the literature that could be used to determine the validity or responsiveness of the tests, e.g. in studies that compared self-reported questionnaires and performance-based tests or in studies that evaluated self-report questionnaires by using a performance-based method as an external standard. We did not include these articles because without specified hypotheses about the expected relationship of the performance-based test with the other measures used, the results of these studies are difficult to interpret. Furthermore, we included only English language publications, and we therefore might have missed some publications on additional performance-based methods or measurement properties.

Several performance-based tests have been evaluated for their measurement properties in a general (‘healthy’) population or in other patient populations. These studies were excluded because the measurement properties of a given instrument depend on the setting and the population in which it is assessed. Therefore, the results found in these studies may not be generalizable to patients with OA of the hip or knee.

With the currently available evidence it is extremely difficult to formulate recommendations for which instrument to choose. Many performance-based tests, especially the walking tests, represent minor variations on the same theme, but we have no idea which might be the most useful because a proper justification of the choice of the activities included in the test and the aspects of function that were measured is mostly lacking. The PAR is the only test for which a proper justification for the choice of the activities was provided [23]. The PAR also received a positive rating for internal consistency. We therefore consider this test to be a good potential candidate for further testing. The reproducibility has not been studied adequately yet, and the responsiveness of the test has not been studied at all. The ILAS received the best ratings. However, a proper justification of the content of the test is lacking and internal consistency of this test has not been evaluated.

In our opinion, multi-activity tests are more valid for measuring the physical function of patients with OA of the hip or knee than single-activity tests, such as walking tests, because patients with OA of the hip or knee experience functional problems in more activities than just walking. For example, in patients with mild OA walking may be unaffected, but these patients may have problems with climbing stairs or standing up from a chair. In addition, time alone may be inadequate to represent the concept of physical function [24]. Tests that combine aspects of physical function with aspects of impairment, e.g. joint flexibility or muscle strength (e.g. Lin's test), are difficult to interpret because they measure multiple different underlying constructs [activity limitations and impairments according to the International Classification of Functioning (ICF)]. As the content of the multi-activity test are much alike, further testing of ALF, ILAS, Steultjens’ test, FAS or the DynaPort KneeTest could also be considered.

More importantly, however, before one can make a justified choice of a particular performance-based method, consensus is needed on what activities should be included in a performance-based test for patients with OA of the hip or knee and which aspects of function (e.g. time and other aspects of quality of movement) should be measured.
Figure 1

The authors have declared no conflicts of interest.


    Appendix 1. Full search strategy
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 
(((knee OR hip) AND (osteoarthritis OR replacement)) OR TKR OR TKA OR THR OR THA)

AND

(‘performance test’ OR ‘performance tests’ OR ‘performance-based test’ OR ‘performance-based tests’ OR ‘performance instrument’ OR ‘performance instruments’ OR ‘performance-based instrument’ OR ‘performance-based instruments’ OR ‘performance-based method’ OR ‘performance-based methods’ OR ‘performance measure’ OR ‘performance measures’ OR ‘performance-based measure’ OR ‘performance-based measures’ OR ‘performance index’ OR ‘performance indices’ OR ‘performance-based index’ OR ‘performance-based indices’ OR ‘performance-based assessment’ OR ‘objective test’ OR ‘objective tests’ OR ‘objective testing’ OR ‘objective instrument’ OR ‘objective instruments’ OR ‘objective method’ OR ‘objective methods’ OR ‘objective measure’ OR ‘objective measures’ OR ‘objective evaluation’ OR ‘objective function’ OR ‘objective functional’ OR ‘objective functioning’ OR ‘objective disability’ OR ‘objective assessment’ OR ‘objective assessments’ OR ‘observational test’ OR ‘observational tests’ OR ‘observation-based test’ OR ‘observation-based tests’ OR ‘observation-based testing’ OR ‘observational testing’ OR ‘observational instrument’ OR ‘observational instruments’ OR ‘observation-based instrument’ OR ‘observation-based instruments’ OR ‘observational method’ OR ‘observational methods’ OR ‘observation-based method’ OR ‘observation-based methods’ OR ‘observational measure’ OR ‘observational measures’ OR ‘observation-based measure’ OR ‘observation-based measures’ OR ‘observational index’ OR ‘observational indices’ OR ‘observation-based index’ OR ‘observation-based indices’ OR ‘observed disability’ OR ‘observed functioning’ OR gait OR stair OR chair OR walk OR walking OR ‘gait analysis’ OR ‘gait analyses’ OR ‘kinetic parameter’ OR ‘kinematic parameter’ OR ‘kinematic analysis’ OR ‘kinematic analyses’ OR ‘gait evaluation’)

Limit: English, human

Search date: 13 April 2004

Hits: 1834.


    Appendix 2. Checklist for the evaluation of performance-based methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 

Property Definition Quality criteriaa,b

Internal consistency The extent to which items in a (sub)scale are intercorrelated, thus measuring the same construct +, factor analyses performed on adequate sample size (7 x no of items) AND Cronbach's alpha(s) calculated per dimension in a sample size of at least 50 patients AND Cronbach's alpha(s) ≥0.70
?, no factor analysis OR doubtful design or method OR sample size too small
–, Cronbach's alpha(s) <0.70, despite adequate design and method
0, no information found on internal consistency
Agreement The extent to which the scores on repeated measures are close to each other (absolute measurement error) +, (MIC OR 0.5 S.D.) >SDC OR (MIC OR 0.5 S.D.) outside the LOA AND SDC and MID both determined in a sample size of at least 50 patients
?, doubtful design or method or sample size <50
–, (MIC OR 0.5 x S.D.) ≤SDC OR (MIC OR 0.5 S.D.) inside LOA, despite adequate design and method
0, no information found on agreement
Reliability The extent to which patients can be distinguished from each other, despite measurement errors (relative measurement error) +, ICC or kappa >0.70 with the lower limit of the confidence interval >0.60 or a sample size of at least 50 patients
?, doubtful design or method (e.g. time interval not mentioned, Pearson correlation) OR ICC or kappa >0.70 with the lower limit of the confidence interval ≤0.60 or sample size <50
–, ICC or kappa <0.70, despite adequate design and method
0, no information found on reliability
Construct validity The extent to which scores on a particular instrument relate to other measures in a manner that is consistent with theoretically derived hypotheses concerning the concepts that are being measured +, specific hypotheses were formulated AND at least 75% of the results are in accordance with these hypotheses in a sample size of at least 50 patients ?, doubtful design or method (e.g. no hypotheses) OR sample size <50
–, less than 75% of hypotheses were confirmed, despite adequate design and methods
0, no information found on construct validity
Responsiveness The instrument's ability to detect important change over time in the concept being measured +, specific hypotheses were formulated AND at least 75% of the result are in accordance with these hypotheses in a sample size of at least 50 patients
?, doubtful design or method (e.g. no hypotheses) OR sample size <50
–, less than 75% of hypotheses were confirmed, despite adequate design and methods
0, no information found on responsiveness
Interpretability The degree to which one can assign qualitative meaning to quantitative scores +, mean and S.D. scores presented of at least two relevant subgroups of patients in a sample size of at least 50 patients
?, doubtful design or method OR less than two subgroups OR sample size <50
0, no information found on interpretation

ICC, intraclass correlation coefficient; MIC, minimal important change; SDC, smallest detectable change; LOA, limits of agreement; S.D., standard deviation.

a+, positive rating; –, negative rating; ?, indeterminate rating; 0, no information available.

bDoubtful design or method = lacking a clear description of the design or methods of the clinimetric study, or any important methodological weakness in the design or execution of the study.


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Appendix 1. Full search...
 Appendix 2. Checklist for...
 References
 

  1. Bellamy N, Kirwan J, Boers M et al. Recommendations for a core set of outcome measures for future phase III trials in knee, hip, and hand osteoarthritis. Consensus development at OMERACT III. J Rheumatol 1997;24:799–802.[ISI][Medline]
  2. Kreibich DN, Vaz M, Bourne RB et al. What is the best way of assessing outcome after total knee replacement? Clin Orthop 1996:221–5.
  3. Marks R. Walking time measures for evaluating OA of the knee. Physiotherapy 1994;50:5–8.
  4. Piva SR, Fitzgerald GK, Irrgang JJ, Bouzubar F, Starz TW. Get up and go test in patients with knee osteoarthritis. Arch Phys Med Rehabil 2004;85:284–9.[CrossRef][ISI][Medline]
  5. Steultjens MPM, Dekker J, van Baar ME, Oostendorp RAB, Bijlsma JWJ. Internal consistency and validity of an observational method for assessing disability in mobility in patients with osteoarthritis. Arthritis Care Res 1999;12:19–25.[CrossRef][Medline]
  6. Odding E, Valkenburg HA, Stam HJ, Hofman A. Assessing joint pain complaints and locomotor disability in the Rotterdam Study: effect of population selection and assessment mode. Arch Phys Med Rehabil 2000;81:189–93.[Medline]
  7. Birrell F, Johnell O, Silman A. Projecting the need for hip replacement over the next three decades: influence of changing demography and threshold for surgery. Ann Rheum Dis 1999;58:569–72.[Abstract/Free Full Text]
  8. Bot SDM, Terwee CB, van der Windt DAWM, Bouter LM, Dekker J, de Vet HCW. Psychometric evaluation of self-report questionnaires—the development of a checklist. In: Ader HJ, Mellenbergh GJ, ed. Second workshop on research methodology, 25–27 June, 2003. Amsterdam: VU University Amsterdam 2003: 161–8.
  9. Lohr KN, Aaronson NK, Alonso J et al. Evaluating quality of life and health status instruments: development of scientific review criteria. Clin Ther 1996;18:979–92.[CrossRef][ISI][Medline]
  10. Bombardier C, Tugwell P. Methodological considerations in functional assessment. J Rheumatol 1987;14(Suppl. 15):6–10.
  11. Altman DG. Practical statistics for medical research. London: Chapman and Hall, 1991.
  12. de Vet HCW. Observer reliability and agreement. In: Armitage P, Colton T, eds. Encyclopedia of biostatistics. New York: John Wiley, 1998:3123–8.
  13. Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures. Controlled Clinical Trials 1991;12: 142S–58S.[Medline]
  14. Nunnally JC, Bernstein IH. Psychometric theory, 3rd edn. New York: McGraw-Hill, 1994.
  15. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986:307–10.
  16. Ravaud P, Giraudeau B, Auleley GR, Edouard-Noël R, Dougados M, Chastang Cl. Assessing smallest detectable change over time in continuous structural outcome measures: application to radiological change in knee osteoarthritis. J Clin Epidemiol 1999;52:1225–30.[CrossRef][ISI][Medline]
  17. Norman GR, Sloan JA, Wyrwich KW. Interpretation of changes in health-related quality of life. The remarkable universality of half a standard deviation. Med Care 2003;41:582–92.[CrossRef][ISI][Medline]
  18. de Vet HCW, Ader HJ, Terwee CB, Pouwer F. Are factor analytical techniques used appropriately in the validation of health status questionnaires? A systematic review on the quality of factor analysis of the SF-36. Qual Life Res 2005;14:1203–18.[CrossRef][ISI][Medline]
  19. McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res 1995;4:293–307.
  20. Testa MA, Simonson DC. Assessment of quality of life outcomes. N Engl J Med 1996;334:835–40.[Free Full Text]
  21. Terwee CB, Dekker FW, Wiersinga WM, Prummel MF, Bossuyt PMM. On assessing responsiveness of health-related quality of life instruments: guidelines for instrument evaluation. Qual Life Res 2003;12:349–62.[CrossRef][ISI][Medline]
  22. Hays RD, Hadorn D. Responsiveness to change: an aspect of validity, not a separate dimension. Qual Life Res 1992;1:73–5.[CrossRef][Medline]
  23. Rejeski WJ, Ettinger WH Jr, Schumaker S, James P, Burns R, Elam JT. Assessing performance-related disability in patients with knee osteoarthritis. Osteoarthritis Cartilage 1995;3:157–67.[CrossRef][ISI][Medline]
  24. Stratford PW, Kennedy D, Pagura SMC, Gollish JD. The relationship between self-report and performance-related measures: questioning the content validity of timed tests. Arthritis Rheum 2003; 49:535–40.[CrossRef][ISI][Medline]
  25. Marks R, Quinney HA, Wessel J. Proprioceptive sensibility in women with normal and osteoarthritic knee joints. Clin Rheumatol 1993; 12:170–5.[CrossRef][ISI][Medline]
  26. Marks R. Reliability and validity of self-paced walking time measures for knee osteoarthritis. Arthritis Care Res 1994;7:50–3.[Medline]
  27. Price LG, Hewett JE, Kay DR, Minor MA. Five-minute walking test of aerobic fitness for people with arthritis. Arthritis Care Res 1988;1:33–7.[Medline]
  28. Péloquin L, Gauthier P, Bravo G, Lacombe G, Billiard JS. Reliability and validity of the five-minute walking field test for estimating VO2 peak in elderly subjects with knee osteoarthritis. J Aging Phys Act 1998;6:36–44.
  29. Guyatt GH, Sullivan MJ, Thompson PJ et al. The 6-minute walk: a new measure of exercise capacity in patients with chronic heart failure. J Can Med Assoc 1985;132: 919.[Abstract]
  30. Parent E, Moffet H. Comparative responsiveness of locomotor tests and questionnaires used to follow early recovery after total knee replacement. Arch Phys Med Rehabil 2002;83:70–80.[CrossRef][ISI][Medline]
  31. Bassey EJ, Dallosso HM, Fentem PH, Irving JM, Patrick JM. Validation of a simple mechanical accelerometer (pedometer) for the estimation of walking activity. Eur J Appl Physiol Occup Physiol 1987;56:323–30.[CrossRef][ISI][Medline]
  32. Silva M, Shepherd EF, Jackson WO, Dorey FJ, Schmalzried TP. Average patient walking activity approaches 2 million cycles per year. J Arthroplasty 2002;17:693–7.[CrossRef][ISI][Medline]
  33. Falconer J, Hayes KW. A simple method to measure gait for use in arthritis clinical research. Arthritis Care Res 1991;4:52–7.[Medline]
  34. Aminian K, Rezakhanlou K, De Andres E, Fritsch C, Leyvraz PF, Robert P. Temporal feature estimation during walking using miniature accelerometers: an analysis of gait improvement after hip arthroplasty. Med Biol Eng Comput 1999;37:686–91.[Medline]
  35. Auvinet B, Chaleil D, Barrey E. Accelerometric gait analysis for use in hospital outpatients. Rev Rhum Engl Ed 1999;66:389–97.[Medline]
  36. Fransen M, Crosbie J, Edmonds J. Reliability of gait measurements in people with osteoarthritis of the knee. Phys Ther 1997;77:944–53.[Abstract/Free Full Text]
  37. Lafuente R, Belda JM, Sanchez-Lacuesta J, Soler C, Poveda R, Prat J. Quantitative assessment of gait deviation: contribution to the objective measurement of disability. Gait Posture 2000;11:191–8.[Medline]
  38. Olsson E, Barck A. Correlation between clinical examination and quantitative gait analysis in patients operated upon with the Gunston-Hult knee prosthesis. Scand J Rehabil Med 1986;18:101–6.[Medline]
  39. Mathias S, Nayak USL, Isaacs B. Balance in elderly patients: the ‘Get-up and Go’ test. Arch Phys Med Rehabil 1986;67:387–9.[ISI][Medline]
  40. Madsen OR, Brot C. Assessment of extensor and flexor strength in the individual gonarthrotic patient: interpretation of performance changes. Clin Rheumatol 1996;15:154–60.[Medline]
  41. Marks R. Pilot study of the reproducibility of measures of walking performance variables in persons with osteoarthritis of the knee. Physiother Can 1995;47:40–4.
  42. McCarthy CJ, Oldham JA. The reliability, validity and responsiveness of an aggregated locomotor function (ALF) score in patients with osteoarthritis of the knee. Rheumatology 2004;43:514–17.[Abstract/Free Full Text]
  43. Lin YC, Davey RC, Cochrane T. Tests for physical function of the elderly with knee and hip osteoarthritis. Scand J Med Sci Sports 2001;11:280–6.[CrossRef][ISI][Medline]
  44. Shields RK, Enloe LJ, Evans RE, Smith KB, Steckel SD. Reliability, validity, and responsiveness of functional tests in patients with total joint replacement. Phys Ther 1995;75:169–76 (discussion 76–9).[Abstract/Free Full Text]
  45. Steultjens MPM, Roorda LD, Dekker J, Bijlsma JWJ. Responsiveness of observational and self-report methods for assessing disability in mobility in patients with osteoarthritis. Arthritis Care Res 2001; 45:56–61.[CrossRef]
  46. Nilsdotter AK, Roos EM, Westerlund JP, Roos HP, Lohmander LS. Comparative responsiveness of measures of pain and function after total hip replacement. Arthritis Care Res 2001;45:258–62.
  47. Oberg U, Oberg B, Oberg T. Validity and reliability of a new assessment of lower-extremity dysfunction. Phys Ther 1994; 74:861–71.[Abstract/Free Full Text]
  48. Walker DJ, Heslop PS, Plummer CJ, Essex T, Chandler S. A continuous patient activity monitor: validation and relation to disability. Physiol Meas 1997;18:49–59.[Medline]
  49. Walker DJ, Kidd E, Heslop PS, Chandler C. Spontaneous ambulatory activity as a quantifiable outcome measure for rheumatoid arthritis. Rheumatology 1999;38:1234–8.[Abstract/Free Full Text]
  50. Walker DJ, Heslop PS, Chandler C, Pinder IM. Measured ambulation and self-reported health status following total joint replacement for the osteoarthritic knee. Rheumatology 2002;41:755–8.[Abstract/Free Full Text]
  51. van den Dikkenberg N, Meijer OG, van der Slikke RMA et al. Measuring quality of movement in patients with knee problems: rationale and construction of the DynaPort KneeTest. Knee Surg Sports Traumatol Arthrosc 2002;10:204–12.[Medline]
Submitted 22 April 2005; revised version accepted 22 November 2005.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
45/7/890    most recent
kei267v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Terwee, C. B.
Right arrow Articles by Dekker, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Terwee, C. B.
Right arrow Articles by Dekker, J.
Related Collections
Right arrow Osteoarthritis and Cartilage
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?