Epidemiology / Statistics

Sensitivity(ability of a test to detect true positives = true postitives / (true positive + false negatives)
Specificity (ability of a test to detect true negatives = true negatives / (true negatives + false postives)
Accuracy = (true positives + true negatives) / (false positives + false negatives)
Positive Predictive value(probability that an individual actually has the condition if the test is positive) = True positives / (true positives + false positives)
Negative Predictive value(probability that an individual does not have the condition if the test is negative) = true negatives / (true negatives + false negatives)

Prevalence = number of cases in a population at a given time.

Incidence = Number of newly diagnosed cases over a specific time period.

Alpha (Type I) error

error in hypothesis testing where a statistically significant association is found when none exists (false positive).
rejecting a true null hypothesis
controlled by increasing the significance level.
Statistical significance provides the probability of committing a Type I error and is demonstrated by the P value.
P<0.05 = 5% probability of committing a Type 1 error

Beta (Type II) error

error in hyposthesis testing in where no statistically significant association is found when there is a true association (false negative).
may occur if study lacks power, ie you need more data to prevent Type II errors.
accepting a false null hypothesis
Power analysis is used to determine if the sample size is large enough to demonstate statistical significance.
In general the power of a study shoud be at least 0.8 (80%)

Discrete Data

Discrete data falls into specific categories such as gender or the presence or absence of a risk factor such as smoking.
Chi-square test
Yates correction for continuity
Fishers exact test

Continuous Data

Continuous data can be displayed on a curve such as height, weight, or 40-yard dash time.
Student's t-test (one sample{normal distribution}, independent two-sample{normal distribution} and paired{not-independent})
ANOVA

ANOVA

test of choice for multiple groups of continuous data.
used to compare means of three or more groups.
useful only for normally distributed, independent data.
One-way ANOVA: used to test differences bewtween 3 or more independent groups.
One-way ANOVA for repeated measures: used when the subjects are dependent groups (same subjects used for each treatment).
2by2 ANOVA: used to evaluate effects of two or more treatment variables.
Posterior comparisions = Bonferroni method, student-Newman-Kauls procedure, Tukey method, Scheffe method.

T-Test

used for comparison of two groups of continuous data.
Paired t-test = dependent = used for two groups of paired (same individual) and normally distributed continous data.
Student t-test = independent = used for two independenet groups of normally distributed data.

Chi-Square Test

used for measuring binary or ordinal data.
compares discrete (categorical) variables as opposed to continuous variables.

Fischers Exact Test

used for measuring binary or ordinal data

Kruskal Wallis Test

used for nonparametric testing with more then two groups.
used to compare medians of three or more independent groups where the data is not normally distributed.

Mann-Whitney Test

used to compare two groups of nonparametric data.
nonparametric statistical significance test for assessing whether the difference in medians between 2 observed distributions is statistically significant. Requires two independent samples.

Wilcoxon Test

used to compare two groups of nonparametric data.

Simple Regression

used to determine if these is a relationship between a dependent variable and an independent variable.

Logistic Regression

assesses the effect of one or more variables on one dichotomous variable. Dichotomous variables have on two variables (male/female).

Bonferroni Correction

performed to adjust for the number of comparisons and decrease the risk of committing a type I error. Performed by dividing the p value by the number of comparisions.

Relative Risk

Magnitude of association between exposure to a risk factor and an injury.
Determined by dividing the incidence of those exposed most to an injury by the incidence of the control group.

Absolute Risk

the arithmetic difference in the rate of adverse outcomes between control and experimental subjects.

Incidence

the proportion of new cases within a specific time interval

Odds Ratio

compares a study group with a control group with the probability of exposure with a specific outcome compared to the probability of an exposure without the same outcome.

Prevalence

the proportion of individuals with a disease or condition at a single point in time.

Confidence Interval

range between two estimated values.
95% confidence interval = +/- 1.96 * (SEM). SEM=standard error of the mean=standard deviation/square root of n.

Scheffe comparision

post hoc test that assess group differences following ANOVA

Newman-Keuls

post hoc test that assess group differences following ANOVA

Correlation Analysis (r value)

provides unitless number that summarizes strength of association between 2 variables.
The closer the value is to 1 or -1 the stronger the association.
Positive number means as the values change in the same direction.
Negative number means the values are inversely related.

R²

the coeffecient of determination
represents the percentage of the independent variable that explains the variance in the study

Variance

=standard deviation squared

Poisson Regression

analysis in which the dependent varialbe in an experiment or observational study is a count the follows the Poisson destribution.

Analysis of Covariance

tests for equality among group means, when the value of the dependent variable is affected by additional information related to the independent variable.

Validity

degree to which the measuremen represents a true value

Reliability

ability of researchers to reproduce or repeat the same measurements.

Tests for Interobserver reliability

kappa coefficient, weight kappa, Cronback's alpha
Person product-moment correlation for continuous variables

Descriptive Study

demonstrate associations between disease and variables, but do no demonstrate cause-effect relationships.
Case report / case series
Correlational Study: large sample study identifying associations between disease and variables.
Cross sectional study: takes a snapshot of a population and derives assocations with disease. Does not determine case and effect.

Analytical Study

allow for hypothesis testing and statical analysis
Cohort study: condition is not manipulated but observed and recorded. Can be retrospective or prospective
Case-Control study: cohort study with participants selected based on disease. Beneficial for rare diseases. Can not be used for rare exposures, cannot directly measure incidence, subject to selection and recall bias.
Prospective Cohort study: minimizes potential bias and inscreased strength of conclusions.

Intervention Study

clinical trials
prospective, randomized, single vs double blinded
Reduces bias and confounding

Selection Bias: study error resulting when comparisions are made between groups that differ in important ways other than the factor under consideration

Measurement Bias: study error resulting when quantitative or qualitative data collected from the treatment groups differ. Generally found in retrospective studies.

Sampling Bias: study error which occurs when patients in the study differ systematically from the population in which the results are generalized.

Publication Bias: research error which occurs when published studies differ systematically from unpublished studies which may be of higher quality, but not show as great a statistical significance.

References:

Wojtys EM, AJSM 1996;24:564
Kuhn JE, AJSM 1996;24:702
Greenfield ML, Wojtys EM, Kuhn JE. A statistics primer. Tests for continuous data. Am J Sports Med. 1997 Nov-Dec;25(6):882-4.
Kuhn JE, Greenfield ML, Wojtys EM. A statistics primer. Statistical tests for discrete data. Am J Sports Med. 1997 Jul-Aug;25(4):585-6.
Kocher MS, JBJS 2004;86A:607