Concept: Measures the strength and direction of the relationship between
two continuous variables.
Warning: Correlation does not imply causation. A strong relationship between two variables does not mean one causes the other.
1. Pearson (r):
- Best for: Linear relationships (straight line), normally distributed data.
- Sensitive to: Outliers.
- Returns: R-squared (RΒ²) = proportion of variance explained
2. Spearman (rho) & Kendall (tau):
- Best for: Monotonic relationships, non-normal data, or ranks.
- Robust to: Outliers.
- Kendall's Tau is often preferred for small datasets with many tied ranks.
Interpretation of Coefficient (r, rho, or tau):
- +1.0: Perfect Positive (As X goes up, Y goes up).
- -1.0: Perfect Negative (As X goes up, Y goes down).
- 0.0: No relationship.
Strength Guidelines:
- 0.9 - 1.0: Very Strong π₯
- 0.7 - 0.9: Strong π
- 0.5 - 0.7: Moderate π
- 0.3 - 0.5: Weak π
- < 0.3: Very Weak/Negligible
Confidence Intervals (95% CI):
- Shows the range where the true correlation likely falls
- Wider CI = less precise estimate (usually with small samples)