# Chapter 5

**CORRELATIONAL RESEARCH **

## Research Design

In general, a *correlational* study is a quantitative method of research in which you have 2 or more quantitative variables from the same group of subjects, & you are trying to determine if there is a relationship (or covariation) between the 2 variables (a similarity between them, not a difference between their means). Theoretically, any 2 quantitative variables can be correlated as long as you have scores on these variables from the same participants; however, it is probably a waste of time to collect & analyze data when there is little reason to think these two variables would be related to each other. Try to have 30 or more participants; this is important to increase the validity of the research.

Your hypothesis might be that there is a positive correlation (for example, the number of hours of study & your midterm exam scores), or a negative correlation (for example, your levels of stress & your exam scores). A perfect correlation would be an r = +1.0 & -1.0, while no correlation would be r = 0. Perfect correlations would almost never occur; expect to see correlations much less than + or - 1.0. Although correlation can't prove a causal relationship, it can be used for prediction, to support a theory, to measure test-retest reliability, etc.

* Correlational * research attempts to determine whether, and to what degree, a relationship exists between two or more variables. The purpose of *correlational * study is either to establish a relationship (or lack of it) or to use relationships to make predictions. A correlation is a quantitative measure of the degree or correspondence between two or more variables. For example, a college admissions director might be interested in answering the question “How does the performance of high school seniors on the SAT correspond to their first semester college grades?” is there a high correlation between the two variables, suggesting that SAT scores might be useful in predicting how students will perform in their freshman year of college? Or is there a low correlation between the two variables, suggesting that SAT scores likely will not be useful in predicting freshman performance? The degree of correspondence between variables is measured by a correlation coefficient, which is a number between –1.00 and +1.00. two variables that are not related will have a correlation coefficient near .00, between –1.00 and +1.00.two variables that are highly related will have a correlation coefficient near -1.00 or + 1.00. a correlation that is positive means that as one variable increases, the other variable also increases. A coefficient that is negative means that when one variable increases the other variable decreases. Since very few pairs of variables are perfectly correlated, predictions based on them are also rarely perfect. Although correlations do not indicate a cause-effect relationship, for many decisions, predictions base on known relationships can be useful.

At a minimum, correlation research requires information about at least two variables obtained from a single group of people. In larger *correlational* studies, a number of variables believed to be related to a complex concept such as achievement may be examined. Variables found not to be highly correlated to achievement would be eliminated from further consideration, while variables that were highly correlated to achievement might prompt further examination.

It is very important to note that *correlational* studies do not establish causal relations between variables. Other, more powerful methods are needed to establish cause-effect relationships. Thus, the fact that there is a high correlation between self-concept and achievement does not imply that self-concept “causes” achievement or that achievement “causes” self-concept. The correlation only indicates that students with lower self-concepts tend to have higher levels of achievement and that students with lower self-concepts tend to have lower levels of achievement. Without additional data, we cannot conclude that one variable is the cause of the other. There may be a third factor, such as the amount of encouragement and support parents give their children, that underlies both variables and influences high or low achievement and self-concept. The important point to remember is that *correlationa* l research never establishes cause-effect links between variables.

The following are examples of *correlational* studies:

1. *The correlation between intelligence and self-esteem * . Scores on an intelligence test and on a measure of self-esteem would be acquired from each member of a given group. The two sets of scores would be correlated and the resulting coefficient would indicate the degree of relationship.

2. *the relationship between anxiety and achievement * . Scores on an anxiety scale and on an achievement test would be correlated and the resulting coefficient would indicate the degree of relationship.

3. *use of an aptitude test to predict success in algebra course * . Scores on an algebra aptitude test would be correlated with success in algebra measured by algebra final exam scores. If the resulting correlation were high, the aptitude test might be a good predictor of grades in algebra.

*Correlational research* is sometimes treated as a type of descriptive research, primarily because it does describe an existing condition. However, the condition it describes is distinctly different from the conditions typically described in survey or observational studies. Correlational research involves collecting data in order to determine whether, and to what degree, a relationship exists between two or more quantifiable variables. The degree of relationship is expressed as a correlation coefficient . if a relationship exists between two variables, it means that scores within a certain range on one variable are associated with scores within a certain range on the other variable. For example, there is a relationship between intelligence and academic achievement; person who score highly on intelligence tests tend to have high grade point averages, and persons who score slowly on intelligence tests tend to have low grade point averages.

The purpose of correlational study is to determine relationships between variables or to use these relationships to make predictions. Correlational studies typically investigate a number of variables believed to be related to a major, complex variable, such as achievement. Variables found not to be highly related to achievement will be dropped from further examination, while variables that are highly related to achievement may be examined in causal-comparative or experimental studies to determine the nature of relationships.

**Participant and Instrument Selection **

The sample for a correlational study is selected using an acceptable sampling method, and 30 participants are generally considered to be a minimal acceptable sample size. There are, however, some factors that influence the size of the sample. The higher the validity and reliability of the variables to be correlated, the smaller the sample can be, but not less than 30. If validity and reliability are low, a larger sample is needed, because errors of measurement may mask the true relationship. As in an study, it is important to select or develop valid and reliable measures of the variables being studied.

**Data Collection **

You may collect your data through testing (e.g. scores on a knowledge test (an exam or math test, etc.), or psychological tests, numerical responses on surveys & questionnaires, etc. Even archival data can be used (e.g. Kindergarten grades) as long as it is in a numerical form.

** **

**Design and Procedure **

The basic *correlational* research design is not complicated. Two (or more) scores are obtained for each member of the sample, one score for each variable of interest, and the paired scores are then correlated. The result is expressed as a correlation coefficient that indicates the degree of relationship between the two variables. Different studies investigate different numbers of variables, and some utilize complex statistical procedures, but the basic design is similar in all correlational studies.

**Data Analysis and Interpretation **

With the use of the Excel program, calculating correlations is probably the easiest data to analyze. In Excel, set up three columns: Subject #, Variable 1 (e.g. hours of study), & Variable 2 (e.g. exam scores). Then enter your data in these columns. Select a cell for the correlation to appear in & label it. Click "**fx ** " on the toolbar at the top, then "**statistical" ** , then "**Pearson ** ". When asked, highlight in turn each of the two columns of data, click "Finish", & your correlation will appear. Charts in any statistics textbook can tell you if the correlation is significant, considering the number of participants.

In a *correlational* study, the scores for one variable are correlated with the scores for another variable. If a number of variables are to be correlated with the scores particular variable of primary interest, each of the variables would be correlated with the variable of primary interest. Each correlation coefficient represents the relationship between a particular variable and the variable of primary interest. The end result of data analysis is a number of correlation coefficients, ranging from -1.00 to +1.00.

There are a number of different methods of computing a correlation coefficient. Which one is appropriate depends on the type of data represented by each variable. The most commonly used technique is the product moment correlation coefficient, usually referred to as the *Pearson* *r* . the *Pearson r* is used when both variables to be correlated are expressed as continuous data such as ratio or interval data. Since most instruments used in education, such as achievement measures and personality measures, are treated as being interval data, the *Pearson* *r* is usually the appropriate coefficient for determining relationship. Further, since the Pearson r results in the most precise estimate of correlation, its use is preferred even when other methods may be applied.

If the data for a variable are expressed as rank or ordinal data, the appropriate correlation coefficient to use is the rank difference correlation, usually referred to as the *Spearman rho* . Rank data are used when participants are arranged in order of scores and each participant is assigned a rank from 1 to however many participants there are. For a group of 30 participants, for example, the participant with the highest score 2, and the participant with the lowest score 30. If two participants have the same score, their ranks are averaged. Thus, if two participants have the same highest score would be assigned the average of rank 1 and rank 2, namely 1.5. If only one of the variables to be correlated is in rank order, such as class standing at the time of graduation, the other variable or variables to be correlated with it must also be expressed in terms of ranks in order to use the *Spearman rho* technique.

**The Pearson r **

The formula for the *Pearson r* looks very, very complicated, but it is not . it looks tough because it has a lot of pieces, but each piece is quite simple to calculate. To calculate correlations including a *Pearson r* we need two sets of scores.

**Things to remember … **

**------------------------------------------------------------------- **

**Two Types of Correlation **

** **

**Positive correlation: ** A *positive correlation * between two variables means that both variables change in the same direction (either both increase or both decrease). For example, if GPAs increase as SAT scores increase, there is a positive correlation between SAT scores and GPAs.

**Negative (inverse) correlation: ** A *negative correlation * between two variables means that as one variable increases, the other variable decreases. In other words, the variables change in opposite directions. So, if GPAs decrease as SAT scores increase, there is a negative correlation between SAT scores and GPAs.

Tue, 10 May 2011 @23:25