| EES | #1 - #2 - #3 |
Household |
People / HH |
Land (hectares) |
| 1 | 7 | 2 |
| 2 | 13 | 1.5 |
| 3 | 6 | 2 |
| 4 | 4 | 1.8 |
| 5 | 8 | 3.5 |
| 6 | 10 | 2.5 |
| 7 | 5 | 1.8 |
| 8 | 9 | 1 |
| 9 | 2 | 1.2 |
| 10 | 6 | 1.9 |
Look at the graphs in Figure 10. What you see is called a scatterplot (also scatter diagram or scattergram). A scatterplot is simply a graph of many individual data points located in a coordinate system. The coordinate system usually is made up of two axes intersecting each other at a right angle. You can think of the axes as some kind of rulers where the scale depends on whatever is being measured along those axes. Each point then is placed in the coordinate system according to its values in the x- (horizontal) and y- (vertical) directions (in our example above, this would mean plotting the values from the first data column along the x-axes, and values from the second data column along the y-axes). Figure 10 gives an example. In the scatterplot on the left each point has a value for population and another one for total area of deforestation in 1978. In the scatterplot on the right, each point has a value for population density and another for the deforestation rate between 1975 and 1978. That could mean that someone took measurements of both of these variables in, say, one area of the Amazon, wrote down these two values, and then went on to a different location and measured the two quantities there, and so on. Together the two values determine unambiguously where that point would fall in the coordinate system.
Now, why would you want to do that? Usually, you would construct a scatterplot when you have a lot of data and would like to find out whether there is any kind of relationship between the two variables that you measured. Note that at this point we don't really care what kind of relationship that might be, just whether there is one or not. How could you tell?
The scatterplots above might remind you of bugs on a windshield; they just look like a rather chaotic unordered assemblage of points. In the scatterplots in Figure 11, things look a little more orderly: on the left you can see that as values of population get larger, the values of land under permanent crops tend to get larger also. In the scatterplot on the right, values of Gross Domestic Product increase, while those of permanent forest losses tend to simultaneously decrease.
This sort of relationship is called a correlation. Increases in one variable tend to correlate with increases/decreases in the other variable. You can tell that this is so from the shape of the "cloud" formed by the data points. So think about what it would mean, if the "cloud" was made up of rather dispersed points vs. if it stretched out as a pretty dense mass to almost form a line? And if two variables were perfectly correlated, what would that scatterplot look like? Think about and then discuss this with your neighbor. When you feel you have answers to these questions, note them down below.
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
_______________________________________________________________________________
_______________________________________________________________________________
_______________________________________________________________________________
________________________________________________________________________________
________________________________________________________________________________
________________________________________________________________________________
________________________________________________________________________________
Mathematically, the strength of this kind of relationship is expressed by the correlation coefficient. The correlation is stronger, the closer the coefficient is to 1 (positively correlated) or -1 (negatively correlated). The correlation is weak, if the correlation coefficient approaches 0. In Figure 11, the graph on the left shows positive correlation, the one on the right negative correlation.
So what is the general rule as to how values change concurrently if (a) they are not at all correlated, (b) positively correlated, and (c ) they are negatively correlated?
A No Correlation -- _________________________________________________________ ______________________________________________________________________________
B Positive Correlation --______________________________________________________ ______________________________________________________________________________
C Negative Correlation --_____________________________________________________ ______________________________________________________________________________
| Linear means that the actual difference between two points on
a scale is the same everywhere on that scale. Between points 1 and 2 is
a difference of 1, and so is between points 101 and 102.
Logarithmic, by contrast, means that the actual difference between two points on that scale increases tenfold from one unit to the next. In principle that means that there is an actual difference of 10 between points 0 and 1, but a difference of 100 between points 1 and 2, 1000 between 2 and 3, and so on. If you have a coordinate system with one axis having a linear and the other a logarithmic scale, the graph is called a semi-log graph. |
Area |
X |
Y |
| World | 398 | 0.28 |
| Africa | 212 | 0.3 |
| N/C America | 197 | 0.65 |
| S. America | 166 | 0.49 |
| Asia | 1139 | 0.15 |
| Europe | 1050 | 0.28 |
| USSR | 128 | 0.81 |
| Oceania | 33 | 1.87 |
| Cote d'Ivoire | 380 | 0.3 |
| Nigeria | 1199 | 0.29 |
| Costa Rica | 576 | 0.18 |
| Mexico | 454 | 0.28 |
| Boliva | 66 | 0.48 |
| Brazil | 174 | 0.53 |
| China | 1201 | 0.09 |
| India | 2811 | 0.20 |
(1) Population density = people per 1000 ha (in 1989) (2) In 1989
Source: Extracted from Sage (1994: 280; his Table 2), after World Resources Institute (1990).
When you're finished plotting all the data points, what do you find? Does the "point cloud" indicate any kind of relationship between the two variables? If it does, imagine a straight line drawn right into the cloud that would best represent the shape of the "cloud." For example, if you find that -- generally speaking -- x- and y-values increase concurrently (i.e., they are positively correlated), then draw a straight line with a ruler through the middle of the cloud (beginning somewhere in the lower left and pointing toward the upper right end of the cloud). Note that you don't have to try to intersect all plotted points, although some points might fall right on the line. If the correlation is not perfect, it is simply impossible for all points to fall on a single line. But "eyeball" it such that the line comes closest to as many points as possible.
Try now to draw this line in the graph. Have it intersect the y-axis.
The line you just drew is called a regression line, and usually
one finds it not by "eyeballing" but through calculations. The result of
these calculations would be an equation that defines the y-intercept and
the slope of the line, the two things you need in order to accurately determine
where to draw the line. The general form of that equation looks like this,
which is the equation for a straight line:
|
|
| Variable | 1961-1965 1 | 1970 | 1975 | 1980 | 1985 | 1991 |
| Population 2 | 3,288,510 | 3,694,334 | 4,076,906 | 4,449,520 | 4,916,419 | 5,295,000 4 |
| Arable land 3 | 1,315,212 | 1,319,036 | 1,335,739 | 1,356,170 | 1,375,736 | 1,346,988 |
| Permanent crops | 78,555 | 89,328 | 94,247 | 99,323 | 100,747 | 94,584 |
| Permanent pasture | 3,044,258 | 3,175,222 | 3,191,218 | 3,178,314 | 3,170,822 | 3,357,520 |
| Forest | 4,169,369 | 4,190,664 | 4,169,629 | 4,111,910 | 4,086,636 | 3,861,081 |
Sources: Extracted from Young, S. et al. 1991. Appendix: Global land use/cover: Assessment of data and some general relationships. Report to the Land Use Working Group, Committee for Research on Global Environmental Change, SSRC. Data originally derived from the FAO Production Yearbooks. Data in the last column are from FAO. 1992. FAO Production Yearbook 1991. New York: United Nations; and FAO. 1992. UN Demographic Yearbook. New York: United Nations.
If possible, use a global and a regional or local example, and compare and contrast what you find through regression analysis. Is the relationship apparent at both scales? Is it stronger at one scale than at the other? Why could that be? Be cautious in interpreting your findings, remembering the quality of your data. (The Rudel article is a nice example of such a careful analysis and interpretation, but note some of the comments on Rudel's work in the Background Information of Unit 3.)
Report your findings with graphs, regression equations, and interpretation
in a 3-5 page essay. Alternatively, create a poster that you would display
at a conference or another public place where you would want to teach people
about these land use change issues at different scales.
| EES |