Problem 1: From ourMoodel site, Run the lines provided at the very bottom of the code working with the skull data and compare it to the output on pages 65 and 66. Run the command two more times using Euclidean and Mahalanobis distances. Put all three distance matrices into your lab report where you rounding to three significant figures. (1a.) Provide a paragraph of commentary on the distances with respect to the data. (2b.)Carefully look at the provided R code. Tell me what two commands are new to you. Research these two new commands and in your own words explain to me what they do.
Distances based on proportions
Ex 1. When there is NO overlap of classes:
Type 1 | Type 2 | Type 3 | |
Colony 1 proportions | 0.7 | 0.3 | 0.0 |
Colony 2 proportions | 0 | 0 | 1.0 |
This is called the dissimilarity index.
, and this is called the similarity index.
Another index:
Ex 2. When there is a complete overlap of classes:
Type 1 | Type 2 | Type 3 | |
Colony 1 proportions | 0.7 | 0.3 | 0.0 |
Colony 2 proportions | 0.7 | 0.3 | 0.0 |
* In other cases these indices have values between 0 and 1.
* Similarity measures are often constructed as 1/D or 1/(1+D), where D = distance measure.
Problem 2:(2a.) Create a function that returns both the results of equation 5.5 and 5.6. (Use the list( ) command as done in the distmatrix( ) function.) Input should be two vectors p1 and p2 which represent the proportions for each species and each sum to 1. Comment your code.(2b.) Show your data for three different scenarios: No overlap, complete overlap, and partial overlap.
For problem 2a, use the following code to help you.
dissimilarity<- function(p1,p2)
{
Your code here
return( list(d1=d1, d2=d2) ) # returning list of both indices
}
Presence-absence data
(Table 5.5 on text p. 68)
Site | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Species 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 |
Species 2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
Summary:
Species 2 | |||
Species 1 | present | absent | TOTAL |
present | a (=3) | b (=3) | a + b |
absent | c (=3) | d (=1) | c + d |
TOTAL | a + c | b + d | n |
Similarity measures(all vary between 0 (=no similarity) and 1 (=complete similarity)):
• Simple matching index: (a+d)/n
• Ochiai index:
• Dice-Sorensen index: 2a/(2a+b+c)
• Jaccard index: a/(a+b+c)
Problem 3:(3a.) Create a function that returns a list containing all four indices shown above (and on page 68). The input should be four values: PP,AP,PA,AA which match with a, b, c, and d. Comment your code.(3b.) Run your function for the three scenarios: (1) 0’s for AP and PA, but non-zero values for PP and AA; (2) 0’s for PP and AA, but non-zero values for PA and AP; (3) some blend between scenarios (1) and (2).
The following commands may help you:
sp1 <- c(0,1,0,1,1,0,1,1,0,0,0,1)
sp2 <- c(1,1,1,1,0,0,0,0,1,1,1,1)
table(sp1,sp2)
PresAbsIndex<- function(PP,PA,AP,AA)
{
[ Your code goes here ]
}
Problem 4:Install the package “ade4” followed by data(butterfly) to get the butterfly dataset described on pages 7 and 8 of our text. Our question is: “Is genetic similarity correlated with geographic distances between butterfly colonies?” Repeat the below code and fill in the genetic distance matrix using the distance measure we called d1 in problem 2. (Don’t worry about using percentages rather than proportions, the relative distances end up just being 0-100 instead of 0-1.) Perform a Mantel Randomization test and interpret the results. Repeat the test using the measure of distance d2. Did it make a difference? Start off with the below R code to help you. (Show your for-loop code as part of your answer as well as the scatter plots and randomization plots from the Mantel Randomization tests.)
library(ade4)
data(butterfly)
names(butterfly)
help(butterfly)
help(pch)
help(polygon)
help(segments)
plot(butterfly$contour[,1:2], pch=16, cex=.4) # set up x and y limits for graph
#segments(x0=butterfly$contour[,1],y0=butterfly$contour[,2],
# x1=butterfly$contour[,3],y1=butterfly$contour[,4])
polygon(butterfly$contour[,1:2], lty=2) # does same as segments()
points(butterfly$xy, pch=7)
nrow(butterfly$xy)
text(butterfly$xy, labels=1:16, pos=2, cex=.8)
apply(butterfly$genet,1,sum) # confirming rows add to 100
(Ddist<- dist(butterfly$xy))
dissimilarity(butterfly$genet[1,],butterfly$genet[2,])$d2 # testing it out
Dgenet<- matrix(NA,nrow=16,ncol=16)
[Your code, consisting of two nested for-loops, to fill in Dgenetmatrix . Use the function you created in problem 2.]
Dgenet # to see what matrix now looks like
Dgenet<- as.dist(Dgenet)
plot(Ddist,Dgenet)
cor(Ddist, Dgenet)
(mantel.out<- mantel.rtest(Dgenet,Ddist, nrepet=10000))
plot(mantel.out).