...

R graph gallery Contents Compilation by Eric Lecoutre December 12, 2003

by user

on
Category: Documents
33

views

Report

Comments

Transcript

R graph gallery Contents Compilation by Eric Lecoutre December 12, 2003
R graph gallery
Compilation by Eric Lecoutre
December 12, 2003
Contents
3D Bivariate Normal Density . . . . . . . . . . . . . . .
3D Scatterplot - Cloud, regression plan . . . . . . . . .
3D Wireframe - For surfaces . . . . . . . . . . . . . . . .
Agreement plot . . . . . . . . . . . . . . . . . . . . . . .
Conditional Plot 1 - coplot . . . . . . . . . . . . . . . .
Fourfold Display . . . . . . . . . . . . . . . . . . . . . .
Hexagon Binning Matrix . . . . . . . . . . . . . . . . . .
Mosaicplot - Associations in a contingency table . . . .
Parallel Plot - Comparing groups with few subjects . . .
Ternary Plot - Biplot . . . . . . . . . . . . . . . . . . . .
Tukey’s Hanging Rootogram . . . . . . . . . . . . . . .
Violin Plot - Boxplot showing density, aka vase boxplot
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
4
6
7
9
10
12
14
15
17
18
19
3D Bivariate Normal Density
variables: 2 QT
library: function: persp
submitted by: Bernhard Pfaff <[email protected]>
Sample graph
Two dimensional Normal Distribution
0.015
0.010
z
0.005
10
5
0.000
−10
0
−5
0
x1
f (x) =
1
2 π σ11 σ22 (1 − ρ2)
.
x2
−5
5
10 −10

(x1 − µ1)2
1
x1 − µ1 x2 − µ2 (x2 − µ2)2

exp −
, 
− 2ρ
+
2
σ11
σ22
σ22 
σ11

 2(1 − ρ ) 
Sample code
# 3-D plots
#
#
mu1<-0 # setting the expected value of x1
mu2<-0 # setting the expected value of x2
s11<-10 # setting the variance of x1
s12<-15 # setting the covariance between x1 and x2
s22<-10 # setting the variance of x2
rho<-0.5 # setting the correlation coefficient between x1 and x2
x1<-seq(-10,10,length=41) # generating the vector series x1
x2<-x1 # copying x1 to x2
#
f<-function(x1,x2)
{
term1<-1/(2*pi*sqrt(s11*s22*(1-rho^2)))
term2<--1/(2*(1-rho^2))
term3<-(x1-mu1)^2/s11
term4<-(x2-mu2)^2/s22
term5<--2*rho*((x1-mu1)*(x2-mu2))/(sqrt(s11)*sqrt(s22))
2
term1*exp(term2*(term3+term4-term5))
} # setting up the function of the multivariate normal density
#
z<-outer(x1,x2,f) # calculating the density values
#
persp(x1, x2, z,
main="Two dimensional Normal Distribution",
sub=expression(italic(f)~(bold(x))==frac(1,2~pi~sqrt(sigma[11]~
sigma[22]~(1-rho^2)))~phantom(0)^bold(.)~exp~bgroup("{",
list(-frac(1,2(1-rho^2)),
bgroup("[", frac((x[1]~-~mu[1])^2, sigma[11])~-~2~rho~frac(x[1]~-~mu[1],
sqrt(sigma[11]))~ frac(x[2]~-~mu[2],sqrt(sigma[22]))~+~
frac((x[2]~-~mu[2])^2, sigma[22]),"]")),"}")),
col="lightgreen",
theta=30, phi=20,
r=50,
d=0.1,
expand=0.5,
ltheta=90, lphi=180,
shade=0.75,
ticktype="detailed",
nticks=5) # produces the 3-D plot
#
mtext(expression(list(mu[1]==0,mu[2]==0,sigma[11]==10,sigma[22]==10,sigma[12
]==15,rho==0.5)), side=3) # adding a text line to the graph
3
3D Scatterplot - Cloud, regression plan
variables: 3 QT
library: lattice or scatterplot3d
function: cloud scatterplot3d,
Sample graph
Iris Data
o
o
o
setosa
versicolor
virginica
o o
o
o o
o
o
o
o
o
o
o
o
oo o
o
o
o
o
o o o
o oooo o
o
o oo
o
o
o o o o
oo
ooo
o
oo
o o oo o o
ooo
o
o
ooo oooo o
o
o oo
o o
o
Sepal.Length
o
oo o
Petal.Width
o ooo o
o o
o
ooooo o
o
oo
ooo
o
o ooo
o
oo
ooo
o o
oo
o
o
o o oo
oo
o
o
oo
ooooo o
oo
o oo o o
o
o
o
o
o
o o
o
Petal.Length
Sample code
data(iris)
cloud(Sepal.Length ~ Petal.Length * Petal.Width, data = iris,
groups = Species, screen = list(z = 20, x = -70),
perspective = FALSE,
key = list(title = "Iris Data", x = .15, y=.85, corner = c(0,1),
border = TRUE,
points = Rows(trellis.par.get("superpose.symbol"), 1:3),
text = list(levels(iris$Species))))
%$%
4
Sample graph
90
30
85
80
20
75
70
10
65
60
8
10
12
14
16
18
20
22
Girth
Sample code
data(trees)
s3d <- scatterplot3d(trees, type="h", highlight.3d=TRUE,
angle=55, scale.y=0.7, pch=16, main="scatterplot3d - 5")
# Now adding some points to the "scatterplot3d"
s3d$points3d(seq(10,20,2), seq(85,60,-5), seq(60,10,-10),
col="blue", type="h", pch=16)
# Now adding a regression plane to the "scatterplot3d"
attach(trees)
my.lm <- lm(Volume ~ Girth + Height)
s3d$plane3d(my.lm)
5
Height
50
40
Volume
60
70
80
scatterplot3d − 5
3D Wireframe - For surfaces
variables: 3 QT
library: lattice
function: wireframe
Sample graph
−1
−0.5
z
−0
y
−−0.5
x
−−1
Sample code
x <- seq(-pi, pi, len = 20)
y <- seq(-pi, pi, len = 20)
g <- expand.grid(x = x, y = y)
g$z <- sin(sqrt(g$x^2 + g$y^2))
wireframe(z ~ x * y, g, drape = TRUE,
aspect = c(3,1), colorkey = TRUE)
%$%
6
Agreement plot
variables: one confusion matrix
library: vcd
function: agreementplot
Description
Representation of a k ∗ k confusion matrix, where the observed and expected diagonal elements are
represented by superposed black and white rectangles, respectively.
Agreement chart allows to quickly see where two judges do disagree.
Sample graph
Very Often
Never Fun
Fairly Often
Wife
Always fun
Agreement Chart
Never Fun
Fairly Often
Very Often
Husband
Sample code
library(vcd)
7
Always fun
data(SexualFun)
agreementplot(t(SexualFun))
8
Conditional Plot 1 - coplot
variables: ≥ 3 QL or QT
library: function: coplot
Description
Sample graph
Given : wool
B
A
10
20
30
40
50
10
30
L
50
70
10
0
10
20
30
40
50
1:54
Sample code
data(warpbreaks)
## given two factors
coplot(breaks ~ 1:54 | wool * tension, data = warpbreaks,
col = "red", bg = "pink", pch = 21,
bar.bg = c(fac = "light blue"))
9
Given : tension
30
M
breaks
50
70
10
30
H
50
70
0
Fourfold Display
data: 2x2xk contingency tables
library: vcd
function: fourfoldplot
Description
√
The fourfold display depicts frequencies by quarter circles, whose radius is proportional to nij , so the
area is proportional to the cell count . The cell frequencies are usually scaled to equate the marginal
totals, and so that the ratio of diagonally opposite segments depicts the odds ratio. Confidence rings for
the observed odd ratio allow a visual test of the hypothesis H0 : θ = 1 corresponding to no association.
They have the property that the rings for adjacent quadrants overlap iff the observed counts are consistent
with the null hypothesis.
Sample graph
Department: A
Department: C
Sex: Male
Sex: Male
19
205
202
391
Admit?: No
Admit?: No
89
120
Admit?: Yes
313
Admit?: Yes
512
Sex: Female
Sex: Female
Department: B
Department: D
Sex: Male
Sex: Male
8
279
131
244
Admit?: No
Sex: Female
Admit?: No
17
138
Admit?: Yes
207
Admit?: Yes
353
Sex: Female
10
Sample code
library("vcd")
# Load data
data(UCBAdmissions)
x <- aperm(UCBAdmissions, c(2, 1, 3))
dimnames(x)[[2]] <- c("Yes", "No")
names(dimnames(x)) <- c("Sex", "Admit?", "Department")
# Shows for 4 departments
fourfoldplot(x[,,1:4])
11
Hexagon Binning Matrix
variables: 2 QL x 2 QT
sample size: large datasets (fast also for n ≥ 106 )
library: hexbin (bioconductor)
function: plot.hexbin, hmatplot
Description
Hexagon binning is a form of bivariate histogram useful for visualizing the structure in datasets with
large n. The underlying concept of hexagon binning is extremely simple;
1. the xy plane over the set (range(x), range(y)) is tessellated by a regular grid of hexagons.
2. the counts of points falling in each hexagon are counted and stored in a data structure
3. the hexagons with count > 0 are plotted using a color ramp or varying the radius of the hexagon
in proportion to the counts.
The algorithm is extremely fast and effective for displaying the structure of datasets with n ≥ 10 6 .
If the size of the grid and the cuts in the color ramp are chosen in a clever fashion than the structure
inherent in the data should emerge in the binned plots. The same caveats apply to hexagon binning as
apply to histograms and care should be exercised in choosing the binning parameters.
The hexbin library is a set of function for creating and plotting hexagon bins. The library extends the
basic hexagon binning ideas with several functions for doing bivariate smoothing, finding an approximate
bivariate median, and looking at the difference between two sets of bins on the same scale. The basic
functions can be incorporated into many types of plots.
Sample graph
45 < Age <= 65
Age > 65
Males
Females
Age <= 45
Sample code
library("hexbin")
data(NHANES)# pretty large data set!
good <- !(is.na(NHANES$Albumin) | is.na(NHANES$Transferin))
NH.vars <- NHANES[good, c("Age","Sex","Albumin","Transferin")]
# extract dependent variables and find
x <- NH.vars[,"Albumin"]
rx <- range(x)
y <- NH.vars[,"Transferin"]
ranges for global binning
12
ry <- range(y)
age <- cut(NH.vars$Age,c(1,45,65,200))
sex <- NH.vars$Sex
subs <- tapply(age,list(age,sex))
bivariate bins for each factor combination
for (i in 1:length(unique(subs))) {
good <- subs==i
assign(paste("nam",i,sep=""),
erode.hexbin(hexbin(x[good],y[good],xbins=23,xbnds=rx,ybnds=ry)))
}
nam <- matrix(paste("nam",1:6,sep=""),ncol=3,byrow=TRUE)
rlabels <-c("Females","Males")
clabels <- c("Age <= 45","45 < Age <= 65","Age > 65")
zoom <- hmatplot(nam,rlabels,clabels,border=list(hbox=c("black","white"),
hdiff=rep("white",6)))
13
Mosaicplot - Associations in a contingency table
variables: QL (≥ 2)
library: function: mosaicplot
Description
Mosaicplot graph represents a contingency table, each cell corresponding to a piece of the plot, which
size is proportional to cell entry.
Extended mosaic displays show the standardized residuals of a loglinear model of the counts from by
the color and outline of the mosaic’s tiles. (Standardized residuals are often referred to a standard normal
distribution.) Negative residuals are drawn in shaded of red and with broken outlines; positive ones are
drawn in blue with solid outlines.
Thus, mosaicplot are perfect to visualize associations within a table and to detect cells which create
dependancies.
Sample graph
Red
MaleFemale
Brown
Male
Female
Male
Blond
Female
Green
Hazel
Standardized
Residuals:
<−4
Blue
−4:−2
Eye
−2:0
0:2
Brown
2:4
>4
Black
Male Female
Hair
Sample code
data(HairEyeColor)
mosaicplot(HairEyeColor, shade = TRUE)
14
Parallel Plot - Comparing groups with few subjects
variables: 1 QL and at least 2 other variables
library: lattice or MASS
function: parallel
Description
Sample graph
virginica
Petal
Length
Three
Varieties
Sepal
Width
of
Iris
Sepal
Length
setosa
Petal
Length
versicolor
Sepal
Width
Sepal
Length
Min
Max
Sample code
data(iris)
parallel(~iris[1:3]|Species, data = iris,
layout=c(2,2), pscales = 0,
varnames = c("Sepal\nLength", "Sepal\nWidth", "Petal\nLength"),
page = function(...) {
grid.text(x = seq(.6, .8, len = 4),
y = seq(.9, .6, len = 4),
label = c("Three", "Varieties", "of", "Iris"),
gp = gpar(fontsize=20))
})
15
Sample graph
Petal L.
Petal W.
Sepal W.
Sample code
data(iris3)
ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
parcoord(log(ir)[, c(3, 4, 2, 1)], col = 1 + (0:149)%/%50)
16
Sepal L.
Ternary Plot - Biplot
variables: 3 QT and 1 QL
library: ade4
function: triangle.plot
Description
Graphs for a dataframe with 3 columns of positive or null values ‘triangle.plot’ is a scatterplot ‘triangle.biplot’ is a paired scatterplots
Sample graph
0
0.8
0
0.134
0.8
Netherlands
Belgium
United_Kingdom
Denmark
pri
ter
0.506
Luxembourg
France
pri
ter
Ireland Italy
Germany
Spain
Greece Portugal
0.5
0.2
0.3
0.7
0.36 sec
0.4
0.2
0.4
0.6
sec
Principal axis
0
0.8
29 1
pri
0
12
4 8
ter
pri
11
6
7
17
3
5
21
13
14 24
20
9
16
2 1
12
19
18
4 823
15
11
6
7
22
3
ter
5
10
0.5
0.2
0.8
10
sec
0.3
0.7
0.5
0.2
sec
0.3
0.7
Sample code
data (euro123)
par(mfrow = c(2,2))
triangle.plot(euro123$in78, clab = 0, cpoi = 2, addmean = TRUE,
show = FALSE)
triangle.plot(euro123$in86, label = row.names(euro123$in78), clab = 0.8)
triangle.biplot(euro123$in78, euro123$in86)
triangle.plot(rbind.data.frame(euro123$in78, euro123$in86), clab = 1,
addaxes = TRUE, sub = "Principal axis", csub = 2, possub = "topright")
par(mfrow = c(1,1))
17
Tukey’s Hanging Rootogram
variables: 1 QL
library: vcd
function: rootogram
Description
Discrete frequency distributions are often graphed as histograms, with a theoretical fitted distribution
superimposed. It is hard to compare the observed and fitted frequencies visually, because (a) we must
assess deviations against a curvilinear relation, and (b) the largest frequencies dominate the display.
The hanging rootogram (Tukey, 1977) solves these problems by (a) shifting the histogram bars to
coincide with the fitted curve, so that deviations may be judged by deviations from a horizontal line, and
(b) plotting on a square-root scale, so that smaller frequencies are emphasized. Featured example shows
more clearly that the observed frequencies differ systematically from those predicted under a Poisson
model.
6
0
2
4
sqrt(Frequency)
8
10
Sample graph
0
1
2
3
4
Number of Occurrences
Sample code
library("vcd")
# create data
madison=table(rep(0:6,c(156,63,29,8,4,1,1)))
# fit a poisson distribution
madisonPoisson=goodfit(madison,"poisson")
rootogram(madisonPoisson)
18
5
6
Violin Plot - Boxplot showing density, aka vase boxplot
variables: 1 QT and optional QL for groups
library: simpleR - Using R for Introductory Statistics
author: John Verzani, College of Staten Island - http://www.math.csi.cuny.edu/ verzani/
featured in: Simple R
function: simple.violinplot
Description
Violin plot are similar to boxplot except that they show the density of the data, estimated by kernel
method.
0
50
100
150
Sample graph
A
B
C
D
E
F
G
H
Sample code
# library(Simple) - required library which comes with Simple R
# http://www.math.csi.cuny.edu/Statistics/R/simpleR
data(OrchardSprays)
simple.violinplot(decrease ~ treatment, data = OrchardSprays,col="bisque",border="black")
19
Fly UP