Mapping with R: Choropleth of 3rd grade reading proficiency in the Twin Cities

I decided to make a more advanced map, one with some relevance to my graduate studies in evaluation and educational psychology. This map shows geographic variation in third grade reading proficiency as measured by the 2005 Minnesota Comprehensive Assessment (MCA). Click on the thumbnail to enlarge:

TCMetro_MCAR05_041908.png

Merging test scores to the attribute data frame was difficult. The attribute data frame must evidently remain sorted in the same order in which it was converted to an sp class object. I found this out after merging district test scores to the attribute data frame and finding that Cannon Falls had migrated up to Minneapolis. I created a row number variable to re-sort the data frame back to its original order before adding it back to the sp object. I’m betting there is a better way to achieve the same type of join. In the future, I will probably join additional variables to the DBF portion of the SHP file before calling it into R.

Here is the script:

library(maptools)
library(rgdal)
library(RColorBrewer)
library(classInt)

#get data
mca05r3m <- read.table(file.choose()) #MCA scores from MDE
districts <- readShapePoly(file.choose(), proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #school districts from MetroGIS #prepare data
districts <- spTransform(districts, CRS=CRS("+proj=longlat"))
districts@data$row <- as.numeric(row.names(districts@data)) #necessary for linking attributes back to polygons after merge
mca05r3m$proficient <- with(mca05r3m, percentLevel3 + percentLevel4 + percentLevel5) #MCA levels 3-5 represent proficiency
temp <- merge(districts@data, mca05r3m, by="SD_NUMBER", all.x=T, sort=F) #merge scores to attribute data frame
districts@data <- temp[order(temp$row),] #order is key
districts@data

#set up the choropleth with helpful code from GeogR
plotvar <- districts@data$proficient
nclr <- 6
plotclr <- brewer.pal(nclr,"Oranges")
plotclr <- plotclr[nclr:1] #reordering colors looks better
class <- classIntervals(plotvar, nclr, style="quantile")
colcode <- findColours(class, plotclr, digits=4)
colcode

#map it
plot(districts, density=16, col="grey", axes=T, cex.axis=.75) #hatches missing data
plot(districts, col=colcode, add=T)
text(coordinates(districts), labels=districts$SD_NAME.x, cex=.5)
title(main="Percentage of Third Graders Reading Proficiently in 2005", sub="Created with R",font.sub=2)
title(xlab="Longitude",ylab='Latitude',cex.lab=.75,line=2.25)
legend(-94.1, 45.4, legend=names(attr(colcode, "table")), fill=attr(colcode, "palette"), cex=0.75, bty="n")

Sources of data and code include:
http://education.state.mn.us/
http://www.metrogis.org/
http://geography.uoregon.edu/GeogR/index.html

Posted in Praxes | 4 Comments

About me

Moore.jpg

Current position
-Data scientist

Research interests
-Preventing educational and health disparities
-Program and policy evaluation
-Spatial statistical methods and the geography of education
-Causal theory and inference
-Latent variable models (item response theory, factor analysis, structural equation models)
-Mixed models for longitudinal and multilevel data

Alumnus
Quantitative Methods in Education (QME)
Hubert H. Humphrey School of Public Affairs
Maryville College

Work experience
Minneapolis Public Schools
Minnesota Department of Education
Office of Research Consultation and Services (ORCS)
Human Capital Research Collaborative
Saint Paul Public Schools’ Department of Research, Evaluation and Assessment
Wilder Research
Child Care Resource and Referral (CCR&R)
Widjiwagan

Location
Minneapolis, Minnesota, USA

Curricullum vitae

Posted in About me | Comments Off on About me

Mapping with R

I have been learning R software for statistical computing. I have used it for all of my homework assignments in EPSY 8261 (Probability and Inference) and 8262 (Regression and the General Linear Model). My prof has an R blog here: http://rguide.blogspot.com/.

I envision bridging my studies in educational psychology (quantitative methods, evaluation) with my interest in spatial analysis and mapping. I read yesterday that APA’s science directorate is trying to encourage greater use of GIS in psychological research. You can read more here: http://www.apa.org/science/psa/feb08gis.html.

I decided to explore R’s mapping capabilities today. Here are a couple findings: -The shapefiles package is not really necessary. -The maptools package can read and display shapefiles, but the author encourages converting shapefiles into sp class or polylist objects. -Once shapefiles are converted, one can use the maptools plot.polylist polygon helper function.

Pat Bartlein at the University of Oregon recommends the following packages: sp, maptools, rgdal, maps, mapproj, mapdata, classInt, scatterplot3d, and RColorBrewer. He created a great website for folks like me who are starting out using R for geography: http://geography.uoregon.edu/GeogR/index.html. Thanks, Pat!

Applied researchers in the Twin Cities unfamiliar with GIS should know that the Metropolitan Council and its MetroGIS initiative have done a great job of collaborating with local entities to make a ton of data available at no charge: http://www.metrogis.org/.

I prefer ArcGIS for making maps and performing spatial analyses. I use MapInfo at work because our IT/geographer person prefers it. Mapping with R is similar to statistical computing in R–it’s a blank portrait waiting for color and texture, whereas SPSS and commercial GIS software are like the Etch A Sketch.

Click on the thumbnail for a large version of the Twin Cities map I created with R:

Twin Cities Metro Area 041208.png

Here is the script:

library(maptools)
library(rgdal)
counties=readShapePoly(file.choose(),proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #choose MetroGIS' county.shp
counties=spTransform(counties, CRS=CRS("+proj=longlat"))
summary(counties)
#capitalize first letter of county names
capwords cap tolower(substring(s,2)), sep = "", collapse = " " )
sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
}
names=capwords(as.vector(counties$CO_NAME))
names
library(RColorBrewer)
display.brewer.all() #check out the color options
cols water=readShapePoly(file.choose(),proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #choose MetroGIS' hydro_luse2000.shp
water=spTransform(water, CRS=CRS("+proj=longlat"))
coordinates(counties) #centroids
plot(counties,col=cols,axes=T,cex.axis=.75) #plot function in maptools package
plot(water,col='#6BAED6',border=NA,add=T)
text(coordinates(counties), labels=names, cex=.75)
title(main="Twin Cities Metro Area",sub="Created with R",font.sub=2)
title(xlab="Longitude",ylab='Latitude',cex.lab=.75,line=2.25)
Posted in Praxes | Comments Off on Mapping with R