ggplot2 Choropleth of Supreme Court Decisions: A Tutorial

I don't do much GIS but I like to. It's rather enjoyable and involves a tremendous skill set. Often you will find your self grabbing data sets from some site, scraping, data cleaning and reshaping, and graphing. On the ride home from work yesterday I heard an NPR talk about the Supreme Court decisions being very close with this court. This got me wondering if there is a data base with this information and the journey began. This tutorial is purely exploratory but you will learn to:

  1. Grab .zip files from a data base and read into R
  2. Clean data
  3. Reshape data with reshape2
  4. Merge data sets
  5. Plot a Choropleth map in ggplot2
  6. Arrange several grid plots with gridExtra

I'm lazy and like a good challenge. I challenged myself to not manually open a file so I downloaded Biobase from bioconductor to open the pdf files for the codebook. Also I used my own package qdap because it had some functions I like and I'm used to using them. This blog post was created in the dev. version of the reports package using the wordpress_rmd template.

Also note that this is designed to be instructional. I broke the code up into chunks with an explanation in between, This is extremely annoying if you just want to run the code so for this sort of person I have provided the code here. Thanks to Bryan Goodrich of talkstats.com who has many programming and statistics talents and shared much with me on GIS topics.

Enjoy!

Load Packages

## download Biobase so we don't have to manually open codebook
source("http://bioconductor.org/biocLite.R")
biocLite("Biobase", suppressUpdates = TRUE)
library(qdap)
## Load initial required packages
lapply(qcv(ggplot2, maps, ggthemes, Biobase), require, character.only = T)

Get Data

The Supreme Court Codebook and opened without clicking

## download the pdf code book and open it
url_dl(SCDB_2012_01_codebook.pdf, url = "http://scdb.wustl.edu/_brickFiles/2012_01/")
openPDF(file.path(getwd(), "SCDB_2012_01_codebook.pdf"))

The Supreme Court Data; learn to download and open a zip file

temp <-tempfile()
download.file("http://scdb.wustl.edu/_brickFiles/2012_01/SCDB_2012_01_caseCentered_Citation.csv.zip", 
    temp)
dat <-read.csv(unz(temp, "SCDB_2012_01_caseCentered_Citation.csv"))
unlink(temp)
htruncdf(dat, 6, 6)
##   caseId docket caseIs voteId dateDe decisi usCite sctCit ledCit lexisC
## 1 1946-0 1946-0 1946-0 1946-0 11/18/      1 329 U. 67 S.  91 L.  1946 U
## 2 1946-0 1946-0 1946-0 1946-0 11/18/      1 329 U. 67 S.  91 L.  1946 U
## 3 1946-0 1946-0 1946-0 1946-0 11/18/      1 329 U. 67 S.  91 L.  1946 U
## 4 1946-0 1946-0 1946-0 1946-0 11/25/      7 329 U. 67 S.  91 L.  1946 U
## 5 1946-0 1946-0 1946-0 1946-0 11/25/      1 329 U. 67 S.  91 L.  1946 U
## 6 1946-0 1946-0 1946-0 1946-0 11/25/      1 329 U. 67 S.  91 L.  1946 U
##   term natura  chief docket caseNa dateAr dateRe petiti petiti respon
## 1 1946   1301 Vinson     24 HALLIB 1/9/19 10/23/    198   <NA>    172
## 2 1946   1301 Vinson     12 CLEVEL 10/10/ 10/17/    100   <NA>     27
## 3 1946   1301 Vinson     21 CHAMPL 11/8/1 10/18/    209   <NA>     27
## 4 1946   1301 Vinson     26 UNITED 1/31/1 10/25/     27   <NA>    170
## 5 1946   1301 Vinson     50 UNITED 10/25/            27   <NA>    176
## 6 1946   1301 Vinson     46 RICHFI 10/24/           198   <NA>      4
##   respon jurisd adminA adminA threeJ caseOr caseOr caseSo caseSo lcDisa
## 1   <NA>      6   <NA>   <NA>      0     51      6     29   <NA>      0
## 2   <NA>      1   <NA>   <NA>      0    123     52     30   <NA>      0
## 3   <NA>      2     66   <NA>      1    107     42    107     42      0
## 4   <NA>      1   <NA>   <NA>      0      3   <NA>      3   <NA>      0
## 5   <NA>      1   <NA>   <NA>      0      3   <NA>      3   <NA>      0
## 6      6      2    117      6      0    302      6    300      6      1
##   certRe lcDisp lcDisp declar caseDi caseDi partyW preced voteUn issue
## 1     11      2      1      1      3      0      1      0      0 80180
## 2      4      2      1      1      2      0      0      0      0 10500
## 3      1   <NA>      2      1      2      0      0      0      0 80250
## 4     10   <NA>      2      1      2      0      0      0      0 20150
## 5      2   <NA>      2      1      3      0      1      0      0 80060
## 6      1      3      2      3      3      0      1      0      0 80100
##   issueA decisi decisi author author lawTyp lawSup lawMin majOpi majOpi
## 1      8      2      0      4   <NA>      6    600 35 U.S     78     78
## 2      1      1      0      4   <NA>      6    600 18 U.S     81     87
## 3      8      2      0      1   <NA>      2    207            84     78
## 4      2      2      0      4   <NA>      6    600 49 Sta     87     87
## 5      8      1      0      7   <NA>   <NA>   <NA>            78     78
## 6      8      1      0      2   <NA>      1    129            81     87
##   splitV majVot minVot
## 1      1      8      1
## 2      1      6      3
## 3      1      5      4
## 4      1      5      3
## 5      1      6      3
## 6      1      7      1

Source a Codebook for State Keys Used By Supreme Court Data

source("http://copy.com/zEtAXJC8tG7yv7Zz")
head(state.key)
##   code          state
## 1    1        alabama
## 2    2         alaska
## 3    3 american samoa
## 4    4        arizona
## 5    5       arkansas
## 6    6     california

Clean Data

Clean Supreme Court Data

dat$state <-lookup(dat$caseOriginState, state.key)
dat2 <-dat[!is.na(dat$state), ]
dat_state <-data.frame(with(dat2, prop.table(table(state))))
head(dat_state)
##        state     Freq
## 1    alabama 0.030063
## 2     alaska 0.005010
## 3    arizona 0.017954
## 4   arkansas 0.010438
## 5 california 0.103549
## 6   colorado 0.009603

Before I get started with any sizable graphing project I start with the bare minimum and add to the code. Dr. Hadley Wickham has provided just such a minimal example for Choropleth mapping on pages 10-11 in the Changes and Additions guide

Minimal Choropleth Example

crimes <-data.frame(state = tolower(rownames(USArrests)), USArrests)

states_map <-map_data("state")

ggplot(crimes, aes(map_id = state)) +
    geom_map(aes(fill = Murder), map = states_map) +
    expand_limits(x = states_map$long, y = states_map$lat) 

plot of chunk unnamed-chunk-6


Map the Data

Here I use the maps package to get the state longitude and latitudes for the shapes.

states_map <-map_data("state")
head(states_map)
##     long   lat group order  region subregion
## 1 -87.46 30.39     1     1 alabama      <NA>
## 2 -87.48 30.37     1     2 alabama      <NA>
## 3 -87.53 30.37     1     3 alabama      <NA>
## 4 -87.53 30.33     1     4 alabama      <NA>
## 5 -87.57 30.33     1     5 alabama      <NA>
## 6 -87.59 30.33     1     6 alabama      <NA>

Plot the Data

Use map_id to map the states. expand_limits is filled with the states_map data set's latitude and longitude.

ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = Freq), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(legend.position = "bottom",
         axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="blue") +
    guides(fill = guide_colorbar(barwidth = 10, barheight = .5)) + 
    ggtitle("Choropleth Supreme Court")

plot of chunk unnamed-chunk-8

Notice the little trick of moving the legend to the bottom and making it narrow? This is done with:

theme(legend.position = "bottom")

## and

guides(fill = guide_colorbar(barwidth = 10, barheight = 0.5))

Generate labels

I said to myself, “Self I forgot all the state names; this needs labels”. Here is a question on label centering I asked at stackoverflow. The trick to supplying text data to ggplot is it has to be in a data.frame format. Note that you need the state (region) name, latitude and longitude. I also added angle to be able to manually twist the angles of individual labels. The trick here was to take the mean of the range of the shape file lats/longs in an answer provided by Andrie from stackoverflow.com. Note that it is extremely important that you are now adding a new date set to ggplot and you need to unmap the map_id with map_id = NULL otherwise ggplot2 will become enraged and refuse to comply with what you consider to be a reasonable request.

cnames <-aggregate(cbind(long, lat) ~ region, data = states_map, FUN = function(x) mean(range(x)))
cnames$angle <-0
head(cnames)
##        region    long   lat angle
## 1     alabama  -86.69 32.63     0
## 2     arizona -111.92 34.17     0
## 3    arkansas  -92.14 34.75     0
## 4  california -119.26 37.28     0
## 5    colorado -105.55 39.00     0
## 6 connecticut  -72.75 41.53     0

Plot With Labels 1

ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = Freq), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(legend.position = "bottom",
         axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="blue") +
    guides(fill = guide_colorbar(barwidth = 10, barheight = .5)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) + 
    ggtitle("Choropleth Supreme Court (With Labels 1)")

plot of chunk unnamed-chunk-11

You can manually adjust the labels with indexing the dataframe cnames.

Manually Move State Locations and Change Angle

cnames[11, c(2:3)] <-c(-114.5, 43.5)  # alter idaho's coordinates
cnames[17, 3] <-30.75  # alter louisiana's coordinates
cnames[21, c(2:3)] <-c(-84.5, 43)  # alter michigan's coordinates
cnames[23, 4] <-90  # alter mississippi's angle
cnames[9, c(2, 4)] <-c(-81.5, 90)  # alter florida's angle and coordinates

Plot With Labels 2

ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = Freq), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(legend.position = "bottom",
         axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="blue") +
    guides(fill = guide_colorbar(barwidth = 10, barheight = .5)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) + 
    ggtitle("Choropleth Supreme Court (With Labels 2)")

plot of chunk unnamed-chunk-13


Further Exploring the Data

From there a new thought emerged. It seemed that a few states had the most percentage of Supreme Court cases originating in them. I wondered if this had something to do with population. I wanted to compare a population Choropleth. This meant grabbing more data and the US Census database is just the place.

Download and read in a zip file just like the Supreme Court Data.

## Download the US census database
temp <-tempfile()
download.file("http://www2.census.gov/census_2000/datasets/demographic_profile/0_All_State/2khxx.zip", 
    temp)
demo <-read.csv(unz(temp, "2khxx.csv"))
unlink(temp)

Clean Data

Clean Census Data and Merge With dat_state From Above

## browseURL("http://www2.census.gov/census_2000/datasets/demographic_profile/Alabama/2kh01.pdf")
vars <-data.frame(codes = qcv(X281421906, X138053563, X143368343, X35.3, United.States), 
    var = qcv(pop, male, female, med_age, state))

colnames(demo)[colnames(demo) %in% vars[, 1]] <-lookup(colnames(demo)[colnames(demo) %in% vars[, 1]], vars)

demo$state <-tolower(demo$state)
demo <-demo[, colnames(demo) %in% vars[, 2]]
demo <-demo[demo$state %in% tolower(state.name), ]

## Merge it
dat_state <-merge(demo, dat_state, by = "state")

One thing led to another and before I knew it I decided to include male to female ratio and median age in the analysis. My first approach (which I did not like) was to combine all the data into a long format data set that I could pass to ggplot2 and then facet by the chosen variables. This required me to use a common scale for the variables. I used the apply function to scale the variables.

Clean Census Data and Reshape it using the melt function from Reshape2.

library(reshape2)
dat_state <-transform(dat_state, per.male = male/c(male + female))
colnames(dat_state)[6] <-"case_origin"
dat_state2 <-melt(data.frame(dat_state[, 1, drop = FALSE], apply(dat_state[, 
    -c(1, 3:4)], 2, scale)))
head(dat_state2)
##        state variable    value
## 1    alabama      pop -0.18913
## 2     alaska      pop -0.80673
## 3    arizona      pop -0.07863
## 4   arkansas      pop -0.47588
## 5 california      pop  4.56783
## 6   colorado      pop -0.21271

Faceted Plot Attempt 1

ggplot(dat_state2, aes(map_id = state)) +
    geom_map(aes(fill = value), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(legend.position = "bottom",
         axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="blue") +
    guides(fill = guide_colorbar(barwidth = 10, barheight = .5)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) +
    facet_grid(variable~.)

plot of chunk unnamed-chunk-17


A Hunger for Better Display

This was unsatisfying in that I had to use a common scale and the meaning was lost. Also I couldn't control individual map colors easily (though I'm sure there's a way). I decided to instead create 4 separate plots and feed them to grid.arange of the gridExtra package (a compliment to ggplot2).

plot1 <-ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = case_origin), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="orange", name="Percent") +
    guides(fill = guide_colorbar(barwidth = .5, barheight = 10)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) +
    ggtitle("Origin of Supreme Court Case (percent)") 

plot2 <-ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = pop), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="red", name="People") +
    guides(fill = guide_colorbar(barwidth = .5, barheight = 10)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) +
    ggtitle("State Populations")

plot3 <-ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = med_age), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="darkgreen", name="Age") +
    guides(fill = guide_colorbar(barwidth = .5, barheight = 10)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) +
    ggtitle("Median Age")

plot4 <-ggplot(dat_state, aes(map_id = state)) +
    geom_map(aes(fill = per.male), map = states_map, color ="black") +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(axis.ticks = element_blank(), 
         axis.title = element_blank(), 
         axis.text =  element_blank()) +
    scale_fill_gradient(low="white", high="blue", name="Percent Male") +
    guides(fill = guide_colorbar(barwidth = .5, barheight = 10)) + 
    geom_text(data=cnames, aes(long, lat, label = region,  
        angle=angle, map_id =NULL), size=2.5) +
    ggtitle("Gender Distribution")

library(gridExtra)
grid.arrange(plot1, plot3, plot2, plot4, ncol = 2)

plot of chunk unnamed-chunk-18

I did not like the alignment of the plot edges and didn't know how to solve the problem. I asked on stackoverflow.com and Kohske gave the following approaches in his response:

Using grid.draw and Aligning Plot Edges

library(gtable)

p1 <-ggplotGrob(plot1)
p2 <-ggplotGrob(plot2)
p3 <-ggplotGrob(plot3)
p4 <-ggplotGrob(plot4)

library(gtable)
grid.draw(cbind(rbind(p1, p2, size="last"), rbind(p3, p4, size="last"), size = "first"))

plot of chunk unnamed-chunk-19

Eliminating the Plot Box and Aligning the Legends

This approach looks nice (though the plot box had to be removed as the legend covered it). It is not recomended by Kohske as it's a bit hacky. If someone has a better approach please share.

plot1b <-plot1 + theme(panel.border = element_blank())
plot2b <-plot2 + theme(panel.border = element_blank())
plot3b <-plot3 + theme(panel.border = element_blank())
plot4b <-plot4 + theme(panel.border = element_blank())

p1b <-ggplotGrob(plot1b)
p2b <-ggplotGrob(plot2b)
p3b <-ggplotGrob(plot3b)
p4b <-ggplotGrob(plot4b)

gt <-cbind(rbind(p1b, p2b, size="last"), rbind(p3b, p4b, size="last"), size = "first")

for (i in which(gt$layout$name == "guide-box")) {
  gt$grobs[[i]] <-gt$grobs[[i]]$grobs[[1]]
}

grid.draw(gt)

plot of chunk unnamed-chunk-20

The final out put was very satisfying. I did notice that yes indeed the states of high population also had a high number of Supreme Court cases originating in them. That was sensible. I noted that New York and Alabama seemed to have more Supreme Court cases originating in them in comparison to their populations (but only slightly). There is still a ton of data in the two data sets left to explore (particularly from a time series perspective). Feel free to experiment yourself with the data.


Please be sure to provide feedback in the comments below.

About tylerrinker

Data Scientist, open-source developer , #rstats enthusiast, #dataviz geek, and #nlp buff
This entry was posted in data, ggplot2, reports, Uncategorized, visualization and tagged , , , , , , , , . Bookmark the permalink.

31 Responses to ggplot2 Choropleth of Supreme Court Decisions: A Tutorial

  1. piotrek says:

    in the title – “choropleth”, no “chloropleth”

  2. josh says:

    I don’t want to be a curmudgeon troll or anything, but this isn’t an exercise in GIS. This is thematic mapping or spatial analysis with R. With a GIS, you have the ability to create spatial data – i.e. the boundaries, lines, and points; kind of like CAD. I don’t think R does that… yet. It might seem like a small distinction, but technically using a GIS is something separate from simply mapping with software. Hope this doesn’t come off too know-it-all-ish, because I think thematic mapping with R is awesomely cool. Great work.

  3. David Unwin says:

    Please note that the map is called a CHOROPLETH without the l. Nothing to do with the green gas, more the Greek choro = “area” and pleth = “value”. This is a common error but not one I would want to see on R bloggers. Tylerrinker has it correctly (above)
    David Unwin (author “Geographic Information Analysis”)

  4. Pingback: Data Viz News [14] | Visual Loop

  5. Felipe says:

    Tyler, Not sure if you are aware of a broken link to the Rmd file that created choropleth tutorial. Is there a way to get the Rmd from you? Thanks

  6. tylerrinker says:

    @Felipe can you try again? I clicked above and it worked. Here is the url link as well: https://dl.dropboxusercontent.com/u/61803503/wp/choropleth.Rmd

    Please let me know if the link in the blog post works.

    • Felipe says:

      Thank you very much Tyler, it worked. I thought I had seen some colored font on your tutorial that’s why I wanted to see the Rmd. I am looking for a way to color part of a paragraph in an Rmd file..do you by chance know how to do that?

  7. tylerrinker says:

    Are you talking about the way the html file looks after it’s rendered?

    • Felipe says:

      Yes, for example if I have this from your tutorial “Eliminating the Plot Box and Aligning the Legends” and I want to color it blue when the html is rendered..is there a way to do that?

      • tylerrinker says:

        Use `r FT(blue, text=”Eliminating the Plot Box and Aligning the Legends”)` which gives: Eliminating the Plot Box and Aligning the Legends. Note that this is from the development version of the reports package
        .

      • Felipe says:

        Thank you so much Tyler, I will try it out!!!

      • Felipe says:

        Tyler, the FT function is needed to be able to color the text,,,is it in the Reports package? if so, is it in CRAN already?

      • Felipe says:

        Answering my own question…just found it in CRAN

      • tylerrinker says:

        Felipe use the dev version. FT may not even be a part of the CRAN version. There’s a been a ton of development on the reports package since the first upload to cran. I’m waiting to push to CRAN until after slidify has been pushed to CRAN. Click here for the dev version of reports. Installing is pretty straight forward with the devtools package.

      • Felipe says:

        Thank you very much, I downloaded the developing version and it worked well…I got blue font now….Thank you again…great package.

  8. Pingback: Shape File Selfies in ggplot2 | TRinker's R Blog

  9. Pingback: Shape File Selfies in ggplot2 ← Patient 2 Earn

  10. Aaron Zeng says:

    Reblogged this on Aaron Zeng and commented:
    Nice tutorial on ggplot2 choropleth maps

  11. D8Amonk says:

    The state.key isn’t working anymore 😦

  12. sarika says:

    can we draw the same map without using shape file. Instead can we use google map and csv file to map the points?

  13. Hi tylerrinker!

    In the step:

    Source a Codebook for State Keys Used By Supreme Court Data
    source(“http://copy.com/zEtAXJC8tG7yv7Zz”)
    head(state.key)

    The url: http://copy.com/zEtAXJC8tG7yv7Zz is not wroking now. So, I can not get the “state.key” file by now.

    Do you have with you? If so, would you mind sending me a copy of it?
    OR
    Would you mind sending me a link from where I download it by myself?

    Waiting for your answer,

    HA

    • tylerrinker says:

      Sorry copy went away…This code would work:

      pacman::p_load_current_gh('trinker/textreadr', 'trinker/textshape')
      pacman::p_load(dplyr, tidyr)

      state.key = 'http://scdb.wustl.edu/_brickFiles/2012_01/SCDB_2012_01_codebook.pdf' %>%
      download() %>%
      read_pdf() %>%
      filter(page_id %in% 119:120) %>%
      filter(grepl("^(?!.*Values)(?=.*(^\\s*\\d{1,2}\\s+[A-Z][a-z]))", text, perl=TRUE)) %>%
      select(text) %>%
      mutate(text = gsub("\\s+", " ", trimws(text))) %>%
      extract(text, c("code", "state"), "(\\d+) (.+$+)") %>%
      mutate(code = as.numeric(code))

  14. Hector Alvaro Rojas says:

    Thanks for your answer.
    The code doesn’t work!
    At the very beginning at got this answer from R project: ———————————————
    > pacman::p_load_current_gh(‘trinker/textreadr’, ‘trinker/textshape’)
    Error in loadNamespace(name) : there is no package called ‘curl’
    Error in loadNamespace(name) : there is no package called ‘curl’
    Warning messages:
    1: In p_install_gh(package, …) :
    The following may have incorrect capitalization specification:

    textshape
    2: In pacman::p_load_current_gh(“trinker/textreadr”, “trinker/textshape”) :

    Failed to install/load:
    trinker/textreadr

    Anyway, I tried getting all the script done but I got this answer from R project: ————————–
    > pacman::p_load_current_gh(‘trinker/textreadr’, ‘trinker/textshape’)
    Error in loadNamespace(name) : there is no package called ‘curl’
    Error in loadNamespace(name) : there is no package called ‘curl’
    Warning messages:
    1: In p_install_gh(package, …) :
    The following may have incorrect capitalization specification:

    textshape
    2: In pacman::p_load_current_gh(“trinker/textreadr”, “trinker/textshape”) :

    Failed to install/load:
    trinker/textreadr
    > pacman::p_load(dplyr, tidyr)
    >
    > state.key = ‘http://scdb.wustl.edu/_brickFiles/2012_01/SCDB_2012_01_codebook.pdf’ %>%
    + download() %>%
    + read_pdf() %>%
    + filter(page_id %in% 119:120) %>%
    + filter(grepl(“^(?!.*Values)(?=.*(^\\s*\\d{1,2}\\s+[A-Z][a-z]))”, text, perl=TRUE)) %>%
    + select(text) %>%
    + mutate(text = gsub(“\\s+”, ” “, trimws(text))) %>%
    + extract(text, c(“code”, “state”), “(\\d+) (.+$+)”) %>%
    + mutate(code = as.numeric(code))
    Error in function_list[[i]](value) : could not find function “download”
    >
    I will try getting the code by transferring the file “SCDB_2012_01_codebook.pdf” to Excel and reading from there.
    At the same time I will get more information about some R-package that let me read “pdf files” from de web directly.
    I will be waiting from your answer.

    • tylerrinker says:

      Try installing curl first using install.packages(‘curl’)

      Then run the script…

      • Hector Alvaro Rojas says:

        Hi Taylor!

        Thanks again for answer.

        Yes! Now is working?

        I do not know why did not worked before. I tried what you said before but it did not worked for me. Now it does.

        I think it could be because now I started the R-project session installing “curl” package and then continue with the script.

        Before I run the script, I read the error message, I installed “curl” package and then run the script again. Using this way I continued getting the same error message from r-project.

        Thanks Taylor!

        Now, I can read “state.key” file and the script is doing all until the Minimal Choropleth Example. I got this nice map, but I cannot get through the rest of the results. I will try to fix it and will let you know if I can or cannot solve those problems.

        My best regards!

        HA

  15. Hector Alvaro says:

    Hi Taylor!

    How about you?

    I am back trying to get this wonderful tutorial done.

    I already get done the “Minimal Choropleth Example”. But now, when I tried to get done the first map after this (the one you called “Plot the Data”) I got an error message that I couldnt solve. Here it is all the steps involved:

    > head(“)
    state Freq
    1 Alabama 0.030062630
    2 Alaska 0.005010438
    3 Arizona 0.017954071
    4 Arkansas 0.010438413
    5 California 0.103549061
    6 Colorado 0.009603340
    > tail(dat_state)
    state Freq
    47 Vermont 0.0029227557
    48 Virginia 0.0212943633
    49 Washington 0.0225469729
    50 West Virginia 0.0100208768
    51 Wisconsin 0.0179540710
    52 Wyoming 0.0008350731
    > states_map head(states_map)
    long lat group order region subregion
    1 -87.46201 30.38968 1 1 alabama
    2 -87.48493 30.37249 1 2 alabama
    3 -87.52503 30.37249 1 3 alabama
    4 -87.53076 30.33239 1 4 alabama
    5 -87.57087 30.32665 1 5 alabama
    6 -87.58806 30.32665 1 6 alabama
    >
    > ggplot(dat_state, aes(map_id = state)) +
    + geom_map(aes(fill =Freq), map = states_map) +
    + expand_limits(x = states_map$long, y = states_map$lat)
    Error in unit(x, default.units) : ‘x’ and ‘units’ must have length > 0
    >

    I think the Error is on the “dat_state” file becouse I change the “dat” file and I got the plot but for the data in the file “crimes”, of course.

    Here is the script that is functioning.

    ggplot(crimes, aes(map_id = state)) +
    geom_map(aes(fill = Murder), map = states_map, color =”black”) +
    expand_limits(x = states_map$long, y = states_map$lat) +
    theme_few()+
    theme(legend.position = “bottom”,
    axis.ticks = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank()) +
    scale_fill_gradient(low=”white”, high=”blue”) +
    guides(fill = guide_colorbar(barwidth = 10, barheight = .5)) +
    ggtitle(“Choropleth Supreme Court”)

    I will be waiting from your answer.

    Thanks Taylor!

    HA

  16. Hector Alvaro Rojas says:

    Hi Taylor!

    What’s up?

    I got the entire tutorial done just a few minutes ago!

    I had some problems in the cleaning part of the new data (from the census 2000). The variables in the file 2khxx.csv did not fit with the scripts. Anyway, I fix de problem in my way by merging, sub setting and melting the dataset each time I needed it. I have to change the states names using tolower and that solved most of the problems.

    This is a great tutorial and it has given me another option to get choropleth maps done with R-project.

    Thanks for your support giving me the answers that I need.

    Best regards,

    HA

Leave a reply to Felipe Cancel reply