I recently was reading the book “Functional Art” and came across the work of Stefanie Posavec. Her Sentence Drawings (click here to see and click here to learn) caught my attention. Here is a ggplot2 rendition:
From what I understand about this visualization technique it’s meant to show the aesthetic and organic beauty of language (click here for interview with artist). I was captivated and thus I began the journey of using ggplot2 to recreate a Sentence Drawing.
Getting Started
I decided to use data sets from the qdap package.
Installing Packages from GitHub
# install.packages("devtools") library(devtools) install_github("ggthemes", "jrnold") install.packages("qdap") install.packages("scales") invisible(lapply(c("qdap", "ggplot2", "ggthemes", "scales", "grid"), require, character.only = TRUE))
Right Turn Function
Stefanie Posavec describes the process for creating the Sentence Drawing by making a right turn at the end of each sentence. I went straight to work creating an inefficient solution to making right hand turns. Realizing the inefficiency, I asked for help and utilized this response from flodel. Here is the solution as a function that you’ll need to run.
turn_it <- function(dataframe, len.col, turn = -pi/2) { dat <- dataframe dat[, "turn"] <- rep(turn, nrow(dataframe)) dat <- within(dat, { facing <- pi/2 + cumsum(turn) move <- dat[, len.col] * exp(1i * facing) position <- cumsum(move) x2 <- Re(position) y2 <- Im(position) x1 <- c(0, head(x2, -1)) y1 <- c(0, head(y2, -1)) }) dat[, c("x1", "y1", "x2", "y2")] <- lapply(dat[, c("x1", "y1", "x2", "y2")], round, digits=0) data.frame(dataframe, dat[, c("x1", "y1", "x2", "y2")]) }
Plot It
Here are the turns represented visually.
n <- 15 set.seed(11) (dat <- data.frame(id = paste("X", 1:n, sep="."), lens=sample(1:25, n, replace=TRUE)))
## id lens
## 1 X.1 7
## 2 X.2 1
## 3 X.3 13
## 4 X.4 1
## 5 X.5 2
## 6 X.6 24
## 7 X.7 3
## 8 X.8 8
## 9 X.9 23
## 10 X.10 4
## 11 X.11 5
## 12 X.12 12
## 13 X.13 23
## 14 X.14 22
## 15 X.15 19
ggplot(turn_it(dat, "lens"), aes(x = x1, y = y1, xend = x2, yend = y2)) + geom_segment(aes(color=id), size=3,lineend = "round") + ylim(c(-40, 10)) + xlim(c(-20, 40))
Apply to Romeo and Juliet
Now that I had this accomplished I set to work with Romeo and Juliet.
Setting Up a Data Set
dat2b <- rajSPLIT dat2b$wc <- wc(rajSPLIT$dialogue) dat2b <- dat2b[!is.na(dat2b[, "wc"]), ] ## Reassign names to family affiliation dat2b[, "fam.aff"] <- factor(lookup(as.character(dat2b[, "fam.aff"]), levels(dat2b[, "fam.aff"])[1:3], qcv(Escalus, Capulet, Montague), missing = NULL)) ## Make dataframe with the beginning coordinates of each act beg_act <- do.call(rbind, lapply(with(turn_it(dat2b, "wc"), split(turn_it(dat2b, "wc"), act)), function(x) { x[1, qcv(act, x1, y1, x2, y2)] }))
Romeo and Juliet Plotted
ggplot(turn_it(dat2b, "wc"), aes(x = x1, y = y1, xend = x2, yend = y2)) + geom_segment(aes(color=fam.aff), lineend = "butt", size=1) + #geom_point(x=0, y=0, size=5, shape="S") + #geom_point(data=dat4b, aes(x=-106, y=-273), size=5, shape="E") + geom_point(data=beg_act, aes(x = x1, y=y1), size=2.3, colour = "grey25") + geom_text(data=beg_act, aes(x = x1, y=y1, label = paste("Act", act)), colour = "grey25", hjust = -.1, size=5, fontface = "bold") + guides(colour = guide_legend(override.aes = list(alpha = 1))) + theme_few() + scale_colour_few(name="Family\nAffiliation") + theme(axis.ticks = element_blank(), axis.text = element_blank(), axis.title= element_blank(), legend.position = c(.1, .85), legend.title.align = .5) + ggtitle("Romeo and Juliet Family\nAffiliation: Sentence Drawing")
After this I wanted to try to fill by sentence level polarity using a newer polarity (sentiment) algorithm from qdap.
poldat <- polarity(dat2b[, "dialogue"]) ggplot(turn_it(poldat[["all"]], "wc"), aes(colour=polarity)) + geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2), lineend = "round", size=1) + theme_few() + theme(panel.background = element_rect(fill = "grey20"), axis.ticks = element_blank(), axis.text = element_blank(), axis.title= element_blank(), legend.direction = "horizontal", legend.title = element_text(colour="white"), legend.background = element_blank(), legend.text = element_text(colour="white"), legend.position = c(.80, .07)) + scale_colour_gradient2(name="", low = muted("blue"), mid = "white", high = muted("red")) + guides(colour = guide_colorbar(barwidth = 11, barheight = .75)) + ggtitle("Sentence Polarity: Sentence Drawing")
Thoughts…
While I like the aesthetics and organic feel of Stefanie Posavec’s Sentence Drawings I can’t help but to ask what this is showing me; what does such a visual afford the audience? I concluded that it captures that language isn’t linear but recursive and intricately linked. Posavec describes the tight spirals as choppy and the extended ones as flowing and smooth. However, I believe there are better ways to capture this sentiment while still balancing the notion of organic recursivity with identifying structure.
Visual representations, like this turn of talk plot below, capture meaningful patterns in the data and allow for comparisons but present the data as linear, when it really is not.
out <- tot_plot(dat2b, "dialogue", grouping.var = "fam.aff", facet.vars = "act", tot=FALSE, plot = FALSE) out + theme(legend.position = "bottom") + labs(fill="Family\nAffiliation")
Again, there must be a balance between capturing the essence of language and understanding the structure. Perhaps using pre-attentive attributes in a meaningful way would be a start to allowing Posavec’s representation to be more useful in finding the narrative in the data. The right hand turn she uses is arbitrary. I ask, what if the turn were meaningful, towards a particular demographic variable. I also could see the benefit of the use of Yihui’s animation package to show the fluid nature of the conversation. I may return to this blog post but I invite others to attempt the challenge of showing something meaningful in the data, while capturing the controlled chaos of language.
Click here for a complete script of this blog post
*Blog post created using the reports package
Very cool post! Thanks for providing your code. I have a quick question regarding the affiliation assignment code:
dat2b[, “fam.aff”] <- factor(lookup(as.character(dat2b[, "fam.aff"]),
levels(dat2b[, "fam.aff"])[1:3], qcv(Escalus, Capulet, Montague),
missing = NULL))
I'm getting this error.
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
Do you have any thoughts on how I should get this to work?
Thanks,
John
@John Sorry it was so long getting back to you. I ran the code in a clean vanilla environment with R 3.0.2. Please install the dependencies as seen in the script and try it again. If you have further trouble feel free to email me your session info.
Pingback: Sentence Drawing 2: Part II | TRinker's R Blog
really awesome! I have a text file which i wanna do the same thing to in R but I don’t know how to get the data into a readable format. Could you give me some guidance on this? Thanks
Sure. I and others answer questions on http://www.talkstats.com and http://stackoverflow.com. I’d suggest you ask the question there and cut and paste part of your text file as well as link back to this blog post. That’s probably the easiest way to post the code and get help.
Reblogged this on Assessing Psyche, Engaging Gauss, Seeking Sophia.
Pingback: Narrative Charts Tell the Tale…. | OUseful.Info, the blog...
Pingback: Bitcoin Faucet Rotator Blog 16.10 Spiral plots of block solve rates