
I find myself pasting urls and lots of little pieces together lately. Now paste is a standard go to guy when you wanna glue some stuff together. But often I find myself pasting and getting stuff like this:
paste(LETTERS)
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" [18] "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
Rather than the desired…
[1] "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
When I get into those situations I think, “Oh better use collapse instead”; but never really think before using paste (That is whether I collapse
or sep
and why). This is inefficient and causes me to lack the time to write quality articles for Fox News (JK for those taking me serious). This tutorial will give some basic and clear direction about the following functions:
paste(x) paste0(x) sprintf(x, y)
paste
paste has 3 arguments.
paste (..., sep = " ", collapse = NULL)
The ...
is the stuff you want to paste together and sep
and collapse
are the guys to get it done. There are three basic things I paste together:
- A bunch of individual character strings.
- 2 or more vectors pasted element for element.
- One vector smushed together.
Here's an example of each, though not with the correct arguments (I'm building suspense here):
paste("A", 1, "%") #A bunch of individual character strings. paste(1:4, letters[1:4]) #2 or more vectors pasted element for element. paste(1:10) #One vector smushed together.
Here's the sep
/collapse
rule for each:
- A bunch of individual character strings – You want sep
- 2 or more vectors pasted element for element. – You want sep
- One vector smushed together.- Smushin requires collapse
So here they are with the correct arguments:
paste("A", 1, "%") #A bunch of individual character strings. paste(1:4, letters[1:4]) #2 or more vectors pasted element for element. paste(1:10, collapse="") #One vector smushed together.
This yields:
> paste("A", 1, "%") #A bunch of individual character strings. [1] "A 1 %" > paste(1:4, letters[1:4]) #2 or more vectors pasted element for element. [1] "1 a" "2 b" "3 c" "4 d" > paste(1:10, collapse="") #One vector smushed together. [1] "12345678910"
paste0
paste0 is short for:
paste(x, sep="")
So it allows us to be lazier and more efficient. I'm lazy so I use paste0 a lot.
paste0("a", "b") == paste("a", "b", sep="")
## [1] TRUE
'nuff said.
sprintf
I discovered this guy a while back but realized it's value in pasting recently. Much of my work on the reports (Rinker, 2013) package requires that I piece together lots of chunks of url and insert user specific pieces. This can be a nightmare with all the quotation marks. A typical take may look like this:
person <-"Grover" action <-"flying" message(paste0("On ", Sys.Date(), " I realized ", person, " was...\n", action, " by the street"))
## On 2013-09-14 I realized Grover was... flying by the street
No joke it took me 6 tries before I formatted that without an error (missing quotes, spaces, and commas).
But we can use sprintf to make one string (less commas + less quotations marks = less errors) and feed the elements that may differ from user to user or time to time. Let's look at an example to see what I mean:
person <-"Grover" action <-"flying" message(sprintf("On %s I realized %s was...\n%s by the street", Sys.Date(), person, action))
## On 2013-09-14 I realized Grover was... flying by the street
Boom first time. It's easy to figure out the spacing and there aren't the commas and quotation marks to deal with. Just use the %s
marker to denote that some element goes here and then feed it in as a vector after the character string. For some applications sprintf is a superior choice over paste
/paste0
.
Note that these are not extensive, all-encompassing rules but guides for general use. Also be aware the sprintf
is even cooler than I demonstrated here.
*Created using the reports package
References
- Tyler Rinker, (2013) reports: Package to assist in report writing. http://github.com/trinker/reports
why do we need the escape – \n ?
Good question. If you run the code you’ll see the \n will break to the next line, however in the version of knitr I was using the \n was not respected. Yihui has since changed this behavior: https://github.com/yihui/knitr/issues/602
Nice post, thanks for sharing! I just discovered the laziness that paste0 allows, but sometimes have issues with R recognizing the function. I run R on multiple comps with different versions (a bad practice, yes)… Is it only available on later versions of R?
Yeah that can be difficult. I know my friend Dason Kurkiewicz has a downloadable GitHub package to deal with this sort of behavior. See: https://github.com/Dasonk/future You can ask him more in the issues section of the GitHub repo.
Thanks, I’ll check that out.
what a legendary post. paste drives me nuts.
the only remaining thing is how to format numbers to be:
%s, or
to 1 dp, or
to 2dp with a $/£ sign out front and a “m” at the end.
Indeed, I’d pay a dollar for that post too.
Great work
Reblogged this on Zaynaib Giwa.
Can we add column to csv file using this paste0 function.
Maybe but you still have to reassign back tot he original data frame and then pump it back out as a csv.
Can I use Paste0() function to add the column in csv file ?
The sprintf function in R is badly broken when it comes to errors, e.g.
# R version 3.2.2 (2015-08-14)
> v=NULL
> sprintf(“v = ‘%s’\n”, v)
character(0)
> paste0(“v = ‘”, v, “‘\n”)
[1] “v = ”\n”
sprintf returns character(0). Not an error, not ‘v=’, just an empty string with no indication that something is wrong. As hideous as the paste functions are, they are least return something sensible in all cases.
If you use sprintf you will silently lose data and be completely unaware that it has happened.
I wouldn’t use the word broken, it behaves differently than you might want or expect it to. It is definitely easier to work with over pasting bits. paste is certainly viable if NULLs etc. are concerning or you can do error checking yourself at the end or beginning with is.null or length(paste0(“v = ‘”, v, “‘\n”)) > 0. But this is a point of preference. One more serious note is that sprintf may behave differently on different machines (i.e. mac may produce a differently padded string than windows.
I think the default value for sep frequently drives people crazy. I was not expecting an empty space there.
paste(1:4, letters[1:4])
I will get:
1 a
2 b
3 c
4 d
How to write in order to get the following list:
1 a
1 b
1 c
1 d
2 a
2 b
…
So all the combinations of the provided numbers and letters.
Thanks
Well
rep
is my go to for this butexpand.grid
is probably easier in this situation:with(expand.grid(letters[1:4], 1:4), paste(Var2, Var1))
Thanks, it works. Also any way to get the combinations without space: 1a, 1b,…
Just find another and easier solution!
levels(interaction(var1, var2, sep=""))
, where var1 and var2 is your lists@Gor nice solution
Pingback: Math Notation for R Plot Titles: expression and bquote | TRinker's R Blog
Thanks a lot; from everyone of us.