“I graduated from Rice and joined RStudio right when the concept of data science was taking off. As a result, I feel like I’ve been at ground zero, or one of the ground zeroes, for the birth of a new field: data science.”
Garrett Grolemund, 37, is talking about R, a programming language and free software environment used in statistical computing and graphics. It was developed by researchers at the University of Auckland in New Zealand. Another New Zealander, Hadley Wickham, has developed open-source statistical analysis software packages for R methods.
“The result is nothing like how I suspected corporate employment would be. Our biggest competitor is ourselves, because we make good free versions of most of our products. The goodwill we’ve built is phenomenal,” said Grolemund, who earned an M.S. and Ph.D. in statistics from Rice in 2012.
His doctoral adviser was Wickham, an assistant professor of statistics at Rice from 2008 to 2012, the creator of such R packages as ggplot2, and now the chief scientist at RStudio, the company that makes software for R. Grolemund works there as a data scientist and educator. When he joined, it was a startup with three employees and no income stream. Now it’s a multi-million dollar firm.
Before coming to Rice, Grolemund earned a B.A. in psychology and an M.A. in statistics from Harvard in 2003. He attended law school for one year at UCLA, but dropped out because of what he called “the lure of statistics, the way it discovers real things.”
Grolemund said, “RStudio’s ability to make a profit is tied into our ability to support the R community at large. Most of our hires, myself included, joined RStudio because we were already contributing to the R community and we wanted to keep doing so.”
In 2012, when first working at RStudio, he visited businesses like Google, Ebay and Genentech to teach their employees how to use R. Now he creates educational materials that teach statisticians how to use R software to analyze data. “Education is one of the unique aspects of R,” he said. “There are so many learning materials to help you get started. It is important because R is a tool for data scientists, who might not see themselves as computer programmers.”
He created the widely used R cheatsheets and developed the Shiny tutorial. Shiny is the software package that permits users to build interactive web apps directly from R. Grolemund has also authored two books, Hands-On Programming with R and R for Data Science (co-authored by Hadley Wickham), both published by O’Reilly Media. The second title is a #1 bestseller on Amazon.com.
Grolemund was an early instructor for DataCamp.com, the leading online education provider for data science and statistics. He designed and filmed six courses for DataCamp, and created four video courses on R, R Markdown, and Shiny for O’Reilly Media's Safari education platform.
He is a frequent speaker and teacher at statistics and R conferences, and delivered keynote addresses at both the 2015 EARL (Effective Applications of the R Language) conference in Boston and the 2016 EARL conference in London. Last year he received the Excellence in Continuing Education Award from the American Statistical Association.
“Before leaving Rice, I wrote the Lubridate R package for working with dates and times data. Lubridate has become one of the foundations for R’s Tidyverse suite of packages. My article on Lubridate in the Journal of Statistical Software has been downloaded more than 68,000 times. That was really my entry point into the world of R and statistical computing,” he said.