CRAN Download counts

I really like developing software and making my own life and work easier with it. But what I enjoy even more is to see others actually use it! So every now and then I look at CRAN download counts of my R packages. I’m not in any top-10 rankings or anything. But that was also never the point. I just like sharing my knowledge and see others use it!

So I’m not sure who the target is for this post. Other developers who want to use the code below to check their download numbers maybe? Mostly I just want to come here to have a quick look myself.

library(tidyverse, quietly = TRUE)
library(cranlogs)
theme_set(theme_minimal() + 
            theme(legend.position = "bottom"))

my_pkgs <- tools::CRAN_package_db() |> 
  filter(str_detect(Author, "Johannes Gruber"))
my_pkgs |> 
  as_tibble() |> 
  select(Package, Version, Author)
## # A tibble: 6 × 3
##   Package             Version Author                                            
##   <chr>               <chr>   <chr>                                             
## 1 askgpt              0.1.2   "Johannes Gruber [aut, cre]"                      
## 2 LexisNexisTools     0.3.7   "Johannes Gruber [aut, cre]"                      
## 3 quanteda.textmodels 0.9.6   "Kenneth Benoit [cre, aut, cph]\n    (<https://or…
## 4 rtoot               0.3.2   "David Schoch [aut, cre] (<https://orcid.org/0000…
## 5 rwhatsapp           0.2.4   "Johannes Gruber [aut, cre]"                      
## 6 stringdist          0.9.10  "Mark van der Loo [aut, cre] (<https://orcid.org/…

As the first step, I load the packages I need and filter the CRAN DB for packages I was involved with. There aren’t that many, so I could also recall them. But just for the future and in case I forget something, I solve the issue with R.

download_data <- my_pkgs |> 
  pull(Package) |> 
  cran_downloads(from = "2018-04-09", to = Sys.Date()) |> 
  filter(count > 0) |> 
  group_by(package) |> 
  mutate(count_total = cumsum(count))

Using cran_downloads from the package cranlogs, I get the downloads of my packages by day from the time my first package was released.1

Now let’s visualise this:

ggplot(download_data, aes(x = date, y = count, colour = package)) +
  geom_line() +
  scale_y_continuous(labels = scales::comma) +
  scale_fill_manual(values = wesanderson::wes_palette("Royal1")) +
  labs(x = NULL, y = NULL, colour = NULL, title = "Package Downloads per Day",
       caption = paste0("Data source: cranlogs.r-pkg.org; Updated: ", Sys.Date()))

That are the numbers by day. Here is the same for total counts:

ggplot(download_data, aes(x = date, y = count_total, colour = package)) +
  geom_line() +
  scale_y_continuous(labels = scales::comma) +
  scale_fill_manual(values = wesanderson::wes_palette("Royal1")) +
  labs(x = NULL, y = NULL, colour = NULL, titletitle = "Package Downloads over time",
       caption = paste0("Data source: cranlogs.r-pkg.org; Updated: ", Sys.Date()))

Now this visualisation is honestly a little silly, since I contributed only a single function to stringdist and some of the others and I don’t want to take credit for the success of these packages. So let’s have a look at the ones where I was actually the main author:

my_own_pkgs <- tools::CRAN_package_db() |> 
  filter(str_detect(Author, fixed("Johannes Gruber [aut, cre]")))

download_own_data <- my_own_pkgs |> 
  filter(str_detect(Author, "Johannes Gruber")) |> 
  pull(Package) |> 
  cran_downloads(from = "2018-04-09", to = Sys.Date()) |> 
  filter(count > 0) |> 
  group_by(package) |> 
  mutate(count_total = cumsum(count))

ggplot(download_own_data, aes(x = date, y = count_total, colour = package)) +
  geom_line() +
  scale_y_continuous(labels = scales::comma) +
  scale_fill_manual(values = wesanderson::wes_palette("Royal1")) +
  labs(x = NULL, y = NULL, colour = NULL, titletitle = "Package Downloads over time",
       caption = paste0("Data source: cranlogs.r-pkg.org; Updated: ", Sys.Date()))

And that’s it. Just replace my name with yours if you want to use the code above.


  1. I looked this up via httr::content(httr::GET(paste0("http://crandb.r-pkg.org/LexisNexisTools/all")))$versions[[1]]$date, but it didn’t seem neccesary to include it in the code↩︎

Related