Maintaining compatibility with R package updates

February 7, 2021   

I frequently use the {furrr} R package to add parallelization to the execution of an R program. It’s incredibly easy to switch a purrr::map() call to furrr::future_map() to improve the performance of a time-intensive code base.

One issue I ran into recently is that the version 0.2.0 release of {furrr} introduced some breaking changes, with the future_options() function now renamed to furrr_options().

This lead to some trouble, as I’ve been working on a R program that has to run on two different machines. One has {furrr} version 0.2.0 installed, while the other is still on an older version.

It would have been preferable to use the {renv} package in situations like this, but for various reasons that was not possible given the time constraints of the project. So instead, I developed a work-around wrapper function based upon which version of {furrr} was available.

To determine which version of a package is installed on a given machine, I can run:

utils::packageVersion("furrr")
## [1] '0.2.2'

And to determine which *_options() function should be called, I can compare the installed version against the version that introduced the change:

utils::packageVersion("furrr") < package_version("0.2.0")
## [1] FALSE

To maintain compatibility across both servers, I wrote a small wrapper to call the correct *_options() function based on which version of furrr was installed on the machine:

#' Set `{furrr}` options based on package version.
#' 
#' `{furrr}` version 0.2.0 introduced changes that were not backwards 
#' compatible with earlier versions of the package.
#' 
#' @param ... Arguments to be passed to either `furrr::furrr_options()` or 
#' `furrr::future_options()`.
#' 
set_furrr_options <- function(...) {
  
  furrr_version <- utils::packageVersion("furrr")
  
  if (furrr_version < package_version("0.2.0")) {
    
    .options <- furrr::future_options(...)
    
  } else {
    
    .options <- furrr::furrr_options(...)
    
  }
  
  return(.options)
  
}

Now I can a program in parallel across the servers with different installed versions of {furrr}.

future::plan(future::multisession, workers = 2)
  
.options <- set_furrr_options(seed = TRUE)
  
results <- furrr::future_map(
    .x = 1:4, 
    .f = sqrt,
    .options = .options
  )

I used some similar code logic for another program that made use of the dplyr::group_split() function. It’s a very useful function, but since it’s still in the experimental lifecycle stage, the API isn’t fully stable yet. It seems that the version 1.0.0 release changed the keep argument to .keep:

dplyr::group_split(mtcars, cyl, keep = TRUE)
## Warning: The `keep` argument of `group_split()` is deprecated as of dplyr 1.0.0.
## Please use the `.keep` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

But I can easily catch this as well:

dplyr_version <- utils::packageVersion("dplyr")

if (dplyr_version < package_version("1.0.0")) {
  
  dplyr::group_split(mtcars, cyl, keep = TRUE)
  
} else {
  
  dplyr::group_split(mtcars, cyl, .keep = TRUE)
  
}