origin

In contrast to other programming languages, R has no widely established and undisputed style guide (e.g. PEP 8 for Python). As a data scientist, I helped to establish a company wide R style guide. While it mainly relies on the tidyverse style guide, we generally decided to be more explicit in our coding practice. This includes that we always refer to functions from non-native R packages with the double colon operator ::. While it is relatively easy to establish such a convention in new projects, it is challenging to adapt ongoing projects and legacy code. origin allows for much faster conversions of both legacy code as well as currently written code.

Purpose of origin

The main purpose is to add pkg:: to an R function call, i.e. it changes code like this:

Usage of origin

In general, you can either originize some selected text (more on that later in Addins), a whole script, or a all scripts in a specific folder, e.g. your project folder. There is a specifically designed function for each purpose yet they all share the same options. Therefore, only originize_file() is extensively presented as an example with its default options.

Code Usage

originize_file(file = "testscript.R",
               pkgs = .packages(), 
               overwrite = TRUE,
               ask_before_applying_changes = TRUE,
               ignore_comments = TRUE,
               check_conflicts = TRUE,
               add_base_packages = FALSE,
               check_base_conflicts = TRUE, 
               check_local_conflicts = TRUE,
               excluded_functions = list(dplyr = c("%>%", "across"),
                                         data.table = c(":=", "%like%"),
                                         # exclude from all packages:
                                         c("first", "last")), 
               verbose = TRUE, 
               use_markers = TRUE)

Common Arguments

Addins

Besides using regular R functions to originize files, there are also useful addins delivered with origin. These addins are designed to be used on-the-fly while coding. You can either originize selected text, the currently opened file, or all scripts in the currently opened project. However, to have as much control as when using functions, each function argument corresponds to an option that can be set and used inside the addins, e.g.

options(origin.pkgs = c("dplyr", "data.table"),
        origin.overwrite = TRUE)

Actually, most function arguments of origin first check whether an option has been declared and uses the assigned value as its default. This allows for equal outcomes regardless whether you use the addin or a function sequentially.

Safety Measures

Since origin changes files on disk, it is very important that the user has full control over what happens and user input is required before critical steps.

Logging

Most importantly, the user must be aware of what the originized file(s) would look like. For this, all changes and potential missed changes are presented, either in the Markers tab (recommended) or in the console.