In contrast to other programming languages, R has no widely
established and undisputed style guide (e.g. PEP 8 for Python). As a
data scientist, I helped to establish a company wide R style guide.
While it mainly relies on the tidyverse style guide, we
generally decided to be more explicit in our coding practice. This
includes that we always refer to functions from non-native R packages
with the double colon operator ::
. While it is relatively
easy to establish such a convention in new projects, it is challenging
to adapt ongoing projects and legacy code. origin
allows
for much faster conversions of both legacy code as well as currently
written code.
origin
The main purpose is to add pkg::
to an R function call,
i.e. it changes code like this:
origin
In general, you can either originize some selected text (more on that
later in Addins), a whole script, or a all scripts in a specific folder,
e.g. your project folder. There is a specifically designed function for
each purpose yet they all share the same options. Therefore, only
originize_file()
is extensively presented as an example
with its default options.
originize_file(file = "testscript.R",
pkgs = .packages(),
overwrite = TRUE,
ask_before_applying_changes = TRUE,
ignore_comments = TRUE,
check_conflicts = TRUE,
add_base_packages = FALSE,
check_base_conflicts = TRUE,
check_local_conflicts = TRUE,
excluded_functions = list(dplyr = c("%>%", "across"),
data.table = c(":=", "%like%"),
# exclude from all packages:
c("first", "last")),
verbose = TRUE,
use_markers = TRUE)
pkgs
: which packages to check for functions used in the
code (see Considered Packages). The default are all
packages attached via library
or require
overwrite
: actually insert pkg::
into the
code. Otherwise, logging shows only what would happen. Note
that ask_before_applying_changes
still allows to keep
control over your code before origin
changes anything.ask_before_applying_changes
: whether changes should be
applied immediately or the user must approve them first.check_conflicts
: should origin
check for
potential namespace conflicts, i.e. a used function is defined in more
than one considered package. User input is required to solve the issue.
Strongly encouraged to be set to TRUE
.add_base_packages
: should base packages also be added,
e.g. base::sum()
.check_base_conflicts
: Should origin also check for
conflicts with base R functions.check_local_conflicts
: Should origin also check for
conflicts with locally defined functions anywhere in your project? Note
that it does not check the environment but solely parses files and scans
them for function definitions.excluded_functions
: a (named) list of functions to
exclude from checking.verbose
: some sort of logging is performed, either in
the console or via the markers tab in RStudio.use_markers
: whether to use the Markers tab in
RStudio.Besides using regular R functions to originize files, there are also
useful addins delivered with origin
. These addins are
designed to be used on-the-fly while coding. You can either originize
selected text, the currently opened file, or all scripts in the
currently opened project. However, to have as much control as when using
functions, each function argument corresponds to an option that can be
set and used inside the addins, e.g.
Actually, most function arguments of origin
first check
whether an option has been declared and uses the assigned value as its
default. This allows for equal outcomes regardless whether you use the
addin or a function sequentially.
Since origin
changes files on disk, it is very important
that the user has full control over what happens and user input is
required before critical steps.
Most importantly, the user must be aware of what the originized file(s) would look like. For this, all changes and potential missed changes are presented, either in the Markers tab (recommended) or in the console.