class: center, middle, title-slide <a href="https://github.com/edgar-treischl/slidesCode" class="github-corner" aria-label="View source on GitHub"><svg width="80" height="80" viewBox="0 0 250 250" style="fill:#151513; color:#fff; position: absolute; top: 0; border: 0; right: 0;" aria-hidden="true"><path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z"/><path d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2" fill="currentColor" style="transform-origin: 130px 106px;" class="octo-arm"/><path d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z" fill="currentColor" class="octo-body"/></svg></a><style>.github-corner:hover .octo-arm{animation:octocat-wave 560ms ease-in-out}@keyframes octocat-wave{0%,100%{transform:rotate(0)}20%,60%{transform:rotate(-25deg)}40%,80%{transform:rotate(10deg)}}@media (max-width:500px){.github-corner:hover .octo-arm{animation:none}.github-corner .octo-arm{animation:octocat-wave 560ms ease-in-out}}</style> # Code Quality and Style ### <a href="http://www.edgar-treischl.de" target="_blank">Dr. Edgar J. Treischl</a> .white[Last update: 2026-02-22] .footnote.slide-footer[ Press ⭕ or ➡️ to navigate ] <style type="text/css"> .reduced_opacity { opacity: 0.5; } </style> --- ## Agenda ### 01 Bad Habits ### 02 Clean Code ### 03 Modular Code ### 04 Maintain Code and Develop with Style --- class: middle, center, inverse ## Let's face the truth, develop code but ... --- background-image: url("images/beaker2.gif") background-size: cover class: bottom, center # .white[Get rid of bad habits <br> 😄] <div class="remark-footer"><a href="https://giphy.com/gifs/moodman-batman-smack-smacked-Qumf2QovTD4QxHPjy5" target="_blank" style="color: white;">Images source: MOODMAN</a></div> --- ## Wipe the Slate Clean: Restart R Like a Pro - Ever had code that worked perfectly on your machine — but mysteriously breaks when someone else tries to run it? - Old objects in your global environment linger like ghosts: ``` r # Leftover objects important_setting <- "production" # Previously loaded function from packages library(CoolCats) # Changed global options options(stringsAsFactors = TRUE) # Please don't do this in 2025 ``` Your code may unknowingly rely on these leftovers. It runs perfectly on your machine — but fails for anyone else who doesn’t share your local setup. --- ## Wipe the Slate Clean: Restart R Like a Pro .pull-left[ - 🔁 Make it a Habit: Restart your R session and run your code from scratch (Ctrl + Shift + F10) - Don't save the workspace and don't load the workspace from an `.Rdata` file - It’s the only way to ensure your code runs in a fresh, clean R session ### Abandon the rm approach 🔫 - `rm(list = ls())` deletes user-created objects from the global workspace - **But**: The code may break due to hidden dependencies: Attached packages are not detached, changed options are not restored, working directory is untouched! ] .pull-right[ <figure> <img src="https://rstats.wtf/img/rstudio-workspace.png" style="width: 100%"/> </figure> ] --- background-image: url(https://github.com/rstudio/hex-stickers/blob/main/PNG/knitr.png?raw=true) background-position: 90% 5% background-size: 8% ## R vs. Rmd ### 🖕 Code lives in R, Code and Text in Rmd files If you need to create a document from an `.R` file, run: ``` r # Spin converts an R script to an Rmd file knitr::spin("script.R") ``` - Roxygen comments will be treated as text - Add a `YAML` header to the script to control the output If you need to extract the code from an `.Rmd` file, run: ``` r # Extracts R code chunks from Rmd files knitr::purl() ``` - Adjust the level of extraction with the `documentation` parameter (e.g., code only) - Set `purl = FALSE` to avoid the extraction of code chunks (see Xie 2024) --- background-image: url("images/beaker1.gif") background-size: cover class: bottom, center ## .orange[Write clean code, dude! 👻] --- ## Write Clean Code ### Clean code prioritizes: - Readability - Simplicity - Clarity ### Messy code often reveals itself: - Long, complicated functions that try to do too much - Duplicated code scattered throughout scripts - Inconsistent naming that leaves you guessing what’s going on - Hardcoded values - Absolute paths --- ## Don't Go Places Where You Don't Belong ... <img src="images/drake.png" style="width: 65%"/> --- ## Abandon absolute paths, they will break anyway ``` r # Don't: readr::read_csv("~/Documents/reports/data/104_data.csv") ``` ``` ## Error: ## ! '~/Documents/reports/data/104_data.csv' does not exist. ``` ### ⚡ Create a project and use the `here` package: ``` r # Here returns the path to the project here::here() ``` ``` ## [1] "/Users/edgar/Develop/slides/slidesCode" ``` ``` r # Create a path to the file is a piece of cake here::here("data", "muppets_income.csv") ``` ``` ## [1] "/Users/edgar/Develop/slides/slidesCode/data/muppets_income.csv" ``` --- ## Avoid Hardcoding Values or I’ll Burn Your Code - The pain begins if you forget to integrate hard coded values as function parameters: ``` r library(dplyr) filtered_data <- mtcars |> filter(mpg > 25) ``` ### 🐸 This makes your code a mess to maintain - What if you need to filter a different value? You’d have to search through your code to find every instance of the hard coded value and change it — and you probably forget half of them. - Another one? Rounding hell in a large report: ``` r # Hardcoded rounding is not bad, true? mtcars |> mutate(mpg_rounded = round(mpg, 2)) ``` --- ## Avoid Hardcoding Values II ### Set parameters at the top of your file while developing ... ``` r round_digits <- 2 mpg_threshold <- 25 mtcars |> filter(mpg > mpg_threshold) |> mutate(mpg_rounded = round(mpg, round_digits)) ``` ``` ## mpg cyl disp hp drat wt qsec vs am gear carb ## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 ## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 ## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 ## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 ## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 ## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 ## mpg_rounded ## Fiat 128 32.4 ## Honda Civic 30.4 ## Toyota Corolla 33.9 ## Fiat X1-9 27.3 ## Porsche 914-2 26.0 ## Lotus Europa 30.4 ``` - Once you start wrapping code into functions, pass values as arguments — even if you don’t plan to change them right now. You’re future-proofing your code. --- ## Avoid Hardcoding Values III ### Move it into a function (with a default): ``` r filter_and_round <- function(data, mpg_limit, digits = 2) { data |> dplyr::filter(mpg > mpg_limit) |> dplyr::mutate(mpg_rounded = round(mpg, digits)) } # Use the default filter_and_round(mtcars, mpg_limit = 25) ``` ``` ## mpg cyl disp hp drat wt qsec vs am gear carb ## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 ## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 ## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 ## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 ## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 ## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 ## mpg_rounded ## Fiat 128 32.4 ## Honda Civic 30.4 ## Toyota Corolla 33.9 ## Fiat X1-9 27.3 ## Porsche 914-2 26.0 ## Lotus Europa 30.4 ``` #### 👉 This approach makes your code flexible, readable, and much easier to maintain --- background-image: url("images/cookimonster.gif") background-size: cover class: bottom, center ## .white[Break ’em into nice bites 😜] --- ## Write Clean Code for Humans, not 🤖 .pull-left[ - *Optimize for Readers, not the Writer*: Apply the “Six-Month Rule” - *Reduce cognitive load*: No deep nesting, no long functions - *Keep one level of abstraction* per code block (next section: Don't Dump Everything in One Function) - *Avoid cleverness/hacks*: Write dull but clear code - *Be consistent* (see section about Tidyverse Style) - *Make dependencies explicit* (Input > Output) - *Enable local reasoning*: Can I understand the code without running it? Can I understand it without looking at other parts of the codebase? ] .pull-right[ ``` r # Deep nested and hard to read result <- mean(na.omit(as.numeric(subset(mtcars, am == 1 & cyl == 6)$hp))) # Apply logical steps that future you/others can understand filtered_cars <- subset(mtcars, am == 1 & cyl == 6) numeric_hp <- as.numeric(filtered_cars$hp) result <- mean(na.omit(numeric_hp)) result ``` ``` ## [1] 131.6667 ``` ] --- ## Modular Code ### 👉 Each function should do one thing — and do it well - Modularity means breaking your code into smaller, self-contained pieces — typically functions, where each one handles one specific task - Instead of writing one long script that tries to do everything, modular code splits the work into clear, manageable chunks (Single Responsibility Principle) ### 👉 Modular code makes your code easier to read, test, and develop - Each piece can be understood and worked on independently - Reusable functions save time and reduce duplication - Testing is simpler because you can write small, focused tests instead of validating everything at once - Changes easier to apply, since you only need to update one part of the codebase --- ## Don’t Dump Everything in One Function .pull-left[ - It’s tempting to just get everything working in a single function or a long script — a pipeline that loads the data, cleans it, analyzes it, calls your mum, and spits out the result. - As soon as you need to tweak something, explain it to someone else, or reuse part of it, things get messy. Suddenly, you’re stuck picking apart a script that’s hard to read, hard to change, and the code may break. ] .pull-right[ ``` r # Load data df <- read.csv("data.csv") # Data cleaning df <- df |> filter(!is.na(price)) |> mutate(price_log = log(price)) |> do_furher_fancy_things # Analysis model <- lm(price_log ~ bedrooms + sqft, data = df) # Plotting plot(df$sqft, df$price_log) ``` ] ### 💋 Keep it simple, stupid! Don't violate the KISS principle --- ## Don’t Dump Everything in One Function - At first glance, it might seem fine — but now imagine you want to reuse the cleaning logic in a different script. Or test how the model works on another dataset. Or change the plotting code without rerunning the model. ### 🐸 You’re stuck. Everything depends on everything else. ### 👉 Modularize Code: Break your code into logical, reusable pieces - Break your logic into small, focused functions that each do one thing. - That’s called *separation of concerns*. Separate *at least* logically: - Insert code that belong to data preparation, analysis, and reporting into *separate functions* - Organize your codebase, create separate R scripts (e.g., analyze.R, plot.R) that follow your internal logic - Define functions for each logical, reusable step --- ## Don’t Dump Everything in One Function ### 🍪 Break ’em into nice bites ... ```r # source.R load_data <- function(path) { read.csv(path) } clean_data <- function(df) { df |> filter(!is.na(price)) |> mutate(price_log = log(price)) ... } #> And so on ... ``` ### ⚡ This approach is not a clean separation of concerns if your function does more than one thing, but it’s a start! --- ## Don’t Dump Everything in One Function ### 🍪 Save and source it ... ``` r # Source the functions source("source.R") # Main script df <- load_data("data.csv") df_clean <- clean_data(df) model <- fit_model(df_clean) plot_relationship(df_clean) ``` <div class="info-box"> <i>💡</i> This version is better because it's modular, readable, and flexible. Want to swap in a new dataset? Easy. Want to test the clean_data function? Just do it, you are on fire! </div> --- ## Don’t Dump Everything in One Function ### The code need to run in sequence? Wrap it in a function ``` r # Wrap the workflow in a function run_job <- function(data_path) { df <- load_data(data_path) df_clean <- clean_data(df) model <- fit_model(df_clean) send_mail(result = model) result <- plot_relationship(df_clean) return(result) } ``` - This make the workflow explicit: It helps others understand how your code fits together - Only after we break our code into logical, reusable pieces, we can really get to understand how to refactor it. Of course, it is extra work, but it’s part of the development process. #### Finally, you can build an R package for our source code 💋 --- ## Don’t Repeat Yourself (DRY) ### 👉 Wrap It in a Function .pull-left[ <div class="info-box"> <i>💡</i> Repeating the same code multiple times increases the risk of errors, bloats scripts, and makes maintenance harder. The DRY principle encourages you to write logic once and reuse it via functions. This makes your code cleaner, easier to test, and more maintainable. </div> ] .pull-right[ ``` r # Some repetitive code ... library(dplyr) library(gapminder) # Filter for Germany in 2007 germany_2007 <- gapminder |> filter(country == "Germany", year == 2007) |> select(country, year, lifeExp, gdpPercap) # Filter for Japan in 2007 japan_2007 <- gapminder |> filter(country == "Japan", year == 2007) |> select(country, year, lifeExp, gdpPercap) # Filter for France in 2007 # And so on ... ``` ] --- ## Don’t Repeat Yourself II Turn repeated code into parameterized functions to save time and avoid bugs ``` r slice_gapminder <- function(data, country, year) { data |> dplyr::filter(country == !!country, year == !!year) |> dplyr::select(country, year, lifeExp, gdpPercap) } ``` ### 👉 Clean and reusable: ``` r japan_2007 <- slice_gapminder(gapminder, "Japan", 2007) japan_2007 ``` ``` ## # A tibble: 1 × 4 ## country year lifeExp gdpPercap ## <fct> <int> <dbl> <dbl> ## 1 Japan 2007 82.6 31656. ``` --- ## Don’t Repeat Yourself III Creating a function does not imply that we are ready yet. There is another very common blind spot: ``` r # Coding is hard work japan_2002 <- slice_gapminder(gapminder, "Japan", 2002) germany_2002 <- slice_gapminder(gapminder, "Germany", 2002) france_2002 <- slice_gapminder(gapminder, "France", 2002) ``` Functions itself do not solve the problem of repetition if we call them repeatedly with only slight variations. ### 👉 Base R and purrr help to avoid repetition --- ## Base R against DRY ``` r # Define inputs countries <- c("Japan", "Germany", "France") years <- rep(2002, length(countries)) # Apply the function to both vectors using mapply() results_list <- mapply(function(country, year) { slice_gapminder(gapminder, country, year) }, countries, years, SIMPLIFY = FALSE) # Combine the list of data frames into one results <- do.call(rbind, results_list) results ``` ``` ## # A tibble: 3 × 4 ## country year lifeExp gdpPercap ## * <fct> <int> <dbl> <dbl> ## 1 Japan 2002 82 28605. ## 2 Germany 2002 78.7 30036. ## 3 France 2002 79.6 28926. ``` --- ## Hello Purrr .pull-left[ <div class="info-box"> <i>🦜</i> The purrr package helps you avoid such repetitive code (Wickham and Henry 2025). The package is part of the tidyverse and provides a consistent, readable way to perform iteration in R. Instead of writing loops or repeating function calls manually, purrr lets you apply functions over lists, vectors, or data frames using a family of tools like *map*, *map2*, and *pmap*. </div> ] .pull-right[ <figure> <a href="https://purrr.tidyverse.org" target="_blank"><img src="https://purrr.tidyverse.org/logo.png" alt="purrr.tidyverse.org" width="55%" align="center"/> </a> </figure> ] --- background-image: url(https://purrr.tidyverse.org/logo.png) background-position: 90% 5% background-size: 8% ## Purrr against DRY ``` r # Use purrr to iterate over countries and years library(purrr) # Extract the data for multiple countries and a specific year countries <- c("Japan", "Germany", "France") years <- rep(2002, length(countries)) # Map2 to apply the function across both vectors results <- map2(countries, years, ~ slice_gapminder(gapminder, .x, .y)) # Map2 returns a list, so we can bind the results results |> dplyr::bind_rows() ``` ``` ## # A tibble: 3 × 4 ## country year lifeExp gdpPercap ## <fct> <int> <dbl> <dbl> ## 1 Japan 2002 82 28605. ## 2 Germany 2002 78.7 30036. ## 3 France 2002 79.6 28926. ``` --- background-image: url(https://purrr.tidyverse.org/logo.png) background-position: 90% 5% background-size: 8% ## Purrr against DRY - As outlined, you don’t need purrr to avoid repetition – you can achieve similar results using base R functions like `sapply()` and other *apply* variants. - This can be useful if you want to reduce dependencies or keep your code lightweight. - However, purrr offers a more consistent and readable syntax, especially when working within the tidyverse ecosystem. Its functions are designed to work seamlessly with tibbles and dplyr verbs, making it easier to integrate into data analysis workflows. - Purrr has further goodies like: - Progress bars - Parallel computing ### 👉 Purrr your way and DRY! --- background-image: url("images/beaker3.gif") background-size: cover class: bottom, center # .white[Robust Code don't bite ... 🪿] --- ## Robust Code A robust function behaves consistently: given the same input, it always returns the same output. This reliability starts with clearly defined inputs and outputs. ### 👉 A function should receive all it needs through its arguments — and return its results explicitly - It should *not rely* on *global variables* or *hidden dependencies* - When each function is a *self-contained unit*, it’s easier to reuse and combine across your analysis or pipeline - Next: - How to handle errors gracefully - How to avoid unintended side effects (like accidental data exports) - How to manage credentials securely --- background-image: url(https://rlang.r-lib.org/logo.png) background-position: 90% 5% background-size: 8% ## Handle Errors Gracefully .pull-left[ Write code that anticipates problems and handles them in a way that doesn’t crash your script or leave the user guessing (defensive programming) - **Validating inputs**: check that function arguments meet expected criteria (e.g., correct type, range, format) before proceeding. - **Catching errors**: use tryCatch() to manage potential errors during execution. - **Helpful feedback**: when something goes wrong, provide clear, actionable error messages that guide the user on how to fix the issue. ] .pull-right[ #### Check required input: `rlang::check_required(name)` ``` r # This function requires a name argument say_hello <- function(name) { rlang::check_required(name) # Do a lot of steps before we greet # Let user wait ;) return(paste("Hello", name, "!")) } say_hello() ``` ``` ## Error in `say_hello()`: ## ! `name` is absent but must be supplied. ``` ] --- background-image: url(https://rlang.r-lib.org/logo.png) background-position: 90% 5% background-size: 8% ## Handle Errors Gracefully II .pull-left[ The `cli` package provides functions to create consistent and informative messages, warnings, and errors in R. - Inform the user: ``` r cli::cli_alert_info("I am an info!") ``` ``` ## ℹ I am an info! ``` - Success message: ``` r cli::cli_alert_success("I am gold.") ``` ``` ## ✔ I am gold. ``` ] .pull-right[ #### Abort if input is not a character: `rlang::is_character(name)` ``` r # Validate input with rlang say_hello <- function(name) { abort_message <- "`name` must be a string." if (!rlang::is_character(name)) { cli::cli_abort(abort_message) } return(paste("Hello", name, "!")) } say_hello(1) ``` ``` ## Error in `say_hello()`: ## ! `name` must be a string. ``` ] --- background-image: url(https://rlang.r-lib.org/logo.png) background-position: 90% 5% background-size: 8% ## Handle Errors Gracefully III .pull-left[ - We can limit the input to a specific set of options, like a list of valid names or categories with the arg_match() function - Sometimes one tiny little letter is enough to make an mistake, but rlang has our back. ``` r # Restrict input to specific options say_hello <- function(name = c("Tom", "Jerry")) { rlang::arg_match(name) return(paste("Hello", name, "!")) } ``` ] .pull-right[ ``` r say_hello(name = "Edgar") ``` ``` ## Error in `say_hello()`: ## ! `name` must be one of "Tom" or "Jerry", not "Edgar". ``` ``` r # Help users with typos say_hello(name = "Pom") ``` ``` ## Error in `say_hello()`: ## ! `name` must be one of "Tom" or "Jerry", not "Pom". ## ℹ Did you mean "Tom"? ``` ] <br> Sometimes, you need to handle errors that occur during execution, especially when dealing with external resources like files, databases, or APIs. This is where `tryCatch()` comes in. --- ## tryCatch() .pull-left[ The function allows us to intercept and respond to errors, warnings, and other conditions without crashing the users interface. The tryCatch() function has several components: - `expr`: the code that is supposed to run - `error`: what to do if an error happens - `warning`: what to do if a warning happens - `finally`: code that always runs, no matter what ] .pull-right[ ``` r result <- tryCatch({ warning("Something might be wrong") x <- 10 / "a" x }, warning = function(w) { message("Warning: ", w$message) 0 }, error = function(e) { message("Error: ", e$message) NA }) ``` ``` ## Warning: Something might be wrong ``` ``` r print(result) ``` ``` ## [1] 0 ``` ] --- ## tryCatch II .pull-left[ Here’s a breakdown of how it works in a real world example: - Warning handler: If a warning occurs during readRDS(), it will be caught and displayed using cli::cli_warn(). The warning is then muffled to prevent further output. - Error handler: If an error occurs, a custom error message is shown with cli::cli_alert_danger(), and NULL is returned as a fallback. - Finally: The code that runs no matter what. It’s useful for logging, cleanup, or confirming that the operation finished. ] .pull-right[ ``` r # Attempt to read a file result <- tryCatch( { readRDS("not_a_real_file.rds") }, warning = function(w) { cli::cli_warn("Warning: {conditionMessage(w)}") invokeRestart("muffleWarning") #suppress further warning }, error = function(e) { cli::cli_alert_danger("Error: {conditionMessage(e)}") NULL # return fallback value }, finally = { cli::cli_alert_info("Finished.") } ) ``` ``` ## ✖ Error: no 'restart' 'muffleWarning' found ``` ``` ## ℹ Finished. ``` ] --- ## try() - Use `try()` if you don’t need detailed error handling logic but still want graceful failure: .pull-left[ - It catches errors and continues running. The function runs the given expression, but if an error happens, it catches that error as an object instead of stopping the function. - Using the `silent` option keeps the default error message from showing, so you can handle the problem silently. - As a second step, the code checks if an error occurred with `inherits(result, "try-error")`. Inherits checks if the result is of class try-error, which indicates that an error was caught. In case of an error, we can handle it gracefully. ] .pull-right[ ``` r # Please don't stop ... on error result <- try(readRDS("file.rds"), silent = TRUE) # Check if it worked if (inherits(result, "try-error")) { cli::cli_abort( "The file could not be found." ) }else { cli::cli_alert_success( "File read successfully!" ) # Continue processing ... } ``` ``` ## Error: ## ! The file could not be found. ``` ] --- ## Declare Side Effects .pull-left[ - Don’t just insert code that exports data: This is not a good practice, actually it’s rude. - The next example is really a bad practice because such a code snippet will overwrite the file if it already exists: ``` r # Using an export step somewhere # in the code is not a good practice fwrite(tmp.orig.from, "../data.csv", sep = ";", dec = "," ) ``` ] .pull-right[ - Create a function that takes into consideration whether to export the data or not! ``` r # Make the export explicit and conditional myfun <- function(export = FALSE) { data <- "Hello World" if (export) { fwrite(tmp.orig.from, "../data.csv", sep = ";", dec = ",") }else { return(data) } } ``` ] --- ## Declare Side Effects II .pull-left[ - Ask for Permission: ``` r usethis::ui_yeah("Shall I export the data?") #> Shall I export the data? #> #> 1: Negative #> 2: Absolutely not #> 3: Definitely #> #> Selection: 3 #> [1] TRUE ``` ] .pull-right[ - By asking about permission, the user is in control of their environment: ``` r # Ask for permission before exporting myfun <- function(export = FALSE) { # Shall we export? yeah_export <- usethis::ui_yeah("Shall I export the data?") if (yeah_export) { export_data() } } ``` ] --- ## Declare Side Effects III ### 👉 Files should be exported in the working directory If the file needs to be exported to a different location, insert the path in the function call. This way, the path is not hard codeded and can be adjusted. ``` r # Respect the user, export only when asked, and allow to set the path myfun <- function(export = FALSE, export_dir = "data") { if (export) { export_path <- fs::path(export_dir, "data.csv") data.table::fwrite(data, export_path) cli::cli_alert_success("Exported data to {export_path}") } else { return(data) } } ``` ### 👉 Export only when asked and inform the user! --- ## Manage Secrets Like a Grown-Up - Hardcoding API keys, tokens, or environment-specific settings into the code might seem harmless — until the keys show up in the history, your teammates copy over your local paths, or an app fails because it’s pointing to your local environment - In case of GitLab, a secret in the code is visible to anyone with access — forever — even if you delete it later ### For Machine-Level Secrets: .Renviron .pull-left[ - The usethis package provides a handy function to open and edit your `.Renviron` file — a config file that’s automatically loaded every time R starts - It’s a great place to store personal, local secrets, or paths that shouldn’t be shared ] .pull-right[ ``` r # Open your .Renviron file usethis::edit_r_environ() # Add a Key API_KEY=sk_abc123 # Access it in your code API_KEY <- Sys.getenv("API_KEY") ``` ] --- ## Manage Secrets Like a Grown-Up II The `.Renviron` approach works well for local development, but once it’s time to share the code, it’s time to level up your config game. ### For Project-Specific Secrets: dotenv .pull-left[ - The `dotenv` package lets you keep environment variables in a project-specific `.env` file. - This approach is perfect when settings vary by project, or when you’re working in a team where everyone has different credentials, file paths, or environments An example .env file looks like this: ```bash OPENAI_API_KEY=sk_project_specific_key ``` ] .pull-right[ - You can access these secrets after running the `load_dot_env()` function. - The function reads the `.env` file and sets the variables as environment variables for your session: ``` r # Load and access the secret dotenv::load_dot_env() key <- Sys.getenv("API_KEY") ``` ] --- ## Hello config It’s a hassle to juggle different setups — secrets, file paths, and other settings — depending on whether you’re developing locally, testing, staging, or deploying to production. .pull-left[ - The `config` package lets you define multiple named environments (like default, dev, etc.) in a clean YAML file - The code just loads the correct configuration automatically, based on the current environment. All settings will be pulled from the appropriate section of the YAML file — no need to manually double-check; no need to rewrite part of your code ] .pull-right[ <figure> <a href="https://rstudio.github.io/config/index.html" target="_blank"> <img src="https://rstudio.github.io/config/logo.svg" alt="https://rstudio.github.io/config/" width="40%" align="center"/> </a> </figure> ] <br> <div class="info-box"> <i>💡</i> Just switch the environment variable like you switch a light switch, and your code will adapt automatically. </div> --- ## Manage Secrets Like a Grown-Up III ### For Multi-Environment Settings: config .pull-left[ An example `config.yml` file: ```YAML default: db_address: "101.166.44.311" data_dir: "data/test" enable_cache: false test: ... production: db_address: "101.166.44.313" data_dir: "/results" enable_cache: true ``` ] .pull-right[ After you added the config file, call the config package. It uses the default section from the `config.yml` file: ``` r config::get("db_address") #> "101.166.44.311" ``` To switch environments, set the `R_CONFIG_ACTIVE` environment variable accordingly: ``` r Sys.setenv(R_CONFIG_ACTIVE = "production") config::get("db_address") #> "101.166.44.312" ``` ] --- ## One Last Word of Caution ### 👉 Don’t Commit These Files. Seriously. .pull-left[ - Once a key is added to a Git repository, it’s written into the project’s history — and even if you delete the line later, the damage is done - Make sure sensitive files are ignored and exclude them before you start to add your keys or crendetials - Consider the appropriate ignore file and add the config.yml or .env file to it: - .gitignore — for the Git setup - .dockerignore — for Docker - .Rbuildignore — for R packages ] .pull-right[ <figure> <img src="images/kirmit_meme.jpg" alt="https://rstudio.github.io/config/" width="95%" align="center"/> </figure> ] --- background-image: url("images/gonzo.gif") background-size: cover class: bottom, center # .white[Maintain Code and Develop with Style 😜] --- ## Maintain Code - *Version control* is the foundational practice for long-term maintainability (Branches, History, etc.) - *Code reviews* of new features or bugs help catch issues early, share knowledge across the team, and ensure that new code adheres to our standards - *Collaborative maintenance*—through comments, suggestions, or pair programming—helps ensure that the entire team feels ownership of the code and keeps it healthy for the long run - Code maintenance is an *ongoing process*: - It’s not just about writing code that works today, but about ensuring it continues to work well as requirements evolve, dependencies change, and new team members join - Regularly revisiting and refactoring code, updating dependencies, and improving tests are all part of keeping your codebase in good shape --- ## Take Your Time, Time to Refactor <div class="info-box"> <i>💭</i> “Refactoring is a process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.” (Martin Fowler, Refactoring: Improving the Design of Existing Code) </div> As your project grows, code that was once “good enough” can become fragile, repetitive, or hard to read even for yourself if you have enough distance from the project. Refactoring gives you the opportunity to: - Remove duplication (apply DRY more effectively) - Improve modularity - Enhance readability and clarity (clearer intent, better naming, consistent style) - Make testing and debugging easier (well-structured code is easier to isolate and test) - Reduce technical debt (costs of choosing a quick or easy solution) --- ## The Tidyverse Style Guide #### 👉 We stick to the tidyverse style guide because it provides a widely accepted, well-documented, and actively maintained standard that aligns with modern R development practices. .pull-left[ - A clear, consistent style makes it easier to spot bugs, understand logic at a glance, and collaborate across a team - The tidyverse style guide provides a clear, opinionated set of conventions for writing clean, consistent R code—covering everything from naming variables to how you should format pipes and function arguments ] .pull-right[ <div class="info-box"> <i>💡</i> By adopting this guide, we avoid the overhead of defining and maintaining our own custom rules; and we ensure that our code is immediately familiar to anyone experienced with the broader R community. </div> ] --- ## The Tidyverse Style Guide II .pull-left[ - *Use `snake_case` for all object and function names*: Use `get_user_data()`, not `GetUserData()` - *Avoid abbreviations unless they are widely recognized*: Use `user_id`, not `u_id` - *Use ALL_CAPS for constants and environment variables*: Use `API_KEY`, not `api_key` - *Use descriptive, meaningful names*: Use `download_report()`, not `dl_rp()` - *Function names should sound like actions (verbs)*: Use `fetch_ls_data()`, not `ls_data` - *Variable names should sound like things (nouns)*: Use `user_list`, not `calculated_users` - *Use consistent naming for related objects*: Use `create_data()` and `create_script()` - *Name `.R` files to match their contents*: `read_data.R` contains `read_data()` ] .pull-right[ [](https://allisonhorst.com/everything-else) *Artwork: Allison Horst* ] --- background-image: url(https://github.com/r-lib/lintr/blob/main/man/figures/logo.png?raw=true) background-position: 90% 5% background-size: 8% ## Hello Lintr The lintr package automatically flags deviations from your style guide. .pull-left[ <div class="info-box"> <i>🦜️</i>"lintr provides static code analysis for R. It checks for adherence to a given style, identifying syntax errors and possible semantic issues, then reports them to you so you can take action" (Hester et al. 2024). </div> ] .pull-right[ ``` r lintr::lint(text = 'myFunction <- function(x, y) { if (sum(x, y) == 10) { print("Sum is correct!") } }') ``` ``` ## <text>:1:1: style: [object_name_linter] Variable and function name style should match snake_case or symbols. ## myFunction <- function(x, y) { ## ^~~~~~~~~~ ``` ] R Package goodie 🦹: ``` r #lintr::lint_dir(path = "R") lintr::lint_package() ``` --- background-image: url(https://styler.r-lib.org/reference/figures/logo.png) background-position: 90% 5% background-size: 8% ## Hello Styler The styler package reformats code to match a defined coding style with a single command. .pull-left[ <div class="info-box"> <i>🦜️️</i> "styler formats your code according to the tidyverse style guide (https://style.tidyverse.org) so you can direct your attention to the content of your code" (Müller and Walthert 2024). </div> ] .pull-right[ ``` r styler::style_text( "myFunction<-function( x,y){ if(sum( x , y )==10){ print( 'Sum is correct!' ) } }" ) ``` ``` ## myFunction <- function(x, y) { ## if (sum(x, y) == 10) { ## print("Sum is correct!") ## } ## } ``` ] R Package goodie 🦹♀️: ``` r styler::style_pkg() ``` --- class: right, middle ## Thank you for your attention! <img style="border-radius: 50%;" src="https://avatars.githubusercontent.com/u/77931249?s=400&u=eaf1d0871b643dd32cc0ff9f777edef28e6569a8&v=4" width="150px"/> [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M505.12019,19.09375c-1.18945-5.53125-6.65819-11-12.207-12.1875C460.716,0,435.507,0,410.40747,0,307.17523,0,245.26909,55.20312,199.05238,128H94.83772c-16.34763.01562-35.55658,11.875-42.88664,26.48438L2.51562,253.29688A28.4,28.4,0,0,0,0,264a24.00867,24.00867,0,0,0,24.00582,24H127.81618l-22.47457,22.46875c-11.36521,11.36133-12.99607,32.25781,0,45.25L156.24582,406.625c11.15623,11.1875,32.15619,13.15625,45.27726,0l22.47457-22.46875V488a24.00867,24.00867,0,0,0,24.00581,24,28.55934,28.55934,0,0,0,10.707-2.51562l98.72834-49.39063c14.62888-7.29687,26.50776-26.5,26.50776-42.85937V312.79688c72.59753-46.3125,128.03493-108.40626,128.03493-211.09376C512.07526,76.5,512.07526,51.29688,505.12019,19.09375ZM384.04033,168A40,40,0,1,1,424.05,128,40.02322,40.02322,0,0,1,384.04033,168Z"></path></svg> www.edgar-treischl.de](https://www.edgar-treischl.de) [<svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> edgar-treischl](https://github.com/edgar-treischl) [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 64H48C21.49 64 0 85.49 0 112v288c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V112c0-26.51-21.49-48-48-48zm0 48v40.805c-22.422 18.259-58.168 46.651-134.587 106.49-16.841 13.247-50.201 45.072-73.413 44.701-23.208.375-56.579-31.459-73.413-44.701C106.18 199.465 70.425 171.067 48 152.805V112h416zM48 400V214.398c22.914 18.251 55.409 43.862 104.938 82.646 21.857 17.205 60.134 55.186 103.062 54.955 42.717.231 80.509-37.199 103.053-54.947 49.528-38.783 82.032-64.401 104.947-82.653V400H48z"></path></svg> edgar.treischl@isb.bayern.de](mailto:edgar.treischl@isb.bayern.de) [<svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M369.9 97.9L286 14C277 5 264.8-.1 252.1-.1H48C21.5 0 0 21.5 0 48v416c0 26.5 21.5 48 48 48h288c26.5 0 48-21.5 48-48V131.9c0-12.7-5.1-25-14.1-34zM332.1 128H256V51.9l76.1 76.1zM48 464V48h160v104c0 13.3 10.7 24 24 24h104v288H48zm250.2-143.7c-12.2-12-47-8.7-64.4-6.5-17.2-10.5-28.7-25-36.8-46.3 3.9-16.1 10.1-40.6 5.4-56-4.2-26.2-37.8-23.6-42.6-5.9-4.4 16.1-.4 38.5 7 67.1-10 23.9-24.9 56-35.4 74.4-20 10.3-47 26.2-51 46.2-3.3 15.8 26 55.2 76.1-31.2 22.4-7.4 46.8-16.5 68.4-20.1 18.9 10.2 41 17 55.8 17 25.5 0 28-28.2 17.5-38.7zm-198.1 77.8c5.1-13.7 24.5-29.5 30.4-35-19 30.3-30.4 35.7-30.4 35zm81.6-190.6c7.4 0 6.7 32.1 1.8 40.8-4.4-13.9-4.3-40.8-1.8-40.8zm-24.4 136.6c9.7-16.9 18-37 24.7-54.7 8.3 15.1 18.9 27.2 30.1 35.5-20.8 4.3-38.9 13.1-54.8 19.2zm131.6-5s-5 6-37.3-7.8c35.1-2.6 40.9 5.4 37.3 7.8z"></path></svg> Slides](https://github.com/edgar-treischl/slidesCode/blob/main/slides_code.pdf) --- class: right, middle ## Licence This presentation is licensed under a CC-BY-NC 4.0 license. You may copy, distribute, and use the slides in your own work, as long as you give attribution to the original author on each slide that you use. Commercial use of the contents of these slides is not allowed. <br/> <br/> <img src="images/by-nceu.png" alt="Image left" width="150px"/>