class: center, middle, inverse, title-slide .title[ # Project-oriented workflow ] .author[ ### Kauê de Sousa ] .date[ ### 2023 ] --- # Content * Script vs Project * RStudio project * Directory structure * Our workflow --- class: middle, inverse # Script vs Project --- # Script vs Project ## Script * An R script is simply a text file containing a set of commands and comments. The script can be saved and used later to re-execute the saved commands * Data analysis in a script-based approach means that you have to set your workplace every time that you need to work on the data * Sometimes the input files are spread out in different directories and make difficult to others to reproduce the analysis. EVEN FOR YOU IF YOU CHANGE COMPUTER!! .footnote[ [1] [Read more here](https://www.tidyverse.org/blog/2017/12/workflow-vs-script/)] --- # Script vs Project ## Project * A folder on your computer that holds all the files relevant to that particular piece of work * Any resident R script is written assuming that it will be run from a fresh R process with working directory set to the project directory * This convention guarantees that the project can be moved around on your computer or onto other computers and will still “just work” .footnote[ [1] [Read more here](https://r4ds.had.co.nz/workflow-projects.html)] --- .pull-left[ <img src="img/folder-structure.png" width="80%"/> ] (**.Rproj**) a file generated by RStudio for the R project in this directory. This is the file that should be opened to work on the project (e.g. clean data, write scripts, write outputs) (**data**) contains all the input files that are ready to be analysed. In some cases we use a sub-directory called "raw", which may contain some sensitive data and or the raw data (**script**) contains all the scripts used for the analysis. They are named in order from 01 to `n` scripts. This makes easier for others to know where to start. Old scripts (not relevant for the analysis) are maintained in a sub-directory "old-scripts". You may need them in the future (**output**) is where all the outputs from the analysis are written. (**docs**) an optional sub-directory to store the notebooks or report(s) .footnote[ [1] [See this folder here](https://github.com/AgrDataSci/template-repo-data-analysis)] --- class: middle, inverse # RStudio project --- # RStudio project Is a context for work on a specific project for data analysis * automatically sets working directory to project folder * has a separate workspace and command history * its easy to share and ensure reproducibility <center> <img src="img/rstudio-project.png" width="35%"/> </center> .footnote[ [1] [Read more here](https://r4ds.had.co.nz/workflow-projects.html#rstudio-projects) ] --- # **Thank you!** .pull-right[ <img src="https://img.icons8.com/ios-filled/50/000000/email-open.png" width = "10%">[k.desousa@cgiar.org](mailto:k.desousa@cgiar.org) <br><br><br><br><br><br><br><br><br><br> [Back to the main page](index.html) ]