Creating a dynamic Rmarkdown website for COVID-19 data

Overview

This post refers to code available online as a GitHub repository.

Create a new RStudio project

Open the RStudio “New Project…” menu -> New Directory -> Simple R Markdown Website -> Type a folder name.

Initialize git

You may need to install.packages("git2r") first.

git2r::init()

Also, restart RStudio to apply the changes and benefit of the “Git” tab.

Make a new GitHub repository

Click on “New repository”. For simplicity, type the same name “COVID-19-website”.

Do not add a README or a license yet, an empty repository will do.

Add the GitHub remote to your local clone. The remote name should be "origin". The url is shown on your GitHub repository page.

git2r::remote_add(name = "origin", url = "git@github.com:kevinrue/COVID-19-website.git")

Push your local clone to the GitHub remote

While git2r::push() does exist, it can raise issues related to the detection of your git credentials. For simplicity, I prefer to push and pull directly from the terminal. To run the command below, the current working directory must be the root of the RStudio project.

git push -u origin master

Clone the source data repository as a git submodule

git submodule add git@github.com:CSSEGISandData/COVID-19.git

Write a notebooks for each webpage

Fast-forwarding in time, I updated the template index.Rmd and about.Rmd notebooks, added some notebooks, and updated the _site.yml to include those new pages in the built website.

Note that _site.yml is also useful to exclude certain subdirectories from the built website. In particular, those include the source data folders cloned as a submodule above.

I also refactored some recurrent code as child notebooks (see the childs/ subdirectory).

Build the website

When you’re ready, click the “Build Webiste” button in the “Build” panel. This is equivalent to running rmarkdown::render_site(encoding = 'UTF-8') at the R console. This generates a _site/ subdirectory in your project, that contains all the files for your website.

To visualize your website, open the “index.html” file in this subdirectory.

Deploy the website

After inspecting that your website looks as you intend it, you can deploy it on GitHub Pages. This can be done by pushing the files in the _site/ subdirectory. to a gh-pages branch on your GitHub repository.

Initially, you need to create an empty gh-pages orphan branch as it does not exist. For this, I generally follow instructions in the bookdown book. Briefly:

# create a branch named gh-pages and clean up everything
git checkout --orphan gh-pages
git rm -rf .

# create a hidden file .nojekyll
touch .nojekyll
git add .nojekyll

git commit -m"Initial commit"
git push origin gh-pages

Then, every time you build your website, you need to push the new website to that branch. To automate this process, I wrote a _deploy.sh script, which I run as bash _deplooy.sh. The script is adapted from … and looks like this.

git clone -b gh-pages \
  git@github.com:kevinrue/COVID-19-website.git \
  site-output
cd site-output
git reset --hard HEAD^
git rm -rf *
cp -r ../_site/* ./
git add --all *
git commit -m "Update the site"
git push -f origin gh-pages
cd ..
rm -rf site-output

Note that this script includes two lines that I do not recommend in general. Namely:

git reset --hard HEAD^
git push -f origin gh-pages

The first line deletes the previous version of the website from the git history, as if it never happened. The second line overrides the GitHub history to erase that commit it from there as well. I only do this here because my website generates many PNG files in each version of the website. If I didn’t delete the previous version each time, the git history would keep track of every version of every PNG file that I ever generated, making the git clone commands increasingly time consuming each time I clone the gh-pages branch to push a new version of the website.

Thanks for reading!