Data Carpentry’s teaching is hands-on, so participants need to bring their own laptops to insure the proper setup of tools for an efficient workflow. In exchange, you will leave with a working version of R, and the skills to use it.
These lessons assume no prior knowledge of the skills or tools, but working through this lesson requires working copies of the software described below. To most effectively use these materials, please make sure to install everything before working through this lesson.
Please note all of the requirements specified below are free. Get in contact if you are about to hand over any money for this course because you shouldn’t be.
We’ll provide a power bar, but it will be easier if you have a fully charged laptop. There are two lessons in each evening session, lasting around 45 minutes each so 2 hours of ‘juice’ should be enough.
We’re going to get you up and running in R which is a highly respected, free statistics package used by academics around the world. We will also install a friendly interface to R called RStudio.
Download and install R from here
Download and install RStudio. This is a nice shiny interface for R, and the easiest way to use it. Download it here. There should be an ‘installer’ for your operating system.
Operating specific instructions and links are detailed below
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE.
Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE.
You can download the binary files for your distribution
from CRAN. Or
you can use your package manager (e.g. for Debian/Ubuntu
run sudo apt-get install r-base
and for Fedora run
sudo yum install R
). Also, please install the
RStudio IDE.
We’ll be using Googlesheets for two reasons. Firstly, they allow online collaborative editing. Secondly, starting your data ‘pipeline’ in the cloud will allow you to produce ‘realtime’ analytics.
Sign-up for an account at here.
Only, if you don’t already have Excel or similar already installed.
Spreadsheets are useful for data entry and data organization, and some subsetting and sorting of the data as well as getting an overview of the data. To interact with spreadsheets, we can use LibreOffice, Microsoft Excel, Gnumeric, OpenOffice.org, or other programs. Commands may differ a bit between programs, but general ideas for thinking about spreadsheets is the same.
For this lesson, if you don't have a spreadsheet program already, you can use LibreOffice. It's a free, open source spreadsheet program.
Slack is a geeky version of WhatsApp, but much better for coding and collaborating. We will use it to share files. You can use it chat, to ask questions, and later to become part of the Datascience Breakfast Club.
See you there!