canadian-weather

Canadian Historical Weather Data

Prepared by Nathan K. Chan

Project website: https://nathankchan.github.io/canadian-weather/

GitHub repository: https://github.com/nathankchan/canadian-weather

Online application: https://nathankchan.shinyapps.io/canadian-weather/

Overview

This project is an interactive Shiny web application for exploring historical hourly weather data across all 13 Canadian provinces and territories. Data is sourced from Environment and Climate Change Canada (ECCC) and covers 960 active weather stations.

Browse the data through five views: Map, Surface Plot, Line Chart, Heat Map, and Table.

Demo of the Canadian Historical Weather Data app

Requirements

R 4.5.3 is required. If it is not installed, visit r-project.org.

Dependencies are managed with renv. Key packages include:

Restore the full dependency lockfile before running:

renv::restore()

Running Locally

1. Build the data pipeline

pipeline.R is the data pipeline script. It:

  1. Updates station metadata from the ECCC inventory
  2. Downloads raw hourly CSVs per station
  3. Removes stale and corrupted files
  4. Converts all CSVs to Parquet format
cd {project_dir}
Rscript pipeline.R

The initial download covers 960 stations and may take several hours. Subsequent runs only fetch new or updated files and complete in minutes.

2. Launch the app

cd {project_dir}
Rscript app.R

Then open the printed URL (e.g. http://127.0.0.1:{port}) in a browser. Press Ctrl+C to stop the app.

Data Flow

ECCC API → pipeline.R → rawdata/{StationID}/*.csv
    → pipeline.R (combine CSVs, drop missing rows, type-convert, write Parquet)
    → data/{StationID}.parquet
    → app.R (lazy Arrow dataset, date-range filtered)
    → Shiny UI (Map / Surface Plot / Line Chart / Heat Map / Table)

The app opens each station’s Parquet file lazily via arrow::open_dataset() and only collects data after applying the selected date-range filter — no full dataset is loaded into memory.

App Features