---
title: "Interactive R Worksheet: F1 Race Data"
subtitle: "Explore Formula 1 data live in your browser"
author: "Dr Gordon Wright"
date: "2026-03-24"
format:
html:
theme: [cosmo, ../custom.scss]
toc: true
toc-depth: 2
include-in-header: ../include/fonts.html
smooth-scroll: true
filters:
- webr
execute:
echo: true
---
::: {.callout-note}
## About this worksheet
All R code runs directly in your browser via WebR. No installation needed. Edit any code cell and press the Run button to see the results.
:::
## Part 1: Loading and exploring F1 data
WebR comes with many built-in datasets. Let us create some Formula 1 race data and explore it.
```{webr-r}
# Simulate F1 lap time data for 6 drivers across 20 laps
set.seed(42)
drivers <- c("Verstappen", "Hamilton", "Norris",
"Leclerc", "Piastri", "Russell")
base_pace <- c(91.2, 91.8, 92.0, 92.1, 92.3, 92.5)
f1 <- data.frame(
driver = rep(drivers, each = 20),
lap = rep(1:20, times = 6),
time = unlist(lapply(seq_along(drivers), function(i) {
round(base_pace[i] + rnorm(20, 0, 0.8) +
seq(0, 1.5, length.out = 20) * runif(1, 0.5, 1.2), 3)
}))
)
head(f1, 12)
```
```{webr-r}
# How many observations? How many drivers?
cat("Rows:", nrow(f1), "\n")
cat("Drivers:", length(unique(f1$driver)), "\n")
cat("Laps per driver:", nrow(f1) / length(unique(f1$driver)), "\n")
```
## Part 2: Summary statistics
```{webr-r}
# Mean and SD lap time per driver
driver_stats <- aggregate(time ~ driver, data = f1, FUN = function(x) {
c(mean = round(mean(x), 3), sd = round(sd(x), 3))
})
# Unpack the matrix
result <- data.frame(
driver = driver_stats$driver,
mean_time = driver_stats$time[, "mean"],
sd_time = driver_stats$time[, "sd"]
)
result[order(result$mean_time), ]
```
::: {.callout-tip}
## What to notice
Verstappen has the fastest mean lap time, but look at the standard deviations. Consistency matters as much as raw pace. A fast but erratic driver loses time through variation.
:::
## Part 3: Visualising lap times
```{webr-r}
# Boxplot of lap times by driver
boxplot(time ~ driver, data = f1,
col = c("#9B1B30", "#00A19C", "#FF8000",
"#E80020", "#FF8000", "#27F4D2"),
las = 2,
ylab = "Lap time (seconds)",
main = "F1 Lap Time Distribution by Driver",
border = "grey30")
# Add a horizontal line at the overall median
abline(h = median(f1$time), lty = 2, col = "grey50")
text(0.6, median(f1$time) + 0.15, "Median", cex = 0.7, col = "grey50")
```
## Part 4: Lap-by-lap degradation
Tyre degradation means lap times typically get slower as the stint progresses. Let us visualise this.
```{webr-r}
# Plot lap times over laps for each driver
plot(NULL, xlim = c(1, 20), ylim = range(f1$time),
xlab = "Lap", ylab = "Lap time (seconds)",
main = "Tyre Degradation: Lap Times Over a Stint")
team_cols <- c("#3366CC", "#CC0000", "#FF8800",
"#E80020", "#FF8800", "#00A19C")
for (i in seq_along(drivers)) {
d <- f1[f1$driver == drivers[i], ]
lines(d$lap, d$time, col = team_cols[i], lwd = 2)
# Add a trend line
fit <- lm(time ~ lap, data = d)
abline(fit, col = team_cols[i], lty = 2, lwd = 1)
}
legend("topleft", drivers, col = team_cols, lwd = 2,
cex = 0.7, bg = "white")
```
::: {.callout-tip}
## Try this
Edit the code above to change `set.seed(42)` in Part 1 to a different number, then re-run all the cells. The entire analysis updates with new simulated data. This is reproducible research in action.
:::
## Part 5: Is Verstappen actually faster?
```{webr-r}
# t-test: Verstappen vs Hamilton
ver <- f1$time[f1$driver == "Verstappen"]
ham <- f1$time[f1$driver == "Hamilton"]
t.test(ver, ham)
```
```{webr-r}
# Effect size: Cohen's d
mean_diff <- mean(ver) - mean(ham)
pooled_sd <- sqrt((sd(ver)^2 + sd(ham)^2) / 2)
cohens_d <- mean_diff / pooled_sd
cat("Cohen's d =", round(cohens_d, 2), "\n")
cat("Interpretation:",
ifelse(abs(cohens_d) > 0.8, "Large effect",
ifelse(abs(cohens_d) > 0.5, "Medium effect",
ifelse(abs(cohens_d) > 0.2, "Small effect", "Negligible"))), "\n")
```
## Summary
::: {.callout-important}
## What this demonstrates
- **WebR** runs R code directly in the browser with no installation
- Students can explore data, run statistical tests, and create visualisations interactively
- The same workflow used in RStudio works here: load data, summarise, visualise, test, interpret
- Every code cell is editable — students can experiment freely without breaking anything
:::