Skip to contents
library(PenguinR)
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.5.1
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

Introduction

The PenguinR package offers a rich and diverse collection of datasets focused on penguin biology, ecology, and behavioral studies. It includes data on species morphology, clutch completion, blood isotope composition, and heart rate measurements collected from adult foraging penguins near Palmer Station, Antarctica.

The package contains a wide variety of data types, including morphometric, physiological, ecological, and experimental datasets. These datasets encompass flipper length, body mass, bill dimensions, reproductive success indicators, metabolic activity, and isotopic composition, enabling detailed exploration of penguin biology through the lens of statistical analysis and experimental design.

Dataset Suffixes

Each dataset in the PenguinR package uses a suffix to denote the type of R object:

  • _df: A data frame

Example Datasets

Below are selected example datasets included in the PenguinR package:

  • penguins_df: Size Measurements for Adult Foraging Penguins near Palmer Station, Antarctica.

  • penguins_raw_df: Penguin Size, Clutch, and Blood Isotope Data for Foraging Adults near Palmer Station, Antarctica.

  • peng_df: Size Measurements for Penguins near Palmer Station, Antarctica.

  • pinguinos_df: Penguin Heart Rate.

Data Visualization with PenguinR Data

Size Measurements for Penguins near Palmer Station, Antarctica


# Prepare summary or filtered data (optional)
peng_summary <- peng_df %>%
  filter(!is.na(flipper_length), !is.na(body_mass)) %>%
  group_by(species, sex) %>%
  summarise(
    mean_flipper = mean(flipper_length, na.rm = TRUE),
    mean_mass = mean(body_mass, na.rm = TRUE),
    .groups = "drop"
  )

# Scatterplot: Body mass vs Flipper length by species and sex
ggplot(peng_df, aes(x = flipper_length, y = body_mass, color = species, shape = sex)) +
  geom_point(size = 2, alpha = 0.8) +
  geom_smooth(method = "lm", se = FALSE, linetype = "dashed", color = "black") +
  labs(
    title = "Body Mass vs Flipper Length in Penguins",
    subtitle = "Data by Species and Sex near Palmer Station, Antarctica",
    x = "Flipper Length (mm)",
    y = "Body Mass (g)",
    color = "Species",
    shape = "Sex"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold"),
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
#> `geom_smooth()` using formula = 'y ~ x'

Conclusion

The PenguinR package provides a comprehensive and well-structured collection of datasets centered on penguin biology and ecology, designed to support learning, teaching, and research in statistical analysis and experimental design.

By integrating data on morphology, reproductive success, blood isotope composition, and heart rate, the package offers users the opportunity to apply a wide range of statistical methods—including descriptive analysis, ANOVA, regression, and multivariate techniques—using authentic ecological data.

Whether for educational use, methodological demonstration, or reproducible research, PenguinR serves as a valuable tool that bridges data science and biology, helping users develop analytical skills while exploring the fascinating world of penguins near Palmer Station, Antarctica.