Every now and then I searched in my own code for a specific graph
that I wanted to reuse. The package includes some of the graphs that I
created which might be a good starting point for my future self or for
your own project. The plotgraph()
function runs the
installed source code and returns the graphs. For example, do you know
the datasaurus
plot?
Without input, the plotgraph()
function returns
available graphs.
# list available graphs without input
plotgraph()
#> Error in plotgraph(): could not find function "plotgraph"
Anscombe quartet
Anscombe quartet is a set of four datasets that have nearly identical simple descriptive statistics, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers on statistical properties.
plotgraph("anscombe_quartet.R")
Boxplot Illustration
The Boxplot Illustration shows the main components of a boxplot. The boxplot is a standardized way of displaying the distribution of data based on a five-number summary: minimum, first quartile, median, third quartile, and maximum. It can also show outliers.
plotgraph("boxplot_illustration.R")
Boxplot pitfalls
The Boxplot pitfalls shows the main problem with boxplots. A boxplot shows the distribution of a dataset, but not the underlying data. A jitter plot is added to show the underlying data.
plotgraph("boxplot_pitfalls.R")
Data format
The data format shows the difference between long and wide data format. The long format is often preferred for data analysis and visualization. The wide format is often preferred for data storage and data entry.
plotgraph("long_wide.R")
Data joins
The plot shows the different types of joins inspired by the Data Wrangling with dplyr and tidyr chapter from the R for Data Science book. The plot shows the different types of joins: inner, left, right, and full join.
plotgraph("data_joins.R")
Data saurus
The datasaurus plot shows the importance of visualizing data. The datasaurus plot shows the same summary statistics for 12 datasets. The plot shows the importance of visualizing data before analyzing it. The datasaurus plot is inspired by the Datasaurus Dozen paper.
plotgraph("datasaurus.R")
Gapminder
The Gapminder bubble chart shows the life expectancy and GDP per capita for countries over time. The Gapminder bubble chart is inspired by the Gapminder project.
plotgraph("gapminder.R")
Simpson’s paradox
The Simpson’s paradox plot shows how the correlation between two variables can change when a third variable is added. It underlines the importance of visualizing data and causal inference, since overall it may seem that there is positive correlation, but when the data is split into groups, the correlation can be negative.
Graphs::plotgraph("simpson.R")
UCB Admission
Where students discriminated? The UCB Admission plot shows the admission rates for different departments at the University of California, Berkeley. The UCB Admission case illustrates the importance of causal inference since it seems that more women were rejected, but when the data is split into departments, the opposite can be true.
edgar::plotgraph("ucb_admission.R")