Are Statistics Courses Accessible?Part II: Revisited with 2020 hindsight

Introduction

I started writing this file from scratch at 10:02 in the morning of the day I delivered this talk.

I felt there was a need to demonstrate that the claims I made in my paper were realistic.

I have started by writing an R markdown document, which is just a plain text file. I can therefore read or write this file on a braille display If I choose.

I have had no sighted assistance to prepare this document. The final version being used for the talk is an HTML file that is also very readable using standard tools.

I have used:

the WriteR text editor Godfrey and Curtis (2016) which I use instead of the (currently) inaccessible RStudio IDE RStudio Team (2020) that my students use.
A standard installation of R statistical software R Core Team (2020)
The BrailleR add-on package Godfrey et al. (2020) which was first introduced in Godfrey (2012)
and a selection of other R packages that we expect our undergraduate students to use, including ggplot2 Wickham et al. (2020), knitr Xie (2020), and rmarkdown Allaire et al. (2020)

An Exploratory Data Analysis Task

A student completing one of our first year statistics courses should be able to complete the following task as part of an assignment.

“Use the diamonds data to highlight differences in the price of diamonds according to a selection of potential predictors of price.”

A sufficient solution is as follows:

library(ggplot2)
data(diamonds, package = "ggplot2")
Graph1 <- diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(alpha = 0.01) + 
    geom_smooth(color = "blue")
Graph1

`geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

A first look at the price of diamonds according to their carat value

Graph2 <- diamonds %>% ggplot(aes(x = color, y = price)) + geom_boxplot()
Graph2

Boxplots of price for each colour of diamond

A blind person would add the following commands to know what those plots display:

library(BrailleR)
VI(Graph1)

This is an untitled chart with no subtitle or caption.
It has x-axis 'carat' with labels 0, 1, 2, 3, 4 and 5.
It has y-axis 'price' with labels 0, 5000, 10000, 15000 and 20000.
It has 2 layers.
Layer 1 is a set of 53940 points.
Layer 1 has alpha set to 0.01.
Layer 2 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.

VI(Graph2)

This is an untitled chart with no subtitle or caption.
It has x-axis 'color' with labels D, E, F, G, H, I and J.
It has y-axis 'price' with labels 0, 5000, 10000 and 15000.
The chart is a boxplot comprised of 7 boxes with whiskers.
There is a box at x=D.
It has median 1838. The box goes from 911 to 4213.5, and the whiskers extend to 357 and 9138.
There are 482 outliers for this boxplot.
There is a box at x=E.
It has median 1739. The box goes from 882 to 4003, and the whiskers extend to 326 and 8674.
There are 760 outliers for this boxplot.
There is a box at x=F.
It has median 2343.5. The box goes from 982 to 4868.25, and the whiskers extend to 342 and 10693.
There are 694 outliers for this boxplot.
There is a box at x=G.
It has median 2242. The box goes from 931 to 6048, and the whiskers extend to 354 and 13721.
There are 473 outliers for this boxplot.
There is a box at x=H.
It has median 3460. The box goes from 984 to 5980.25, and the whiskers extend to 337 and 13460.
There are 456 outliers for this boxplot.
There is a box at x=I.
It has median 3730. The box goes from 1120.5 to 7201.75, and the whiskers extend to 334 and 16309.
There are 208 outliers for this boxplot.
There is a box at x=J.
It has median 4234. The box goes from 1860.5 to 7695, and the whiskers extend to 335 and 16427.
There are 66 outliers for this boxplot.

A talented (perhaps blind) student would perhaps improve the first graph using the following:

Graph1a <- diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(alpha = 0.01) + 
    geom_smooth(color = "blue") + facet_wrap(~color)
Graph1a

`geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

A second look at the price of diamonds according to their carat value, separated by colour category

VI(Graph1a)

This is an untitled chart with no subtitle or caption.
The chart is comprised of 7 panels containing sub-charts, arranged horizontally.
The panels represent different values of color.
Each sub-chart has x-axis 'carat' with labels 0, 1, 2, 3, 4 and 5.
Each sub-chart has y-axis 'price' with labels 0, 5000, 10000, 15000 and 20000.
Each sub-chart has 2 layers.
Panel 1 represents data for color = D.
Layer 1 of panel 1 is a set of 6775 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 1 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 2 represents data for color = E.
Layer 1 of panel 2 is a set of 9797 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 2 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 3 represents data for color = F.
Layer 1 of panel 3 is a set of 9542 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 3 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 4 represents data for color = G.
Layer 1 of panel 4 is a set of 11292 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 4 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 5 represents data for color = H.
Layer 1 of panel 5 is a set of 8304 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 5 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 6 represents data for color = I.
Layer 1 of panel 6 is a set of 5422 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 6 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.
Panel 7 represents data for color = J.
Layer 1 of panel 7 is a set of 2808 points.
Layer 1 has alpha set to 0.01.
Layer 2 of panel 7 is a 'lowess' smoothed curve with 95% confidence intervals.Layer 2 has colour set to vivid violet.

Comments

The VI() command has extracted quite alot of information from the created graph objects. It produces exact details when these are easily obtained, but only structural information for some plot elements.

The BrailleR package is undergoing improvements, with special attention being given to the plots generted using the ggplot2 package. There is definite need for ongoing effort in this regard.

The Four Elements of a Statistics Course

At the first workshop in this series, now eleven years ago, I said there were four main concerns for anyone undertaking a course in statistics Godfrey (2009).

They haven’t changed in number, but the emphasis in today’s environment are very definitely different, and I suggest that blind students are considerably better off today than a student from ten years ago.

Element 1: Graphs

Aside from the dramatic improvements in the ability to know what has been produced, little has changed in the tools being used by blind students today.

We have seen improvements in tactile graphic creation and development of interactive tools for exploration of graphs.

The range of tools has increased, but interactions with students suggest to me that the way these students expect to engage in theri statistics courses has not changed. Students rely more on human support than is truly necessary.

Element 2: Statistical Software

Godfrey (2009) unashamedly promoted use of R (R Core Team 2020) for offering the best access to software for blind people.

The last ten years has seen little change in the accessibility of statistical software. While personal preference plays a role in making this proclamation, I still believe R is the best option for any blind person, with only two other options offering anything close to what I consider access.

Ten years of attempting to gain any traction with statistical software developers has been almost fruitless. The advances in R are courtesy of a large and extremely active open source community. The only other options that offer the blind user any hope of equivalent access as enjoyed by their sighted peers remains SAS and Stata. Rather unfortunately, neither of these options has much traction in the teaching of undergraduates in any discipline nowadays, and only SAS can claim to take access for blind users seriously.

Element 3: Tables

Looking up tables to perform manual calculations ought to be a thing of hte past, but due to the inability to successfully modernise assessment procedures, their use remains.

Even if a student must use tabulated information, there is no need for them to rely on the printed version of the tables. We can use software as a calculator; sighted students using modern calculators don’t have to look at printed tables either.

The critical point that seperates 75% of the standard normal distribution from the other 25% is found using:

qnorm(0.75)

[1] 0.6744898

OK, the printed tables are rounded to four decimal places for proportions and two for z values, so the student might need to use:

round(qnorm(0.75), digits = 2)

[1] 0.67

Element 4: Mathematical Formulae

Tools to embed mathematical content in HTML pages have dramatically improved over the last ten years. Tools to help blind users read that content have also improved significantly.

It is now realistic to expect a sighted student to include the proper symbols in their work. We want to see \(\mu\) used for the population mean.

In markdown, this is created using standard LaTeX notation and served to the reader using either MathML or MathJax.

Today’s blind authors cannot rely on pdf to create mathematical content independently.

BrailleR in Action

In order to support blind people wishing to make use of the range of tools I’ve found most useful, and those that I’ve created over the last ten years, I have compiled my work into an online book called “BrailleR in Action”

It uses the same tools as do many other authors in the R community and is based on R markdown. We now use this format for the material we provide to students.

What I have demonstrated

This HTML version of my talk has demonstrated that I (a blind person) can work independently. I’ve

created statistical graphics
been able to write mathematical content,
incorporate references to other source material of interest,
and I’ve been able to verify that the content includes what I intended.

I could have shown you how to use R to do mathematical calculations or to fit statistical models.

I could have shown you hundreds of books written using R markdown that provide me with the vast majority of the source material I need to do my job as a statistician.

I can tell you that my working life is considerably easier in 2021 than it was in 2009 because I embrace the best tools, and because I’m ready to move to anything superior that comes along.

My sighted students now use these same tools to complete their assignments, and this means I am now doing more of the marking that was only possible with the help of a sightedd person to read them aloud ten years ago.

Major Conclusions

Ten years ago, I discovered that I had a role to play in helping improve access to statistics education for blind people. Over those years, I’ve learned a lot from experimentation, trying to show blind people what they can learn, and seeing a lot of good ideas fall flat because they made work for the blind person or the staff that supported them.

In the 20th century, access to printed information was the primary limitation on the blind student. This is a considerably smaller problem for blind students undertaking statistics courses in 2021 than it was even ten years ago.

In my opinion, it is the modern ways of teaching statistics that are driving the inaccessibility in the 21st century.

The ability of the blind student to take courses in qualitative disciplines and succeed is increasing with the improvements in technology. In many sciences, the opposite is true as we lose ground due to advancements in software and other technologies that are not created with the blind user in mind.

It is possible for blind students to succeed in statistics courses, as long as the right choices in software and the way work will be undertaken are made.

We live in information-rich times, and the most current work being done in statistics that is relevant to introductory and advanced students is being presented in a very accessible format. Reliance on obtaining information from the inaccessible pdf format, even with the best enhancements available today, is an inferior way of working and leaves blind students at risk of suffering in a similar fashion to many others who preceded them.

It will still take some serious preparation before the course starts, some additional work during the course, and some reasonable accommodations being made by the course staff. As much as the blind student is afraid of failing statistics, most academic staff are afraid of failing capable students.

It was necessary to prove that some things could be done, even if those actions were not eventually taken up by the thousands of blind students around the world. Numerous glass ceilings have been broken and some hugely innovative tools have been developed. The most enduring successes are those tools that provide improved access to blind people and have become embedded into the everyday software and practices being used by everyone. In statistics, this includes improvements in R and SAS, and in mathematics this includes MathJax.

I put the availability of R markdown at the forefront of accessible tools that has made such a difference for R users, with both the consumption of printed information and the production of statistical analyses in HTML being key to this success.

With respect to other printed information, we need to be sure that research efforts get put into the tools that are the most practical and useful for blind people, instead of those tools that are the most commonly used by publishers. In my opinion, the efforts to improve access to content in pdf are important. The problem is that unless they become built-in so that they are not optional extras that require authors and publishers to learn how to build in access, they will not gain sufficient traction and therefore not be valued by blind consumers. Whether we like it or not, younger blind people do not want to read material in a fixed page format that is second rate when there is slightly better HTML based solutions ready to hand. They want access on a plethora of devices, running all operating systems, and in my experience, they want them to work right out of the box. It is difficult to see what (if any) benefits a blind person will enjoy from the most accessible pdf that do not already exist when working with documents available in accessible HTML.

The next ten years is going to be very exciting for blind people in STEM. We need to make sure the tools that currently work well continue to do so, and we also need to continue to develop more tools that improve the ability of blind people to succeed in more STEM fields. Access to information is a necessary problem to overcome, but access to modern tools will also continue to pose challenges.

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2020. Rmarkdown: Dynamic Documents for r. https://github.com/rstudio/rmarkdown.

Godfrey, A. Jonathan R. 2009. “Are Statistics Courses Accessible?” In Proceedings of the Workshop on e-Inclusion in Mathematics and Science 2009, 72–80. Fukuoka, Japan.

———. 2012. “The BrailleR Project.” In Proceedings of Digitization and e-Inclusion in Mathematics and Science, edited by Katsuhito Yamaguchi and Masakazu Suzuki, 89–95. Tokyo, Japan. http://workshop.sciaccess.net/DEIMS2012/Proceedings.zip.

Godfrey, A. Jonathan R., and James M. P. Curtis. 2016. “Simple Authoring of Statistical Analyses by and for Blind People.” In Proceedings of the International Workshop on Digitization and e-Inclusion in Mathematics and Science 2016, edited by Katsuhito Yamaguchi and Masakazu Suzuki, 47–54. Kanegawa, Japan. http://workshop.sciaccess.net/DEIMS2016/index.html.

Godfrey, A. Jonathan R., Debra Warren, Paul Murrell, Timothy Bilton, and Volker Sorge. 2020. BrailleR: Improved Access for Blind Users. https://github.com/ajrgodfrey/BrailleR.

R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

RStudio Team. 2020. RStudio: Integrated Development Environment for r. Boston, MA: RStudio, PBC. http://www.rstudio.com/.

Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2020. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.

Xie, Yihui. 2020. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.