Uncategorised / Thursday, 26 July, 2018

Maps as Statistics? A call to Adventure for Perception Research in (geo)visualization (LIDA Seminar)

Dr Roger Beecham, in his talk, “Maps as Statistics? A call to Adventure for Perception Research in (geo)visualization”, argued that line-up tests (“the Visual Line-up Method” first proposed by Wickham et al. in 2010) remove some of the potential pitfalls when trying to make inferences from graphics. Line-up tests, depicted in the figure below, are visual analogues of statistical tests used in most scientific research. A plot of real data is compared against a set of simulated decoys constructed under a meaningful null hypothesis. If the vagaries of Null Hypothesis Testing are not obvious to you, graphical line-ups are equivalent to police line-ups. In the police line-up, if a witness identifies a suspect from a set of candidate non-suspects, then the assumption of innocence – the null hypotheses that the suspect did not commit the crime –  becomes less likely.

There are numerous possible applications for graphical line-ups – Wickham et al. 2010 argue that they could be used to test complex relational structures – for example spatial relations (Beecham, et al. 2017) – that cannot be easily captured by summary statistics. As Beecham et al. state in their extended paper, though, for graphical line-ups to be elevated to formal statistics in and of themselves there needs to be confidence that any structure implied by a graphic can be reliably perceived – and this is a particular challenge to geovisualization.

Image (c) Roger Beecham.

In his research, Beecham specialises in spatial data analysis – often using new, passively-collected data for social science applications. A current focus is around how innovations in ‘Data Science’ are reshaping statistical model building – and the role of (geo)visualization in supporting this activity. In his extended paper, Beecham et al. 2017 make the case for using line-up tests for supporting inferences made from Choropleth maps. Starting with Waldo  Tobler’s first law of Geography, “Everything is related to everything else, but near things are more related than distant things” – known as spatial dependency – Beecham proceeded to show how line-up tests could be constructed to support statistical analysis of spatial structure – and to improve upon de facto approaches for testing for spatial dependency in geography. Through large crowdsourced empirical experiments, though, he also contributes data and models of the stark and unintended effects that differing statistical intensity and differing geometric irregularity in Choropleth maps has on visual perception.

He pointed out that, in the case of Choropleth maps, geometric irregularity can be problematic for graphical inference because of salient visual artefacts introduced that are often incidental to the spatial process being studied. Thus there is a fundamental disjunct between patterns that are visually perceived in maps and the statistical spatial process underlying those patterns. To an extent this mismatch between statistical effect and visual effect can be modelled. However, there remains much uncertainty and variation that cannot – and Beecham et al. quantify and compare this to similar empirical data collected on perception of non-spatial structure in non-spatial visualizations.  Returning to the notion of graphics forming statistics through line-ups, Beecham et al (2017) identify a link with the concept of statistical power – and how the data and models derived from this empirical work could inform estimates of statistical power in line-up tests.

Roger concluded his presentation by locating his study within a new class of visual perception research that attempts to contribute data and models around the extent and reliability with which statistics are inferred from graphics. This work is currently disrupting visualization research, and potentially the way in which claims are made from data across the sciences. The full paper on the subject, experimental data and analysis code can be viewed from this web page. See also, Harrison et al. 2014, Kay and Heer 2016, Correll and Heer 2017 and Albers Szafir 2018 – each with fully reproducible data and code.

This talk was delivered as part of the LIDA Seminar: Maps of Time.  Roger presented alongside keynote speaker Dr Jonathan Minton who discussed Seeing Population Data as Structures, not Slices.