useR!2017: Full Schedule

11:00am CEST

Show Me Your Model: tools for visualisation of statistical models

Keywords: Model visualisation, model exploration, structure visualisation, grammar of model visualisation
The ggplot2 (Wickham 2009) package changed the way how we approach to data visualisation. Instead of looking for suitable type of a plot out of dozens of predefined templates now we express the relation among variables with a well defined grammar based on the excellent book The Grammar of Graphics (Wilkinson 2006).
Similar revolution is happening with tools for visualisation of statistical models. In the CRAN repository, one may find a lot of great packages that graphically explain a structure or diagnostic for some family of statistical models. Just to mention few known and powerful packages: rms, forestmodel and regtools (regression models), survminer (survival models), ggRandomForests (random forest based models), factoextra (multivariate structure exploration), factorMerger (one-way ANOVA) and many, many others. They are great, but they do not share same logic nor structure.
New packages from the tidyverse, like broom (Robinson 2017), creates an opportunity to build an unified interface for model exploration and visualisation for large collection of statistical models. And there is more and more articles that set theoretical foundations for unified grammar of model visualization (see for example Wickham, Cook, and Hofmann 2015).
In this talk I am going to present various approaches to the model visualisation, give an overview of selected existing packages for visualisation of statistical models and discuss proposition for a unified grammar of model visualisation.
References Robinson, David. 2017. Broom: Convert Statistical Analysis Objects into Tidy Data Frames. https://CRAN.R-project.org/package=broom.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Wickham, Hadley, Dianne Cook, and Heike Hofmann. 2015. Visualizing Statistical Models: Removing the Blindfold. Statistical Analysis; Data Mining 8(4).

Wilkinson, Leland. 2006. The Grammar of Graphics. Springer Science & Business Media.

Speakers

Przemyslaw Biecek

Thursday July 6, 2017 11:00am - 11:18am CEST
PLENARY Wild Gallery

Talk, Kaleidoscope II

Company 1046

11:18am CEST

Quantitative fisheries advice using R and FLR

Keywords: Quantitative Fisheries Science, Common Fisheries Policy, Management Strategy Evaluation, advice, simulation
Webpages: https://flr-project.org, https://github.com/flr
The management of the activities of fishing fleets aims at ensuring the sustainable exploitation of the ocean’s living resources, the provision of important food resources to humankind, and the profitability of an industry that is an important economic and social activity in many areas of Europe and elsewhere. These are the principles of the European Union Common Fisheries Policy (CFP), which has driven the management of Europe’s fisheries resources since 1983.
Quantitative scientific advice is at the heart of fisheries management regulations, providing estimates of the likely current and future status of fish stocks through statistical population models, termed stock assessments, but also probabilistic comparisons of the expected effects of alternative management procedures. Management Strategy Evaluation (MSE) uses stochastic simulation to incorporate both the inherent variability of natural systems, and our limited ability to model their dynamics, into analyses of the expected effects of a given management intervention on the sustainability of both fish stocks and fleets.
The Fishery Library in R (FLR) project has been for the last ten years building an extensible toolset of statistical and simulation methods for quantitative fisheries science (Kell et al. 2007), with the overarching objective of enabling fisheries scientists to carry out analyses of management procedures in a simplified and robust manner through the MSE approach.
FLR has become widely used in many of the scientific bodies providing fisheries management advice, both in Europe and elsewhere. The evaluation of the effects of some elements of the revised CFP, the analysis of the proposed fisheries management plans for the North Sea, or the comparison of management strategies for Atlantic tuna stocks, among others, have used the FLR tools to advice managers of the possible courses of action to favour the sustainable use of many marine fish stocks.
The FLR toolset is currently composed of 20 packages, covering the various steps in the fisheries advice and simulation workflow. They include a large number of S4 classes, and more recently Reference Classes, to model the data structures that represent each of the elements in the fisheries system. Class inheritance and method overloading are essential tools that have allowed the FLR packages to interact, complement and enrich each other, while still limiting the number of functions an user needs to be aware of. Methods also exist that make use of R’s parallelization facilities and of compiled code to deal with complex computations. Statistical models have also been implemented, making use of both R’s capabilities and external libraries for Automatic Differentiation.
We present the current status of FLR, the new developments taking place, and the challenges faced in the development of a collection of packages based on S4 classes and methods.
References Kell, L. T., I. Mosqueira, P. Grosjean, J.-M. Fromentin, D. Garcia, R. Hillary, E. Jardim, et al. 2007. “FLR: An Open-Source Framework for the Evaluation and Development of Management Strategies.” ICES Journal of Marine Science 64 (4). http://dx.doi.org/10.1093/icesjms/fsm012.

Speakers

Finlay Scott

Joint Research Centre, European Commission

FLR Scott Mosqueira pdf

Thursday July 6, 2017 11:18am - 11:36am CEST
PLENARY Wild Gallery

Talk, Kaleidoscope II

Company 627

11:36am CEST

*jamovi*: a spreadsheet for R

Keywords: Spreadsheet, User-interface, Learning R
Webpages: https://www.jamovi.org, https://CRAN.R-project.org/package=jmv
In spite of the availability of the powerful and sophisticated R ecosystem, spreadsheets such as Microsoft Excel remain ubiquitous within the business community, and spreadsheet like software, such as SPSS, continue to be popular in the sciences. This likely reflects that for many people the spreadsheet paradigm is familiar and easy to grasp.
The jamovi project aims to make R and its ecosystem of analyses accessible to this large body of users. jamovi provides a familiar, attractive, interactive spreadsheet with the usual spreadsheet features: data-editing, filtering, sorting, and real-time recomputation of results. Significantly, all analyses in jamovi are powered by R, and are available from CRAN. Additionally, jamovi can be placed in ‘syntax mode’, where the underlying R code for each analysis is produced, allowing for a seamless transition to an interactive R session.
We believe that jamovi represents a significant opportunity for the authors of R packages. With some small modifications, an R package can be augmented to run inside of jamovi, allowing R packages to be driven by an attractive user-interface (in addition to the normal R environment). This makes R packages accessible to a much larger audience, and at the same time provides a clear pathway for users to migrate from a spreadsheet to R scripting.
This talk introduces jamovi, introduces its user-interface and feature set, and demonstrates the ease with which R packages can be augmented to additionally support the interactive spreadsheet paradigm.
jamovi is available from www.jamovi.org

Speakers

Jonathon Love

useR pdf

Thursday July 6, 2017 11:36am - 11:54am CEST
PLENARY Wild Gallery

Talk, Kaleidoscope II

Company 997

11:54am CEST

The growing popularity of R in data journalism

Online presentation: https://goo.gl/pF9bKU

In this talk, Timo Grossenbacher, data journalist at Swiss Public Broadcast and creator of Rddj.info, will show that R is becoming more and more popular among a new community: data journalists. He will showcase some innovative work that has been done with R in the field of data journalism, both by his own team and by other media outlets all over the world. At the same time, he will point out the strengths (reproducibility, for example) and hurdles (having to learn to code) of using R for a typical data journalism workflow – a workflow that is often centered around quick, exploratory data analysis rather than statisticial modeling. During the talk, he will also point out and controversially discuss packages that are of great help for journalists especially, such as the tidyverse, readxl and googlesheets packages.

Speakers

Timo Grossenbacher

Projektleiter «Automated Journalism», Tamedia

Timo Grossenbacher (1987) verantwortet seit Sommer 2020 Projekte im Bereich «Automated Journalism» bei Tamedia. Davor war er mehr als fünf Jahre als Datenjournalist bei Schweizer Radio und Fernsehen tätig. Er hat Geographie und Informatik an der Universität Zürich studiert und... Read More →

Thursday July 6, 2017 11:54am - 12:12pm CEST
PLENARY Wild Gallery

Talk, Kaleidoscope II

Company 1245

12:12pm CEST

FFTrees: An R package to create, visualise and use fast and frugal decision trees

Online presentation: https://ndphillips.github.io/useR2017_pres/

Keywords: decision trees, decision making, package, visualization
Webpages: https://cran.r-project.org/web/packages/FFTrees/, https://rpubs.com/username/project
Many complex real-world problems call for fast and accurate classification decisions. An emergency room physician faced with a patient complaining of chest pain needs to quickly decide if the patient is having a heart attack or not. A lost hiker, upon discovering a patch of mushrooms, needs to decide whether they are safe to eat or are poisonous. A stock portfolio adviser, upon seeing that, at 3:14 am, an influential figure tweeted about a 5 company he is heavily invested in, needs to decide whether to move his shares or sit tight. These decisions have important consequences and must be made under time-pressure with limited information. How can and should people make such decisions? One effective way is to use a fast and frugal decision tree (FFT). FFTs are simple heuristics that allow people to make fast, accurate decisions based on limited information (Gigerenzer and Goldstein 1996; Martignon, Katsikopoulos, and Woike 2008). In contrast to compensatory decision algorithms such as regression, or computationally intensive algorithms such as random forests, FFTs allow people to make fast decisions ‘in the head’ without requiring statistical training or a calculation device. Because they are so easy to implement, they are especially helpful in applied decision domains such as emergency rooms, where people need to be able to make decisions quickly and transparently (Gladwell 2007; Green and Mehr 1997)
While FFTs are easy to implement, actually constructing an effective FFT from data is less straightforward. While several FFT construction algorithms have been proposed 15 (Dhami and Ayton 2001; Martignon, Katsikopoulos, and Woike 2008; Martignon et al. 2003), none have been programmed and distributed in an easy-to-use and well-documented tool. The purpose of this paper is to fill this gap by introducing FFTrees (Phillips 2016), an R package (R Core Team 2016) that allows anyone to create, evaluate, and visualize FFTs from their own data. The package requires minimal coding, is documented by many examples, and provides quantitative performance measures and visual displays showing exactly how cases are classified at each level in the tree.
This presentation is structured in three sections: Section 1 provides a theoretical background on binary classification decision tasks and explains how FFTs solve them. Section 2 provides a 5-step tutorial on how to use the FFTrees package to construct and evaluate FFTs from data. Finally, Section 3 compares the prediction performance of FFTrees to alternative algorithms such as logistic regression and random forests. To preview our results, we find that trees created by FFTrees are both more efficient, and as accurate as the best of these algorithms across a wide variety of applied datasets. Moreover, they produce trees much simpler than that of standard decision tree algorithms such as rpart (Therneau, Atkinson, and Ripley 2015), while maintining similar prediction performance.
References Dhami, Mandeep K, and Peter Ayton. 2001. “Bailing and Jailing the Fast and Frugal Way.” Journal of Behavioral Decision Making 14 (2). Wiley Online Library: 141–68.

Gigerenzer, Gerd, and Daniel G Goldstein. 1996. “Reasoning the Fast and Frugal Way: Models of Bounded Rationality.” Psychological Review 103 (4). American Psychological Association: 650.

Gladwell, Malcolm. 2007. Blink: The Power of Thinking Without Thinking. Back Bay Books.

Green, Lee, and David R Mehr. 1997. “What Alters Physicians’ Decisions to Admit to the Coronary Care Unit?” Journal of Family Practice 45 (3). [New York, Appleton-Century-Crofts]: 219–26.

Martignon, Laura, Konstantinos V Katsikopoulos, and Jan K Woike. 2008. “Categorization with Limited Resources: A Family of Simple Heuristics.” Journal of Mathematical Psychology 52 (6). Elsevier: 352–61.

Martignon, Laura, Oliver Vitouch, Masanori Takezawa, and Malcolm R Forster. 2003. “Naive and yet Enlightened: From Natural Frequencies to Fast and Frugal Decision Trees.” Thinking: Psychological Perspective on Reasoning, Judgment, and Decision Making, 189–211.

Phillips, Nathaniel. 2016. FFTrees: Generate, Visualise, and Compare Fast and Frugal Decision Trees.

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Therneau, Terry, Beth Atkinson, and Brian Ripley. 2015. Rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart.

Speakers

Nathaniel Phillips

Thursday July 6, 2017 12:12pm - 12:30pm CEST
PLENARY Wild Gallery

Talk, Kaleidoscope II

Company 1152

useR!2017

11:00am CEST

Przemyslaw Biecek

11:18am CEST

Finlay Scott

11:36am CEST

Jonathon Love

11:54am CEST

Timo Grossenbacher

12:12pm CEST

Nathaniel Phillips

Recently Active Attendees