Polar Data Catalogue

------------- # PDC Explorer Visualization Cookbook ### Polar Data Catalogue () ### Patrick Bosworth () Date Published: 31-03-2023 Date of Last Update: 18-03-2023 ## Table of contents 1. [INTRODUCTION](#introduction) * [Overview](#overview) 2. [REQUIREMENTS](#reqs) * [Map Usage Requirements](#usageReqs) * [Maintenance and Deployment Requirements](#maintenanceReqs) 3. [DEPLOYMENT AND MODIFICATION](#modifying) * [Making Edits and Testing](#editandtest) * [How to Deploy](#howto) 4. [TROUBLESHOOTING /COMMON ERRORS](#troubleshooting) * [API Integration](#apiintegration) * [Bounds Checking and Error Handling](#errors) 5. [VISUALIZATION STRUCTURE AND DESIGN TIPS](#design) * [Visualization Intro](#appintro) - [Package and Library Setup](#packlib) - [Read In and Adjust the Dataset](#readadj) - [Shiny Apps UI](#shinyui) - [Leaflet and Shiny Server](#shinyserver) + [The Filtered Dataset](#filtdata) + [Map and Table Display](#maptable) + [Language Switcher](#langsw) 6. [CONCLUSION](#conclusion) 8. [APPENDIX](#appendix) * [Acronyms](#acryonyms) * [Glossary](#glossary) * [Links](#links) ## INTRODUCTION ### Overview The Polar Data Catalogue Explorer (or PDC Explorer) is an interactive data visualization that is designed to to make discovering and using data at the [Polar Data Catalogue](https://www.polardata.ca/) easy and effective. The goal is for the map align with FAIR principles (Findable, Accessible, Interoperable, and Reusable) while also enhancing data discoverability. ## REQUIREMENTS ### Map Usage Requirements - Any modern computing device (Android 10 and above, iOS 13 and above, Windows 10 and above, iOS Mojave and above) - Internet connection - A modern web browser using Chromium, Webkit, or Gecko: Google Chrome, Safari, Firefox, Microsoft Edge, Brave, Vivaldi, Opera, etc. - Javascript is required ### Maintenance and Deployment Requirements All of the above, plus the following: - R Studio, available for Windows, Mac, and Linux. - The following R libraries must be installed: shiny, shinyWidgets, leaflet, leaflet.extras2, stringr, httr, jsonlite, shiny.i18n, tidyverse, and rsconnect. The sourcecode file has install routines available but commented-out at the top of the file for your convenience. - PDC logo file "pdc.png" - Access to the PDC metadata API - Access to the shinyapps.io account. A connection should be made within R Studio following this [setup tutorial](https://shiny.rstudio.com/articles/shinyapps.html) ## DEPLOYMENT AND MODIFICATION ### MAKING EDITS AND TESTING The PDC Explorer is a Shiny app visualization, written in the R programming language and uses the Leaflet library for map rendering. Making edits and testing the visualization requires the use of RStudio, an IDE available for Windows, Mac, and Linux. To create a working development environment: 1. Install the latest version of R from [CRAN] (https://cran.rstudio.com/) (the Comprehensive R Archive Network) 2. If you're using Windows, install the latest version of [Rtools] (https://cran.r-project.org/bin/windows/Rtools/) 3. Download and install the latest version of [RStudio from Posit] (https://posit.co/download/rstudio-desktop/). A [guide to using RStudio] (https://docs.posit.co/ide/user/) may help new users familiarize themselves with the development environment. Edits to the visualization can be done in RStudio if you have access to the source code, which should be a single .R file. Be sure to set the working directory in RStudio to the same location as the source code file. Working and editing in RStudio allows you to test all the visualization functionality on a local environment - changes made here do not replicate to the online map unless they are specifically deployed. ### HOW TO DEPLOY Deployment is the process by which the local version of the Shiny app visualization is published to the hosting provider and becomes available on the web to a general audience. Deployment is nearly automatic, as RStudio controls the process as long as it is connected to the authenticated PDC shinyapps.io account. You will need the shinyapps.io account login details to [configure the connection] (https://shiny.rstudio.com/articles/shinyapps.html) between it and RStudio. When you're ready to deploy: 1. Save a copy of the R source code as "app.R" 2. From the "File" menu, choose "Publish" 3. In the file selector on the left, ensure only "app.R", "translationPDC.json",, "pdc.png", and "metadata.csv" are selected. If you have added other files for use by the visualization select those as well. 4. On the right side of the Publish to Server box, ensure the correct account name is selected, "polardata". 5. Ensure that Title or Update shows the name of the Shiny app is "metadata". 6. Hit Publish to begin the deployment. RStudio will take a few minutes to complete the deployment, after which a browser window will open with the visualization running. ## TROUBLESHOOTING / COMMON ERRORS ### API INTEGRATION The PDC Explorer Visualization sources its data from the [Polar Data Catalogue API] (http://hedeby.uwaterloo.ca/api/documentation). If the API is unavailable, the visualization will fall back to using an occasionally-updated CSV file (metadata.csv) that is deployed along with the visualization to shinyapps.io. At the current release date, this failover is silent and the user is not informed they might be viewing old or out-of-date metadata. Here's an example of an API call. It is using the "Find records by Research Program" function to retrieve the metadata for a research program by name. Note that spaces in the name are converted to %20 to comply with HTML link syntax. The R library "httr" is required for the GET command. Pay careful attention to the status response code, as it will indicate whether the API call was successful. ```r apiResponse <- GET("http://hedeby.uwaterloo.ca/api/metadata/program/amundsen%20science?page=0") status = status_code(apiResponse) ``` Greg Vey (gvey@uwaterloo.ca) can assist in troubleshooting API issues. ### BOUNDS CHECKING AND ERROR HANDLING Shiny and Leaflet are a powerful combination for displaying maps, but the system has a tendency to fail in ugly ways when a bug is encountered. Typically, the map will fail to display and a cryptic error message will display on screen, which is a bad experience for the user. Therefore, take care that failure states do not interrupt the rendering of the map. Whenever possible, include bounds checks on any user input, provide if-statements to route around error states, and code with redundancy in mind. Read on for further understanding of the PDC Explorer App. ## VISUALIZATION STRUCTURE AND DESIGN TIPS ### Visualization Intro The PDC Explorer visualization is constructed in a standard way for a Shiny app. 1. Package and library setup 2. Read in and adjust the dataset 3. Define the Shiny UI - the appearance and default behaviour of the map/table display and filtering controls 4. Configure the Shiny server - the logic and interaction between the controls and the map/table display 5. A single line to invoke the Shiny app and run the visualization #### Package and Library Setup All the libraries and packages required to compile and deploy the visualization are already included. In your development environment, you are required to install each of the libraries before they can be used. Uncomment the first 10 lines of the code and run them to install, or run the following commands in your R console: ```r install.packages("shiny") install.packages("shinyWidgets") install.packages("leaflet") install.packages("leaflet.extras2") install.packages("stringr") install.packages("httr") install.packages("jsonlite") install.packages("tidyverse") install.packages("shiny.i18n") install.packages("rsconnect") ``` If you add a new library to the code, ensure that the install.packages() and library() lines are listed in this section. This is also the section of the code that reads in the other files: the PDC icon, the backup metadata file, and the translation file. #### Read In and Adjust the Dataset The next section connects to the API to retrieve a list of of all the PDC programs. Next, the visualization loops through this program list to retrieve the complete metadata of each program. The visualization continues whether or not the API was successful - and if the API is down or an incomplete dataset is assembled, the backup metadata file is used. In the data adjustment section, multiple helper columns are created in the dataset to assist with data display and filtering. Comments in the source code help explain what each of these steps do. If a new feature needs another helper column, or if additional data are needed to provide a better user experience, place the code in this section and comment clearly. #### Shiny Apps UI Section The UI section defines the display and sets display options and defaults. The entire UI section is enclosed within Shiny's "fluidPage", which should adjust for screen size or mobile usage automatically. In the Shiny methodology, the control names or inputIds defined here are used by the Server section later. Note that the controls here automatically refresh their content when new options are chosen. Some of these updates are built into their functionality, like any Inputs; others update themselves because of matching code in the Server section. The order of the controls defines where on the page they will appear: - titlePanel: defines the name and tab title of the page/visualization - actionButton "refresh": control button for the language translation, plus a line of code to setup the language translation - sidebarLayout: encloses the filtering and main panel. The rest of the page is enclosed by this function - sidebarPanel: encloses the filtering panel - **helpText**: a short text description of how to use the PDC Explorer visualization - **pickerInput** *"program"*: defines the programs filter. Note the default is set to "Amundsen Science", which helps the map to display something on first running the visualization rather than having a blank screen. This is a [shinyWidget] (https://dreamrs.github.io/shinyWidgets/index.html). - **sliderInput** *"daterange2"*: defines a double-ended slider filter for the date a program was published. The filter starts in the year 2000 and goes to whatever the current date is. - **pickerInput** *"author"*: defines the author filter. By default, all author names are selected. The list of options in this filter does not change based on other filters, so it is possible to result in a filter with no results. This control is a [shinyWidget] (https://dreamrs.github.io/shinyWidgets/index.html). - **pickerInput** *"tags"*: defines the tags filter. By default, all tags are selected. Tags have a category, so a "subtext" is defined so we can see the category when filtering by tabs. This is a [shinyWidget] (https://dreamrs.github.io/shinyWidgets/index.html). - **textOutput** *"displayCount"*: defines a simple display of how many programs resulted from the selected filters. The text and functions are in the Server section. - **actionButton** *"resetButton"*: this activates a function to reset the filters to default - **downloadButton** *"downloadData"*: activates a function to download the currently filtered metadata to a CSV file on the user's local device - **width = 3**: defines the size of the filter panel. The value is 3 out of 12 units, so about 1/4 of the window. Note if the window is too narrow, the filter panel appears at the top and the map/table below. Details on these UI functions can be reviewed on the [Shiny function reference website] (https://shiny.rstudio.com/reference/shiny/1.7.4/) or the [shinyWidgets function reference] (https://dreamrs.github.io/shinyWidgets/index.html). #### Leaflet and Shiny Server Section The second half of the Shiny portion of the visualization incorporates Leaflet code and Shiny code together to define the logic of the display. We use the control names defined in the UI section to develop code that controls the functionality of the controls. Because the Server is defined as "reactive", updates occur automatically in most cases. There are multiple functions within the Server section: 1. Construct a filtered dataset 2. Create the Leaflet map 3. Create the Leaflet tabular view 4. Display the count of filtered results 5. Enable a filters reset button 6. Enable a CSV download 7. Enable the language switching functionality ###### The Filtered Dataset: filteredMeta In concept, the purpose of the filteredMeta expression is to create a subset of the complete metadata dataset. The subset is defined by the filter selections input by the user. Then, this subset is displayed on the map or table. Here's the implementation: 1. A copy of the complete metadata is made 2. If the program filter has been cleared with nothing selected, the default Amundsen Science program is selected, which ensures the map always displays something. 3. In a rather complex expression, a subset of the metadata is created with the filter settings defining the selection criteria. For any filter to work properly, it must appear within this expression. The general format for each selection is: ```r tempMeta$DATA_ROW_TO_BE_FILTERED [%in% for text fields; conditional for numeric fields] input$NAME_OF_FILTER & ``` Sub-expressions can be complex and include other functions as needed, but no intermediate variables can be used within the main filtering expression. If a separate variable is needed, define it before the filtering expression. 4. In the event that user filters out all possible studies (ie. filteredMeta is empty), create a single-row empty dataframe. This ensures the map displays in the event that no programs result from the selected filters. ###### Map and Table Display The remaining code in the server section connects the data to be displayed with the displays themselves. The map and table are configured here, as well as accessory functions. The order of code blocks doesn't matter - those kinds of display options are configured in the UI section. The map is enclosed within a **leaflet** command, and each line adds a feature to the map. These features include: - **setView**: sets the location and zoom level of the default viewpoint on the map, in this case centered on Canadian polar regions. - **addProviderTiles**: Leaflet is capable of using several map providers. These are the base maps upon which the map display is built. Open Street Maps is used as the default, and two alternate providers are present as well. More information on how Providers work can be found in the [Leaflet documentation] (https://rstudio.github.io/leaflet/basemaps.html) - **addEasyprint**: allows users to download a screenshot of the map view - **addRectangles**: based on the filtered data, draws a hollow rectangle on top of the map for each program found. The corners of the rectangles are defined by lat/long pairs in the metadata. A label is drawn with the name of the program, as well as a marker with the PDC icon. Finally, a popup is defined so that when a user clocks on a program, several details about the program appear in a box. Note: drawing rectangles can be computationally intensive, and the map will be slow to update if >100 rectangles need to be drawn. Not all of those pairs are accurate in the dataset, which can lead to some rectangles being tiny dots or taking up the whole world. Duplicate programs can sometimes draw directly on top of each other. - **renderDataTable**: we already defined a **tabsetPanel** in the UI section, which means the display will have a map in one tab and a data table in the other tab. This function defines the table section, and is quite simple because the default table is already highly functional. If you want different columns to display in the data table, edit the list "tabularCols", which is near the end of the data adjustment section of the code. - **observeEvent** resetButton: an observeEvent is an ["event handler"] (https://shiny.rstudio.com/reference/shiny/1.0.5/observeevent) or an interactive method to change the display or data when an action occurs, but is distinct from the filters. This one forces a reset of the filter selections when the button is pressed, by calling **updateTextInput** and passing in default values back into the filters. - **downloadHandler**: generates a CSV file of the filtered dataset (filteredMeta) and allows for an easy download. A default filename based on the user's system date helps to organize files. - **observeEvent** for the language switcher: another observeEvent-actionButton pair that enables the user to switch between English and French language. The language switching function is described in detail below. ###### Language Switcher The ability to switch from English to French is enabled with the "shiny.i8n" library. Very little logic is needed to make the function work, but some care must be taken to ensure every piece of text is translated properly. Setup of the language switcher is early in the code, where a Translator variable "i18n" is defined and points to a .JSON translation file called "translationPDC.json". The button logic is defined in the observeEvent-actionButton pair near the end of the code. Not much change would ever be needed there, unless another language needs to be added. For language switching to work, any text displayed in the visualization must be modified such that it is enclosed by a special piece of code: ```r # This text cannot be translated properly exampleVar <- "text to translate" #This text can be translated properly exampleVar <- i18n$t("text to translate") ``` Note: Do not enclose any metadata values within the i18n code. The second required step is to make sure a translation pair is present in the translationPDC.json file. Open that file in the text editor of your choice and check to see whether the exact string (in this case: "text to translate") is already present. If not, find the last translation pair and make the following modifications: Before: ```json ... { "en": "Transdisciplinary", "fr": "Transdisciplinaire" } ``` After: ```json ... { "en": "Transdisciplinary", "fr": "Transdisciplinaire" }, { "en": "text to translate", "fr": "texte à traduire" } ``` Repeat for each unique line of text. The translation library is case- and space-sensitive so the strings must match perfectly. Take care of comma placement and curly braces, or the visualization will fail to load the JSON file and halt. ## CONCLUSION The PDC Explorer is a Shiny App which allows users to browse, filter, and locate programs in the PDC catalogue on an interactive map or table. The [PDC Metadata API] (http://hedeby.uwaterloo.ca/api/documentation) is used to create the dataset, and a download function allows users to download a filtered subset of that metadata for research purposes. Users can also follow links on the map or in the table to the Polar Data Catalogue website for additional information or to download the program datasets themselves. Ultimately, our goal is to provide easy access to project resources for all interested parties, in the spirit of the [FAIR Data Principles](https://www.go-fair.org/fair-principles/). This cookbook file is intended to be used by developers or maintainers to keep the Explorer visualization working well, or to add new features. ## APPENDIX ### Acronyms | Acronym | Description| | ------ | ------ | | API | Application Programming Interface | | JSON | JavaScript Object Notation, a file format used in the visualization for language translation | | PDC | The Polar Data Catalogue | | CSV | Comma Separated Variable, a flat file format for storing data. Similar to a spreadsheet. | | CRAN | the Comprehensive R Archive Network, a repository of R code and libraries | | PDC | The Polar Data Catalogue | ### Glossary | Glossary | Description| | ------ | ------ | | R | an interpreted programming language for statistical computing and graphics | | Shiny app | a web application framework that allows R to be deployed online | | shinyapps.io | a Shiny app hosting service. The PDC Explorer visualization is hosted on shinyapps.io | | Leaflet | an open source JavaScript library used to build web mapping applications. PDC explorer uses Leaflet code to display its map | ### Links 1. [Polar Data Catalogue](https://www.polardata.ca/) 2. [Shiny Apps setup tutorial](https://shiny.rstudio.com/articles/shinyapps.html) 3. [CRAN] (https://cran.rstudio.com/) 4. [Rtools download] (https://cran.r-project.org/bin/windows/Rtools/) 5. [RStudio download](https://posit.co/download/rstudio-desktop/) 6. [RStudio user's guide] (https://docs.posit.co/ide/user/) 7. [RStudio connection to shinyapps.io tutorial] (https://shiny.rstudio.com/articles/shinyapps.html) 8. [Polar Data Catalogue API documentation] (http://hedeby.uwaterloo.ca/api/documentation) 9. [shinyWidgets function reference](https://dreamrs.github.io/shinyWidgets/index.html) 10. [Shiny function reference website] (https://shiny.rstudio.com/reference/shiny/1.7.4/) 11. [Leaflet documentation] (https://rstudio.github.io/leaflet/basemaps.html) 12. [Observer-Event reference documentation] (https://shiny.rstudio.com/reference/shiny/1.0.5/observeevent) 13. [FAIR Data Principles](https://www.go-fair.org/fair-principles/) --------------- This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. ---------------