LogoLogo
Data CataloguePacificMapPDH.statMicrodata Library
  • Pacific Data Hub
    • Change Log
  • Data Catalogue
    • User Guide
      • Creating a new account
    • Advanced Search
      • Solr Search Queries
    • API
  • PacificMap
    • User Guide
  • PDH.stat
    • Data Explorer
    • Plugins
      • Excel
      • Power BI
      • R
      • Python
      • STATA
    • API
      • Interface
      • Use cases
      • Sample code
  • Microdata Library
    • User Guide
Powered by GitBook
On this page
  • Overview
  • Installation
  • Basic Usage

Was this helpful?

Export as PDF
  1. PDH.stat
  2. Plugins

R

Run advanced statistical analyses on Pacific data using the rsdmx package

PreviousPower BINextPython

Last updated 4 years ago

Was this helpful?

Overview

PDH.stat is part of the rsdmx package, developed by Emmanuel Blondel, and contributors Matthieu Stigler and Eric Persson. Learn more about the original package . It has been configured to include Pacific Data Hub's .Stat API as a default service provider.

Installation

These steps have been tested with R 4.0.2 on Windows 10.

Remove rsdmx if already installed: remove.packages("rsdmx")

Install devtools: install.packages("devtools")

Install rsdmx from the latest development version on Github: devtools::install_github("opensdmx/rsdmx")

Basic Usage

Load package: library(rsdmx)

See all service providers

Aside from PDH.stat, the original package offers connectivity with OECD, Eurostat and others. See all available service providers with the getSDMXServiceProviders() function.

as.data.frame(getSDMXServiceProviders())

See available dataflows from PDH.stat

To see the available PDH.stat dataflows (data sets), use the readSDMX() function, setting the providerId parameter to "PDH" and the resource parameter to "dataflow":

as.data.frame(readSDMX(providerId="PDH", resource="dataflow"))

To return the available data set IDs and their English names, filter the dataframe:

as.data.frame(readSDMX(providerId="PDH", resource="dataflow"))[c("id", "Name.en")]

Get all data for a dataflow

To retrieve a dataflow, provide the dataflow ID to the readSDMX() function in the flowref parameter, also setting the resource as "data".

For example, to connect to "Inflation Rates" dataflow, the ID is "DF_CPI" (as shown when retrieving all the dataflows for PDH.stat):

sdmx <- readSDMX(providerId="PDH", resource="data", flowRef="DF_CPI")
df <- as.data.frame(sdmx)

Get more specific data for a dataflow

Extra parameters can be supplied the readSDMX() function to retrieve a filtered view of the dataflow:

  • start is the desired start year (supplied as an integer)

  • end is the desired end year (supplied as an integer)

  • key controls a variety of filters, and by default it is set to "all" (retrieves all data). A further explanation is provided below.

The key parameter controls a different number of variables depending on the dataflow, including time period, country, currency and others. Each variable is selected with a code, and separated by a dot . Two dots .. indicates a "wildcard" (selects all available values). A plus + can allow multipled variables to be selected. Generally the time period comes first, A for "annual" or M for "month" (if the data is available at that level). Some examples:

  • For DF_CPI "Inflation Rates" dataflow, to get annual data from 2010-2015 for Cook Islands and Fiji:

    • The key is "A.CK+FJ.."

    • start is 2010 and end is 2015

    • The R code:

as.data.frame(readSDMX(providerId = "PDH", 
                resource="data", 
                flowRef="DF_CPI", 
                key="A.CK+FJ..", 
                start=2010, 
                end=2015))

This is a quick-start guide. Go for the official documentation.

Given that the key variables can change depending on the dataflow, it can be easier to retrieve all data and then filter manually in R. Alternatively, use the to filter a dataset and then view the relevant API call and key as explained .

here
here
Data Explorer
here