# R

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MKHV2uUL2S5HLMigWA6%2F-MKHVi771sb__da9awKH%2Fimage.png?alt=media\&token=7281d340-7229-4a4c-923b-c36cddf34d94)

### Overview

PDH.stat is part of the `rsdmx` package, developed by Emmanuel Blondel, and contributors Matthieu Stigler and Eric Persson. Learn more about the original package [here](https://github.com/opensdmx/rsdmx). It has been configured to include Pacific Data Hub's .Stat API as a default service provider.

### Installation

These steps have been tested with R 4.0.2 on Windows 10.

Remove rsdmx if already installed: `remove.packages("rsdmx")`

Install devtools: `install.packages("devtools")`

Install rsdmx from the latest development version on Github: `devtools::install_github("opensdmx/rsdmx")`

### Basic Usage

This is a quick-start guide. Go [here ](https://cran.r-project.org/web/packages/rsdmx/vignettes/quickstart.html)for the official documentation.

Load package: `library(rsdmx)`

**See all service providers**

Aside from PDH.stat, the original package offers connectivity with OECD, Eurostat and others. See all available service providers with the `getSDMXServiceProviders()` function.

```r
as.data.frame(getSDMXServiceProviders())
```

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MSZsSloJ-gdAsmzzXtV%2F-MS_1H7ccYyrgZAChKII%2Fr1.png?alt=media\&token=24bcc7af-b92f-46b8-8bf1-99f591839678)

**See available dataflows from PDH.stat**

To see the available PDH.stat dataflows (data sets), use the `readSDMX()` function, setting the `providerId` parameter to "PDH" and the `resource` parameter to "dataflow":

```r
as.data.frame(readSDMX(providerId="PDH", resource="dataflow"))
```

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MHKJBmNl25KXgZAMCoe%2F-MHKJSMifu4WQpAwan34%2Fgetdataflows.png?alt=media\&token=5cb08ac3-c8ae-4c50-be74-14cd2dc1d141)

To return the available data set IDs and their English names, filter the dataframe:

```r
as.data.frame(readSDMX(providerId="PDH", resource="dataflow"))[c("id", "Name.en")]
```

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MHKJBmNl25KXgZAMCoe%2F-MHKJTuY-jSWrAt-LQPq%2Fgetdataflowsandnames.png?alt=media\&token=89ea2666-5aee-42ce-b44e-11d530811c8c)

**Get all data for a dataflow**

To retrieve a dataflow, provide the dataflow ID to the `readSDMX()` function in the `flowref` parameter, also setting the `resource` as "data".

For example, to connect to "Inflation Rates" dataflow, the ID is "DF\_CPI" (as shown when retrieving all the dataflows for PDH.stat):

```r
sdmx <- readSDMX(providerId="PDH", resource="data", flowRef="DF_CPI")
df <- as.data.frame(sdmx)
```

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MHKJBmNl25KXgZAMCoe%2F-MHKJQpCcliSBCiPZwr_%2Fgetcpidata.png?alt=media\&token=b66d57d0-efd0-416d-a9b4-26e7e5162281)

**Get more specific data for a dataflow**

Extra parameters can be supplied the `readSDMX()` function to retrieve a filtered view of the dataflow:

* `start` is the desired start year (supplied as an integer)
* `end` is the desired end year (supplied as an integer)
* `key` controls a variety of filters, and by default it is set to "all" (retrieves all data). A further explanation is provided below.

The `key` parameter controls a different number of variables depending on the dataflow, including time period, country, currency and others. Each variable is selected with a code, and separated by a dot `.` Two dots `..` indicates a "wildcard" (selects all available values). A plus `+` can allow multipled variables to be selected. Generally the time period comes first, `A` for "annual" or `M` for "month" (if the data is available at that level). Some examples:

* For `DF_CPI` "Inflation Rates" dataflow, to get annual data from 2010-2015 for Cook Islands and Fiji:
  * The `key` is `"A.CK+FJ.."`
  * `start` is 2010 and `end` is 2015
  * The R code:

```r
as.data.frame(readSDMX(providerId = "PDH", 
                resource="data", 
                flowRef="DF_CPI", 
                key="A.CK+FJ..", 
                start=2010, 
                end=2015))
```

Given that the `key` variables can change depending on the dataflow, it can be easier to retrieve all data and then filter manually in R. Alternatively, use the [Data Explorer](https://stats.pacificdata.org/?locale=en) to filter a dataset and then view the relevant API call and key as explained [here](https://docs.pacificdata.org/de#get-api-queries-corresponding-to-the-data-selection).

![](https://3496061366-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MDSpD50SdXxsYnU7P9D%2F-MHKJBmNl25KXgZAMCoe%2F-MHKJOGzJP-39JzCu1aZ%2Fcookfiji.png?alt=media\&token=b5ea0ba7-c891-4e85-92df-57a563760669)
