Retrieving data in R

We provide an R function, query, that fetches time series data from InfluxDB server and returns data.frame object.

Setting up

The query function uses httr, an HTTP client library for R. Please install its ssl dependency beforehand. On Debian-based systems:

$ sudo apt-get install libssl-dev

Assuming that you’ve already installed r-base and RStudio (external instructions), you need to install httr. Please execute the following inside your RStudio environment to install httr:

> install.packages("httr")

We need two additional packages for the demonstration, devtools to fetch the query source code (api-v1.3.r) from our gist repository and optional data.table to cast the data into columns. Of course, you can skip this step, and download the source code into your environment.

> install.packages("devtools")
> install.packages("data.table")

Provided function

query(…)


Download

to download the source of api-v1.3.r, please use the Raw or Download ZIP button on the GitHubGist page

Description

fetches time series data with matching parameters from the database server and converts into data.frame object

Usage

query(domain, apiKey, timeFilter = "", device = "//", location = "//", sensor = "//", includeNetworkSensors = FALSE, channel = "//", aggFunc = "", aggInterval = "", doCast = TRUE, castAggFunc = mean, convertTimestamp = TRUE, timezone = "UTC")

Arguments

domain

The domain part of the URL

apiKey

Secret key for authenticating with the server

timeFilter

Time filter expression written in InfluxQL, the default value "" ignores time

device

InfluxQL regular expression to select devices, the default value "//" matches all devices

location

InfluxQL regular expression to select locations, the default value "//" matches all locations

sensor

InfluxQL regular expression to select sensors, the default value "//" matches all sensors

includeNetworkSensors
TRUE

allows returning network-related data

FALSE (default)

filters out all network-related data

channel

InfluxQL regular expression to select channels, the default value "//" matches all channels

aggFunc

InfluxQL aggregate function evaluated on the selected data, no aggregate function is applied by default

aggInterval

InfluxQL time interval value, used with aggFunc

doCast
TRUE (default)

return time series into individual columns,

FALSE

return all time series in a single column

castAggFunc

Apply aggregate function when casting data, by default mean function is applied

convertTimestamp
TRUE (default)

convert timestamps into date time objects,

FALSE

return timestamps in unix time in milliseconds

timezone

Timezone name, used with convertTimestamp

Value

data.frame object containing time series data

Demonstration script

The following R script demonstrates the main functionality of query function with our demo.decentlab.com server. In this script, we use the previously installed devtools to source query from our gist repository and enable casting so that the output data can be column oriented. Please note that the casting requires the data.table package.

library(devtools)
devtools::source_gist('79f52bef3778e0dbbd5dc58437621d88',
                      filename = 'api-v1.3.r')

Please remove the filename parameter if above command fails.

data.df <- query(
  domain = "demo.decentlab.com",
  apiKey = "eyJrIjoiclhMRFFvUXFzQXpKVkZydm52b0VMRVg3M3U2b3VqQUciLCJuIjoiZGF0YS1xdWVyeS1hcGktZGVtby0yIiwiaWQiOjF9",
  device = "/26[67]/",
  sensor = "/(temperature)|(humidity)/",
  aggFunc = "mean",
  aggInterval = "7m",
  timeFilter = "time > now() - 1d",
)

The above example will retrieve the data from demo.decentlab.com server, matching the following criteria:

  • devices with ID containing 266 or 267 (will match also 2671, 12670, 4266, etc)

  • sensor names containing either temperature or humidity

  • compute the mean of groups divided by 7 min interval

  • received in the last one day

The value of data.df:

> head(data.df)
                 time 266.sht-humidity 266.sht-temperature 267.sensirion-sht21-humidity 267.sensirion-sht21-temperature
1 2017-02-02 15:28:00               NA                  NA                           NA                              NA
2 2017-02-02 15:35:00         33.86359            21.42597                     32.65051                        21.56540
3 2017-02-02 15:42:00         33.73389            21.45815                           NA                              NA
4 2017-02-02 15:49:00         33.65759            21.52250                     32.67340                        21.53322
5 2017-02-02 15:56:00         33.75677            21.57612                           NA                              NA
6 2017-02-02 16:03:00         33.68811            21.60830                           NA                              NA

Note that the head function shows only the first six rows. Another example demonstrates different possibilities.

data.df <- query(
  domain = "demo.decentlab.com",
  apiKey = "eyJrIjoiclhMRFFvUXFzQXpKVkZydm52b0VMRVg3M3U2b3VqQUciLCJuIjoiZGF0YS1xdWVyeS1hcGktZGVtby0yIiwiaWQiOjF9",
  device = "/^266$/",
  sensor = "/^sht-/",
  timeFilter = "time >= '2016-12-01 00:00:00' AND time < '2017-01-01 00:00:00'",
  timezone = "Europe/Zurich",
  doCast = FALSE
)

In this example, we obtain data received from only 266. ^ and $ indicates the beginning and ending of regular expression so that no other ID containing 266, such as 1266, 2660, or _266a, matches the expression. In addition, all sensors starting with sht- are selected. The casting was disabled to obtain single column output data.

> head(data.df)
                 time      series    value
1 2016-12-01 01:02:43 266.battery 2.668945
2 2016-12-01 01:07:43 266.battery 2.668945
3 2016-12-01 01:12:43 266.battery 2.668945
4 2016-12-01 01:17:43 266.battery 2.668945
5 2016-12-01 01:22:43 266.battery 2.668945
6 2016-12-01 01:27:43 266.battery 2.668945
> tail(data.df)
                     time              series    value
17139 2017-01-01 00:27:02 266.sht-temperature 20.42854
17140 2017-01-01 00:32:02 266.sht-temperature 20.41781
17141 2017-01-01 00:37:02 266.sht-temperature 20.41781
17142 2017-01-01 00:47:02 266.sht-temperature 20.41781
17143 2017-01-01 00:52:02 266.sht-temperature 20.40709
17144 2017-01-01 00:57:02 266.sht-temperature 20.40709

Here, the data received in the December of 2016 is returned. Please be careful with absolute time value in a query because it is always interpreted in UTC, and the output is also in UTC by default. For instance, the output timezone is Europe/Zurich in this example, and therefore the time values are beyond 2016-12-31 23:59:59, which is in the timezone of UTC+01 (winter time).