Retrieving data in R¶
We provide an R function, query, that fetches time series data from InfluxDB server and returns data.frame object.
Setting up¶
The query function uses httr, an HTTP client library for R. Please install its ssl dependency beforehand. On Debian-based systems:
$ sudo apt-get install libssl-dev
Assuming that you’ve already installed r-base and RStudio (external instructions), you need to install httr. Please execute the following inside your RStudio environment to install httr:
> install.packages("httr")
We need two additional packages for the demonstration, devtools to fetch the query source code (api-v1.3.r) from our gist repository and optional data.table to cast the data into columns. Of course, you can skip this step, and download the source code into your environment.
> install.packages("devtools")
> install.packages("data.table")
Provided function¶
query(…)
Download¶
to download the source of api-v1.3.r, please use the
Raw
orDownload ZIP
button on the GitHubGist page
Description¶
fetches time series data with matching parameters from the database server and converts into data.frame object
Usage¶
query(domain, apiKey, timeFilter = "", device = "//", location = "//", sensor = "//", includeNetworkSensors = FALSE, channel = "//", aggFunc = "", aggInterval = "", doCast = TRUE, castAggFunc = mean, convertTimestamp = TRUE, timezone = "UTC")
Arguments¶
- domain
The domain part of the URL
- apiKey
Secret key for authenticating with the server
- timeFilter
Time filter expression written in InfluxQL, the default value
""
ignores time- device
InfluxQL regular expression to select devices, the default value
"//"
matches all devices- location
InfluxQL regular expression to select locations, the default value
"//"
matches all locations- sensor
InfluxQL regular expression to select sensors, the default value
"//"
matches all sensors- includeNetworkSensors
TRUE
allows returning network-related data
FALSE
(default)filters out all network-related data
- channel
InfluxQL regular expression to select channels, the default value
"//"
matches all channels- aggFunc
InfluxQL aggregate function evaluated on the selected data, no aggregate function is applied by default
- aggInterval
InfluxQL time interval value, used with aggFunc
- doCast
TRUE
(default)return time series into individual columns,
FALSE
return all time series in a single column
- castAggFunc
Apply aggregate function when casting data, by default
mean
function is applied- convertTimestamp
TRUE
(default)convert timestamps into date time objects,
FALSE
return timestamps in unix time in milliseconds
- timezone
Timezone name, used with convertTimestamp
Value¶
data.frame object containing time series data
Demonstration script¶
The following R script demonstrates the main functionality of query function with our demo.decentlab.com server. In this script, we use the previously installed devtools to source query from our gist repository and enable casting so that the output data can be column oriented. Please note that the casting requires the data.table package.
library(devtools)
devtools::source_gist('79f52bef3778e0dbbd5dc58437621d88',
filename = 'api-v1.3.r')
Please remove the filename parameter if above command fails.
data.df <- query(
domain = "demo.decentlab.com",
apiKey = "eyJrIjoiclhMRFFvUXFzQXpKVkZydm52b0VMRVg3M3U2b3VqQUciLCJuIjoiZGF0YS1xdWVyeS1hcGktZGVtby0yIiwiaWQiOjF9",
device = "/26[67]/",
sensor = "/(temperature)|(humidity)/",
aggFunc = "mean",
aggInterval = "7m",
timeFilter = "time > now() - 1d",
)
The above example will retrieve the data from demo.decentlab.com server, matching the following criteria:
devices with ID containing
266
or267
(will match also 2671, 12670, 4266, etc)sensor names containing either
temperature
orhumidity
compute the mean of groups divided by 7 min interval
received in the last one day
The value of data.df:
> head(data.df)
time 266.sht-humidity 266.sht-temperature 267.sensirion-sht21-humidity 267.sensirion-sht21-temperature
1 2017-02-02 15:28:00 NA NA NA NA
2 2017-02-02 15:35:00 33.86359 21.42597 32.65051 21.56540
3 2017-02-02 15:42:00 33.73389 21.45815 NA NA
4 2017-02-02 15:49:00 33.65759 21.52250 32.67340 21.53322
5 2017-02-02 15:56:00 33.75677 21.57612 NA NA
6 2017-02-02 16:03:00 33.68811 21.60830 NA NA
Note that the head function shows only the first six rows. Another example demonstrates different possibilities.
data.df <- query(
domain = "demo.decentlab.com",
apiKey = "eyJrIjoiclhMRFFvUXFzQXpKVkZydm52b0VMRVg3M3U2b3VqQUciLCJuIjoiZGF0YS1xdWVyeS1hcGktZGVtby0yIiwiaWQiOjF9",
device = "/^266$/",
sensor = "/^sht-/",
timeFilter = "time >= '2016-12-01 00:00:00' AND time < '2017-01-01 00:00:00'",
timezone = "Europe/Zurich",
doCast = FALSE
)
In this example, we obtain data received from only 266. ^
and $
indicates the beginning and ending of regular expression so that no other ID containing 266
, such as 1266
, 2660
, or _266a
, matches the expression. In addition, all sensors starting with sht-
are selected. The casting was disabled to obtain single column output data.
> head(data.df)
time series value
1 2016-12-01 01:02:43 266.battery 2.668945
2 2016-12-01 01:07:43 266.battery 2.668945
3 2016-12-01 01:12:43 266.battery 2.668945
4 2016-12-01 01:17:43 266.battery 2.668945
5 2016-12-01 01:22:43 266.battery 2.668945
6 2016-12-01 01:27:43 266.battery 2.668945
> tail(data.df)
time series value
17139 2017-01-01 00:27:02 266.sht-temperature 20.42854
17140 2017-01-01 00:32:02 266.sht-temperature 20.41781
17141 2017-01-01 00:37:02 266.sht-temperature 20.41781
17142 2017-01-01 00:47:02 266.sht-temperature 20.41781
17143 2017-01-01 00:52:02 266.sht-temperature 20.40709
17144 2017-01-01 00:57:02 266.sht-temperature 20.40709
Here, the data received in the December of 2016 is returned. Please be careful with absolute time value in a query because it is always interpreted in UTC, and the output is also in UTC by default. For instance, the output timezone is Europe/Zurich in this example, and therefore the time values are beyond 2016-12-31 23:59:59, which is in the timezone of UTC+01 (winter time).