update vignette build

pull/17/head
noerw 7 years ago
parent 2b8762d52c
commit 8b4ec6295d

@ -3,7 +3,7 @@ This R package ingests data (environmental measurements, sensor stations) from
the API of opensensemap.org for analysis in R. the API of opensensemap.org for analysis in R.
The package aims to be compatible with sf and the tidyverse. The package aims to be compatible with sf and the tidyverse.
> **Whats up with that package name?** idk, the R people seem to [enjoy][1] > *Whats up with that package name?* idk, the R people seem to [enjoy][1]
[dropping][2] [vovels][3] so.. Unfortunately I couldn't fit the naming [dropping][2] [vovels][3] so.. Unfortunately I couldn't fit the naming
convention to drop an `y` in there. convention to drop an `y` in there.
@ -21,7 +21,7 @@ devtools::install_github('noerw/opensensmapr')
``` ```
## Usage ## Usage
A usage example is shown in the vignette [`osem-intro`](vignettes/osem-intro.Rmd). A usage example is shown in the vignette [`osem-intro`](https://noerw.github.com/opensensmapR/inst/doc/osem-intro.html).
In general these are the main functions for data retrieval: In general these are the main functions for data retrieval:
```r ```r
@ -43,9 +43,9 @@ m = osem_measurements(bbox, phenomenon, filter1, ...)
osem_counts() osem_counts()
``` ```
Additionally there are some helpers: `summary.sensebox(), plot.sensebox(), osem_as_sf()...`. Additionally there are some helpers: `summary.sensebox(), plot.sensebox(), st_as_sf.sensebox(), [.sensebox(), filter.sensebox(), mutate.sensebox(), ...`.
For parameter options, open each functions' documentation by calling `?<function-name>`. For parameter usage, open each functions' documentation by calling `?<function-name>`.
## License ## License
GPL-2.0 - Norwin Roosen GPL-2.0 - Norwin Roosen

@ -39,12 +39,13 @@ plot(pm25_sensors)
library(sf) library(sf)
library(units) library(units)
library(lubridate) library(lubridate)
library(dplyr)
# construct a bounding box: 12 kilometers around Berlin # construct a bounding box: 12 kilometers around Berlin
berlin = st_point(c(13.4034, 52.5120)) %>% berlin = st_point(c(13.4034, 52.5120)) %>%
st_sfc(crs = 4326) %>% st_sfc(crs = 4326) %>%
st_transform(3857) %>% # allow setting a buffer in meters st_transform(3857) %>% # allow setting a buffer in meters
st_buffer(units::set_units(12, km)) %>% st_buffer(set_units(12, km)) %>%
st_transform(4326) %>% # the opensensemap expects WGS 84 st_transform(4326) %>% # the opensensemap expects WGS 84
st_bbox() st_bbox()
@ -52,13 +53,21 @@ berlin = st_point(c(13.4034, 52.5120)) %>%
pm25 = osem_measurements( pm25 = osem_measurements(
berlin, berlin,
phenomenon = 'PM2.5', phenomenon = 'PM2.5',
from = now() - days(7), # defaults to 2 days from = now() - days(20), # defaults to 2 days
to = now() to = now()
) )
plot(pm25) plot(pm25)
## ------------------------------------------------------------------------ ## ------------------------------------------------------------------------
pm25_sf = osem_as_sf(pm25) outliers = filter(pm25, value > 100)$sensorId
plot(st_geometry(pm25_sf), axes = T) bad_sensors = outliers[, drop = T] %>% levels()
pm25 = mutate(pm25, invalid = sensorId %in% bad_sensors)
## ------------------------------------------------------------------------
st_as_sf(pm25) %>% st_geometry() %>% plot(col = factor(pm25$invalid), axes = T)
## ------------------------------------------------------------------------
pm25 %>% filter(invalid == FALSE) %>% plot()

@ -26,11 +26,6 @@ Its main goals are to provide means for:
- big data analysis of the measurements stored on the platform - big data analysis of the measurements stored on the platform
- sensor metadata analysis (sensor counts, spatial distribution, temporal trends) - sensor metadata analysis (sensor counts, spatial distribution, temporal trends)
> *Please note:* The openSenseMap API is sometimes a bit unstable when streaming
long responses, which results in `curl` complaining about `Unexpected EOF`. This
bug is being worked on upstream. Meanwhile you have to retry the request when
this occurs.
### Exploring the dataset ### Exploring the dataset
Before we look at actual observations, lets get a grasp of the openSenseMap Before we look at actual observations, lets get a grasp of the openSenseMap
datasets' structure. datasets' structure.
@ -45,14 +40,14 @@ all_sensors = osem_boxes()
summary(all_sensors) summary(all_sensors)
``` ```
This gives a good overview already: As of writing this, there are more than 600 This gives a good overview already: As of writing this, there are more than 700
sensor stations, of which ~50% are currently running. Most of them are placed sensor stations, of which ~50% are currently running. Most of them are placed
outdoors and have around 5 sensors each. outdoors and have around 5 sensors each.
The oldest station is from May 2014, while the latest station was registered a The oldest station is from May 2014, while the latest station was registered a
couple of minutes ago. couple of minutes ago.
Another feature of interest is the spatial distribution of the boxes. `plot()` Another feature of interest is the spatial distribution of the boxes: `plot()`
can help us out here. This function requires a bunch of optional dependcies though. can help us out here. This function requires a bunch of optional dependencies though.
```{r message=F, warning=F} ```{r message=F, warning=F}
if (!require('maps')) install.packages('maps') if (!require('maps')) install.packages('maps')
@ -112,12 +107,13 @@ Luckily we can get the measurements filtered by a bounding box:
library(sf) library(sf)
library(units) library(units)
library(lubridate) library(lubridate)
library(dplyr)
# construct a bounding box: 12 kilometers around Berlin # construct a bounding box: 12 kilometers around Berlin
berlin = st_point(c(13.4034, 52.5120)) %>% berlin = st_point(c(13.4034, 52.5120)) %>%
st_sfc(crs = 4326) %>% st_sfc(crs = 4326) %>%
st_transform(3857) %>% # allow setting a buffer in meters st_transform(3857) %>% # allow setting a buffer in meters
st_buffer(units::set_units(12, km)) %>% st_buffer(set_units(12, km)) %>%
st_transform(4326) %>% # the opensensemap expects WGS 84 st_transform(4326) %>% # the opensensemap expects WGS 84
st_bbox() st_bbox()
``` ```
@ -125,19 +121,33 @@ berlin = st_point(c(13.4034, 52.5120)) %>%
pm25 = osem_measurements( pm25 = osem_measurements(
berlin, berlin,
phenomenon = 'PM2.5', phenomenon = 'PM2.5',
from = now() - days(7), # defaults to 2 days from = now() - days(20), # defaults to 2 days
to = now() to = now()
) )
plot(pm25) plot(pm25)
``` ```
Now we can get started with actual spatiotemporal data analysis. First plot the Now we can get started with actual spatiotemporal data analysis.
measuring locations: First, lets mask the seemingly uncalibrated sensors:
```{r}
outliers = filter(pm25, value > 100)$sensorId
bad_sensors = outliers[, drop = T] %>% levels()
pm25 = mutate(pm25, invalid = sensorId %in% bad_sensors)
```
Then plot the measuring locations, flagging the outliers:
```{r}
st_as_sf(pm25) %>% st_geometry() %>% plot(col = factor(pm25$invalid), axes = T)
```
Removing these sensors yields a nicer time series plot:
```{r} ```{r}
pm25_sf = osem_as_sf(pm25) pm25 %>% filter(invalid == FALSE) %>% plot()
plot(st_geometry(pm25_sf), axes = T)
``` ```
further analysis: `TODO` Further analysis: comparison with LANUV data `TODO`

File diff suppressed because one or more lines are too long
Loading…
Cancel
Save