update vignette build

pull/17/head
noerw 7 years ago
parent 2b8762d52c
commit 8b4ec6295d

@ -3,7 +3,7 @@ This R package ingests data (environmental measurements, sensor stations) from
the API of opensensemap.org for analysis in R.
The package aims to be compatible with sf and the tidyverse.
> **Whats up with that package name?** idk, the R people seem to [enjoy][1]
> *Whats up with that package name?* idk, the R people seem to [enjoy][1]
[dropping][2] [vovels][3] so.. Unfortunately I couldn't fit the naming
convention to drop an `y` in there.
@ -21,7 +21,7 @@ devtools::install_github('noerw/opensensmapr')
```
## Usage
A usage example is shown in the vignette [`osem-intro`](vignettes/osem-intro.Rmd).
A usage example is shown in the vignette [`osem-intro`](https://noerw.github.com/opensensmapR/inst/doc/osem-intro.html).
In general these are the main functions for data retrieval:
```r
@ -43,9 +43,9 @@ m = osem_measurements(bbox, phenomenon, filter1, ...)
osem_counts()
```
Additionally there are some helpers: `summary.sensebox(), plot.sensebox(), osem_as_sf()...`.
Additionally there are some helpers: `summary.sensebox(), plot.sensebox(), st_as_sf.sensebox(), [.sensebox(), filter.sensebox(), mutate.sensebox(), ...`.
For parameter options, open each functions' documentation by calling `?<function-name>`.
For parameter usage, open each functions' documentation by calling `?<function-name>`.
## License
GPL-2.0 - Norwin Roosen

@ -39,12 +39,13 @@ plot(pm25_sensors)
library(sf)
library(units)
library(lubridate)
library(dplyr)
# construct a bounding box: 12 kilometers around Berlin
berlin = st_point(c(13.4034, 52.5120)) %>%
st_sfc(crs = 4326) %>%
st_transform(3857) %>% # allow setting a buffer in meters
st_buffer(units::set_units(12, km)) %>%
st_buffer(set_units(12, km)) %>%
st_transform(4326) %>% # the opensensemap expects WGS 84
st_bbox()
@ -52,13 +53,21 @@ berlin = st_point(c(13.4034, 52.5120)) %>%
pm25 = osem_measurements(
berlin,
phenomenon = 'PM2.5',
from = now() - days(7), # defaults to 2 days
from = now() - days(20), # defaults to 2 days
to = now()
)
plot(pm25)
## ------------------------------------------------------------------------
pm25_sf = osem_as_sf(pm25)
plot(st_geometry(pm25_sf), axes = T)
outliers = filter(pm25, value > 100)$sensorId
bad_sensors = outliers[, drop = T] %>% levels()
pm25 = mutate(pm25, invalid = sensorId %in% bad_sensors)
## ------------------------------------------------------------------------
st_as_sf(pm25) %>% st_geometry() %>% plot(col = factor(pm25$invalid), axes = T)
## ------------------------------------------------------------------------
pm25 %>% filter(invalid == FALSE) %>% plot()

@ -26,11 +26,6 @@ Its main goals are to provide means for:
- big data analysis of the measurements stored on the platform
- sensor metadata analysis (sensor counts, spatial distribution, temporal trends)
> *Please note:* The openSenseMap API is sometimes a bit unstable when streaming
long responses, which results in `curl` complaining about `Unexpected EOF`. This
bug is being worked on upstream. Meanwhile you have to retry the request when
this occurs.
### Exploring the dataset
Before we look at actual observations, lets get a grasp of the openSenseMap
datasets' structure.
@ -45,14 +40,14 @@ all_sensors = osem_boxes()
summary(all_sensors)
```
This gives a good overview already: As of writing this, there are more than 600
This gives a good overview already: As of writing this, there are more than 700
sensor stations, of which ~50% are currently running. Most of them are placed
outdoors and have around 5 sensors each.
The oldest station is from May 2014, while the latest station was registered a
couple of minutes ago.
Another feature of interest is the spatial distribution of the boxes. `plot()`
can help us out here. This function requires a bunch of optional dependcies though.
Another feature of interest is the spatial distribution of the boxes: `plot()`
can help us out here. This function requires a bunch of optional dependencies though.
```{r message=F, warning=F}
if (!require('maps')) install.packages('maps')
@ -112,12 +107,13 @@ Luckily we can get the measurements filtered by a bounding box:
library(sf)
library(units)
library(lubridate)
library(dplyr)
# construct a bounding box: 12 kilometers around Berlin
berlin = st_point(c(13.4034, 52.5120)) %>%
st_sfc(crs = 4326) %>%
st_transform(3857) %>% # allow setting a buffer in meters
st_buffer(units::set_units(12, km)) %>%
st_buffer(set_units(12, km)) %>%
st_transform(4326) %>% # the opensensemap expects WGS 84
st_bbox()
```
@ -125,19 +121,33 @@ berlin = st_point(c(13.4034, 52.5120)) %>%
pm25 = osem_measurements(
berlin,
phenomenon = 'PM2.5',
from = now() - days(7), # defaults to 2 days
from = now() - days(20), # defaults to 2 days
to = now()
)
plot(pm25)
```
Now we can get started with actual spatiotemporal data analysis. First plot the
measuring locations:
Now we can get started with actual spatiotemporal data analysis.
First, lets mask the seemingly uncalibrated sensors:
```{r}
outliers = filter(pm25, value > 100)$sensorId
bad_sensors = outliers[, drop = T] %>% levels()
pm25 = mutate(pm25, invalid = sensorId %in% bad_sensors)
```
Then plot the measuring locations, flagging the outliers:
```{r}
st_as_sf(pm25) %>% st_geometry() %>% plot(col = factor(pm25$invalid), axes = T)
```
Removing these sensors yields a nicer time series plot:
```{r}
pm25_sf = osem_as_sf(pm25)
plot(st_geometry(pm25_sf), axes = T)
pm25 %>% filter(invalid == FALSE) %>% plot()
```
further analysis: `TODO`
Further analysis: comparison with LANUV data `TODO`

File diff suppressed because one or more lines are too long
Loading…
Cancel
Save