You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
opensensmapR/inst/doc/osem-intro.html

342 lines
243 KiB
HTML

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="author" content="Norwin Roosen" />
<meta name="date" content="2017-08-23" />
<title>Analyzing environmental sensor data from openSenseMap.org in R</title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
</style>
<link href="data:text/css;charset=utf-8,body%20%7B%0Abackground%2Dcolor%3A%20%23fff%3B%0Amargin%3A%201em%20auto%3B%0Amax%2Dwidth%3A%20700px%3B%0Aoverflow%3A%20visible%3B%0Apadding%2Dleft%3A%202em%3B%0Apadding%2Dright%3A%202em%3B%0Afont%2Dfamily%3A%20%22Open%20Sans%22%2C%20%22Helvetica%20Neue%22%2C%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0Afont%2Dsize%3A%2014px%3B%0Aline%2Dheight%3A%201%2E35%3B%0A%7D%0A%23header%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0A%23TOC%20%7B%0Aclear%3A%20both%3B%0Amargin%3A%200%200%2010px%2010px%3B%0Apadding%3A%204px%3B%0Awidth%3A%20400px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Aborder%2Dradius%3A%205px%3B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Afont%2Dsize%3A%2013px%3B%0Aline%2Dheight%3A%201%2E3%3B%0A%7D%0A%23TOC%20%2Etoctitle%20%7B%0Afont%2Dweight%3A%20bold%3B%0Afont%2Dsize%3A%2015px%3B%0Amargin%2Dleft%3A%205px%3B%0A%7D%0A%23TOC%20ul%20%7B%0Apadding%2Dleft%3A%2040px%3B%0Amargin%2Dleft%3A%20%2D1%2E5em%3B%0Amargin%2Dtop%3A%205px%3B%0Amargin%2Dbottom%3A%205px%3B%0A%7D%0A%23TOC%20ul%20ul%20%7B%0Amargin%2Dleft%3A%20%2D2em%3B%0A%7D%0A%23TOC%20li%20%7B%0Aline%2Dheight%3A%2016px%3B%0A%7D%0Atable%20%7B%0Amargin%3A%201em%20auto%3B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dcolor%3A%20%23DDDDDD%3B%0Aborder%2Dstyle%3A%20outset%3B%0Aborder%2Dcollapse%3A%20collapse%3B%0A%7D%0Atable%20th%20%7B%0Aborder%2Dwidth%3A%202px%3B%0Apadding%3A%205px%3B%0Aborder%2Dstyle%3A%20inset%3B%0A%7D%0Atable%20td%20%7B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dstyle%3A%20inset%3B%0Aline%2Dheight%3A%2018px%3B%0Apadding%3A%205px%205px%3B%0A%7D%0Atable%2C%20table%20th%2C%20table%20td%20%7B%0Aborder%2Dleft%2Dstyle%3A%20none%3B%0Aborder%2Dright%2Dstyle%3A%20none%3B%0A%7D%0Atable%20thead%2C%20table%20tr%2Eeven%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Ap%20%7B%0Amargin%3A%200%2E5em%200%3B%0A%7D%0Ablockquote%20%7B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Apadding%3A%200%2E25em%200%2E75em%3B%0A%7D%0Ahr%20%7B%0Aborder%2Dstyle%3A%20solid%3B%0Aborder%3A%20none%3B%0Aborder%2Dtop%3A%201px%20solid%20%23777%3B%0Amargin%3A%2028px%200%3B%0A%7D%0Adl%20%7B%0Amargin%2Dleft%3A%200%3B%0A%7D%0Adl%20dd%20%7B%0Amargin%2Dbottom%3A%2013px%3B%0Amargin%2Dleft%3A%2013px%3B%0A%7D%0Adl%20dt%20%7B%0Afont%2Dweight%3A%20bold%3B%0A%7D%0Aul%20%7B%0Amargin%2Dtop%3A%200%3B%0A%7D%0Aul%20li%20%7B%0Alist%2Dstyle%3A%20circle%20outside%3B%0A%7D%0Aul%20ul%20%7B%0Amargin%2Dbottom%3A%200%3B%0A%7D%0Apre%2C%20code%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0Aborder%2Dradius%3A%203px%3B%0Acolor%3A%20%23333%3B%0Awhite%2Dspace%3A%20pre%2Dwrap%3B%20%0A%7D%0Apre%20%7B%0Aborder%2Dradius%3A%203px%3B%0Amargin%3A%205px%200px%2010px%200px%3B%0Apadding%3A%2010px%3B%0A%7D%0Apre%3Anot%28%5Bclass%5D%29%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Acode%20%7B%0Afont%2Dfamily%3A%20Consolas%2C%20Monaco%2C%20%27Courier%20New%27%2C%20monospace%3B%0Afont%2Dsize%3A%2085%25%3B%0A%7D%0Ap%20%3E%20code%2C%20li%20%3E%20code%20%7B%0Apadding%3A%202px%200px%3B%0A%7D%0Adiv%2Efigure%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0Aimg%20%7B%0Abackground%2Dcolor%3A%20%23FFFFFF%3B%0Apadding%3A%202px%3B%0Aborder%3A%201px%20solid%20%23DDDDDD%3B%0Aborder%2Dradius%3A%203px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Amargin%3A%200%205px%3B%0A%7D%0Ah1%20%7B%0Amargin%2Dtop%3A%200%3B%0Afont%2Dsize%3A%2035px%3B%0Aline%2Dheight%3A%2040px%3B%0A%7D%0Ah2%20%7B%0Aborder%2Dbottom%3A%204px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Apadding%2Dbottom%3A%202px%3B%0Afont%2Dsize%3A%20145%25%3B%0A%7D%0Ah3%20%7B%0Aborder%2Dbottom%3A%202px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Afont%2Dsize%3A%20120%25%3B%0A%7D%0Ah4%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23f7f7f7%3B%0Amargin%2Dleft%3A%208px%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Ah5%2C%20h6%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23ccc%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Aa%20%7B%0Acolor%3A%20%230033dd%3B%0Atext%2Ddecoration%3A%20none%3B%0A%7D%0Aa%3Ahover%20%7B%0Acolor%3A%20%236666ff%3B%20%7D%0Aa%3Avisited%20%7B%0Acolor%3A%20%23800080%3B%20%7D%0Aa%3Avisited%3Ahover%20%7B%0Acolor%3A%20%23BB00BB%3B%20%7D%0Aa%5Bhref%5E%3D%22http
</head>
<body>
<h1 class="title toc-ignore">Analyzing environmental sensor data from openSenseMap.org in R</h1>
<h4 class="author"><em>Norwin Roosen</em></h4>
<h4 class="date"><em>2017-08-23</em></h4>
<div id="analyzing-environmental-sensor-data-from-opensensemap.org-in-r" class="section level2">
<h2>Analyzing environmental sensor data from openSenseMap.org in R</h2>
<p>This package provides data ingestion functions for almost any data stored on the open data platform for environemental sensordata <a href="https://opensensemap.org" class="uri">https://opensensemap.org</a>. Its main goals are to provide means for:</p>
<ul>
<li>big data analysis of the measurements stored on the platform</li>
<li>sensor metadata analysis (sensor counts, spatial distribution, temporal trends)</li>
</ul>
<blockquote>
<p><em>Please note:</em> The openSenseMap API is sometimes a bit unstable when streaming long responses, which results in <code>curl</code> complaining about <code>Unexpected EOF</code>. This bug is being worked on upstream. Meanwhile you have to retry the request when this occurs.</p>
</blockquote>
<div id="exploring-the-dataset" class="section level3">
<h3>Exploring the dataset</h3>
<p>Before we look at actual observations, lets get a grasp of the openSenseMap datasets structure.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(magrittr)
<span class="kw">library</span>(opensensmapr)
all_sensors =<span class="st"> </span><span class="kw">osem_boxes</span>()</code></pre></div>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">summary</span>(all_sensors)</code></pre></div>
<pre><code>## box total: 701
##
## boxes by exposure:
## indoor outdoor unknown
## 127 553 21
##
## boxes by model:
## custom homeEthernet homeEthernetFeinstaub
## 209 78 8
## homeWifi homeWifiFeinstaub luftdaten_sds011
## 106 34 22
## luftdaten_sds011_bme280 luftdaten_sds011_bmp180 luftdaten_sds011_dht11
## 40 3 14
## luftdaten_sds011_dht22
## 187
##
## $last_measurement_within
## 1h 1d 30d 365d never
## 312 327 418 554 63
##
## oldest box: 2014-05-28 15:36:14 (CALIMERO)
## newest box: 2017-08-23 08:44:14 (Messstation Steinheim am Albuch)
##
## sensors per box:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 4.000 5.000 4.609 5.000 17.000</code></pre>
<p>This gives a good overview already: As of writing this, there are more than 600 sensor stations, of which ~50% are currently running. Most of them are placed outdoors and have around 5 sensors each. The oldest station is from May 2014, while the latest station was registered a couple of minutes ago.</p>
<p>Another feature of interest is the spatial distribution of the boxes. <code>plot()</code> can help us out here. This function requires a bunch of optional dependcies though.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">if (!<span class="kw">require</span>(<span class="st">'maps'</span>)) <span class="kw">install.packages</span>(<span class="st">'maps'</span>)
if (!<span class="kw">require</span>(<span class="st">'maptools'</span>)) <span class="kw">install.packages</span>(<span class="st">'maptools'</span>)
if (!<span class="kw">require</span>(<span class="st">'rgeos'</span>)) <span class="kw">install.packages</span>(<span class="st">'rgeos'</span>)
<span class="kw">plot</span>(all_sensors)</code></pre></div>
<p><img src="
<p>It seems we have to reduce our area of interest to Germany.</p>
<p>But what do these sensor stations actually measure? Lets find out. <code>osem_phenomena()</code> gives us a named list of of the counts of each observed phenomenon for the given set of sensor stations:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">phenoms =<span class="st"> </span><span class="kw">osem_phenomena</span>(all_sensors)
<span class="kw">str</span>(phenoms)</code></pre></div>
<pre><code>## List of 191
## $ Temperatur : int 644
## $ rel. Luftfeuchte : int 531
## $ Luftdruck : int 367
## $ PM10 : int 344
## $ PM2.5 : int 344
## $ UV-Intensität : int 255
## $ Beleuchtungsstärke : int 251
## $ Luftfeuchtigkeit : int 83
## $ Schall : int 26
## $ Helligkeit : int 20
## $ Licht : int 20
## $ UV : int 15
## $ Humidity : int 12
## $ Temperature : int 11
## $ Anderer : int 10
## $ Ilmanpaine : int 9
## $ Lämpötila : int 9
## $ Licht (digital) : int 9
## $ Valonmäärä : int 8
## $ Windgeschwindigkeit : int 8
## $ Kosteus : int 7
## $ Luftfeuchte : int 7
## $ Lautstärke : int 6
## $ Signal : int 6
## $ UV-säteily : int 6
## $ Wind speed : int 5
## $ Pressure : int 4
## $ temperature : int 4
## $ Windrichtung : int 4
## $ DS18B20_Probe01 : int 3
## $ DS18B20_Probe02 : int 3
## $ DS18B20_Probe03 : int 3
## $ DS18B20_Probe04 : int 3
## $ DS18B20_Probe05 : int 3
## $ Light : int 3
## $ Niederschlag : int 3
## $ UV Index : int 3
## $ UV-Säteily : int 3
## $ UV-Strahlung : int 3
## $ C2H5OH : int 2
## $ CO : int 2
## $ CPU-Temp : int 2
## $ Feinstaub : int 2
## $ Feinstaub PM10 : int 2
## $ Feinstaub PM2,5 : int 2
## $ H2 : int 2
## $ humidity : int 2
## $ Ilmankosteus : int 2
## $ NH3 : int 2
## $ NO2 : int 2
## $ Regen : int 2
## $ rel. Luftfeuchtigkeit : int 2
## $ Relative Humidity : int 2
## $ Temperatur BMP280 : int 2
## $ Temperatur DHT22 : int 2
## $ Temperatur HDC1008 : int 2
## $ TemperaturBME : int 2
## $ test : int 2
## $ UV-Index : int 2
## $ Wassertemperatur : int 2
## $ Wifi-Stärke : int 2
## $ Windböen : int 2
## $ Wolkenbedeckung : int 2
## $ Air Preassure : int 1
## $ Air pressure : int 1
## $ Air Temperature : int 1
## $ Akkuspannung Terrasse : int 1
## $ Akkuspannung Unten Eingang : int 1
## $ Attendance : int 1
## $ Batterie : int 1
## $ Batterieladung : int 1
## $ Battery : int 1
## $ Beleuchtungsstaerke : int 1
## $ Beleuchtungsstärke des sichtbaren Lichts: int 1
## $ Bodenfeuchte : int 1
## $ Bodentemperatur : int 1
## $ C3H8 : int 1
## $ C4H10 : int 1
## $ CH4 : int 1
## $ CO2 : int 1
## $ CO2-Konzentration : int 1
## $ Dämmerung : int 1
## $ dT : int 1
## $ Dust Sensor : int 1
## $ Dust_Concentration : int 1
## $ Eingangsspannung : int 1
## $ Feinstaub P10 : int 1
## $ Feinstaub P2.5 : int 1
## $ Feinstaubgehalt PM10 : int 1
## $ Feinstaubgehalt PM2.5 : int 1
## $ Feinstaubkonzentration : int 1
## $ Feuchte : int 1
## $ Feuchtigkeit : int 1
## $ filedData : int 1
## $ H2, LPG, CH4, CO, Alcohol : int 1
## $ Höhe : int 1
## $ Illuminance : int 1
## $ Infrared light : int 1
## $ Intensität der ultravioletten Strahlung : int 1
## [list output truncated]</code></pre>
<p>Thats quite some noise there, with many phenomena being measured by a single sensor only, or many duplicated phenomena due to slightly different spellings. We should clean that up, but for now lets just filter out the noise and find those phenomena with high sensor numbers:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">phenoms[phenoms &gt;<span class="st"> </span><span class="dv">20</span>]</code></pre></div>
<pre><code>## $Temperatur
## [1] 644
##
## $`rel. Luftfeuchte`
## [1] 531
##
## $Luftdruck
## [1] 367
##
## $PM10
## [1] 344
##
## $PM2.5
## [1] 344
##
## $`UV-Intensität`
## [1] 255
##
## $Beleuchtungsstärke
## [1] 251
##
## $Luftfeuchtigkeit
## [1] 83
##
## $Schall
## [1] 26</code></pre>
<p>Alright, temperature it is! Fine particulate matter (PM2.5) seems to be more interesting to analyze though. We should check how many sensor stations provide useful data: We want only those boxes with a PM2.5 sensor, that are placed outdoors and are currently submitting measurements:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">pm25_sensors =<span class="st"> </span><span class="kw">osem_boxes</span>(
<span class="dt">exposure =</span> <span class="st">'outdoor'</span>,
<span class="dt">date =</span> <span class="kw">Sys.time</span>(), <span class="co"># ±4 hours</span>
<span class="dt">phenomenon =</span> <span class="st">'PM2.5'</span>
)</code></pre></div>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">summary</span>(pm25_sensors)</code></pre></div>
<pre><code>## box total: 236
##
## boxes by exposure:
## outdoor
## 236
##
## boxes by model:
## custom homeEthernetFeinstaub homeWifi
## 18 4 5
## homeWifiFeinstaub luftdaten_sds011 luftdaten_sds011_bme280
## 12 15 29
## luftdaten_sds011_bmp180 luftdaten_sds011_dht11 luftdaten_sds011_dht22
## 1 11 141
##
## $last_measurement_within
## 1h 1d 30d 365d never
## 230 233 234 234 2
##
## oldest box: 2016-09-11 08:17:17 (Balkon Gasselstiege)
## newest box: 2017-08-23 08:44:14 (Messstation Steinheim am Albuch)
##
## sensors per box:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 4.000 4.000 4.271 4.000 10.000</code></pre>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">plot</span>(pm25_sensors)</code></pre></div>
<p><img src="
<p>Thats still more than 200 measuring stations, we can work with that.</p>
</div>
<div id="analyzing-sensor-data" class="section level3">
<h3>Analyzing sensor data</h3>
<p>Having analyzed the available data sources, lets finally get some measurements. We could call <code>osem_measurements(pm25_sensors)</code> now, however we are focussing on a restricted area of interest, the city of Berlin. Luckily we can get the measurements filtered by a bounding box:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(sf)
<span class="kw">library</span>(units)
<span class="kw">library</span>(lubridate)
<span class="co"># construct a bounding box: 12 kilometers around Berlin</span>
berlin =<span class="st"> </span><span class="kw">st_point</span>(<span class="kw">c</span>(<span class="fl">13.4034</span>, <span class="fl">52.5120</span>)) %&gt;%
<span class="st"> </span><span class="kw">st_sfc</span>(<span class="dt">crs =</span> <span class="dv">4326</span>) %&gt;%
<span class="st"> </span><span class="kw">st_transform</span>(<span class="dv">3857</span>) %&gt;%<span class="st"> </span><span class="co"># allow setting a buffer in meters</span>
<span class="st"> </span><span class="kw">st_buffer</span>(units::<span class="kw">set_units</span>(<span class="dv">12</span>, km)) %&gt;%
<span class="st"> </span><span class="kw">st_transform</span>(<span class="dv">4326</span>) %&gt;%<span class="st"> </span><span class="co"># the opensensemap expects WGS 84</span>
<span class="st"> </span><span class="kw">st_bbox</span>()</code></pre></div>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">pm25 =<span class="st"> </span><span class="kw">osem_measurements</span>(
berlin,
<span class="dt">phenomenon =</span> <span class="st">'PM2.5'</span>,
<span class="dt">from =</span> <span class="kw">now</span>() -<span class="st"> </span><span class="kw">days</span>(<span class="dv">7</span>), <span class="co"># defaults to 2 days</span>
<span class="dt">to =</span> <span class="kw">now</span>()
)
<span class="kw">plot</span>(pm25)</code></pre></div>
<p><img src="
<p>Now we can get started with actual spatiotemporal data analysis. First plot the measuring locations:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">pm25_sf =<span class="st"> </span><span class="kw">osem_as_sf</span>(pm25)
<span class="kw">plot</span>(<span class="kw">st_geometry</span>(pm25_sf), <span class="dt">axes =</span> T)</code></pre></div>
<p><img src="
<p>further analysis: <code>TODO</code></p>
</div>
</div>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>