2018-05-25 17:17:29 +02:00
<!DOCTYPE html>
2018-05-26 12:52:02 +02:00
< html xmlns = "http://www.w3.org/1999/xhtml" >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< head >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< meta charset = "utf-8" / >
< meta http-equiv = "Content-Type" content = "text/html; charset=utf-8" / >
< meta name = "generator" content = "pandoc" / >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< meta name = "viewport" content = "width=device-width, initial-scale=1" >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< meta name = "author" content = "Norwin Roosen" / >
2018-05-25 17:17:29 +02:00
2018-06-07 00:13:15 +02:00
< meta name = "date" content = "2018-06-07" / >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< title > Caching openSenseMap Data for Reproducibility< / title >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< style type = "text/css" > code { white-space : pre ; } < / style >
< style type = "text/css" >
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
< / style >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< link href = "data:text/css;charset=utf-8,body%20%7B%0Abackground%2Dcolor%3A%20%23fff%3B%0Amargin%3A%201em%20auto%3B%0Amax%2Dwidth%3A%20700px%3B%0Aoverflow%3A%20visible%3B%0Apadding%2Dleft%3A%202em%3B%0Apadding%2Dright%3A%202em%3B%0Afont%2Dfamily%3A%20%22Open%20Sans%22%2C%20%22Helvetica%20Neue%22%2C%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0Afont%2Dsize%3A%2014px%3B%0Aline%2Dheight%3A%201%2E35%3B%0A%7D%0A%23header%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0A%23TOC%20%7B%0Aclear%3A%20both%3B%0Amargin%3A%200%200%2010px%2010px%3B%0Apadding%3A%204px%3B%0Awidth%3A%20400px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Aborder%2Dradius%3A%205px%3B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Afont%2Dsize%3A%2013px%3B%0Aline%2Dheight%3A%201%2E3%3B%0A%7D%0A%23TOC%20%2Etoctitle%20%7B%0Afont%2Dweight%3A%20bold%3B%0Afont%2Dsize%3A%2015px%3B%0Amargin%2Dleft%3A%205px%3B%0A%7D%0A%23TOC%20ul%20%7B%0Apadding%2Dleft%3A%2040px%3B%0Amargin%2Dleft%3A%20%2D1%2E5em%3B%0Amargin%2Dtop%3A%205px%3B%0Amargin%2Dbottom%3A%205px%3B%0A%7D%0A%23TOC%20ul%20ul%20%7B%0Amargin%2Dleft%3A%20%2D2em%3B%0A%7D%0A%23TOC%20li%20%7B%0Aline%2Dheight%3A%2016px%3B%0A%7D%0Atable%20%7B%0Amargin%3A%201em%20auto%3B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dcolor%3A%20%23DDDDDD%3B%0Aborder%2Dstyle%3A%20outset%3B%0Aborder%2Dcollapse%3A%20collapse%3B%0A%7D%0Atable%20th%20%7B%0Aborder%2Dwidth%3A%202px%3B%0Apadding%3A%205px%3B%0Aborder%2Dstyle%3A%20inset%3B%0A%7D%0Atable%20td%20%7B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dstyle%3A%20inset%3B%0Aline%2Dheight%3A%2018px%3B%0Apadding%3A%205px%205px%3B%0A%7D%0Atable%2C%20table%20th%2C%20table%20td%20%7B%0Aborder%2Dleft%2Dstyle%3A%20none%3B%0Aborder%2Dright%2Dstyle%3A%20none%3B%0A%7D%0Atable%20thead%2C%20table%20tr%2Eeven%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Ap%20%7B%0Amargin%3A%200%2E5em%200%3B%0A%7D%0Ablockquote%20%7B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Apadding%3A%200%2E25em%200%2E75em%3B%0A%7D%0Ahr%20%7B%0Aborder%2Dstyle%3A%20solid%3B%0Aborder%3A%20none%3B%0Aborder%2Dtop%3A%201px%20solid%20%23777%3B%0Amargin%3A%2028px%200%3B%0A%7D%0Adl%20%7B%0Amargin%2Dleft%3A%200%3B%0A%7D%0Adl%20dd%20%7B%0Amargin%2Dbottom%3A%2013px%3B%0Amargin%2Dleft%3A%2013px%3B%0A%7D%0Adl%20dt%20%7B%0Afont%2Dweight%3A%20bold%3B%0A%7D%0Aul%20%7B%0Amargin%2Dtop%3A%200%3B%0A%7D%0Aul%20li%20%7B%0Alist%2Dstyle%3A%20circle%20outside%3B%0A%7D%0Aul%20ul%20%7B%0Amargin%2Dbottom%3A%200%3B%0A%7D%0Apre%2C%20code%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0Aborder%2Dradius%3A%203px%3B%0Acolor%3A%20%23333%3B%0Awhite%2Dspace%3A%20pre%2Dwrap%3B%20%0A%7D%0Apre%20%7B%0Aborder%2Dradius%3A%203px%3B%0Amargin%3A%205px%200px%2010px%200px%3B%0Apadding%3A%2010px%3B%0A%7D%0Apre%3Anot%28%5Bclass%5D%29%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Acode%20%7B%0Afont%2Dfamily%3A%20Consolas%2C%20Monaco%2C%20%27Courier%20New%27%2C%20monospace%3B%0Afont%2Dsize%3A%2085%25%3B%0A%7D%0Ap%20%3E%20code%2C%20li%20%3E%20code%20%7B%0Apadding%3A%202px%200px%3B%0A%7D%0Adiv%2Efigure%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0Aimg%20%7B%0Abackground%2Dcolor%3A%20%23FFFFFF%3B%0Apadding%3A%202px%3B%0Aborder%3A%201px%20solid%20%23DDDDDD%3B%0Aborder%2Dradius%3A%203px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Amargin%3A%200%205px%3B%0A%7D%0Ah1%20%7B%0Amargin%2Dtop%3A%200%3B%0Afont%2Dsize%3A%2035px%3B%0Aline%2Dheight%3A%2040px%3B%0A%7D%0Ah2%20%7B%0Aborder%2Dbottom%3A%204px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Apadding%2Dbottom%3A%202px%3B%0Afont%2Dsize%3A%20145%25%3B%0A%7D%0Ah3%20%7B%0Aborder%2Dbottom%3A%202px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Afont%2Dsize%3A%20120%25%3B%0A%7D%0Ah4%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23f7f7f7%3B%0Amargin%2Dleft%3A%208px%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Ah5%2C%20h6%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23ccc%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Aa%20%7B%0Acolor%3A%20%230033dd%3B%0Atext%2Ddecoration%3A%20none%3B%0A%7D%0Aa%3Ahover%20%7B%0Acolor%3A%20%236666ff%3B%20%7D%0Aa%3Avisited%20%7B%0Acolor%3A%20%23800080%3B%20%7D%0Aa%3Avisited%3Ahover%20%7B%0Acolor%3A%20%23BB00BB%3B%20%7D%0Aa%5Bhref%5E%3D%22http
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< / head >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< body >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
< h1 class = "title toc-ignore" > Caching openSenseMap Data for Reproducibility< / h1 >
< h4 class = "author" > < em > Norwin Roosen< / em > < / h4 >
2018-06-07 00:13:15 +02:00
< h4 class = "date" > < em > 2018-06-07< / em > < / h4 >
2018-05-25 17:17:29 +02:00
2018-06-07 00:13:15 +02:00
< p > It may be useful to download data from openSenseMap only once. For reproducible results, the data should be saved to disk, and reloaded at a later point.< / p >
2018-05-25 17:17:29 +02:00
< p > This avoids..< / p >
< ul >
< li > changed results for queries without date parameters,< / li >
< li > unnecessary wait times,< / li >
< li > risk of API changes / API unavailability,< / li >
< li > stress on the openSenseMap-server.< / li >
< / ul >
2018-06-07 00:13:15 +02:00
< p > This vignette shows how to use this built in < code > opensensmapr< / code > feature, and how to do it yourself in case you want to save to other data formats.< / p >
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "co" > # this vignette requires:< / span >
< span class = "kw" > library< / span > (opensensmapr)
< span class = "kw" > library< / span > (jsonlite)
< span class = "kw" > library< / span > (readr)< / code > < / pre > < / div >
< div id = "using-the-opensensmapr-caching-feature" class = "section level2" >
< h2 > Using the opensensmapr Caching Feature< / h2 >
< p > All data retrieval functions of < code > opensensmapr< / code > have a built in caching feature, which serializes an API response to disk. Subsequent identical requests will then return the serialized data instead of making another request.< / p >
2018-05-25 17:17:29 +02:00
< p > To use this feature, just add a path to a directory to the < code > cache< / code > parameter:< / p >
2018-06-07 00:13:15 +02:00
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > b =< span class = "st" > < / span > < span class = "kw" > osem_boxes< / span > (< span class = "dt" > grouptag =< / span > < span class = "st" > 'ifgi'< / span > , < span class = "dt" > cache =< / span > < span class = "kw" > tempdir< / span > ())
< span class = "co" > # the next identical request will hit the cache only!< / span >
b =< span class = "st" > < / span > < span class = "kw" > osem_boxes< / span > (< span class = "dt" > grouptag =< / span > < span class = "st" > 'ifgi'< / span > , < span class = "dt" > cache =< / span > < span class = "kw" > tempdir< / span > ())
2018-05-26 12:52:02 +02:00
< span class = "co" > # requests without the cache parameter will still be performed normally< / span >
2018-06-07 00:13:15 +02:00
b =< span class = "st" > < / span > < span class = "kw" > osem_boxes< / span > (< span class = "dt" > grouptag =< / span > < span class = "st" > 'ifgi'< / span > )< / code > < / pre > < / div >
< p > Looking at the cache directory we can see one file for each request, which is identified through a hash of the request URL:< / p >
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "kw" > list.files< / span > (< span class = "kw" > tempdir< / span > (), < span class = "dt" > pattern =< / span > < span class = "st" > 'osemcache< / span > < span class = "ch" > \\< / span > < span class = "st" > ..*< / span > < span class = "ch" > \\< / span > < span class = "st" > .rds'< / span > )< / code > < / pre > < / div >
< pre > < code > ## [1] " osemcache.17db5c57fc6fca4d836fa2cf30345ce8767cd61a.rds" < / code > < / pre >
< p > You can maintain multiple caches simultaneously which allows to only store data related to a script in the same directory:< / p >
2018-05-26 12:52:02 +02:00
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > cacheDir =< span class = "st" > < / span > < span class = "kw" > getwd< / span > () < span class = "co" > # current working directory< / span >
2018-06-07 00:13:15 +02:00
b =< span class = "st" > < / span > < span class = "kw" > osem_boxes< / span > (< span class = "dt" > grouptag =< / span > < span class = "st" > 'ifgi'< / span > , < span class = "dt" > cache =< / span > cacheDir)
2018-05-26 12:52:02 +02:00
< span class = "co" > # the next identical request will hit the cache only!< / span >
2018-06-07 00:13:15 +02:00
b =< span class = "st" > < / span > < span class = "kw" > osem_boxes< / span > (< span class = "dt" > grouptag =< / span > < span class = "st" > 'ifgi'< / span > , < span class = "dt" > cache =< / span > cacheDir)< / code > < / pre > < / div >
2018-05-25 17:17:29 +02:00
< p > To get fresh results again, just call < code > osem_clear_cache()< / code > for the respective cache:< / p >
2018-06-07 00:13:15 +02:00
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "kw" > osem_clear_cache< / span > () < span class = "co" > # clears default cache< / span >
< span class = "kw" > osem_clear_cache< / span > (< span class = "kw" > getwd< / span > ()) < span class = "co" > # clears a custom cache< / span > < / code > < / pre > < / div >
2018-05-26 12:52:02 +02:00
< / div >
< div id = "custom-de--serialization" class = "section level2" >
2018-05-25 17:17:29 +02:00
< h2 > Custom (De-) Serialization< / h2 >
2018-05-26 12:52:02 +02:00
< p > If you want to roll your own serialization method to support custom data formats, here’ s how:< / p >
2018-06-07 00:13:15 +02:00
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "co" > # first get our example data:< / span >
measurements =< span class = "st" > < / span > < span class = "kw" > osem_measurements< / span > (< span class = "st" > 'Windrichtung'< / span > )< / code > < / pre > < / div >
2018-05-26 12:52:02 +02:00
< p > If you are paranoid and worry about < code > .rds< / code > files not being decodable anymore in the (distant) future, you could serialize to a plain text format such as JSON. This of course comes at the cost of storage space and performance.< / p >
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "co" > # serializing senseBoxes to JSON, and loading from file again:< / span >
2018-06-07 00:13:15 +02:00
< span class = "kw" > write< / span > (jsonlite::< span class = "kw" > serializeJSON< / span > (measurements), < span class = "st" > 'measurements.json'< / span > )
measurements_from_file =< span class = "st" > < / span > jsonlite::< span class = "kw" > unserializeJSON< / span > (readr::< span class = "kw" > read_file< / span > (< span class = "st" > 'measurements.json'< / span > ))
< span class = "kw" > class< / span > (measurements_from_file)< / code > < / pre > < / div >
< pre > < code > ## [1] " osem_measurements" " tbl_df" " tbl"
## [4] " data.frame" < / code > < / pre >
< p > This method also persists the R object metadata (classes, attributes). If you were to use a serialization method that can’ t persist object metadata, you could re-apply it with the following functions:< / p >
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > < span class = "co" > # note the toJSON call instead of serializeJSON< / span >
< span class = "kw" > write< / span > (jsonlite::< span class = "kw" > toJSON< / span > (measurements), < span class = "st" > 'measurements_bad.json'< / span > )
measurements_without_attrs =< span class = "st" > < / span > jsonlite::< span class = "kw" > fromJSON< / span > (< span class = "st" > 'measurements_bad.json'< / span > )
< span class = "kw" > class< / span > (measurements_without_attrs)< / code > < / pre > < / div >
< pre > < code > ## [1] " data.frame" < / code > < / pre >
< div class = "sourceCode" > < pre class = "sourceCode r" > < code class = "sourceCode r" > measurements_with_attrs =< span class = "st" > < / span > < span class = "kw" > osem_as_measurements< / span > (measurements_without_attrs)
< span class = "kw" > class< / span > (measurements_with_attrs)< / code > < / pre > < / div >
< pre > < code > ## [1] " osem_measurements" " tbl_df" " tbl"
## [4] " data.frame" < / code > < / pre >
< p > The same goes for boxes via < code > osem_as_sensebox()< / code > .< / p >
2018-05-26 12:52:02 +02:00
< / div >
2018-05-25 17:17:29 +02:00
2018-05-26 12:52:02 +02:00
<!-- dynamically load mathjax for compatibility with self - contained -->
< script >
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
< / script >
2018-05-25 17:17:29 +02:00
< / body >
< / html >