You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
opensensmapR/inst/doc/osem-serialization.html

153 lines
15 KiB
HTML

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="author" content="Norwin Roosen" />
<meta name="date" content="2018-06-07" />
<title>Caching openSenseMap Data for Reproducibility</title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
</style>
<link href="data:text/css;charset=utf-8,body%20%7B%0Abackground%2Dcolor%3A%20%23fff%3B%0Amargin%3A%201em%20auto%3B%0Amax%2Dwidth%3A%20700px%3B%0Aoverflow%3A%20visible%3B%0Apadding%2Dleft%3A%202em%3B%0Apadding%2Dright%3A%202em%3B%0Afont%2Dfamily%3A%20%22Open%20Sans%22%2C%20%22Helvetica%20Neue%22%2C%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0Afont%2Dsize%3A%2014px%3B%0Aline%2Dheight%3A%201%2E35%3B%0A%7D%0A%23header%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0A%23TOC%20%7B%0Aclear%3A%20both%3B%0Amargin%3A%200%200%2010px%2010px%3B%0Apadding%3A%204px%3B%0Awidth%3A%20400px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Aborder%2Dradius%3A%205px%3B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Afont%2Dsize%3A%2013px%3B%0Aline%2Dheight%3A%201%2E3%3B%0A%7D%0A%23TOC%20%2Etoctitle%20%7B%0Afont%2Dweight%3A%20bold%3B%0Afont%2Dsize%3A%2015px%3B%0Amargin%2Dleft%3A%205px%3B%0A%7D%0A%23TOC%20ul%20%7B%0Apadding%2Dleft%3A%2040px%3B%0Amargin%2Dleft%3A%20%2D1%2E5em%3B%0Amargin%2Dtop%3A%205px%3B%0Amargin%2Dbottom%3A%205px%3B%0A%7D%0A%23TOC%20ul%20ul%20%7B%0Amargin%2Dleft%3A%20%2D2em%3B%0A%7D%0A%23TOC%20li%20%7B%0Aline%2Dheight%3A%2016px%3B%0A%7D%0Atable%20%7B%0Amargin%3A%201em%20auto%3B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dcolor%3A%20%23DDDDDD%3B%0Aborder%2Dstyle%3A%20outset%3B%0Aborder%2Dcollapse%3A%20collapse%3B%0A%7D%0Atable%20th%20%7B%0Aborder%2Dwidth%3A%202px%3B%0Apadding%3A%205px%3B%0Aborder%2Dstyle%3A%20inset%3B%0A%7D%0Atable%20td%20%7B%0Aborder%2Dwidth%3A%201px%3B%0Aborder%2Dstyle%3A%20inset%3B%0Aline%2Dheight%3A%2018px%3B%0Apadding%3A%205px%205px%3B%0A%7D%0Atable%2C%20table%20th%2C%20table%20td%20%7B%0Aborder%2Dleft%2Dstyle%3A%20none%3B%0Aborder%2Dright%2Dstyle%3A%20none%3B%0A%7D%0Atable%20thead%2C%20table%20tr%2Eeven%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Ap%20%7B%0Amargin%3A%200%2E5em%200%3B%0A%7D%0Ablockquote%20%7B%0Abackground%2Dcolor%3A%20%23f6f6f6%3B%0Apadding%3A%200%2E25em%200%2E75em%3B%0A%7D%0Ahr%20%7B%0Aborder%2Dstyle%3A%20solid%3B%0Aborder%3A%20none%3B%0Aborder%2Dtop%3A%201px%20solid%20%23777%3B%0Amargin%3A%2028px%200%3B%0A%7D%0Adl%20%7B%0Amargin%2Dleft%3A%200%3B%0A%7D%0Adl%20dd%20%7B%0Amargin%2Dbottom%3A%2013px%3B%0Amargin%2Dleft%3A%2013px%3B%0A%7D%0Adl%20dt%20%7B%0Afont%2Dweight%3A%20bold%3B%0A%7D%0Aul%20%7B%0Amargin%2Dtop%3A%200%3B%0A%7D%0Aul%20li%20%7B%0Alist%2Dstyle%3A%20circle%20outside%3B%0A%7D%0Aul%20ul%20%7B%0Amargin%2Dbottom%3A%200%3B%0A%7D%0Apre%2C%20code%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0Aborder%2Dradius%3A%203px%3B%0Acolor%3A%20%23333%3B%0Awhite%2Dspace%3A%20pre%2Dwrap%3B%20%0A%7D%0Apre%20%7B%0Aborder%2Dradius%3A%203px%3B%0Amargin%3A%205px%200px%2010px%200px%3B%0Apadding%3A%2010px%3B%0A%7D%0Apre%3Anot%28%5Bclass%5D%29%20%7B%0Abackground%2Dcolor%3A%20%23f7f7f7%3B%0A%7D%0Acode%20%7B%0Afont%2Dfamily%3A%20Consolas%2C%20Monaco%2C%20%27Courier%20New%27%2C%20monospace%3B%0Afont%2Dsize%3A%2085%25%3B%0A%7D%0Ap%20%3E%20code%2C%20li%20%3E%20code%20%7B%0Apadding%3A%202px%200px%3B%0A%7D%0Adiv%2Efigure%20%7B%0Atext%2Dalign%3A%20center%3B%0A%7D%0Aimg%20%7B%0Abackground%2Dcolor%3A%20%23FFFFFF%3B%0Apadding%3A%202px%3B%0Aborder%3A%201px%20solid%20%23DDDDDD%3B%0Aborder%2Dradius%3A%203px%3B%0Aborder%3A%201px%20solid%20%23CCCCCC%3B%0Amargin%3A%200%205px%3B%0A%7D%0Ah1%20%7B%0Amargin%2Dtop%3A%200%3B%0Afont%2Dsize%3A%2035px%3B%0Aline%2Dheight%3A%2040px%3B%0A%7D%0Ah2%20%7B%0Aborder%2Dbottom%3A%204px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Apadding%2Dbottom%3A%202px%3B%0Afont%2Dsize%3A%20145%25%3B%0A%7D%0Ah3%20%7B%0Aborder%2Dbottom%3A%202px%20solid%20%23f7f7f7%3B%0Apadding%2Dtop%3A%2010px%3B%0Afont%2Dsize%3A%20120%25%3B%0A%7D%0Ah4%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23f7f7f7%3B%0Amargin%2Dleft%3A%208px%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Ah5%2C%20h6%20%7B%0Aborder%2Dbottom%3A%201px%20solid%20%23ccc%3B%0Afont%2Dsize%3A%20105%25%3B%0A%7D%0Aa%20%7B%0Acolor%3A%20%230033dd%3B%0Atext%2Ddecoration%3A%20none%3B%0A%7D%0Aa%3Ahover%20%7B%0Acolor%3A%20%236666ff%3B%20%7D%0Aa%3Avisited%20%7B%0Acolor%3A%20%23800080%3B%20%7D%0Aa%3Avisited%3Ahover%20%7B%0Acolor%3A%20%23BB00BB%3B%20%7D%0Aa%5Bhref%5E%3D%22http
</head>
<body>
<h1 class="title toc-ignore">Caching openSenseMap Data for Reproducibility</h1>
<h4 class="author"><em>Norwin Roosen</em></h4>
<h4 class="date"><em>2018-06-07</em></h4>
<p>It may be useful to download data from openSenseMap only once. For reproducible results, the data should be saved to disk, and reloaded at a later point.</p>
<p>This avoids..</p>
<ul>
<li>changed results for queries without date parameters,</li>
<li>unnecessary wait times,</li>
<li>risk of API changes / API unavailability,</li>
<li>stress on the openSenseMap-server.</li>
</ul>
<p>This vignette shows how to use this built in <code>opensensmapr</code> feature, and how to do it yourself in case you want to save to other data formats.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># this vignette requires:</span>
<span class="kw">library</span>(opensensmapr)
<span class="kw">library</span>(jsonlite)
<span class="kw">library</span>(readr)</code></pre></div>
<div id="using-the-opensensmapr-caching-feature" class="section level2">
<h2>Using the opensensmapr Caching Feature</h2>
<p>All data retrieval functions of <code>opensensmapr</code> have a built in caching feature, which serializes an API response to disk. Subsequent identical requests will then return the serialized data instead of making another request.</p>
<p>To use this feature, just add a path to a directory to the <code>cache</code> parameter:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">b =<span class="st"> </span><span class="kw">osem_boxes</span>(<span class="dt">grouptag =</span> <span class="st">'ifgi'</span>, <span class="dt">cache =</span> <span class="kw">tempdir</span>())
<span class="co"># the next identical request will hit the cache only!</span>
b =<span class="st"> </span><span class="kw">osem_boxes</span>(<span class="dt">grouptag =</span> <span class="st">'ifgi'</span>, <span class="dt">cache =</span> <span class="kw">tempdir</span>())
<span class="co"># requests without the cache parameter will still be performed normally</span>
b =<span class="st"> </span><span class="kw">osem_boxes</span>(<span class="dt">grouptag =</span> <span class="st">'ifgi'</span>)</code></pre></div>
<p>Looking at the cache directory we can see one file for each request, which is identified through a hash of the request URL:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">list.files</span>(<span class="kw">tempdir</span>(), <span class="dt">pattern =</span> <span class="st">'osemcache</span><span class="ch">\\</span><span class="st">..*</span><span class="ch">\\</span><span class="st">.rds'</span>)</code></pre></div>
<pre><code>## [1] &quot;osemcache.17db5c57fc6fca4d836fa2cf30345ce8767cd61a.rds&quot;</code></pre>
<p>You can maintain multiple caches simultaneously which allows to only store data related to a script in the same directory:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">cacheDir =<span class="st"> </span><span class="kw">getwd</span>() <span class="co"># current working directory</span>
b =<span class="st"> </span><span class="kw">osem_boxes</span>(<span class="dt">grouptag =</span> <span class="st">'ifgi'</span>, <span class="dt">cache =</span> cacheDir)
<span class="co"># the next identical request will hit the cache only!</span>
b =<span class="st"> </span><span class="kw">osem_boxes</span>(<span class="dt">grouptag =</span> <span class="st">'ifgi'</span>, <span class="dt">cache =</span> cacheDir)</code></pre></div>
<p>To get fresh results again, just call <code>osem_clear_cache()</code> for the respective cache:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">osem_clear_cache</span>() <span class="co"># clears default cache</span>
<span class="kw">osem_clear_cache</span>(<span class="kw">getwd</span>()) <span class="co"># clears a custom cache</span></code></pre></div>
</div>
<div id="custom-de--serialization" class="section level2">
<h2>Custom (De-) Serialization</h2>
<p>If you want to roll your own serialization method to support custom data formats, heres how:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># first get our example data:</span>
measurements =<span class="st"> </span><span class="kw">osem_measurements</span>(<span class="st">'Windrichtung'</span>)</code></pre></div>
<p>If you are paranoid and worry about <code>.rds</code> files not being decodable anymore in the (distant) future, you could serialize to a plain text format such as JSON. This of course comes at the cost of storage space and performance.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># serializing senseBoxes to JSON, and loading from file again:</span>
<span class="kw">write</span>(jsonlite::<span class="kw">serializeJSON</span>(measurements), <span class="st">'measurements.json'</span>)
measurements_from_file =<span class="st"> </span>jsonlite::<span class="kw">unserializeJSON</span>(readr::<span class="kw">read_file</span>(<span class="st">'measurements.json'</span>))
<span class="kw">class</span>(measurements_from_file)</code></pre></div>
<pre><code>## [1] &quot;osem_measurements&quot; &quot;tbl_df&quot; &quot;tbl&quot;
## [4] &quot;data.frame&quot;</code></pre>
<p>This method also persists the R object metadata (classes, attributes). If you were to use a serialization method that cant persist object metadata, you could re-apply it with the following functions:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># note the toJSON call instead of serializeJSON</span>
<span class="kw">write</span>(jsonlite::<span class="kw">toJSON</span>(measurements), <span class="st">'measurements_bad.json'</span>)
measurements_without_attrs =<span class="st"> </span>jsonlite::<span class="kw">fromJSON</span>(<span class="st">'measurements_bad.json'</span>)
<span class="kw">class</span>(measurements_without_attrs)</code></pre></div>
<pre><code>## [1] &quot;data.frame&quot;</code></pre>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">measurements_with_attrs =<span class="st"> </span><span class="kw">osem_as_measurements</span>(measurements_without_attrs)
<span class="kw">class</span>(measurements_with_attrs)</code></pre></div>
<pre><code>## [1] &quot;osem_measurements&quot; &quot;tbl_df&quot; &quot;tbl&quot;
## [4] &quot;data.frame&quot;</code></pre>
<p>The same goes for boxes via <code>osem_as_sensebox()</code>.</p>
</div>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>