2020-05-03T16:14:03Z /feed.html/ Getting Edgerouter metrics into Prometheus Carl Johan Gustavsson 2020-04-11T20:00:00Z 2020-04-11T20:00:00Z /blog/2020/04/getting-edgerouter-metrics-into-prometheus.html <p>I got an <a href="https://www.ui.com/edgemax/edgerouter-lite/">Ubiquiti Edgerouter Lite</a> as the router at home and while it has its admin interface providing current traffic flowing in an out it does not store any historic metrics. Also it only shows a window of data with unspecified time for each bar it is quite useless as you can see in the picture&nbsp;below.</p> <p><img style="width: 100%" src="/media/images/2020/edgerouter-graph.png" /></p> <p>I already have <a href="https://prometheus.io/">Prometheus</a> and <a href="https://grafana.com/">Grafana</a> set up for other purposes so naturally I wanted to ingest the metrics from my router into&nbsp;Prometheus.</p> <p>Edgerouter supports exporting metrics metrics via <span class="caps">SNMP</span> and Prometheus can import via a separate daemon called <a href="https://github.com/prometheus/snmp_exporter">snmp_exporter</a>. It basically connects to devices over <span class="caps">SNMP</span>, collects configured metrics and exposes those metrics via webserver that Prometheus can&nbsp;poll.</p> <p>I installed <code>snmp_exporter</code> on my FreeBSD server which is running Prometheus. <code>snmp_exporter</code> uses a config file called <code>snmp.yml</code> which can be generated by the utility <code>snmp_exporter_generator</code> that comes bundled with <code>snmp_exporter</code>. To generate the config needed the <code>snmp_exporter_generator</code> uses yet another yml file called <code>generator.yml</code>. Fortunately someone else has created the one that is needed for the Edgerouter and it can be found at <a href="https://github.com/j18e/prometheus-edgerouter">https://github.com/j18e/prometheus-edgerouter</a>.</p> <p>The first thing I needed to do is to enable <span class="caps">SNMP</span> in the Edgerouter admin and select a community string, which is the password <span class="caps">SNMP</span> uses. Unfortunately I dont think it does <a href="https://en.wikipedia.org/wiki/Simple_Network_Management_Protocol#Version_3">SNMPv3</a> so everything is unencrypted&nbsp;🤷‍♂️.</p> <p>Then I copied the <code>generator.yml</code> from the GitHub repo above and entered the community string and then ran <code>snmp_exporter_generator generate</code> to output the <code>snmp.yml</code>. On FreeBSD it uses the <code>/usr/local/etc/snmp_exporter</code> paths by default so needed <code>sudo</code> for that which is a bit weird, I expected it to take the file from current directory and generate it in there as&nbsp;well. </p> <p>To finish I added a new <span class="caps">SNMP</span> job to the <code>prometheus.yml</code>:</p> <pre> - job_name: 'snmp' scrape_interval: 5s static_configs: - targets: - 192.168.1.1 metrics_path: /snmp params: module: - edgerouter relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 127.0.0.1:9116 </pre> <p>and restarted Prometheus to reload the&nbsp;config.</p> <p>To graph this in Grafana I added a panel and used <code>8*rate(ifHCInOctets{ifName="eth1"}[60s])</code> as the query for received bits/s which gives us a graph like&nbsp;this:</p> <p><img style="width: 100%" src="/media/images/2020/graphana-bandwidth.png" /></p> <p>and then <code>8*rate(ifHCOutOctets{ifName="eth1"}[60s])</code> for sent&nbsp;bits/s.</p> Alive and HTTPS Carl Johan Gustavsson 2020-03-01T19:15:00Z 2020-03-01T19:15:00Z /blog/2020/03/alive-and-https.html <p>Finally making some life signs here. I have had &#8220;write more blog posts&#8221; on my todo for a very long time now, and today I finally got some time and motivation to do this. Actually I updated the blog silently around a year ago to update my profile after I started at Shopify. Today&#8217;s post started out by me noticing that the blog wasn&#8217;t served over <span class="caps">TLS</span>, shame on me, it is 2020 after&nbsp;all&#8230;</p> <p>For some background my blog has been served by static html on <a href="https://aws.amazon.com/s3">Amazon S3</a> since the beginning in 2013. S3 itself does support <span class="caps">HTTPS</span> but only if you use the actual bucket url which looks something like <code>http(s)://&lt;bucket&gt;.s3.amazonaws.com/&lt;object&gt;</code>. My subdomain <code>blog.prng.se</code> was a <span class="caps">CNAME</span> to my bucket where the static files&nbsp;lives.</p> <p><a href="https://aws.amazon.com/cloudfront">Cloudfront</a> (Amazon&#8217;s <span class="caps">CDN</span>) however can sit infront of S3 and serve over <span class="caps">HTTPS</span> with a custom domain, for no fixed fee. Basically you start with requesting a certificate for your domain in <span class="caps">AWS</span> Certificate Manager and choose your verification method, either via Email or <span class="caps">DNS</span>. I choose <span class="caps">DNS</span> and got a <span class="caps">CNAME</span> target I needed to add to my <span class="caps">DNS</span> Records for my domain. It took a few minutes for the certification validation to complete after my <span class="caps">DNS</span> record was&nbsp;updated. </p> <p>Then I could create the Cloudfront distribution, by pointing out my S3 bucket and my newly provisioned certificate and select some other options about which POPs to use etc. The Cloudfront distribution took quite some time (10+ minutes) to provision but after that the <span class="caps">CDN</span> was available at <a href="d23eyjq58b193n.cloudfront.net">d23eyjq58b193n.cloudfront.net</a>. After that I had to change the current <span class="caps">CNAME</span> for <code>blog.prng.se</code> to point to the <span class="caps">CDN</span> domain and wait for that to propagate. I had a 1h <span class="caps">TTL</span> on the domain which meant I had to wait a bit. When <span class="caps">DNS</span> propagation completed I still had problems connecting via https to the blog though. After some head scratching I realized I needed to specify the valid CNAMEs in the Cloudfront distribution. I added that, saved and waited a minute or two for it to take effect and, voila, the blog is now served over <span class="caps">HTTPS</span>.</p> <p>Next up, I&#8217;m going to switch the blog from using <a href="http://hyde.github.io/">Hyde</a> as it has been more or less abandoned the last 4 years and it doesn&#8217;t support Python 3. So long, and thanks for all the fish Python 2.7. I found a new static site generator called <a href="https://www.getzola.org">Zola</a> that&#8217;s written in Rust that I will migrate&nbsp;to. </p> Debugging strange errors with strace Carl Johan Gustavsson 2014-04-23T20:11:00Z 2014-04-23T20:11:00Z /blog/2014/04/debugging-with-strace.html <p>Yesterday my collegue came to me with a strange error in our development environment. For some reason the message queue consumers failed fetching new messages from Amazon <span class="caps">SQS</span> with a <span class="caps">SSL</span> library error. The error message we got&nbsp;was </p> <pre><code>[Errno 185090050] _ssl.c:340: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib" </code></pre> <p>which could be interpreted as the process could not load the certificate revocation list&nbsp;properly. </p> <p>This error appeared after my collegue deployed a very minor and totally unrelated code change. However, we proceeded to deploy the previous version, just to be sure. The error&nbsp;persisted.</p> <p>After googling a bit I found that the file it was supposed to be looking for was the <code>cacerts.txt</code> inside <code>boto</code>. I went ahead and check if that file existed in our virtualenv and it did. Very&nbsp;strange.</p> <p>That meant I needed to know the absolute path of the file it could not find. For that I needed to see what syscall it failed on, so strace comes to the rescue. Strace is a neat utility to trace all syscalls a process does and the signals it receives. I attached strace to one of our server processes by the following&nbsp;command.</p> <pre><code>strace -v -p &lt;pid&gt; 2&gt;&amp;1 | grep cacert </code></pre> <p>The output I got was similar to the following (I didn&#8217;t save the output at the&nbsp;time)</p> <pre><code>open("/xxxxxx/1398298388/lib/python2.7/site-packages/boto/cacerts/cacerts.txt", O_RDONLY) = -1 </code></pre> <p>So it was looking for the file I checked for. However, there is one thing here, you might recognize <code>1398298388</code> as a unix timestamp. It is the time when our virtual environment was created. We create a new one for each build and then deploy the entire virtual environement to our servers, activate it and restart the processes. In the <code>xxxxxxx</code> directory we have the latest virtual environements we have&nbsp;deployed. </p> <p>This specific virtual environment the process was launched from was now missing from our server. It is due to that we only keep the last few environments on the server. What had happend was that our processes had not been restarted properly after the last deploys. We have supervisor monitoring our server processes and we had recently change how processes are started by introducing a wrapper script. This script did not forward the shutdown signals properly. Restarting the processes properly solved the <span class="caps">SSL</span>&nbsp;error.</p> <p>The conclusion that can be drawn is that strace is very useful for debugging errors that have no obvious reason and you need to look closer at the process to see what it actually tries to&nbsp;do. </p> Asyncio part 1 Carl Johan Gustavsson 2014-04-16T23:45:00Z 2014-04-16T23:45:00Z /blog/2014/03/asyncio-pt-1.html <p>Python 3.4 was released a couple of weeks ago and with it the new asynchronous library <a href="https://docs.python.org/3.4/library/asyncio.html">asyncio</a>. As I have worked quite a lot with Twisted at my day job I am quite thrilled for this new built-in async&nbsp;library.</p> <p>The new asyncio library is also a good reason to start doing Python 3 development. Due to using Twisted at my dayjob I have mostly used&nbsp;2.7. </p> <p>To explore asyncio I&#8217;m going to write a small application using it. Recently I have gotten a bit interested in distributed hash tables (<span class="caps">DHT</span>) and that seems like a good fit for for testing out an async library. Having researched about DHTs I found the <a href="http://www.cs.rice.edu/Conferences/IPTPS02/109.pdf" title="Kademlia: A Peer-to-peer Information System Based on the XOR Metric">paper on Kademlia</a>. Go ahead and read it if you haven&#8217;t already, it is quite easy to read and describes the rationale behind the design decisions. Next resource I found was this <a href="http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html" title="Kademlia: A Design Specification">specification</a>. </p> <p>After reading I started by building a very simple <a href="http://en.wikipedia.org/wiki/JSON-RPC" title="Wikipedia: JSON-RPC"><span class="caps">JSON</span> <span class="caps">RPC</span></a> layer using asyncio primitives communicating over <span class="caps">UDP</span>. The progress can be found at the <a href="https://github.com/cjgu/asyncio-kademlia">asyncio-kademilia</a> project page on github] and more details will come in part&nbsp;2.</p> San Francicso move Carl Johan Gustavsson 2014-02-01T00:00:00Z 2014-02-01T00:00:00Z /blog/2014/02/san-francisco.html <p>I finally got some time over to continue the blog project after my move to San Francisco. Now it is actually published on my domain&nbsp;also.</p> <p>And by the way San Francisco is awesome even in winter, kind of like Swedish summer. I got myself a bike here also, a single-speed this time. I&#8217;ve always been a bit against the whole fixie/single-speed thinge, but so far I like my single-speed. It is not a fixie a the moment but can be converted so I guess I at least have to try it before judging it&nbsp;more.</p> <p>San Francisco is great to bike in, I would say better than in Stockholm, maybe except for the hills. The bike lanes are better and car drivers seems to be nicer and more&nbsp;careful.</p> Hyde, the static web generator Carl Johan Gustavsson 2013-12-27T00:00:00Z 2013-12-27T00:00:00Z /blog/2013/12/hyde.html <p>So, I needed a small Christmas project, and a place to dump some random thoughts, what would be a better place than a simple blog. When I say simple I mean static <span class="caps">HTML</span>, preferably served from Amazon&nbsp;S3.</p> <p>After taking a look at the Python-based options, I found <a href="http://blog.getpelican.com/">Pelican</a>, which seemed really nice, but I quickly noticed that it is <span class="caps">AGPL</span>-licensed. Second best alternative was <a href="https://github.com/hyde/hyde/">Hyde</a> which is <span class="caps">MIT</span>-licensed. As liberal licensing appeal a lot to me, the choice was easy to&nbsp;make.</p> <p>The standard design template certainly needed a bit of work (and it is by no means done) but is now in an acceptable&nbsp;state.</p> <p>For publishing to S3, I didn&#8217;t like the complexity of the S3 addon publishers that were available so I decided to use a s3cmd instead, making it a one line&nbsp;deploy:</p> <pre><code>s3cmd sync deploy/ s3://&lt;bucket-name&gt;/ </code></pre> <p>And thats it, very easy. Also, s3cmd sync uses the etag/md5 of files making sure to only upload files that have changed on subsequent&nbsp;deploys.</p>