<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Michael Bang&apos;s blog</title>
    <description></description>
    <link>https://blog.vbang.dk/</link>
    <atom:link href="https://blog.vbang.dk/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Sun, 29 Jun 2025 19:45:33 +0000</pubDate>
    <lastBuildDate>Sun, 29 Jun 2025 19:45:33 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>

    
      <item>
        <title>Tools I love: mise(-en-place)</title>
        <description>&lt;p&gt;Once in a while you get introduced to a tool that instantly changes the way you work. For me, &lt;a href=&quot;https://github.com/jdx/mise&quot;&gt;mise&lt;/a&gt; is one of those tools.&lt;/p&gt;

&lt;p&gt;mise is the logical conclusion to a lot of the meta-tooling that exists around language-specific version and package managers like &lt;a href=&quot;https://asdf-vm.com/&quot;&gt;asdf&lt;/a&gt;, &lt;a href=&quot;https://github.com/nvm-sh/nvm&quot;&gt;nvm&lt;/a&gt;, &lt;a href=&quot;https://docs.astral.sh/uv/&quot;&gt;uv&lt;/a&gt;, &lt;a href=&quot;https://github.com/pyenv/pyenv&quot;&gt;pyenv&lt;/a&gt; etc. It makes it exceptionally easy to install, use, and manage software. It also allows you to manage &lt;a href=&quot;https://mise.jdx.dev/environments/&quot;&gt;environment variables&lt;/a&gt; and &lt;a href=&quot;https://mise.jdx.dev/tasks/&quot;&gt;declare tasks&lt;/a&gt; (run commands).&lt;/p&gt;

&lt;h1 id=&quot;trying-out-new-tools&quot;&gt;Trying out new tools&lt;/h1&gt;
&lt;p&gt;The first step in getting an intuitive understanding of what mise can help you with is to use it to install a tool. Pick your favorite and try it out; it supports &lt;a href=&quot;https://mise.jdx.dev/registry.html&quot;&gt;&lt;em&gt;a lot&lt;/em&gt;&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;I recently read about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jj&lt;/code&gt; in &lt;a href=&quot;https://registerspill.thorstenball.com/&quot;&gt;Thorsten Ball’s newsletter&lt;/a&gt; and decided to try it out (again). I crossed my fingers and hoped that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jj&lt;/code&gt; was &lt;a href=&quot;https://mise.jdx.dev/registry.html&quot;&gt;one of the tools supported by mise&lt;/a&gt; and, to my delight, it was! The process looked something like this:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;jj
command_not_found_handler:5: &lt;span class=&quot;nb&quot;&gt;command &lt;/span&gt;not found: jj

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;mise use jj
mise ~/projects/examples_mise/mise.toml tools: jj@0.30.0

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;jj version
jj 0.30.0

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; ..

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;jj version
command_not_found_handler:5: &lt;span class=&quot;nb&quot;&gt;command &lt;/span&gt;not found: jj

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;eaxmples_mise

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;jj version
jj 0.30.0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As the above shows, with mise we’re just one command away from installing and trying out a new tool, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise use jj&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In the above we that mise printed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise ~/projects/examples_mise/mise.toml tools: jj@0.30.0&lt;/code&gt;. This tells us that mise has created (or updated) the mise configuration &lt;em&gt;on that path&lt;/em&gt;. &lt;br /&gt;
We also see that if we cd out of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/projects&lt;/code&gt;, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jj&lt;/code&gt; command is no longer available. If we cd back into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/projects/examples_mise&lt;/code&gt;, it becomes available again; unless you explicitly install tools globally, mise will only make the tools available which are mentioned in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; file on the path from your current directory to the root of your file system. That of course means that we could potentially meet multiple &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; files when going back up to the root of the file system. Mise handles this by concatting the configurations and overwriting conflicting configurations, letting the file furthest down the tree win.&lt;/p&gt;

&lt;p&gt;This is a clever design as it allows us to configure different versions of the same tool to be available in different directories. Let’s have a look at what the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; file looks like:&lt;/p&gt;

&lt;div class=&quot;language-toml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;[tools]&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;jj&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;latest&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we want a specific version of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jj&lt;/code&gt; to be installed in a specific directory, we just update the toml file to say e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jj = &quot;0.30.0&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;managing-multiple-versions-of-a-tool&quot;&gt;Managing multiple versions of a tool&lt;/h1&gt;

&lt;p&gt;Let’s see what it looks like to use mise to manage Python versions for two projects with different requirements:&lt;/p&gt;

&lt;script src=&quot;https://asciinema.org/a/hLKhxRzzoDwHOJyBhkNsVl3pL.js&quot; id=&quot;asciicast-hLKhxRzzoDwHOJyBhkNsVl3pL&quot; async=&quot;true&quot;&gt;&lt;/script&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;tree
&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
├── project_new
│	└── mise.toml
└── project_old
    └── mise.toml

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;project_new/mise.toml
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;tools]
python &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;3.11&quot;&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;project_old/mise.toml
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;tools]
python &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;3.8&quot;&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;project_new
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;--version&lt;/span&gt;
Python 3.11.13

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; ../project_old
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python &lt;span class=&quot;nt&quot;&gt;--version&lt;/span&gt;
Python 3.8.20
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When we cd into one of the directories listed above, mise automatically makes the version of the tool configured in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; available to us. If it isn’t already installed, mise will install it for us. The implication of this is that you can commit a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; to your repository, and anyone that has mise installed will automatically get and use the expected dev tools when they enter the project directory. And when it’s time to upgrade a dev tool, you can just update the version number in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; and everyone will start using the new version!&lt;/p&gt;

&lt;h1 id=&quot;use-in-cicd-pipelines&quot;&gt;Use in CI/CD pipelines&lt;/h1&gt;

&lt;p&gt;The fact that mise makes tools available to you according to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; file in your current working directory has further implications: it’s not just developer machines that can benefit from using mise; CI/CD pipelines can benefit greatly as well! When you use mise in your pipelines, you avoid the problem of having out of sync versions between developer and build machines. You get to have a single place where you can configure the version of your dev tools everywhere!&lt;/p&gt;

&lt;p&gt;As I mentioned in the beginning, besides managing dev tools, mise also allows you to &lt;a href=&quot;https://mise.jdx.dev/tasks/toml-tasks.html&quot;&gt;declare and run so-called tasks&lt;/a&gt;. Think of a task as an advanced invocation of a bash script. Even if we use tasks as just plain bash scripts (they can do a lot more), it can be a major advantage to declare common operations such as building, testing, linting etc. as mise tasks, since all developers get access to them and will run their commands in exactly the same way every time. If you’re diligent in your naming, you can even make the experience of building or testing across projects identical.&lt;/p&gt;

&lt;p&gt;The following are examples of some very simple Python-related tasks declared in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-toml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;[tasks.install-deps]&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;run&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;uv pip install -r requirements.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;nn&quot;&gt;[tasks.test]&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;run&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;pytest .&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Adding this to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise.toml&lt;/code&gt; will make the commands &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise install-deps&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mise test&lt;/code&gt; available. Again, if you check this in to your repo, the commands will be available to all developers and pipelines. And reusing these names in your rust project means that you can use the same commands to tell cargo to install your crates or run your tests.&lt;/p&gt;

&lt;p&gt;Once you’ve declared your tasks you should of course also use them in your CI/CD pipeline. Doing this makes you less dependent on the particular yaml syntax and arbitrary requirements of your provider, and makes it easier to move to another one if you need to. It also ensures that there’s a standard way to build and test your code, helping to further reduce the amount of “it works on my machine”.&lt;/p&gt;

&lt;p&gt;There’s a lot of depth to what you can use mise to help you automate. It’s a lovely tool and I hope I’ve spiked your interest enough to give it a try!&lt;/p&gt;

&lt;h1 id=&quot;security-concerns&quot;&gt;Security concerns&lt;/h1&gt;

&lt;p&gt;Although this is a very obvious problem, I want to make it explicit: a major concern of all software dependency management is control of your supply chain; how easy is it for somebody to insert malicious code into a binary you will run hugely impacts the integrity of your systems and data. Depending on your industry, it might not be feasible to use mise as it’s pretty opaque where your dependencies will be downloaded from.&lt;/p&gt;

&lt;blockquote class=&quot;twitter-tweet&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;I&apos;m hoping to find the time to write a series of posts over the summer on tools that I love. Here&apos;s the first one which I fell in love with just 3 months ago: mise&lt;br /&gt; &lt;a href=&quot;https://blog.vbang.dk/2025/06/29/tools-i-love-mise/&quot;&gt;https://blog.vbang.dk/2025/06/29/tools-i-love-mise/&lt;/a&gt;&lt;/p&gt;&amp;mdash; Michael Vinter Bang (@micvbang) &lt;a href=&quot;https://twitter.com/micvbang/status/1939384107823137162?ref_src=twsrc%5Etfw&quot;&gt;December 26, 2024&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;

</description>
        <pubDate>Sun, 29 Jun 2025 17:07:10 +0000</pubDate>
        <link>https://blog.vbang.dk/2025/06/29/tools-i-love-mise/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2025/06/29/tools-i-love-mise/</guid>
      </item>
    
      <item>
        <title>On a great discussion</title>
        <description>&lt;p&gt;This isn’t for every problem, it doesn’t happen every day, and it isn’t achievable with just any group of people. But sometimes, under exactly the right circumstances, you might be lucky to find yourself in a discussion where everything and everyone seem to just.. click.&lt;/p&gt;

&lt;p&gt;Although the discussion starts out with many and drastically different points of view, the atmosphere is good and smooth and nice throughout. Everyone is respectful of each other and make sure that there’s enough space for anybody that wants to make a point. Arguments are given and accepted in good faith. Participants iterate and build on each other’s ideas and will help to argue viewpoints different from, and even opposite to, the ones they started with. This isn’t done out of a feeling of fairness or to make someone feel good, but because one sees an important point that strengthens another person’s argument. Similarly, everybody pitches in with reasons why some arguments are weaker than they otherwise seem. It doesn’t matter who brought up which argument; everybody understands that the point is to find the best possible solutions under the current constraints, and nobody keeps score.&lt;/p&gt;

&lt;p&gt;When everyone has contributed what they want and the discussion dies out, the path forward has hopefully become clearer. The group makes the best decision it can with the information available. Incentives are aligned just right: everyone is in the same boat and, win or lose, the responsibility, the outcome, and the rewards of the decision is shared within the group.&lt;/p&gt;

&lt;p&gt;I participated in one of these discussions today. I’m happy that I didn’t notice what was happening until it was almost over. I’m afraid that I would’ve sat in awe and ruined the moment and the flow of conversation by making everyone aware.&lt;/p&gt;

&lt;p&gt;When the meeting had ended and we were packing up, I did mention it. Everyone agreed that it had been a great experience. Today was a great day!&lt;/p&gt;
</description>
        <pubDate>Mon, 17 Feb 2025 18:07:10 +0000</pubDate>
        <link>https://blog.vbang.dk/2025/02/17/on-a-great-discussion/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2025/02/17/on-a-great-discussion/</guid>
      </item>
    
      <item>
        <title>Simple event broker: data serialization is expensive</title>
        <description>&lt;p&gt;In the &lt;a href=&quot;/2024/07/10/seb-tiger-style/&quot;&gt;last post&lt;/a&gt; I described my weekend project of using advice from &lt;a href=&quot;https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md&quot;&gt;Tiger Style&lt;/a&gt; to optimize the write path of &lt;a href=&quot;https://github.com/micvbang/simple-event-broker&quot;&gt;Seb&lt;/a&gt;.
Here, we found that data serialization and memory allocations were big contributors to the application being slower than it could be, and profiling helped us identify places on the write path where batching and buffer reuse could greatly improve the throughput. With a few small changes, we doubled the number of records that Seb can write to disk per second!&lt;/p&gt;

&lt;p&gt;In this post we’re going to use those learnings as a guide to do the same thing on the read path. In order for the posts not to be almost identical, this time we’ll focus on how seemingly minor changes to function signatures can have major impacts on performance.&lt;/p&gt;

&lt;h2 id=&quot;overview&quot;&gt;Overview&lt;/h2&gt;

&lt;p&gt;Since we already covered how to record performance profiles in the last post, we’ll skip it here. Instead we’ll go directly to a high-level picture of Seb’s read path, and then look at a profile of the code (at &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/tree/19a5bde1f5359b2b1c556bb7df288273a6b416d8&quot;&gt;19a5bde1&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;A high-level overview of Seb’s read path:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-09-10-seb-read-performance/architecture_seb_read_path.png&quot; alt=&quot;High-level overview of Seb&apos;s read path&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Here, we see that the read path starts with an incoming HTTP request which is handled by an &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/19a5bde1f5359b2b1c556bb7df288273a6b416d8/internal/httphandlers/getrecords.go#L22&quot;&gt;HTTP handler&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(1)&lt;/code&gt; and sent to the &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/19a5bde1f5359b2b1c556bb7df288273a6b416d8/internal/sebbroker/broker.go#L26&quot;&gt;Broker&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(2)&lt;/code&gt;. The Broker ensures that a relevant instance of &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/19a5bde1f5359b2b1c556bb7df288273a6b416d8/internal/sebtopic/topic.go#L38&quot;&gt;Topic&lt;/a&gt; exists and hands it the request &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(3)&lt;/code&gt;. The Topic then checks to see if the requested records are available in the locally cached batches &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(3.1)&lt;/code&gt;, fetching any missing batches from S3 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(3.2)&lt;/code&gt; and caching them on disk. The Topic then finally uses the &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/19a5bde1f5359b2b1c556bb7df288273a6b416d8/internal/sebrecords/records.go#L81&quot;&gt;Parser&lt;/a&gt; to extract the requested records &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(4)&lt;/code&gt;, which might span one or more files in the cache. Finally it sends the retrieved records all the way back up the stack, where the result is turned back into an HTTP response and sent back over the network to the caller.&lt;/p&gt;

&lt;p&gt;It’s important to mention here that, just like was the case on the write path, the HTTP response is encoded using multipart form-data with one part per record. As was evident when we looked at the write path, this is highly inefficient. To give you an intuition of what multipart form-data looks like, here’s an example HTTP request:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;POST /records HTTP/1.1
Host: localhost:51313
Content-Type: multipart/form-data;boundary=&quot;boundary&quot;

--boundary
Content-Disposition: form-data; name=&quot;0&quot;

record-0-data
--boundary
Content-Disposition: form-data; name=&quot;1&quot;

record-1-data
--boundary--
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;profiling&quot;&gt;Profiling&lt;/h2&gt;

&lt;p&gt;Like we did in the last post, we’ll use Go’s excellent profiling tools to identify where Seb is spending its time on the CPU. In order to do this, we need to put some load on the system. The first task of this project therefore was to implement a simple read benchmark that is easy to run. I won’t go into details of the implementation here, but I will note that having a tool to generate reliable, consistent load on your system makes performance optimizations &lt;em&gt;so&lt;/em&gt; much easier to do, and gives us much better odds of making actual improvements. I highly recommend investing the time in building a tool like this for your next project!&lt;/p&gt;

&lt;p&gt;While using the read benchmark to put some load on the system, I recorded a profile of Seb which resulted in the following flame graph:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-09-10-seb-read-performance/profiling-mime-multipart-slow.png&quot; alt=&quot;Profiling Seb, retrieving records, before&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I’ve highlighted multipart form-data formatting-related code using red boxes, and memory-related operations (allocations, copying, and garbage collection) using black boxes. We saw exactly this behavior on the write path in the last post as well, so if you read that one this result should come as no surprise. What we’re seeing is that we’re spending loads of time writing all of the records according to the multipart form-data format, generating a lot of garbage while doing so.&lt;/p&gt;

&lt;p&gt;Looking at the left-most red box on the flame graph, we see that most of its time is spent in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Syscall6&lt;/code&gt;. Going a bit up the stack, we see that this originates from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;net.(*conn).Write&lt;/code&gt;, i.e. writing bytes to a network socket. We want to get a response to our callers, so this work looks productive and isn’t something we’re trying to eliminate.&lt;/p&gt;

&lt;p&gt;Looking at the right-most red box, we see that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;multipart.(*Writer).CreateFormField&lt;/code&gt; spends a lot of time serializing our HTTP payload using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fmt.Fprintf&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fmt.Sprintf&lt;/code&gt;, both of which causes a lot of allocations and creates tons of work for the garbage collector.&lt;/p&gt;

&lt;p&gt;Lastly, looking at the black boxes in the middle of the flame graph, we see that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sebtopic.(*Topic).ReadRecords&lt;/code&gt; spends a &lt;em&gt;lot&lt;/em&gt; of time allocating and copying bytes around. If you look carefully, you can see that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(*Parser).Records&lt;/code&gt; &lt;em&gt;does disk IO&lt;/em&gt;. And, oh my, although disk IOs are one of the most expensive operations we can do, they take up only ~25% of the the time spent in that function!&lt;/p&gt;

&lt;p&gt;Now that we have a better understanding of where Seb is spending its precious time on the CPU, we can focus on how to improve it for the better.&lt;/p&gt;

&lt;h2 id=&quot;reflecting&quot;&gt;Reflecting&lt;/h2&gt;

&lt;p&gt;Like we learned in the previous post, data serialization has a major impact on performance. It not only takes time to translate data between formats, it also requires us to allocate and copy bytes between buffers, creating a lot of garbage that has to be cleaned up.&lt;/p&gt;

&lt;p&gt;In the previous post we worked backwards from Seb’s internal on-disk format and redefined the user-facing API such that it uses the same format, thereby avoiding almost all of the serialization-related work we’re now seeing on the read path. Instead of using multipart form-data, encoding one field per record, if we instead serialize it as one buffer containing all record data and one list containing the lengths of each record in that buffer, we can avoid a lot of work.&lt;/p&gt;

&lt;p&gt;I’ve visualized the difference between the two formats below:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-09-10-seb-read-performance/multipart-form-encoding-to-pointers-and-raw.png&quot; alt=&quot;Payload serialization, multipart form-data vs raw data and lengths&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Looking at the flame graph again, what would it look like if we removed all of the serialization and unproductive allocations that we currently see?&lt;/p&gt;

&lt;p&gt;Assuming that we don’t have to restructure data but can basically just give the caller the raw bytes, we can just read it from disk and pass it up the stack. This should remove all of the unproductive allocations we saw.&lt;/p&gt;

&lt;p&gt;Since the format shown above only requires us to create two form fields instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N&lt;/code&gt; (one per record), we would also expect the time spent in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CreateFormField&lt;/code&gt; to almost go away.&lt;/p&gt;

&lt;p&gt;I’ve visualized what these changes might look like, with blue boxes representing avoidable work:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-09-10-seb-read-performance/profiling-mime-multipart-slow-parts-avoid.png&quot; alt=&quot;Profiling Seb, retrieving records, work to avoid&quot; /&gt;&lt;/p&gt;

&lt;p&gt;When we disregard the contents of the blue boxes in the above flame graph, we see that we’re almost left with only the essential (and most expensive!) operations: reading from disk and writing to the network.&lt;/p&gt;

&lt;p&gt;This is all well and good in theory, but how do we achieve this in code?&lt;/p&gt;

&lt;h2 id=&quot;fixing&quot;&gt;Fixing&lt;/h2&gt;
&lt;p&gt;Although the specific changes in implementation could be interesting to look at, we will continue to look at this using only the high-level information we already know; I want to highlight that the changes in execution speed we’re going to see from the changes described here don’t have as much to do with the exact implementation as it has to do with the structure; the flow of data. Both of course play a role, but I think the most important learnings in this case can be had by focusing on just the structure.&lt;/p&gt;

&lt;p&gt;If you’re interested in digging into implementation details, I suggest you look at the source: &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/tree/19a5bde1f5359b2b1c556bb7df288273a6b416d8&quot;&gt;this is where we start&lt;/a&gt;, &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/tree/d0e3cd56e97e43d68d9df74bc47424a4572cb176&quot;&gt;this is where we end&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the first diagram of this post, we saw the functions that make up the read path. Here, we see it again, this time with function signatures:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type Batch struct {
	Sizes []uint32
	Data  []byte
}

func (s *Broker) GetRecords(ctx context.Context, topicName string, offset uint64, maxRecords int, softMaxBytes int) ([][]byte, error)

func (s *Topic) ReadRecords(ctx context.Context, offset uint64, maxRecords int, softMaxBytes int) (sebrecords.Batch, error)

func (rb *Parser) Records(recordIndexStart uint32, recordIndexEnd uint32) (Batch, error)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;At the bottom of the read path, we see that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parser.Records()&lt;/code&gt; returns a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt;. Since this is at the bottom of the call hierarchy, the returned &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt;es must be allocated within &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parser.Records()&lt;/code&gt; itself. From the description at the beginning of the post, we know that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Topic.ReadRecords()&lt;/code&gt; will call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parser.Records()&lt;/code&gt; once per file that we need to read. This means that, with the current function signature, we will see at least one allocation per file read. Depending on the number of records requested, this could cause many allocations.&lt;/p&gt;

&lt;p&gt;We are looking to eliminate unproductive allocations, so how do we avoid the current requirement that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parser.Records()&lt;/code&gt; must allocate a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt; per call? By giving &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*Batch&lt;/code&gt; as an argument instead of requiring it as a return value:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func (rb *Parser) Records(batch *Batch, recordIndexStart uint32, recordIndexEnd uint32) error
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The small change we just made to the signature has a very important impact: we moved the responsibility of allocating &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt; one level up the stack, from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Parser.Records()&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Topic.ReadRecords()&lt;/code&gt;. We can of course do this same trick all the way up the stack, which changes all signatures to the following:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func (s *Broker) GetRecords(ctx context.Context, batch *sebrecords.Batch, topicName string, offset uint64, maxRecords int, softMaxBytes int) error

func (s *Topic) ReadRecords(ctx context.Context, batch *sebrecords.Batch, offset uint64, maxRecords int, softMaxBytes int) error

func (rb *Parser) Records(batch *Batch, recordIndexStart uint32, recordIndexEnd uint32) error
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This minor change has moved the responsibility of allocating &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt;es from the bottom of the stack to the top. It’s now the responsibility of the code that calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Broker.GetRecords()&lt;/code&gt; (in our case an HTTP handler) to provide a pre-allocated batch to be used for each request. As long as the given &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*Batch&lt;/code&gt; is large enough to satisfy the request, we can now guarantee &lt;em&gt;at most one&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt; allocation per request, regardless of how many files we need to read data from. And, with allocations being made at the top of the call stack, it’s possible to reuse buffers across requests, leading to even fewer allocations.&lt;/p&gt;

&lt;p&gt;To show you what this could look like from the caller’s perspective, here’s a simplified version of the HTTP handler:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type RecordsGetter interface {
	GetRecords(ctx context.Context, batch *sebrecords.Batch, topicName string, offset uint64, maxRecords int, softMaxBytes int) error
}

func GetRecords(log logger.Logger, batchPool *syncy.Pool[*sebrecords.Batch], rg RecordsGetter) http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		// do http stuff

		batch := batchPool.Get()
		defer batchPool.Put(batch)

		err := rg.GetRecords(r.Context(), batch, topicName, offset, maxRecords, softMaxBytes)
		if err != nil {
			// handle various errors
		}

		err = httphelpers.RecordsToMultipartFormDataHTTP(mw, batch.Sizes, batch.Data)
		if err != nil {
			// handle various errors
		}
	}
}

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Since the write path already uses the same structure, these changes also allow us to share the pool of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Batch&lt;/code&gt;es between the read- and write paths!&lt;/p&gt;

&lt;p&gt;Additionally, since Seb &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/cmd/seb/app/serve.go#L105&quot;&gt;limits how many HTTP requests it wants to handle in parallel&lt;/a&gt;, an extra benefit is that it’s now possible to allocate all buffers that the program needs at startup! This of course comes with some drawbacks, e.g. it puts hard limits on the size of payloads, but it also comes with some superhero-like benefits: with all buffers allocated at startup, we can now determine &lt;em&gt;at deployment time&lt;/em&gt; how much memory the application will use&lt;sup id=&quot;fnref:0&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:0&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. If the application starts at deployment, we can be confident that &lt;em&gt;it cannot go out-of-memory!&lt;/em&gt; This sounds surreal and is an absolute superpower when doing server planning and provisioning. This one took a few days to sink in for me, but once I realized the power of it, I couldn’t stop thinking about it. Why aren’t we aiming to build our systems like this?&lt;/p&gt;

&lt;p&gt;Alright. With the above changes implemented, it’s time to put some pressure on the system and record another profile. The new recording resulted in the following flame graph:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-09-10-seb-read-performance/profiling-mime-multipart-fast.png&quot; alt=&quot;Profiling Seb, retrieving records, after&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Oh my, this is even better than I dared hope for! We’ve eliminated basically all of the serialization and garbage collection overhead, even removing a large &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memmove&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;multipart.(*part).Write&lt;/code&gt; that I wasn’t expecting to get rid of.&lt;/p&gt;

&lt;p&gt;On the new flame graph we see that we’re almost literally down to spending time only in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Syscall6&lt;/code&gt;. Clicking around, I can tell you that the flame graph reports that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Syscall6&lt;/code&gt; now takes up 91.9% of the total runtime! Approximately half of it is for reading from disk, and the other half is for writing to the network.&lt;/p&gt;

&lt;p&gt;With these very promising changes it’s time to benchmark.&lt;/p&gt;

&lt;h2 id=&quot;benchmarking-digression&quot;&gt;Benchmarking digression&lt;/h2&gt;

&lt;p&gt;Before jumping to benchmarking, I want to digress slightly and note something I’ve learned the hard way (many times by now, so maybe I never really learned it…)&lt;/p&gt;

&lt;p&gt;When benchmarking you should &lt;em&gt;always&lt;/em&gt; record and safely store your benchmark parameters. And, importantly, &lt;em&gt;include the version of the code that was used!&lt;/em&gt; This lets you know exactly which code and configuration gave you the results you’re looking at. This is incredibly valuable when you inevitably make more changes to the code than you expected, as it allows you to understand how (or even if) you can sensibly compare different runs of the benchmarks. If you fail to do this, you’re destined to have to re-run all of your benchmarks &lt;em&gt;just this last time&lt;/em&gt; (for the 7th time.) The best strategy I found for remembering to do this is to just dump the benchmark’s parameters along with the results. The parameters are honestly just as important and valuable as the results themselves!&lt;/p&gt;

&lt;h2 id=&quot;benchmarking&quot;&gt;Benchmarking&lt;/h2&gt;

&lt;p&gt;The benchmarks run for this post were run on my laptop, a Lenovo T14, plugged in to the wall, with the following specs:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;AMD Ryzen 7 PRO 4750U&lt;/li&gt;
  &lt;li&gt;Micron MTFDHBA512TDV 512GB NVMe drive&lt;/li&gt;
  &lt;li&gt;48 gigs of RAM&lt;/li&gt;
  &lt;li&gt;Ubuntu 22.04&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’re doing no network requests (all files are cached locally), so the NIC should be irrelevant. Also, since we’re doing buffered IO on a 1GiB records, we expect reads to be mostly served from the page cache.&lt;/p&gt;

&lt;p&gt;The benchmarks were run with the following command:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;seb benchmark-read --local-broker=true -r 5 -w 16 --batches=4096 --record-size=1024 --records-per-batch=256 --records-per-request=1024 --requests 20000
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This command runs 5 repetitions of a job that utilizes 16 workers to send a total of 20.000 requests. Each request asks for 1024 records (1KiB each, so 1MiB/request), for a total of ~19.5GiB requested. The starting offset for each request is selected uniformly at random from a set of pre-inserted and cached records. The on-disk batch size is 256 records/file, so each request will have to open and read 4 or 5 different files.&lt;/p&gt;

&lt;p&gt;And, as summarized by the benchmark tool:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Benchmark config:
Num workers:            16
Num requests:           20000                                 
Records/request:        1024                                 
Record size:            1KiB (1024B)                                 
Bytes/request:          1MiB (1048576B)
Total bytes requested:  19.5GiB (20971520000B)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note: this workload doesn’t really replicate a production scenario where we would probably expect something like a Poisson distribution heavily skewed towards the most recent records. Also, we’re not looking to understand the absolute performance of Seb here but are just looking for the relative impact of our changes.&lt;/p&gt;

&lt;p&gt;Without further ado, the results of the benchmarks:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;code&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;seconds/run&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;records/second&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;improvement&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://github.com/micvbang/simple-event-broker/tree/19a5bde1f5359b2b1c556bb7df288273a6b416d8&quot;&gt;reference&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;35.82 / 35.32 / 37.21&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;572k / 550k / 580k&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://github.com/micvbang/simple-event-broker/tree/d0e3cd56e97e43d68d9df74bc47424a4572cb176&quot;&gt;update&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9.76 / 9.50 / 10.30&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2099k / 1987k / 2154k&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3.67x&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Whoop, a 3.67x improvement; we can now run the same workload in about 1/4 of the time!&lt;/p&gt;

&lt;p&gt;For the second time we’re learning that data serialization and unnecessary memory operations have a &lt;em&gt;major&lt;/em&gt; impact on performance. By changing the user-facing interface to match the format that Seb wants the data to be in internally, we’ve removed a lot of work and with it a lot of allocations and memcopying. By using simple tools and comparatively small refactorings, we’re seeing a &lt;em&gt;massive&lt;/em&gt; 3.67x payoff in performance. Awesome!&lt;/p&gt;

&lt;p&gt;Yet again I’ll end my post by tipping my hat giving a big THANK YOU to &lt;a href=&quot;https://x.com/jorandirkgreef&quot;&gt;Joran Dirk Greef&lt;/a&gt; at TigerBeetle and &lt;a href=&quot;https://x.com/DominikTornow&quot;&gt;Dominik Tornow&lt;/a&gt; at Resonate for sharing all of their knowledge and helping to light a fire in the systems software community!&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:0&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;this isn’t entirely accurate; I haven’t eliminated all allocations from Seb yet. But the &lt;em&gt;vast&lt;/em&gt; majority of memory used &lt;em&gt;is&lt;/em&gt; coming from these buffers, so the overall point is still valid. &lt;a href=&quot;#fnref:0&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Tue, 10 Sep 2024 13:25:00 +0000</pubDate>
        <link>https://blog.vbang.dk/2024/09/10/seb-tiger-style-read-path/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2024/09/10/seb-tiger-style-read-path/</guid>
      </item>
    
      <item>
        <title>Simple event broker tries Tiger Style</title>
        <description>&lt;p&gt;I’ve been on a bender for the past few weeks. I haven’t been able to stop reading and watching content about &lt;a href=&quot;https://tigerbeetle.com/&quot;&gt;TigerBeetle&lt;/a&gt;. I was especially enamored by videos in which &lt;a href=&quot;https://x.com/jorandirkgreef&quot;&gt;Joran Dirk Greef&lt;/a&gt; presents &lt;a href=&quot;https://www.youtube.com/watch?v=_jfOk4L7CiY&quot;&gt;TigerBeetle in general&lt;/a&gt;, &lt;a href=&quot;https://www.youtube.com/watch?v=Wii1LX_ltIs&quot;&gt;replication&lt;/a&gt;, and &lt;a href=&quot;https://www.youtube.com/watch?v=w3WYdYyjek4&quot;&gt;Tiger Style&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Joran has been far and wide the past years, doing all he can to spread the message of TigerBeetle and Tiger Style. Lucky for us, this has left a trail of insightful content in his wake!&lt;/p&gt;

&lt;p&gt;My time in the virtual company of Joran has inspired me to try TigerBeetle’s coding style, &lt;a href=&quot;https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md&quot;&gt;Tiger Style&lt;/a&gt;. Since I’m already working on &lt;a href=&quot;https://blog.vbang.dk/2024/05/26/seb/&quot;&gt;Seb&lt;/a&gt;, my event broker which I want to be fast and keep my data safe, I thought this would be a good place to try it out.&lt;/p&gt;

&lt;p&gt;With inspiration from Joran and Tiger Style, my past weekend’s project was to improve the write path of Seb. My goal was simple: write more records per second while maintaining correctness (duh!)&lt;/p&gt;

&lt;h1 id=&quot;tiger-style&quot;&gt;Tiger Style&lt;/h1&gt;
&lt;p&gt;The &lt;a href=&quot;https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md#performance&quot;&gt;parts of Tiger Style&lt;/a&gt; that mostly inspired this weekend project were:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Perform back-of-the-envelope sketches with respect to the four resources (network, disk, memory, CPU) and their two main characteristics (bandwidth, latency). Sketches are cheap. Use sketches to be “roughly right” and land within 90% of the global maximum.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Amortize network, disk, memory and CPU costs by batching accesses.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These were particularly intriguing to me since, in the first implementation of Seb, records could only be added and retrieved one-by-one. This was a fundamental, architectural problem that had to change in order for the event broker to have any reasonable hope of not remaining the slowest kid in class forever. In my first post, &lt;a href=&quot;https://blog.vbang.dk/2024/05/26/seb/&quot;&gt;Hello World, Simple Event Broker&lt;/a&gt;, I showed that my first naive batching implementation gave an easy 2x improvement in the number of records handled per second, going from ~22k to ~50k. This was obviously a welcome improvement, but honestly not very impressive.&lt;/p&gt;

&lt;p&gt;I’ve been focusing more on correctness than performance while building Seb so far, so I haven’t really taken the opportunity to do any profiling. Until now!&lt;/p&gt;

&lt;h1 id=&quot;profiling&quot;&gt;Profiling&lt;/h1&gt;

&lt;p&gt;It has taken me much longer to learn this than is reasonable, but I now finally know, and act as if I know!, that the very first thing you &lt;em&gt;must&lt;/em&gt; do when trying to make your program faster is to measure it and be &lt;strong&gt;very systematic&lt;/strong&gt; about your measurements.
Yes, I &lt;em&gt;know&lt;/em&gt; it is much more fun to guess at the problem and try out random solutions, crossing your fingers in the hope that one of your guesses magically make things go brrr. But if you plan to make progress instead of trying your luck all day, going straight to some sort of profiling is the winning move. Every. Single. Time. Even if you’re just printf’ing timestamps; &lt;strong&gt;you must measure&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;Luckily, Go has some excellent tooling for profiling which makes the decision to stop spinning the roulette that much easier. It’s almost trivial to instrument a Go program to be profiled: just &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/infrastructure/httphelpers/pprof.go&quot;&gt;start an HTTP server&lt;/a&gt; on an unused port (on localhost!) and request a CPU profile from it:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl http://localhost:5000/debug/pprof/profile?seconds&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;10 &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; profile
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once you’ve got your profile, you can view it using:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;go tool pprof --http &quot;:8001&quot; profile
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This should open up a browser with an interactive view of the profile you just made. If you haven’t done this before, try it out on one of your programs. As the following will show you, you might be surprised by what you find!&lt;/p&gt;

&lt;p&gt;Alright, on to Seb. On Saturday morning I fired up Seb, ran a workload with a bunch of inserts and requested a CPU profile.&lt;/p&gt;

&lt;p&gt;With the profile in hand, I opened the interactive web view and jumped directly to the flame graph. If you haven’t seen one of these before, check out &lt;a href=&quot;https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#Description&quot;&gt;this explanation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The graph I got was this (sorry - open the screenshot in a new tab, it doesn’t show in a readable size on my blog and I’m an idiot with CSS):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-07-07-seb-write-performance/profiling-mime-multipart-slow.png&quot; alt=&quot;Profiling Seb, adding records, before&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The red box I put on there outlines the HTTP handler &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/76ee8661d98e6988448d88f543f38e304edb92ae/internal/httphandlers/addrecords.go#L25&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;httphandlers.AddRecords()&lt;/code&gt;&lt;/a&gt; which takes up almost 50%(!) of the time of the time shown on the graph. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt;’s job is to parse incoming HTTP requests, pass them to the Broker, and send an HTTP response to the caller. Admittedly I was surprised to see that Seb is spending around half of the time on its write path parsing multipart data and, in the process, generating heaps of garbage that has to be cleaned up again.&lt;/p&gt;

&lt;p&gt;The green box on the screenshot outlines &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/76ee8661d98e6988448d88f543f38e304edb92ae/internal/sebrecords/records.go#L45&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sebrecords.Write()&lt;/code&gt;&lt;/a&gt; which is responsible for writing data to the underlying storage.&lt;/p&gt;

&lt;p&gt;The black boxes outline runtime memory operations: allocations, memcopies, and garbage collection. This is a large part of the time spent!&lt;/p&gt;

&lt;p&gt;The flame graph basically tells us that Seb is creating a lot of garbage. Like, a lot. Unlike in real life where making a mess can be quite fun, on the computer it’s doubly bad: it’s expensive to clean up &lt;em&gt;and&lt;/em&gt; it’s expensive to make a mess in the first place. And, to make matters even worse, using all of this memory completely ruins the effectiveness of our hardware caches. Ugh!&lt;/p&gt;

&lt;p&gt;Taking another look at Tiger Style, we see that it has more relevant advice:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;All memory must be statically allocated at startup. No memory may be dynamically allocated (or freed and reallocated) after initialization. This avoids unpredictable behavior that can significantly affect performance, and avoids use-after-free. As a second-order effect, it is our experience that this also makes for more efficient, simpler designs that are more performant and easier to maintain and reason about, compared to designs that do not consider all possible memory usage patterns upfront as part of the design.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I have never attempted to implement a system of this size that statically allocates everything, but I appreciate that it must be a major effort to do so. I’m absolutely certain that I won’t remove all allocations from Seb’s write path in this small weekend project, but in terms of performance and safety it seems like great advice. Let’s see how far we get.&lt;/p&gt;

&lt;p&gt;Using a stretchy interpretation of the Tiger Style advice of back-of-the-envelope sketching (which is supposed to be done &lt;em&gt;before&lt;/em&gt; you actually write your code), let’s have a high-level look at the implementation of the two functions highlighted by the flame graph. Our aim is to find code that puts pressure on the garbage collector.&lt;/p&gt;

&lt;h1 id=&quot;investigating&quot;&gt;Investigating&lt;/h1&gt;

&lt;p&gt;Since &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; is taking up most of the time, we’ll focus on that first. I’ve listed the most relevant code below. The full function is available &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/76ee8661d98e6988448d88f543f38e304edb92ae/internal/httphandlers/addrecords.go#L25&quot;&gt;here&lt;/a&gt; if you’re curious. Since the flame graph told us that this function is doing a lot of allocations, I’ve added comments to highlight the most obvious ones.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func AddRecords(log logger.Logger, s RecordsAdder) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        defer r.Body.Close()

        // ...

        records := make([]sebrecords.Record, 0, 256) // &amp;gt;= 1 ALLOC
        mr := multipart.NewReader(r.Body, mediaParams[&quot;boundary&quot;])
        for part, err := mr.NextPart(); err == nil; part, err = mr.NextPart() {
            record, err := io.ReadAll(part)  // &amp;gt;= 1 ALLOC PER LOOP
            if err != nil {
                log.Errorf(&quot;reading parts of multipart/form-data: %s&quot;, err)
                w.WriteHeader(http.StatusInternalServerError)
                return
            }
            part.Close()
            records = append(records, record)
        }

        // ...
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We’re only doing a back-of-the-envelope kind of investigation here, so we won’t go into the actual implementations of anything but the code listed above. With just this tiny snippet of code we can tell that there is at least one allocation related to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;records&lt;/code&gt; variable (notice the trailing “s”), and at least one allocation for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;record&lt;/code&gt; variable; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;io.ReadAll()&lt;/code&gt; must allocate the byte slice it returns.&lt;/p&gt;

&lt;p&gt;Since the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;record&lt;/code&gt;-variable is computed once per &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N&lt;/code&gt; iterations of the loop it looks to be the dominating factor in terms of how many allocations are made. In fancy systems lingo we say that there’s &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_O_notation&quot;&gt;&lt;em&gt;on the order of&lt;/em&gt;&lt;/a&gt; N allocations happening here - at least one allocation per record received in the HTTP request.&lt;/p&gt;

&lt;p&gt;This very high-level understanding of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; memory usage is enough to satisfy me for now. Let’s turn to the second offender on the list, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sebrecords.Write()&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func Write(wtr io.Writer, rb []Record) error {
    header := Header{
        MagicBytes:  FileFormatMagicBytes,
        UnixEpochUs: UnixEpochUs(),
        Version:     FileFormatVersion,
        NumRecords:  uint32(len(rb)),
    }

    err := binary.Write(wtr, byteOrder, header)
    if err != nil {
        return fmt.Errorf(&quot;writing header: %w&quot;, err)
    }

    recordIndexes := make([]uint32, len(rb)) // 1 ALLOC, small
    var recordIndex uint32
        for i, record := range rb {
        recordIndexes[i] = recordIndex
        recordIndex += uint32(len(record))
    }

    err = binary.Write(wtr, byteOrder, recordIndexes)
    if err != nil {
        return fmt.Errorf(&quot;writing record indexes %d: %w&quot;, recordIndex, err)
    }

    records := make([]byte, 0, recordIndex) // 1 ALLOC, large
    for _, record := range rb {
        records = append(records, record...)
    }

    err = binary.Write(wtr, byteOrder, records)
    if err != nil {
        return fmt.Errorf(&quot;writing records length %s: %w&quot;, sizey.FormatBytes(len(rb)), err)
    }
    return nil
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As we saw earlier, the flame graph told us that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; is spending a lot of time copying things around and doing garbage collection. Looking for big memory accesses, we see that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; makes two calls to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make()&lt;/code&gt; - one for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recordIndexes&lt;/code&gt; and one for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;records&lt;/code&gt;. In preparation of the first loop a single, small allocation is made, before memcopying &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N uint32&lt;/code&gt;s. For the second loop it’s a probably much larger allocation of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N*[avg record size]&lt;/code&gt; bytes that is being copied into.&lt;/p&gt;

&lt;p&gt;We see that both of these allocations are made in preparation of a call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;binary.Write()&lt;/code&gt;; both are done in order to reduce the number of disk IOs. Calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;binary.Write()&lt;/code&gt; once instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N&lt;/code&gt; times will reduce the number of disk IO-related syscalls we make. Since Seb is using buffered IO without fsync (S3 is the source of truth!), we can’t tell exactly how many actual disk IOs each call translates to, but at least we do know that it translates to fewer syscalls and context switches.&lt;/p&gt;

&lt;p&gt;This means that, although it doesn’t look like it on the flame graph, both of these allocations and memcopies are actually beneficial in the current setting. The cost of doing a memory copy is much smaller than the cost of doing a disk IO, so given the chance to trade between a few memory copies and doing a few disk IOs (or syscalls), you’re very likely to get ahead if you bet on memory copying over disk IOs.&lt;/p&gt;

&lt;p&gt;Using &lt;a href=&quot;https://github.com/sirupsen/napkin-math#numbers&quot;&gt;Sirupsen’s napkin math&lt;/a&gt; and a bit of hand waving regarding buffered IOs, we can estimate that it’s on the order of 10 times faster (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;100μ/MB&lt;/code&gt; vs &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1ms/MB&lt;/code&gt;) to collect all of our data into a single buffer and then do a single IO instead of doing one IO per record using the fragmented buffers that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; receives as its input.&lt;/p&gt;

&lt;p&gt;Although the flame graph shows that we’re spending a lot of time copying things around in memory, we’ve actually just found that, in this particular example, a bit of memcopy is preferable because it’s done to reduce the number of much more expensive disk IOs.&lt;/p&gt;

&lt;h1 id=&quot;fixing&quot;&gt;Fixing&lt;/h1&gt;

&lt;p&gt;Taking a step back and considering all of the information from our investigation above, we see that the two functions have a common problem: the fact that they’re given records one-by-one impacts how much garbage they generate.&lt;/p&gt;

&lt;p&gt;For &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt;, each record received directly translates to at least one allocation. Receiving a multipart form data-formatted list of records means that it needs to parse the records and make an allocation for each one. In &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt;, we need to transform the slice of records created in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; into a slice of bytes so that we can write it efficiently to disk.&lt;/p&gt;

&lt;p&gt;It looks a lot like we could do the same job with a lot fewer allocations if we simply didn’t have to transform data between different representations!&lt;/p&gt;

&lt;p&gt;But how do we do this? If we work our way backwards, we can try to change the interface of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; so that it doesn’t have to do any transformations:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func Write(recordIndexes []uint32, records []byte) error {
    // ...
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That doesn’t look too bad! With &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recordIndexes&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;records&lt;/code&gt; being given directly as inputs, we can write them to disk without further processing.&lt;/p&gt;

&lt;p&gt;Working our way backwards up the stack, we can do the same to the callers of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt;. If, instead of requiring users to send data as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N&lt;/code&gt; multipart form-encoded fields, we request that they send the sizes of each record as one field and the raw record data as another, the number of allocations goes from &lt;em&gt;the order of&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;N&lt;/code&gt; to &lt;em&gt;the order of&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt;, meaning that the number of allocations no longer depends on the number of records in the input. Nice!&lt;/p&gt;

&lt;p&gt;With the changes described, the implementation of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; becomes much simpler and is basically just three calls to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;binary.Write()&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func Write(wtr io.Writer, recordIndexes []uint32, allRecords []byte) error {
    header := Header{
        MagicBytes:  FileFormatMagicBytes,
        UnixEpochUs: UnixEpochUs(),
        Version:     FileFormatVersion,
        NumRecords:  uint32(len(recordSizes)),
    }

    err := binary.Write(wtr, byteOrder, header)
    if err != nil {
        return fmt.Errorf(&quot;writing header: %w&quot;, err)
    }

    err = binary.Write(wtr, byteOrder, recordIndexes)
    if err != nil {
        return fmt.Errorf(&quot;writing record indexes %d: %w&quot;, recordIndex, err)
    }

    err = binary.Write(wtr, byteOrder, allRecords)
    if err != nil {
        return fmt.Errorf(&quot;writing records length %s: %w&quot;, sizey.FormatBytes(len(recordSizes)), err)
    }

    return nil
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; becomes slightly worse to read, but I’m sure another pass could improve it:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func AddRecords(log logger.Logger, s RecordsAdder) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        defer r.Body.Close()

        // ...

        var fileSizes []uint32
        var records []byte
        mr := multipart.NewReader(r.Body, mediaParams[&quot;boundary&quot;])
        for part, err := mr.NextPart(); err == nil; part, err = mr.NextPart() {
            bs, err := io.ReadAll(part)
            if err != nil {
                log.Errorf(&quot;reading parts of multipart/form-data: %s&quot;, err)
                w.WriteHeader(http.StatusInternalServerError)
                return
            }
            part.Close()

            switch part.FormName() {
            case httphelpers.RecordsMultipartRecordsKey:
                records = bs

            case httphelpers.RecordsMultipartSizesKey:
                err = json.Unmarshal(bs, &amp;amp;fileSizes)
                if err != nil {
                    log.Errorf(&quot;reading sizes: %v&quot;, err)
                    w.WriteHeader(http.StatusBadRequest)
                    return
                }

            default:
                log.Errorf(&quot;unknown field %s&quot;, part.FormName())
                w.WriteHeader(http.StatusBadRequest)
                return
            }
        }

        // TODO: we verify that both &apos;sizes&apos; and &apos;records&apos; were given

        // ...
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s see whether our interpretation of Tiger Style back-of-the-envelope changes (and a bit of other make-the-types-match kind of stuff all along the write path that I’ll sweep under the rug for now), has done to decrease the amount of garbage we generate on Seb’s write path:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-07-07-seb-write-performance/profiling-mime-multipart-medium.png&quot; alt=&quot;Profiling Seb, adding records, mid&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Not bad! &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; has changed quite a bit. What I immediately notice is that half of the multipart parsing code has disappeared from the graph: only the left-most part is still there. It’s not exactly perfect yet, as we’re still spending a lot of time in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;runtime.growslice&lt;/code&gt;. This is likely because each byte slice allocated for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;records&lt;/code&gt; variable must be expanded quite a few times to accommodate the all of the record data received.&lt;/p&gt;

&lt;p&gt;Looking at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; (which is named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WriteRaw()&lt;/code&gt; in the new graph), we see that the amount of pressure on the garbage collector has decreased noticeably. You might notice that the allocations have moved from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write()&lt;/code&gt; up to its parent, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;collectBatches()&lt;/code&gt; - I’ve swept some minor changes under the rug here, but trust me that this isn’t important to our goal.&lt;/p&gt;

&lt;p&gt;Although we’re seeing definite progress, I’m not entirely satisfied with the results of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; yet. The flame graph is showing us that a lot of time is being spent growing slices, which makes sense since &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;io.ReadAll()&lt;/code&gt; is a generic function that starts out with a modest allocation which has to grow to accommodate the size of our batches of records.&lt;/p&gt;

&lt;p&gt;In order to fix the problem, we can allocate a pool of larger buffers that can be reused between requests. I’ve highlighted the added lines with comments.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;var bufPool = syncy.NewPool(func() *bytes.Buffer { // NEW
 return bytes.NewBuffer(make([]byte, 5*sizey.MB))  // NEW
})                                                 // NEW

func AddRecords(log logger.Logger, s RecordsAdder) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        defer r.Body.Close()

        // ...

        var fileSizes []uint32
        var records []byte
        mr := multipart.NewReader(r.Body, mediaParams[&quot;boundary&quot;])
        for part, err := mr.NextPart(); err == nil; part, err = mr.NextPart() {
            buf := bufPool.Get()        // NEW
            buf.Reset()                 // NEW
            defer bufPool.Put(buf)      // NEW

            _, err = io.Copy(buf, part) // NEW
            if err != nil {
                log.Errorf(&quot;reading parts of multipart/form-data: %s&quot;, err)
                w.WriteHeader(http.StatusInternalServerError)
                return
            }
            part.Close()

            switch part.FormName() {
            case httphelpers.RecordsMultipartRecordsKey:
                records = buf.Bytes()                         // NEW

            case httphelpers.RecordsMultipartSizesKey:
                err = json.Unmarshal(buf.Bytes(), &amp;amp;fileSizes) // NEW
                if err != nil {
                    log.Errorf(&quot;reading sizes: %v&quot;, err)
                    w.WriteHeader(http.StatusBadRequest)
                    return
                }

            default:
                log.Errorf(&quot;unknown field %s&quot;, part.FormName())
                w.WriteHeader(http.StatusBadRequest)
                return
            }
        }

        // TODO: we verify that both &apos;sizes&apos; and &apos;records&apos; were given

        // ...
    }
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running the same workload again with the new pool of buffers shows that our buffer pool was a great help:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-07-07-seb-write-performance/profiling-mime-multipart-faster.png&quot; alt=&quot;Profiling Seb, adding records, after&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We’re now seeing &lt;em&gt;much&lt;/em&gt; less pressure on the garbage collector, with only a few large &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;runtime.memmove()&lt;/code&gt; calls left.&lt;/p&gt;

&lt;p&gt;This is where we’ll leave the optimization work for now. The only thing left is to do some benchmarking to see how these changes affect the goal of the project, namely increasing the amount of records per second we can push through Seb.&lt;/p&gt;

&lt;h1 id=&quot;benchmarking&quot;&gt;Benchmarking&lt;/h1&gt;

&lt;p&gt;Part of the work I did during the weekend was to update &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/76ee8661d98e6988448d88f543f38e304edb92ae/cmd/seb/app/benchmark.go#L64&quot;&gt;Seb’s benchmarking tool&lt;/a&gt;. It’s nothing fancy, but should work well to get an understanding of the relative improvements of the changes implemented above.&lt;/p&gt;

&lt;p&gt;I started out benchmarking using Seb’s S3 storage implementation, but because of very variable latencies I decided that writing to disk would serve us better for these experiments; the purpose isn’t to show how many records Seb can handle in a production scenario, but rather to see relative improvements of the changes discussed above. A final note is that this workload uses buffered IO without fsync, so don’t read too much into the absolute numbers. We’re looking for relative changes, nothing else.&lt;/p&gt;

&lt;p&gt;All benchmarks were run on one of Hetzner’s tiny, cheap, 2-core CAX11 machines, and were repeated 10 times each. Each benchmark starts a new Seb broker, exposes it on a local HTTP port and starts 16 goroutines that use the Seb client to pepper the broker with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;POST /records&lt;/code&gt;. They were run like this:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./seb benchmark &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; 10 &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; 16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The configuration for each benchmark is as follows:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Config:
Num workers:            16
Num batches:            4096
Num records/batch:      1024
Record size:            1KiB (1024B)
Total bytes:            4GiB (4294967296B)
Batch block time:       5ms
Batch bytes max:        10MiB (10485760)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the results, given as avg / min / max are:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;code&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;seconds/run&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;records/second&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;improvement&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;reference&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;24.21 / 23.37 / 25.11&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;173k / 167k / 179k&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;updated, no buffers&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;15.82 / 15.51 / 16.13&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;265k / 260k / 270k&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1.53x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;update, with buffers&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;12.41 / 12.17 / 12.57&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;338k / 334k / 345k&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1.95x&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Nice! By running three CPU profiles and looking at Seb’s code at a very high-level, we managed to identify a few locations where we could avoid a bunch of unnecessary memory allocations and thereby alleviate pressure on the garbage collector. These simple changes have almost doubled the number of records that we can push through Seb. Not bad for a weekend project!&lt;/p&gt;

&lt;p&gt;With that I’ll say that this has been fun to try out Tiger Style and that I’ll definitely continue learning from it in the future. I’m particularly interested in deterministic testing; if you happen to have great references and/or code examples to study, please let me know!&lt;/p&gt;

&lt;p&gt;Thanks to Joran and the TigerBeetle team for sharing their many insights with all of us - it’s a major source of inspiration!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If the post resonated with you and you are looking for someone to help you to do hard things with a computer, you can &lt;a href=&quot;/hire_me.html&quot;&gt;hire me&lt;/a&gt; to help you!&lt;/strong&gt;&lt;/p&gt;
</description>
        <pubDate>Wed, 10 Jul 2024 11:20:00 +0000</pubDate>
        <link>https://blog.vbang.dk/2024/07/10/seb-tiger-style/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2024/07/10/seb-tiger-style/</guid>
      </item>
    
      <item>
        <title>Driplang: triggering when events happen (or don&apos;t)</title>
        <description>&lt;p&gt;This post describes multiple ways I’ve seen projects handle event triggering in the past and suggests a minor tweak that I believe will greatly benefit projects that have nontrivial event triggering requirements. The tweak is simple and helps to avoid creating unnecessary dependencies between unrelated parts of your system.&lt;/p&gt;

&lt;p&gt;Additionally, it also describes how a tiny domain specific language can be used in the implementation of this, trying to make it possible for even non-developers to manage and create event triggers. Perhaps even using a visual tool! I never got this far in my own implementation, but it’s a very obvious next step from where the post ends.&lt;/p&gt;

&lt;p&gt;The ideas discussed here aren’t new. The functional outcome of my ideas have been available in various SaaS solutions for probably a decade. Nonetheless, I think there’s an important lesson here regarding software in general, in how seemingly minor changes in structure can have outsized benefits when it comes to the cost and complexity of developing and maintaining a system.&lt;/p&gt;

&lt;p&gt;Before we really get going I want to note that, although we’ll be talking about sending emails, the point I’m trying to make is much more general. It just so happens that notifications are a &lt;em&gt;very&lt;/em&gt; natural context to describe this problem with. Every time I’ve tried to explain these ideas, I always end up going back to notifications.&lt;/p&gt;

&lt;p&gt;A final thing before we continue: I’ll need a pinky promise that you won’t use this to spam people. No. Yes, &lt;em&gt;seriously&lt;/em&gt;. Spam is easily top 3 on the list of the 7 deadly sins.&lt;/p&gt;

&lt;p&gt;We good? Alright.&lt;/p&gt;

&lt;h1 id=&quot;the-problem&quot;&gt;The problem&lt;/h1&gt;

&lt;p&gt;On most projects I’ve worked on, it has at some point been a requirement to trigger certain functionality when specific events happen. A classic example is &lt;em&gt;“send an email to users who haven’t used feature X within their first week of signing up”&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Even though this example is rather basic, it can be surprisingly difficult to implement well. If we’re not careful when we implement event triggering, we can inadverdently start introducing dependencies between otherwise unrelated components, which over time can become a burden that slows development significantly. What once started as a simple one-liner to send an email can suddenly require that we have to consider large parts of the system whenever we want to make even a small change.&lt;/p&gt;

&lt;h1 id=&quot;simple-triggers&quot;&gt;Simple triggers&lt;/h1&gt;

&lt;p&gt;At the beginning of a project there isn’t a lot of functionality yet. This hopefully means that there aren’t a lot of accidental or unnecessary dependencies between components, and that it’s still pretty cheap and easy to add new features and maintain existing ones. Not wanting to introduce new abstractions before they are truly needed, at this stage it can easily be argued that sending an email when a user is created is most simply done somewhere on the code path that naturally exists for user creation. This could, for example, be just after the user has been persisted to storage:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class UserController:
   def add_user(self, user):
      self.user_repository.create(user)
      self.email_service.send_intro_email(user)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Depending on the full context of the rest of your system, the project’s goals and scope, your team and the position of the moon, this very well could be a nice and simple, non-overengineered solution to a simple problem. Lovely!&lt;/p&gt;

&lt;p&gt;A benefit of this simple solution is that it’s easy to look up what happens when you add a user: it’s all right there in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_user()&lt;/code&gt; function! This of course comes with the assumption that everything that happens when you add a user &lt;em&gt;actually&lt;/em&gt; happens in that function.&lt;/p&gt;

&lt;p&gt;Depending on how many event triggers we need to implement, a potential drawback of this simple implementation is that we will be scattering email-sending code all over the system. This might make it difficult to get an overview of all of the places from which we’re sending emails. Although this &lt;em&gt;could&lt;/em&gt; become a problem, the thing that tickles my spidey sense is that there are examples of reasonably simple event triggering logic that simply cannot be implemented this way. At least not in any advisable way that I know of. Triggers that require more information than naturally exists on existing code paths are super difficult to implement without introducing coupling between otherwise unrelated components. In the above example we wanted to trigger on the “user created” event, which happened right there in the code. For more complex triggers such a code path might simply not exist.&lt;/p&gt;

&lt;h1 id=&quot;more-advanced-triggers&quot;&gt;More advanced triggers&lt;/h1&gt;

&lt;p&gt;As time passes and new and more complex features are added to the project, we might want to create event triggers that aren’t a direct response to something that happens in the system. Such event triggers rarely have an obvious location where we can just add a one-liner. The problem is that we need knowledge from different parts of the system in one place.&lt;/p&gt;

&lt;p&gt;One obvious way to tackle this problem is to create an omniscient cron job-thingy that can pull information from all relevant parts of the system. In my mind I imagine this as an octopus that gets to roam around freely in your database, sticking its tentacles into anything it likes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-06-19-eventdripper/octopus_grabbing_data.png&quot; alt=&quot;Octopus inside your database&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The benefit of this strategy is that it can make it very explicit what information is required to trigger a certain event and where that information comes from. Additionally, depending on our needs, it might be an advantage that this allows us to place all code relating to sending notifications close together instead of sprinkling it throughout the system. Below is an example of what this might look like:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class OmniscientCronJobThingy:
   def x_not_used_in_first_week(self):
      for user in self.user_repository.list(created_within=&apos;1 week&apos;):
         if not self.feature_x_repository.used_by(user):
            self.email_service.send_feature_x_intro_email(user)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Since this is a cron job we have to run it at some meaningful interval, ensure that it actually runs, probably handle errors asynchronously (we don’t want stop sending emails to the rest of users just because sending an email to one of them fails), and so on. All of these are problems that can be overcome, but it does come with the price of added complexity compared with the one-liner we first saw.&lt;/p&gt;

&lt;p&gt;A major drawback that our omniscient octopus introduces is that it adds a dependency on potentially the entire data model of the system. Since it’s basically a component with license to &lt;del&gt;kill&lt;/del&gt; read data from anywhere, we have to take it into account whenever we consider making a change to almost literally any part of the system; &lt;em&gt;did one of our co-workers add an event trigger that requires knowledge from the part the system we’re currently considering changing&lt;/em&gt;? This problem can be mitigated somewhat by forcing the component to go through epositories instead of raw-dogging the database, but this doesn’t eliminate the problem entirely. When there’s an omniscient octopus tasting various parts of your data, you never quite know whether it’s safe to change your data model or not. At the very least, the loose octopus will make it more cumbersome to change the data model. Been there, done that. Although pets are nice and cute, you really don’t want them running around your database!&lt;/p&gt;

&lt;p&gt;Another problem we haven’t discussed yet is that of using existing data models to infer the state of something that we want to trigger on. In some cases we’re lucky that the data model naturally happens to contain exactly the information we want to trigger on. In other cases, not so much. What do we do then? Do we muddy the existing data model by adding &lt;em&gt;just one more&lt;/em&gt; field, to keep our omniscient cron job satisfied? I would personally be looking for different options very quickly.&lt;/p&gt;

&lt;p&gt;To summarize: we are looking for a solution that&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;avoids sprinkling email-sending code all across our application&lt;/li&gt;
  &lt;li&gt;avoids unwanted tentacles fiddling around our tables&lt;/li&gt;
  &lt;li&gt;does not create unnecessary dependencies between components&lt;/li&gt;
  &lt;li&gt;does not lead us into the temptation of introducing “unnecessary” data into our existing data models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As advertised earlier, the path I’m suggesting is in no way new nor sophisticated. It’s fundamental programming. One of the classics. It’s decoupling.&lt;/p&gt;

&lt;p&gt;If we simply separate &lt;em&gt;tracking&lt;/em&gt; of events and &lt;em&gt;reacting&lt;/em&gt; to events, we can have all of the benefits from our two solutions with very few of the drawbacks. We might even be able to move a large part of the human responsibility for declaring event triggers to non-developers!&lt;/p&gt;

&lt;p&gt;The following snippet looks very similar to our first one-liner snippet, but the result is quite different.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class UserRepository:
   def store_user(user):
      self.user_repository.create(user)
      self.eventdripper.log(
      	event_id=&quot;user_created&quot;,
      	entity_id=user.id,
      	data={&apos;name&apos;: user.name, &apos;email&apos;: user.email},
      )
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Although there’s a new name here that I haven’t introduced yet (eventdripper - yay naming!), there are no tricks and it should be fairly obvious that by logging the occurrence of an event instead of reacting to it immediately, we can move the responsibility of sending emails away from the place that the event naturally occurs. In this case, the responsibility has been moved to the mysterious Ms Eventdripper.&lt;/p&gt;

&lt;p&gt;Besides delegating responsibility, another benefit of logging events is that we no longer need to keep our handsy octopus on staff. Since eventdripper is given all information required to determine which event triggers to trigger, we no longer need an omniscient entity that can snoop on the existing data model to gather information about the current state of things. This also avoids the temptation of adding new fields to our data models just to satisfy the needs of our snoop.&lt;/p&gt;

&lt;p&gt;As you might have guessed from the poor naming, I’ve implemented a service that makes it easy to log events and react to them later. It tries to solve the problems described in this post, and it works for complex event triggers with restrictions on real-world timings. That service is called…. Eventdripper!&lt;/p&gt;

&lt;h1 id=&quot;eventdripper&quot;&gt;Eventdripper&lt;/h1&gt;

&lt;p&gt;As indicated by the snippet above, the interface of eventdripper is dead simple:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; POST /event
 {
 	&quot;event&quot;: &quot;user_created&quot;,
 	&quot;entity_id&quot;: &quot;user-id&quot;,
 	&quot;data&quot;: { /* data relevant when reacting to the event */ }
 }
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;All the information it needs to do its magic is:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;the name of event that happened&lt;/li&gt;
  &lt;li&gt;a unique identifier for the entity the event relates to&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The third parameter, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;data&quot;&lt;/code&gt;, is an optional, opaque value that the consumer can use to add metadata needed when reacting to the event. In our example, since we’re sending an email, it might be nice to have the user’s name and email.&lt;/p&gt;

&lt;p&gt;In order to get data into eventdripper, we just have to send the above payload over our preferred transport (&lt;a href=&quot;https://blog.vbang.dk/2024/05/26/seb/&quot;&gt;Seb&lt;/a&gt; anyone?). Eventdripper then collects the events and shoves them into a database, indexing them on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;event&quot;&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;entity id&quot;&lt;/code&gt;. For the purposes of this post, the way data is transported and stored isn’t super important. As long as events are received in-order and the database allows fast lookup by event and entity id, we’re golden.&lt;/p&gt;

&lt;p&gt;With all of our events now happily inhabiting the databases of eventdripper, we have a new problem to solve: how do the users of eventdripper describe which sequences of events that should satisfy a trigger? And, related to that, how does eventdripper decide whether the user’s description is satisfied by a given sequence of events? If you’re anything like me, hearing these requirements simply &lt;em&gt;begs&lt;/em&gt; for an implementation of a domain specific language. This is the story of how driplang was born!&lt;/p&gt;

&lt;h1 id=&quot;driplang&quot;&gt;Driplang&lt;/h1&gt;

&lt;p&gt;Driplang is a tiny, domain specific language (DSL) inspired by boolean and temporal logic. The DSL makes it easy (okay, possible at least..) to define expressions that can either be satisfied or not by a given sequence of events. A driplang expression can’t be evaluated by itself, but must be evaluated against a sequence of events.&lt;/p&gt;

&lt;p&gt;Driplang has four operators: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OR&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt;. The two only possible outcomes of expression evaluation are &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;true&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;false&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The driplang operators work just like you would expect them to in boolean logic, with the caveat that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; is (very) special.&lt;/p&gt;

&lt;p&gt;Let’s start by looking at a few simple boolean examples. The contents of this table shouldn’t be surprising if you already know boolean logic.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;expression&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;events&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;output&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;false&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[B]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;false&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[B, A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A, B]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; (B &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OR&lt;/code&gt; C))&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; (B &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OR&lt;/code&gt; C))&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[B, A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;false&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; (B &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OR&lt;/code&gt; C))&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[D, A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The important point to notice here is that the &lt;em&gt;order&lt;/em&gt; of events doesn’t matter for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AND&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OR&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As the name hopefully suggests, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; operator is needed when we require a ordering, e.g. if we only want our expression to be satisfied when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A&lt;/code&gt; happens before &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B&lt;/code&gt;. In driplang that requirement would look like this: A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; B.&lt;/p&gt;

&lt;p&gt;Here’s a table to give you an intuition for how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; works. I left a tiny surprise for you at the end.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;expression&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;events&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;output&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A, B]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; B&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[B, A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;false&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; B)&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; B)&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A, B]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;false&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT&lt;/code&gt; B)&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A, C]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;true&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; (B &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt; 2 days)&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;[A, B]&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;it depends&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Hopefully everything makes sense until the last expression in the table above.&lt;/p&gt;

&lt;p&gt;A minor but very important note that I left out, is that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; has an optional argument: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt; causes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; to consider real-world time. This means that, besides the fact that the events must arrive in the required order, they must also have arrived within the given time constraint. The expression &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A THEN (B WITHIN 2 days)&lt;/code&gt; from the table above will thus only be satisfied if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B&lt;/code&gt; happened within 2 days of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;All the way in the beginning of this post, we talked about triggering on events based on real-world time: &lt;em&gt;“send an email to users who haven’t used feature X within their first week of signing up”&lt;/em&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt; is the piece of the puzzle that allows driplang to handle this. We now know enough to express this as a driplang expression: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;user_created THEN ((NOT use_feature_x) WITHIN 7 days)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A great benefit of using a DSL to implement this is that it can be used two-fold: we can use it both to describe describe event triggers &lt;em&gt;and&lt;/em&gt; to evaluate them. And, since driplang is easily expressed as text (via a stupid-simple JSON format), we can easily store driplang expressions in a database, close to where the events we need to evaluate them on live.&lt;/p&gt;

&lt;p&gt;As I hinted at in the beginning, this post ends at a point where the obvious next step is to make a visual tool that can generate driplang expressions behind the scenes. I’m no UX designer, but I imagine it might look something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-06-19-eventdripper/boxes_and_arrows.png&quot; alt=&quot;Visual declaration of driplang expressions&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I think this could be helpful by letting non-developers, who often are the people that declare the trigger requirements anyway, be responsible actually managing event triggers. This would leave developers “only” with the job of logging events and implementing the functionality that must be triggered. In my experience, the functionality to be triggered (sending non-spammy emails) can often be abstracted enough that developers don’t have to be part of this in the long run, e.g. using email templates with variables.&lt;/p&gt;

&lt;h2 id=&quot;performance-and-implementation&quot;&gt;Performance and implementation&lt;/h2&gt;

&lt;p&gt;In terms of performance, there’s a few things we can do to optimize scheduling of expression evaluation; for expressions that don’t contain &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; operators with ‘WITHIN’ arguments, we only need to evaluate expressions when a new event is added; the only time that the output of these expressions can change is when a new event arrives.&lt;/p&gt;

&lt;p&gt;Additionally, we only need to evaluate expressions that contain references to the event that just arrived. For example, the expression A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; B will not change if the event C arrives.
So even if we have loads of event triggers declared in eventdripper, waiting to potentially be triggered, we only have to evaluate the expressions that contain the new event. With a bit of semi-clever SQL, we can ensure that we only evaluate expressions when there’s a chance that the output changed.&lt;/p&gt;

&lt;p&gt;The only type of expression we haven’t considered yet is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;THEN&lt;/code&gt; expressions with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt; arguments. Here, it’s not only the arrival of new events that contributes to whether an expression is satisfied, but also the fact that time continues on its infinite march. I don’t currently see a way of doing this that doesn’t rely on a cron job having to run in the background, reevaluating expressions at some fraction of the interval of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WITHIN&lt;/code&gt;’s time constraint. If you’ve got any ideas for how this could work, do reach out!&lt;/p&gt;

&lt;p&gt;Although the implementation of eventdripper and driplang is interesting to discuss, I’ll leave these details for another blog post. For driplang, however, I will say that it requires surprisingly little and rather simple code, especially considering that it allows us to both describe and evaluate rather complex event triggers which otherwise have a tendency to turn into a big ball of mud.&lt;/p&gt;

&lt;h1 id=&quot;heading-back-to-the-surface&quot;&gt;Heading back to the surface&lt;/h1&gt;

&lt;p&gt;Having just been introduced to eventdripper and driplang, you might be thinking that this looks like an overly complex solution to something that in many cases can be solved much simpler, with code closer to the first snippet I showed. In situations where your needs are simple and you don’t expect to need more advanced triggers, I most likely will agree with you. In general we should not waste time overengineering things that we will not need.&lt;/p&gt;

&lt;p&gt;The one thing I hope you take away from this: once your event triggering needs become non-trivial, I think decoupling the code that tracks and the code that reacts to events is definitely worth your while. Whether you use a DSL to implement this is another discussion. So far, it has served me well and helped solve exactly the problems I set out to solve. I’m very happy with the results!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If the post resonated with you and you are looking for someone to help you to do hard things with a computer, you can &lt;a href=&quot;/hire_me.html&quot;&gt;hire me&lt;/a&gt; to help you!&lt;/strong&gt;&lt;/p&gt;
</description>
        <pubDate>Wed, 19 Jun 2024 14:20:00 +0000</pubDate>
        <link>https://blog.vbang.dk/2024/06/19/eventdripper/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2024/06/19/eventdripper/</guid>
      </item>
    
      <item>
        <title>Data exploration using VIM</title>
        <description>&lt;p&gt;I’ve used vim and/or vim bindings for the better part of 10 years. But apparently there’s this tiny piece of magic that has completely escaped me all this time.&lt;/p&gt;

&lt;p&gt;About half a year ago I received a tip from a good friend (thanks Jörn ❤️) that I kind of forgot about and never took the time to actually try out.&lt;/p&gt;

&lt;p&gt;Then, this week I had to do a bunch of random data exploration and, luckily, it somehow jumped back into my brain. Just this week I’ve saved countless hours looking through gigabytes and gigabytes of sketchy data from the Danish Business Authority. Public data is &lt;em&gt;awesome&lt;/em&gt;, but the quality of that data? Often not so much :(&lt;/p&gt;

&lt;p&gt;Anyway. The tip is this: you can use the vim command (is it called that?) &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:%! [cmd]&lt;/code&gt; to invoke CLI programs on data in vim’s buffer. That’s it. It’s crazy powerful and I love it.&lt;/p&gt;

&lt;p&gt;I’ve made a 3.5 minute screencast of how this looks in practice.
If you don’t want to watch the video, I’ll give you a short description of how it works below.&lt;/p&gt;

&lt;script src=&quot;https://asciinema.org/a/662066.js&quot; id=&quot;asciicast-662066&quot; async=&quot;true&quot;&gt;&lt;/script&gt;

&lt;p&gt;For example, let’s say you have the following in your buffer:&lt;/p&gt;

&lt;div class=&quot;language-text highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;9msAkRIqstFcQAdfpvFZqgWGPBbReNS
3JEFbIfJuIGZBZodTONfnzyCykPtsBR
4KdSIqYYlDEIxpGiHFbRpqiZsFlgLxL
7UNqzFGgxEkzfzWLdTSKabDsUtTcSDs
5IqHRWKquwsekkritCxsnInXbsPeLvx
2ZdEuPTvYKFXNpOkhOytByqaDUQRSQI
0UreGiTTUnRxxrtNtaBfNYfbDhDlKwJ
1aOaHMrQzwGFjFtmwcPwdTfKVwteivR
6abgfdynLiidyiSBPUVMbkhKEsJMNVy
4doltlrfrOLmkuvCdVyJzqZRGkCOzkD
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:%! sort&lt;/code&gt; in vim will pipe the data from our buffer into sort and put it back in the buffer:&lt;/p&gt;

&lt;div class=&quot;language-text highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;0UreGiTTUnRxxrtNtaBfNYfbDhDlKwJ
1aOaHMrQzwGFjFtmwcPwdTfKVwteivR
2ZdEuPTvYKFXNpOkhOytByqaDUQRSQI
3JEFbIfJuIGZBZodTONfnzyCykPtsBR
4doltlrfrOLmkuvCdVyJzqZRGkCOzkD
4KdSIqYYlDEIxpGiHFbRpqiZsFlgLxL
5IqHRWKquwsekkritCxsnInXbsPeLvx
6abgfdynLiidyiSBPUVMbkhKEsJMNVy
7UNqzFGgxEkzfzWLdTSKabDsUtTcSDs
9msAkRIqstFcQAdfpvFZqgWGPBbReNS
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can continue doing this as much as we like, using all of our normal CLI tools, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:%! grep &quot;sekkrit&quot;&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;language-text highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;IqHRWKquwsekkritCxsnInXbsPeLvx
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And, the best part, because we’re in vim, we can undo and redo all of the commands that we run, retry failed commands (remember to add those pesky quotes around spaces for grep!!), search and replace, and the list goes on. You’re only limited by your imagination and the tools you have available on the CLI.&lt;/p&gt;

&lt;p&gt;So, now you also know. Go spread the word!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If the post resonated with you and you are looking for someone to help you to do hard things with a computer, you can &lt;a href=&quot;/hire_me.html&quot;&gt;hire me&lt;/a&gt; to help you!&lt;/strong&gt;&lt;/p&gt;
</description>
        <pubDate>Sat, 01 Jun 2024 17:08:00 +0000</pubDate>
        <link>https://blog.vbang.dk/2024/06/01/vim-data-exploration/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2024/06/01/vim-data-exploration/</guid>
      </item>
    
      <item>
        <title>Hello World, Simple Event Broker!</title>
        <description>&lt;p&gt;For various side projects I’ve worked on, I’ve wanted to introduce event queues in order to simplify some things. Normally, I just go with the “one DB to rule them all”, and shove things into Postgres. Sometimes though, the workload becomes too much and the burst- and credit balance of my puny RDS instances start looking like ski slopes that would kill most skiers.&lt;/p&gt;

&lt;p&gt;Every time this has happened I’ve looked into hosting or renting actual event queuing systems, but never found anything that fit the bill: dedicated event queuing systems are built to scale to insane workloads with the smallest latency possible and, to me at least, they all either seemed like a handful to self-host or were too expensive to rent. I just needed something that would not lose my data if the VM and/or its disk died, something that would run on tiny, cheap hardware, and was able to put up with a reasonable amount of load. I took some time off recently and thought a fun way to spend some of this time would to be to build a system that matches these requirements.&lt;/p&gt;

&lt;p&gt;So, I started work on &lt;a href=&quot;https://github.com/micvbang/simple-event-broker&quot;&gt;Seb&lt;/a&gt; (Simple Event Broker. Yay naming!)&lt;/p&gt;

&lt;h2 id=&quot;goals-and-status&quot;&gt;Goals and status&lt;/h2&gt;

&lt;p&gt;Seb is an event broker designed with the goals of being 1) cheap to run 2) easy to manage 3) easy to use, in that order. It actually has “don’t lose my data” as the very-first goal on that list, but I wanted a list of three, and I thought not losing data reasonably could be assumed to be table stakes. Let’s call it item 0.&lt;/p&gt;

&lt;p&gt;Seb explicitly does not attempt to reach sub-millisecond latencies nor scale to fantastic workloads. If you need this, there are systems infinitely more capable, designed for exactly these workloads, and which handles them &lt;em&gt;very&lt;/em&gt; well. See Kafka, Red Panda, RabbitMQ et al.&lt;/p&gt;

&lt;p&gt;In order to reach the goals of being both cheap to run and easy to manage, Seb embraces the fact that writing data to disk and &lt;em&gt;ensuring that data is actually written and stays written&lt;/em&gt; is rather difficult. It utilizes the hundreds of thousands of engineering hours that were poured into object stores and pays the price of latency at the gates of the cloud vendors. For the use cases I have in mind, this trade-off is perfect; it gives me reasonable throughput at a (very) low price.&lt;/p&gt;

&lt;p&gt;I expect the target audience for a system like this will be small and niche. Who knows? Maybe there’s more people like me that need event queues but aren’t rich enough to rent them!&lt;/p&gt;

&lt;p&gt;Anyway, working on Seb has been a lot of fun and it solves exactly the problem I was looking to solve. It’s by no means “done” yet (is anything ever?), but it’s currently in a state where I can use it for what I need to. There’s of course loads of stuff I’d love to add and improve; only supporting a single, static API-key for authentication, for instance, is laughable. But things take time and this is how far I’ve come.&lt;/p&gt;

&lt;h2 id=&quot;architecture&quot;&gt;Architecture&lt;/h2&gt;

&lt;p&gt;Although Seb doesn’t have a clever play on words including “go” in its name, it’s written in Go. I kinda want to evolve it to be embeddable (even easier to manage when it lives &lt;em&gt;inside&lt;/em&gt; your application!), but for now I’ve hidden everything from the public in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;internal/&lt;/code&gt; folder so that I don’t have to play nice with anyone that might be foolish enough to try and use it just yet. It’s currently &lt;em&gt;very&lt;/em&gt; actively under development, and I might change anything at any time. Force-push-to-master kind of active; be warned!&lt;/p&gt;

&lt;p&gt;Seb is split into three main parts; the &lt;em&gt;&lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebbroker/broker.go#L27&quot;&gt;Broker&lt;/a&gt;&lt;/em&gt;, which is responsible for managing and multiplexing Topics, &lt;em&gt;&lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebtopic/topic.go#L38&quot;&gt;Topic&lt;/a&gt;&lt;/em&gt; which is responsible for persisting data to the underlying storage, and &lt;em&gt;&lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebcache/cache.go#L22&quot;&gt;Cache&lt;/a&gt;&lt;/em&gt;, which is responsible for caching data locally so that we can minimize the number of times we pass through the gates of the cloud vendors, saving both latency and cash money. This is shown below.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-05-26-seb/architecture.png&quot; alt=&quot;Seb high-level architecture&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The Broker assumes that data is durably persisted when a Topic’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AddRecords()&lt;/code&gt; method returns. As might be legible from my doodles above, Topic currently has three different backing storages: &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebtopic/s3storage.go#L21&quot;&gt;S3&lt;/a&gt;, &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebtopic/diskstorage.go#L16&quot;&gt;local disk&lt;/a&gt;, and &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebtopic/memorystorage.go#L17&quot;&gt;local memory&lt;/a&gt;. S3 is the only one that anyone should trust with production data (remember I said that writing to disk is hard?). Disk and memory are super-duper only to be used for data that you don’t care about. Pinky-promises required before use!&lt;/p&gt;

&lt;p&gt;The simple but important realization I had when initially trying to design Seb on paper was that if I can trust the cloud vendors’s object stores that a file is durably stored once they’ve given me a 200 OK, the hardest part of the system (besides concurrency?) wouldn’t have to be handled by me. With this assumption it’s a non-event interms of durability if my VM or local disk dies during operation. The data lives on in the skies and no caller believes that they have added data to the queue which wasn’t actually added. Argument for why this last part is true coming right up!&lt;/p&gt;

&lt;h2 id=&quot;durability-and-latency-money-trade-off&quot;&gt;Durability and latency-money trade-off&lt;/h2&gt;

&lt;p&gt;In order to not have to wait a full roundtrip every time we write data to S3 (and to save money on the $0.005-per-1,000-requests of S3!) we collect records in batches before sending them off to S3. Whenever “the first” record of a batch comes in the door, Seb will wait for a configurable amount of time in the hope that more records will arrive and can be included in the batch. Callers are blocked while waiting for the batch to finish. This is a very direct trade-off between money and latency, and your specific situation will dictate how long time it makes sense to wait. Once the wait time has expired, Seb will attempt to write the accumulated records to S3. Only when we’ve gotten our response from the S3 API do we tell the callers whether their request succeeded or not. If it succeeded we send them the offset of their record, and if not we send them an error. This is &lt;em&gt;it&lt;/em&gt;. The main argument that Seb won’t lose our data. There’s of course still a lot of other ways that things can go wrong, but, in terms of durability, this is the central argument: Seb only tells callers that their data has been persisted once it has gotten a 200 OK from S3.&lt;/p&gt;

&lt;p&gt;You might have noticed that it’s still possible that Seb will crash in the time between getting a 200 OK from S3 and replying to the caller. In this situation the data &lt;em&gt;has&lt;/em&gt; been added to the queue, and can be retrieved by consumer, but the caller has no way of knowing. So, if the caller really cares about adding their data to the queue, they will retry the call and the data will be added twice. In fancy systems lingo we would say that the producer has “at-least-once” delivery semantics. This problem is somewhat easily circumvented: if producers include a unique id in each record, consumers can use this to ignore records they’ve already handled. This would of course also be possible to handle this directly in Seb, but would require that all producers include a unique ID for every record, and that Seb has some way of keeping track of which IDs that were already added. In order to keep Seb simple, this is not a goal.&lt;/p&gt;

&lt;p&gt;The strategy for batching records is configurable and hidden behind the &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebbroker/broker.go#L17&quot;&gt;RecordBatcher&lt;/a&gt; interface. The strategy described above is implemented as &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebbroker/blockingbatcher.go#L39&quot;&gt;BlockingBatcher&lt;/a&gt;. There’s also a batching strategy called &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebbroker/nullbatcher.go#L12&quot;&gt;NullBatcher&lt;/a&gt; which doesn’t do any batching, and just send records straight through to S3, creating and uploading one file per record. This is mostly useful for testing.&lt;/p&gt;

&lt;h2 id=&quot;data-layout&quot;&gt;Data layout&lt;/h2&gt;

&lt;p&gt;The data format used in a system like this can have a large impact on read and write performance. I initially looked around for existing file formats to use but didn’t manage to find any that would be particularly helpful. Instead, I came up with the simplest and stupidest file format that I thought would work, which would be fast and simple to both write and parse. I started out being kinda inspired by LSM trees, but since I’ve yet to implement support for record keys, I’ve done nothing of the sort. It’s just a tiny header concatenated with pointers into raw record data. Oh, and files are immutable, so they’re infinitely cacheable and only ever have to be “constructed” once.&lt;/p&gt;

&lt;p&gt;This is what the format looks like:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-05-26-seb/file_format.png&quot; alt=&quot;Seb file format&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As I’ve tried to show in the visualization, the file format has three sections:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;header (32 bytes)&lt;/li&gt;
  &lt;li&gt;pointers to each record (N * 32 bytes)&lt;/li&gt;
  &lt;li&gt;record data (however much data the records are)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For anyone that has tried to come up with a custom file format before, one of the things you’re likely to learn the hard way is that you should include a version number in the header. It’s unlikely we’ll get the file format right in the first try, and adding a version number will give us the opportunity to change the format in the future while keeping the parser code compatible with versions without too many hacks; read the header and do dispatch based on the version number.&lt;/p&gt;

&lt;p&gt;The static part of the &lt;a href=&quot;https://github.com/micvbang/simple-event-broker/blob/master/internal/sebrecords/records.go#L24&quot;&gt;header&lt;/a&gt; is declared as follows:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Header&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;MagicBytes&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;Version&lt;/span&gt;     &lt;span class=&quot;kt&quot;&gt;int16&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;UnixEpochUs&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;NumRecords&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;uint32&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;Reserved&lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;14&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It weighs in at 32 bytes and dictates that each file can contain a maximum of 2^32 records (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NumRecords&lt;/code&gt; is uint32). Each offset into the file is given as a uint32, so the maximum offset into the file we can point to is 4GB. Both of these numbers are obviously &lt;em&gt;way&lt;/em&gt; larger than we are likely to want to use in practice. We want to keep the size of each file reasonably small so that it’s not too expensive to fetch it from S3 if we don’t have it in the local cache, but at the same time we don’t want it to be too small because this would mean that we have to go to S3 more often. Trade-offs everywhere!&lt;/p&gt;

&lt;p&gt;Let’s see what everything looks like when we create a file with a few records. I’ll do the example in human readable format so that you don’t have to dust off the good-ol’ ascii chart.&lt;/p&gt;

&lt;p&gt;Here’s our file:&lt;/p&gt;

&lt;div class=&quot;language-text highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Data                   Field.       Size      File offset
----------------------------------------------------------
seb!                 Magic bytes   4 bytes        0
1                    Version       2 bytes        4
2024-05-28 12:00:00  UnixEpochUs   8 bytes        6
3                    NumRecords    4 bytes       14
00000000000000       Reserved     14 bytes       18
44                   Index0        4 bytes       32
61                   Index1        4 bytes       36
96                   Index2        4 bytes       40
first-record-data    Data         17 bytes       44
second-record-data   Data         18 bytes       61
third-record-data    Data         17 bytes       79
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As is hopefully clear from the above snippet, the three records we added to the file contain the rather boring data “first-record-data”, “second-record-data”, and “third-record-data”.&lt;/p&gt;

&lt;p&gt;The first step of reading back records from our file is to read the static part of the header, namely the first 32 bytes. Having read this, we can verify that the magic bytes (“seb!”) and the version number (1) match our expectations and, additionally, we have information on how many records the file contains (3). The second step is to use the number of records to calculate the size of the file’s index (3 records *4 bytes). Now, having read both the header and the index, we know exactly where each record starts and ends.&lt;/p&gt;

&lt;p&gt;In order to read the second record, for example, we look up entry 1 in our index, which is zero-indexed. Looking at Index1 in our file, we see that the record starts at file offset 61. We can tell the length of our record by looking up the offset of next one and subtract the two; 79-61. We now know that our record starts at file offset 61 and is 17 bytes long; the code has been cracked and we can continue our adventure!&lt;/p&gt;

&lt;h2 id=&quot;benchmarking&quot;&gt;Benchmarking&lt;/h2&gt;

&lt;p&gt;This post has already become way too long. If you’re still reading: well done! We’re almost through. If you’re out of breath and need to take a break: I hear you. Go lie down. But, if you want to finish this before doing so, I’ve written a summary TLDR below. If you don’t want the spoiler, quickly cover your secreen and scroll past the following handful of lines!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR Summary&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Hardware: Hetzner CAX11, 2 core ARM Ampere, 4GB memory&lt;/li&gt;
  &lt;li&gt;Seb configuration: batch collection time: 10ms&lt;/li&gt;
  &lt;li&gt;Each test sends 100k records&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Requests are sent from T14 laptop on fiber in Copenhagen, Denmark to CAX11 in Falkenstein, Germany&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Max performance non-batched: 22k requests/s with 4800 workers (1 record/request)&lt;/li&gt;
  &lt;li&gt;Max performance batched*: 50k requests/s with 600 workers (32 records/request)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that I’ve spent some time building and discussing Seb, I thought it would be nice to understand how it behaves if we put it under a bit of stress. These benchmarks aren’t going to be particularly scientific. I’m aiming for getting an overall feeling for what this thing can do, not winning benchmark of the year. Each test in the following data was run just once, so you don’t have to look at those pesky error bars. Yes, I know. You’re welcome.&lt;/p&gt;

&lt;p&gt;Since Seb was designed to be cheap to run, I wanted to try it out on a cheap machine. At €4.51/month, Hetzner’s CAX11 ARM VMs are exactly what I’m looking for. They come with 2 ARM Ampere cores and 4GB memory. Hetzner provide no specs on their disks, but do state the following&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;They are optimized for high I/O performance and low latency and are especially suited for applications which require fast access to disks with low latency, such as databases.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I expect the latency to AWS to be the dominating factor in this test anyway, so the performance of the disk shouldn’t matter &lt;em&gt;too much&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Since we’re going for speed in these benchmarks, I decided to set the batch collection time low at 10ms. This means that, whenever the first request comes in, Seb will collect all incoming requests for the next 10ms into a batch. Once the batch is collected, Seb writes it to a file and sends it to S3 before putting it into the local disk-cache.&lt;/p&gt;

&lt;p&gt;An important detail: since Seb blocks callers while collecting a batch, we have to send a lot of HTTP requests in parallel in order to be able to saturate the system.&lt;/p&gt;

&lt;h3 id=&quot;graphs-and-numbers&quot;&gt;Graphs and numbers&lt;/h3&gt;

&lt;p&gt;The first graph we’re going to look at is runtime vs number of workers for different payloads.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-05-26-seb/benchmark_time_vs_workers.png&quot; alt=&quot;Time vs workers&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We see that it’s faster to use more workers, but that the returns of adding more workers start diminishing at around 1200. I speculate that our small 2-core server starts to buckle at the knees because of the overhead of handling that many HTTP connections simultaneously.&lt;/p&gt;

&lt;p&gt;On the above graph we also see that it’s generally slower to send requests with larger payloads, but that requests of size &amp;lt;= 1024 bytes are roughly the same. This makes sense since we’re aren’t even filling up our ethernet packets at this point.&lt;/p&gt;

&lt;p&gt;The next graph is requests/second vs workers for different record sizes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-05-26-seb/benchmark_requestsps_vs_workers.png&quot; alt=&quot;Requests/second vs workers&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Here, we see the maximum number of requests/second hit ~20k for record sizes 64 and 256 bytes. I can’t come up with a reason why 256 bytes should be faster than 64, so I’m going to assume that this is just noise. After all, we are running this on a shared VM and giving it a bit of a hard time. See, I promised: no error bars!&lt;/p&gt;

&lt;p&gt;Starting at 1200 workers, we see that the requests/second drops by roughly half with a quadrupling of the record size. This is another indication to me that we have found the point at which we’re starting to confuse our hardworkig CAX11 with the sheer number of requests we’re sending to it. If &lt;em&gt;only&lt;/em&gt; the record size had been the bottle neck, I would expect the number of requests/second to drop by something closer to a factor of four. Another way to look at this: ~3000 requests/second at 16kb/request is around 375 mbit/s, whereas at ~7k requests/second at 4kb/request is around 220 mbit/s. Even though the number of requests is much lower, we’re still pushing almost double the amount of data through with our 16kb payload. The record size does seem to have an impact, though, which we can see from how the graphs flatten out a lot quicker for the higher record sizes.&lt;/p&gt;

&lt;p&gt;I didn’t really plan to benchmark any further, but after finding that we’re probably saturating the server with the number of requests rather than the amount of data we’re pushing through, I decided to do one more benchmark. This time I’m using Seb’s batch API, which allows us to queue multiple records per request.&lt;/p&gt;

&lt;p&gt;The final graph shows us records/second vs workers, for batch sizes of 1 and 32 with a record size of 1kb.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/static/posts/2024-05-26-seb/benchmark_batch_recordsps_vs_workers.png&quot; alt=&quot;Records/second vs workers&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As we would expect from our above analysis, the graph shows that the number of records/second increases dramatically (more than doubling from ~22k to ~50k!) when records are batched. On the graph, we also see that the system starst to deteriorate at 1200 workers. This matches our previous observations. I believe that main difference now is that we’re not just pushing it on the amount of requests, but also giving it more work per request than it has time to handle. The system simply can’t keep up anymore and performance starts to degrade.&lt;/p&gt;

&lt;p&gt;Alright, that’s it folks! I must say I’m pretty happy with how much work we can push through this system. ~22k and ~50k records/second is a lot more than I expect to be needing in the foreseeable future. Turns out that Seb packs a decent punch!&lt;/p&gt;

&lt;h2 id=&quot;todos-and-missing-features&quot;&gt;TODOs and missing features&lt;/h2&gt;

&lt;p&gt;There’s still a bunch of things I’d love to work on to improve Seb. I’ve spent too much time writing the above, so I’ll just outline the TODOs and missing features in a bullet list below. Perhaps some of these will be the topic of another post?&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Authentication
    &lt;ul&gt;
      &lt;li&gt;currently only supports a single, deployment-wide API key&lt;/li&gt;
      &lt;li&gt;considering: certificate-based authentication&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;keep state
    &lt;ul&gt;
      &lt;li&gt;probably sqlite&lt;/li&gt;
      &lt;li&gt;track consumer offsets&lt;/li&gt;
      &lt;li&gt;track record keys&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;record keys
    &lt;ul&gt;
      &lt;li&gt;compaction&lt;/li&gt;
      &lt;li&gt;history of values for key&lt;/li&gt;
      &lt;li&gt;iterate over all keys&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;clean up old data
    &lt;ul&gt;
      &lt;li&gt;LSM compaction (requires record keys)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;If the post resonated with you and you are looking for someone to help you to do hard things with a computer, you can &lt;a href=&quot;/hire_me.html&quot;&gt;hire me&lt;/a&gt; to help you!&lt;/strong&gt;&lt;/p&gt;
</description>
        <pubDate>Sun, 26 May 2024 16:20:00 +0000</pubDate>
        <link>https://blog.vbang.dk/2024/05/26/seb/</link>
        <guid isPermaLink="true">https://blog.vbang.dk/2024/05/26/seb/</guid>
      </item>
    

    
      
        
      
    
      
        
          <item>
            <title>About</title>
            <description>&lt;p&gt;I’m Michael, based in Copenhagen, Denmark. I’ve made a living as a programmer since 2007 and I now run a consultancy that helps customers do things with computers.&lt;/p&gt;

&lt;p&gt;Since you’re reading this, I’m assuming that you read one of my blog posts and thought it reasonated with you somehow. That makes me happy! Or, perhaps you’re furious and need to tell me how wrong I am. In that case, do tell!&lt;/p&gt;

&lt;p&gt;You can hire me on an hourly, weekly, or monthly basis. Depending on the type of project, I’m also available for longer contracts.&lt;/p&gt;

&lt;p&gt;My contact information is at &lt;a href=&quot;#contact&quot;&gt;the bottom of the page&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;experience&quot;&gt;Experience&lt;/h2&gt;

&lt;p&gt;I love working behind the scenes, doing backend, infrastructure, and systems programming. Basically anywhere that maintainability, correctness, and performance are important factors, and where people mostly expect things to Just Work™.&lt;/p&gt;

&lt;p&gt;I have experience form the following industries:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Health care (Novo Nordisk, Adent Health)&lt;/li&gt;
  &lt;li&gt;Semiconductors/systems programming research (Samsung Research)&lt;/li&gt;
  &lt;li&gt;Banking (Danske Bank)&lt;/li&gt;
  &lt;li&gt;Gaming (noesis.gg)&lt;/li&gt;
  &lt;li&gt;Real estate (Ejendomstorvet)&lt;/li&gt;
  &lt;li&gt;Consulting (Eksponent, Big Bang Holding)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On a contract, I’m happy to help in any way that I can; architecting, writing code, mentoring, doing code reviews, setting up CI/CD pipelines. All of it is important and required for teams to be great. I’m pragmatic, open, easy-going, and I love keeping things light and fun. I’m professional and I’m on time.&lt;/p&gt;

&lt;p&gt;I care deeply about, and have proven experience with:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;technical project leadership&lt;/li&gt;
  &lt;li&gt;mentoring&lt;/li&gt;
  &lt;li&gt;designing and implementing greenfield projects&lt;/li&gt;
  &lt;li&gt;writing testable, maintainable code&lt;/li&gt;
  &lt;li&gt;writing tests that actually provide value&lt;/li&gt;
  &lt;li&gt;improving maintainability and testability of existing systems&lt;/li&gt;
  &lt;li&gt;using performance profiling to guide development&lt;/li&gt;
  &lt;li&gt;doing code reviews&lt;/li&gt;
  &lt;li&gt;technical writing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m also experienced with technical reviewing, having reviewed:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Matt Boyle’s &lt;a href=&quot;https://www.bytesizego.com/the-ultimate-guide-to-debugging-with-go-book&quot;&gt;Foundations of Debugging with Go&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Inanc Gumus’s &lt;a href=&quot;https://www.manning.com/books/go-by-example&quot;&gt;Go by Example&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Bartlomiej Plotka’s &lt;a href=&quot;https://www.oreilly.com/library/view/efficient-go/9781098105709/&quot;&gt;Efficient Go&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;William Kennedy’s &lt;a href=&quot;https://education.ardanlabs.com/courses/ultimate-go-notebook&quot;&gt;Ultimate Go Notebook&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;p&gt;You don’t have to take my word for it. Below are references from previous employers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/novo_nordisk_joanna_sharman_soares.pdf&quot;&gt;2023-2024 Novo Nordisk&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/foss_nicolas_arogvi.pdf&quot;&gt;2023 FOSS&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/ejendomstorvet_jonas_krat.pdf&quot;&gt;2022 Ejendomstorvet&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/samsung_javier_gonzalez.pdf&quot;&gt;2021 Samsung&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;2018-2020 Co-founding &lt;a href=&quot;https://noesis.gg&quot;&gt;noesis.gg&lt;/a&gt; and founding &lt;a href=&quot;https://cvr.dev&quot;&gt;cvr.dev&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/danske_bank_jacob_avlund.pdf&quot;&gt;2017-2018 Danske Bank&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/static/references/eksponent_christian_dalager.pdf&quot;&gt;2014-2017 Eksponent&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;contact&quot;&gt;Contact&lt;/h2&gt;

&lt;p&gt;You can contact me here:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Email: &lt;a href=&quot;mailto:project@vbang.dk&quot;&gt;project@vbang.dk&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;LinkedIn: &lt;a href=&quot;https://www.linkedin.com/in/micvbang&quot;&gt;micvbang&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Twitter: &lt;a href=&quot;https://x.com/micvbang&quot;&gt;@micvbang&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
            <link>https://blog.vbang.dk/about.html</link>
          </item>
        
      
    
      
    
      
    
      
    

  </channel>
</rss>