<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <author>
    <name>Daniel Shih</name>
  </author>
  <generator uri="https://hexo.io/">Hexo</generator>
  <id>https://isdaniel.github.io/</id>
  <link href="https://isdaniel.github.io/" rel="alternate"/>
  <link href="https://isdaniel.github.io/atom.xml" rel="self"/>
  <rights>All rights reserved 2026, Daniel Shih</rights>
  <subtitle>好點子沒價值，有價值的是把好點子實現</subtitle>
  <title>石頭的coding之路</title>
  <updated>2026-04-22T03:00:22.029Z</updated>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/logical-replication/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/tags/logical-replication/"/>
    <category term="WAL" scheme="https://isdaniel.github.io/tags/WAL/"/>
    <category term="CDC" scheme="https://isdaniel.github.io/tags/CDC/"/>
    <content>
      <![CDATA[<h1 id="pg-walstream-High-Performance-PostgreSQL-WAL-Streaming-in-Rust"><a href="#pg-walstream-High-Performance-PostgreSQL-WAL-Streaming-in-Rust" class="headerlink" title="pg-walstream: High-Performance PostgreSQL WAL Streaming in Rust"></a>pg-walstream: High-Performance PostgreSQL WAL Streaming in Rust</h1><p><a href="https://github.com/isdaniel/pg-walstream">pg-walstream</a> is a Rust library for parsing and streaming PostgreSQL Write-Ahead Log (WAL) messages through logical and physical replication protocols. It provides a type-safe, async-first interface for building real-time Change Data Capture (CDC) pipelines.</p><p>If you need to react to database changes in real-time — event-driven architectures, data pipelines, audit logging, cache invalidation, or search index syncing — pg-walstream abstracts the complex PostgreSQL replication protocol into a clean Rust API.</p><p><a href="https://crates.io/crates/pg_walstream">crates.io</a> | <a href="https://docs.rs/pg-walstream">API Docs</a></p><h2 id="Background-PostgreSQL-WAL-and-Replication"><a href="#Background-PostgreSQL-WAL-and-Replication" class="headerlink" title="Background: PostgreSQL WAL and Replication"></a>Background: PostgreSQL WAL and Replication</h2><p>Before diving into pg-walstream, it helps to understand how PostgreSQL replication works at a fundamental level.</p><h3 id="What-is-WAL-Write-Ahead-Logging"><a href="#What-is-WAL-Write-Ahead-Logging" class="headerlink" title="What is WAL (Write-Ahead Logging)?"></a>What is WAL (Write-Ahead Logging)?</h3><p>WAL is PostgreSQL’s mechanism for ensuring data durability. The core idea: <strong>write the transaction log first, then write the actual data</strong>. Every INSERT, UPDATE, DELETE, and DDL change is recorded sequentially in WAL files before the corresponding data pages are flushed to disk.</p><p>This design provides two key benefits:</p><ol><li><strong>Crash recovery</strong> — if the system crashes, PostgreSQL can replay (REDO) the WAL to recover committed transactions that weren’t yet flushed to disk.</li><li><strong>I&#x2F;O efficiency</strong> — sequential WAL writes are much faster than random page writes, so PostgreSQL doesn’t need to flush dirty pages on every commit.</li></ol><p>Each WAL record is identified by a <strong>Log Sequence Number (LSN)</strong> — a monotonically increasing pointer into the WAL stream. LSN is the backbone of replication: it tells both the sender and receiver exactly where they are in the change history.</p><figure class="highlight n1ql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">               WAL Stream</span><br><span class="line">┌───────┬───────┬───────┬───────┬───────┐</span><br><span class="line">│ BEGIN │<span class="keyword">INSERT</span> │<span class="keyword">UPDATE</span> │<span class="keyword">DELETE</span> │<span class="keyword">COMMIT</span> │  ...</span><br><span class="line">│LSN: <span class="number">1</span> │LSN: <span class="number">2</span> │LSN: <span class="number">3</span> │LSN: <span class="number">4</span> │LSN: <span class="number">5</span> │</span><br><span class="line">└───────┴───────┴───────┴───────┴───────┘</span><br><span class="line">                      │</span><br><span class="line">           Replayed <span class="keyword">by</span> standby / consumed <span class="keyword">by</span> CDC</span><br></pre></td></tr></table></figure><p>For a deeper dive into WAL internals, see my earlier post: <a href="https://isdaniel.github.io/postgresql-wal-introduce/">PostgreSQL WAL (Write-Ahead Logging) mechanism</a>.</p><h3 id="Physical-vs-Logical-Replication"><a href="#Physical-vs-Logical-Replication" class="headerlink" title="Physical vs. Logical Replication"></a>Physical vs. Logical Replication</h3><p>PostgreSQL supports two replication modes, each serving different use cases:</p><p><strong>Physical Replication</strong> streams raw WAL bytes — the exact disk-level changes — to a standby server. The standby replays these byte-for-byte, producing an identical copy of the primary. This is what powers read replicas and high-availability setups.</p><p><strong>Logical Replication</strong> decodes WAL into higher-level change events (INSERT, UPDATE, DELETE) using an output plugin (e.g., <code>pgoutput</code>). Instead of raw disk blocks, consumers receive structured messages like “row X was inserted into table Y with these column values.” This is what powers CDC pipelines.</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">Physical Replication:</span><br><span class="line">  <span class="keyword">Primary</span> ──[raw WAL bytes]──▶ Standby (byte<span class="operator">-</span>identical <span class="keyword">copy</span>)</span><br><span class="line"></span><br><span class="line">Logical Replication:</span><br><span class="line">  <span class="keyword">Primary</span> ──[pgoutput plugin]──▶ Decoded Change Events ──▶ CDC Consumer</span><br><span class="line">                                  (<span class="keyword">INSERT</span><span class="operator">/</span><span class="keyword">UPDATE</span><span class="operator">/</span><span class="keyword">DELETE</span>)</span><br></pre></td></tr></table></figure><p>pg-walstream supports <strong>both</strong> modes — physical replication for standby&#x2F;backup scenarios and logical replication for CDC.</p><h3 id="The-Logical-Replication-Protocol"><a href="#The-Logical-Replication-Protocol" class="headerlink" title="The Logical Replication Protocol"></a>The Logical Replication Protocol</h3><p>When a client connects to PostgreSQL in replication mode, they communicate via the <strong>Streaming Replication Protocol</strong>. For logical replication, the flow works as follows:</p><ol><li><strong>Client creates a replication slot</strong> — this tells PostgreSQL to retain WAL segments needed by this consumer, preventing them from being recycled.</li><li><strong>Client starts replication</strong> from a slot, specifying the output plugin (<code>pgoutput</code>) and which publication to subscribe to.</li><li><strong>PostgreSQL streams messages</strong> — the server continuously sends WAL data messages (<code>XLogData</code>) containing decoded change events.</li><li><strong>Client sends feedback</strong> — periodically, the client reports its progress (flushed LSN, applied LSN) back to the server. This lets PostgreSQL know which WAL segments can be safely recycled.</li></ol><p>The decoded messages follow the <strong>logical replication message format</strong>, which has evolved across four protocol versions:</p><table><thead><tr><th align="center">Protocol Version</th><th align="center">PostgreSQL</th><th>Key Additions</th></tr></thead><tbody><tr><td align="center">v1</td><td align="center">14+</td><td>Core messages: BEGIN, COMMIT, INSERT, UPDATE, DELETE, TRUNCATE, RELATION, TYPE, ORIGIN</td></tr><tr><td align="center">v2</td><td align="center">14+</td><td><strong>Streaming transactions</strong>: STREAM_START, STREAM_STOP, STREAM_COMMIT, STREAM_ABORT — allows consuming large, in-progress transactions before COMMIT</td></tr><tr><td align="center">v3</td><td align="center">14+</td><td><strong>Two-phase commit</strong>: BEGIN_PREPARE, PREPARE, COMMIT_PREPARED, ROLLBACK_PREPARED, STREAM_PREPARE</td></tr><tr><td align="center">v4</td><td align="center">17+</td><td><strong>Parallel streaming</strong>, <code>abort_lsn</code> field for more precise abort handling</td></tr></tbody></table><p>Each message type carries specific data. For example, an INSERT message contains:</p><ul><li><strong>Relation ID</strong> — which table the row belongs to</li><li><strong>Tuple data</strong> — the column values of the new row, typed by OID</li></ul><p>The RELATION message (sent once per table, or when a schema changes) maps the relation ID to a table name, namespace, and column definitions — so the consumer can interpret the tuple data.</p><h3 id="Why-Build-a-Library-for-This"><a href="#Why-Build-a-Library-for-This" class="headerlink" title="Why Build a Library for This?"></a>Why Build a Library for This?</h3><p>Implementing the replication protocol from scratch involves:</p><ul><li>Managing the PostgreSQL wire protocol and authentication (cleartext, MD5, SCRAM-SHA-256)</li><li>Parsing binary WAL messages with protocol-version-specific formats</li><li>Tracking LSN positions and sending periodic feedback to avoid WAL bloat</li><li>Handling connection drops, retries, and replication slot lifecycle</li><li>Dealing with streaming transactions that may arrive interleaved</li></ul><p>pg-walstream encapsulates all of this complexity into a type-safe Rust API, so you can focus on what to <strong>do</strong> with the change events rather than how to <strong>receive</strong> them.</p><h2 id="Key-Features"><a href="#Key-Features" class="headerlink" title="Key Features"></a>Key Features</h2><ul><li><strong>Protocol v1–v4 support</strong> including streaming transactions (v2), two-phase commit (v3), and parallel streaming (v4)</li><li><strong>Two connection backends</strong>: <code>libpq</code> (C FFI, default) and <code>rustls-tls</code> (pure Rust, no runtime C deps)</li><li><strong>Zero-copy buffers</strong> via the <code>bytes</code> crate — no unnecessary data cloning</li><li><strong>Serde-based deserialization</strong> — map WAL events directly to Rust structs</li><li><strong>Automatic retry</strong> with exponential backoff for transient failures</li><li><strong>Async&#x2F;await</strong> with <code>tokio</code> and <code>futures::Stream</code> integration</li><li><strong>Memory efficient</strong> — all configurations stay under 18 MB RSS</li></ul><h2 id="Architecture"><a href="#Architecture" class="headerlink" title="Architecture"></a>Architecture</h2><figure class="highlight pgsql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line">┌──────────────────────────────────────────┐</span><br><span class="line">│          Application Layer               │</span><br><span class="line">│  (Your CDC / <span class="keyword">Replication</span> Logic)          │</span><br><span class="line">└──────────────┬───────────────────────────┘</span><br><span class="line">               │</span><br><span class="line">┌──────────────▼───────────────────────────┐</span><br><span class="line">│    LogicalReplicationStream              │</span><br><span class="line">│  - <span class="keyword">Connection</span> management &amp; retry         │</span><br><span class="line">│  - Event processing &amp; LSN feedback       │</span><br><span class="line">│  - <span class="keyword">Snapshot</span> export support               │</span><br><span class="line">└──────────────┬───────────────────────────┘</span><br><span class="line">               │</span><br><span class="line">┌──────────────▼───────────────────────────┐</span><br><span class="line">│  LogicalReplicationParser                │</span><br><span class="line">│  - Protocol v1-v4 parsing                │</span><br><span class="line">│  - Zero-<span class="keyword">copy</span> message deserialization     │</span><br><span class="line">│  - Streaming <span class="keyword">transaction</span> support         │</span><br><span class="line">└──────────────┬───────────────────────────┘</span><br><span class="line">               │</span><br><span class="line">┌──────────────▼───────────────────────────┐</span><br><span class="line">│     PgReplicationConnection              │</span><br><span class="line">│  ┌─────────────────┬──────────────────┐  │</span><br><span class="line">│  │  libpq backend  │ rustls-tls       │  │</span><br><span class="line">│  │  (C FFI)        │ (pure Rust)      │  │</span><br><span class="line">│  └─────────────────┴──────────────────┘  │</span><br><span class="line">│  Compile-<span class="type">time</span> feature flag selection     │</span><br><span class="line">└──────────────┬───────────────────────────┘</span><br><span class="line">               │</span><br><span class="line">┌──────────────▼───────────────────────────┐</span><br><span class="line">│     BufferReader / BufferWriter          │</span><br><span class="line">│  - Zero-<span class="keyword">copy</span> operations (bytes crate)    │</span><br><span class="line">│  - Binary protocol handling              │</span><br><span class="line">└──────────────────────────────────────────┘</span><br></pre></td></tr></table></figure><p>The library has two connection backends selected at compile time:</p><table><thead><tr><th>Backend</th><th>Feature Flag</th><th>Dependencies</th><th>Description</th></tr></thead><tbody><tr><td>libpq (default)</td><td><code>libpq</code></td><td><code>libpq-dev</code>, <code>libclang-dev</code></td><td>FFI wrapper around PostgreSQL’s C client library</td></tr><tr><td>rustls-tls</td><td><code>rustls-tls</code></td><td><code>cmake</code> (build-time only)</td><td>Pure-Rust TLS via rustls + aws-lc-rs (hardware-accelerated)</td></tr></tbody></table><p>When both features are enabled, <code>rustls-tls</code> takes priority automatically.</p><h2 id="Getting-Started"><a href="#Getting-Started" class="headerlink" title="Getting Started"></a>Getting Started</h2><h3 id="Installation"><a href="#Installation" class="headerlink" title="Installation"></a>Installation</h3><figure class="highlight toml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Default (libpq backend)</span></span><br><span class="line"><span class="section">[dependencies]</span></span><br><span class="line"><span class="attr">pg_walstream</span> = <span class="string">&quot;0.6.2&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Pure-Rust backend (no C runtime deps)</span></span><br><span class="line"><span class="attr">pg_walstream</span> = &#123; version = <span class="string">&quot;0.6.2&quot;</span>, default-features = <span class="literal">false</span>, features = [<span class="string">&quot;rustls-tls&quot;</span>] &#125;</span><br></pre></td></tr></table></figure><h3 id="PostgreSQL-Setup"><a href="#PostgreSQL-Setup" class="headerlink" title="PostgreSQL Setup"></a>PostgreSQL Setup</h3><p>Enable logical replication in <code>postgresql.conf</code>:</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">wal_level</span> = logical</span><br><span class="line"><span class="attr">max_replication_slots</span> = <span class="number">4</span></span><br><span class="line"><span class="attr">max_wal_senders</span> = <span class="number">4</span></span><br></pre></td></tr></table></figure><p>Create a publication and replication user:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> PUBLICATION my_publication <span class="keyword">FOR</span> <span class="keyword">TABLE</span> users, orders;</span><br><span class="line"></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">USER</span> replication_user <span class="keyword">WITH</span> REPLICATION PASSWORD <span class="string">&#x27;secure_password&#x27;</span>;</span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> <span class="keyword">ALL</span> TABLES <span class="keyword">IN</span> SCHEMA public <span class="keyword">TO</span> replication_user;</span><br></pre></td></tr></table></figure><h3 id="Basic-Streaming-Example"><a href="#Basic-Streaming-Example" class="headerlink" title="Basic Streaming Example"></a>Basic Streaming Example</h3><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">use</span> pg_walstream::&#123;</span><br><span class="line">    LogicalReplicationStream, ReplicationStreamConfig, RetryConfig,</span><br><span class="line">    StreamingMode, CancellationToken,</span><br><span class="line">&#125;;</span><br><span class="line"><span class="keyword">use</span> std::time::Duration;</span><br><span class="line"></span><br><span class="line"><span class="meta">#[tokio::main]</span></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">fn</span> <span class="title function_">main</span>() <span class="punctuation">-&gt;</span> <span class="type">Result</span>&lt;(), <span class="type">Box</span>&lt;<span class="keyword">dyn</span> std::error::Error&gt;&gt; &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">config</span> = ReplicationStreamConfig::<span class="title function_ invoke__">new</span>(</span><br><span class="line">        <span class="string">&quot;my_slot&quot;</span>.<span class="title function_ invoke__">to_string</span>(),</span><br><span class="line">        <span class="string">&quot;my_publication&quot;</span>.<span class="title function_ invoke__">to_string</span>(),</span><br><span class="line">        <span class="number">2</span>,                              <span class="comment">// Protocol version</span></span><br><span class="line">        StreamingMode::On,</span><br><span class="line">        Duration::<span class="title function_ invoke__">from_secs</span>(<span class="number">10</span>),        <span class="comment">// Feedback interval</span></span><br><span class="line">        Duration::<span class="title function_ invoke__">from_secs</span>(<span class="number">30</span>),        <span class="comment">// Connection timeout</span></span><br><span class="line">        Duration::<span class="title function_ invoke__">from_secs</span>(<span class="number">60</span>),        <span class="comment">// Health check interval</span></span><br><span class="line">        RetryConfig::<span class="title function_ invoke__">default</span>(),</span><br><span class="line">    );</span><br><span class="line"></span><br><span class="line">    <span class="keyword">let</span> <span class="keyword">mut </span><span class="variable">stream</span> = LogicalReplicationStream::<span class="title function_ invoke__">new</span>(</span><br><span class="line">        <span class="string">&quot;postgresql://postgres:password@localhost:5432/mydb?replication=database&quot;</span>,</span><br><span class="line">        config,</span><br><span class="line">    ).<span class="keyword">await</span>?;</span><br><span class="line"></span><br><span class="line">    stream.<span class="title function_ invoke__">start</span>(<span class="literal">None</span>).<span class="keyword">await</span>?;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">cancel_token</span> = CancellationToken::<span class="title function_ invoke__">new</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">loop</span> &#123;</span><br><span class="line">        <span class="keyword">match</span> stream.<span class="title function_ invoke__">next_event_with_retry</span>(&amp;cancel_token).<span class="keyword">await</span> &#123;</span><br><span class="line">            <span class="title function_ invoke__">Ok</span>(event) =&gt; &#123;</span><br><span class="line">                <span class="built_in">println!</span>(<span class="string">&quot;Received: &#123;:?&#125;&quot;</span>, event);</span><br><span class="line">                stream.shared_lsn_feedback.<span class="title function_ invoke__">update_applied_lsn</span>(event.lsn.<span class="title function_ invoke__">value</span>());</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="title function_ invoke__">Err</span>(e) <span class="keyword">if</span> matches!(e, pg_walstream::ReplicationError::<span class="title function_ invoke__">Cancelled</span>(_)) =&gt; &#123;</span><br><span class="line">                <span class="built_in">println!</span>(<span class="string">&quot;Shutting down gracefully&quot;</span>);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="title function_ invoke__">Err</span>(e) =&gt; &#123;</span><br><span class="line">                <span class="built_in">eprintln!</span>(<span class="string">&quot;Error: &#123;&#125;&quot;</span>, e);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="title function_ invoke__">Ok</span>(())</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="Typed-Deserialization"><a href="#Typed-Deserialization" class="headerlink" title="Typed Deserialization"></a>Typed Deserialization</h3><p>pg-walstream supports mapping WAL events directly to Rust structs via Serde:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">use</span> serde::Deserialize;</span><br><span class="line"><span class="keyword">use</span> pg_walstream::EventType;</span><br><span class="line"></span><br><span class="line"><span class="meta">#[derive(Debug, Deserialize)]</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">User</span> &#123;</span><br><span class="line">    id: <span class="type">i64</span>,</span><br><span class="line">    username: <span class="type">String</span>,</span><br><span class="line">    email: <span class="type">Option</span>&lt;<span class="type">String</span>&gt;,</span><br><span class="line">    score: <span class="type">f64</span>,</span><br><span class="line">    active: <span class="type">bool</span>,</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Inside your event loop:</span></span><br><span class="line"><span class="keyword">match</span> &amp;event.event_type &#123;</span><br><span class="line">    EventType::Insert &#123; .. &#125; =&gt; &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">user</span>: User = event.<span class="title function_ invoke__">deserialize_insert</span>()?;</span><br><span class="line">        <span class="built_in">println!</span>(<span class="string">&quot;New user: &#123;:?&#125;&quot;</span>, user);</span><br><span class="line">    &#125;</span><br><span class="line">    EventType::Update &#123; .. &#125; =&gt; &#123;</span><br><span class="line">        <span class="keyword">let</span> (old, new): (<span class="type">Option</span>&lt;User&gt;, User) = event.<span class="title function_ invoke__">deserialize_update</span>()?;</span><br><span class="line">        <span class="built_in">println!</span>(<span class="string">&quot;Updated: &#123;:?&#125; -&gt; &#123;:?&#125;&quot;</span>, old, new);</span><br><span class="line">    &#125;</span><br><span class="line">    EventType::Delete &#123; .. &#125; =&gt; &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">user</span>: User = event.<span class="title function_ invoke__">deserialize_delete</span>()?;</span><br><span class="line">        <span class="built_in">println!</span>(<span class="string">&quot;Deleted: &#123;:?&#125;&quot;</span>, user);</span><br><span class="line">    &#125;</span><br><span class="line">    _ =&gt; &#123;&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="Producer-Consumer-Pattern-with-tokio-spawn"><a href="#Producer-Consumer-Pattern-with-tokio-spawn" class="headerlink" title="Producer&#x2F;Consumer Pattern with tokio::spawn"></a>Producer&#x2F;Consumer Pattern with tokio::spawn</h3><p>For high-throughput scenarios, decouple WAL reading from event processing using channels:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">use</span> tokio::sync::mpsc;</span><br><span class="line"><span class="keyword">use</span> pg_walstream::&#123;ChangeEvent, ReplicationError&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="keyword">fn</span> <span class="title function_">run_producer</span>(</span><br><span class="line">    <span class="keyword">mut</span> stream: LogicalReplicationStream,</span><br><span class="line">    cancel_token: CancellationToken,</span><br><span class="line">    tx: mpsc::Sender&lt;ChangeEvent&gt;,</span><br><span class="line">) &#123;</span><br><span class="line">    stream.<span class="title function_ invoke__">start</span>(<span class="literal">None</span>).<span class="keyword">await</span>.<span class="title function_ invoke__">unwrap</span>();</span><br><span class="line">    <span class="keyword">loop</span> &#123;</span><br><span class="line">        <span class="keyword">match</span> stream.<span class="title function_ invoke__">next_event_with_retry</span>(&amp;cancel_token).<span class="keyword">await</span> &#123;</span><br><span class="line">            <span class="title function_ invoke__">Ok</span>(event) =&gt; &#123;</span><br><span class="line">                stream.shared_lsn_feedback.<span class="title function_ invoke__">update_applied_lsn</span>(event.lsn.<span class="title function_ invoke__">value</span>());</span><br><span class="line">                <span class="keyword">if</span> tx.<span class="title function_ invoke__">send</span>(event).<span class="keyword">await</span>.<span class="title function_ invoke__">is_err</span>() &#123; <span class="keyword">break</span>; &#125;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="title function_ invoke__">Err</span>(ReplicationError::<span class="title function_ invoke__">Cancelled</span>(_)) =&gt; <span class="keyword">break</span>,</span><br><span class="line">            <span class="title function_ invoke__">Err</span>(e) =&gt; &#123;</span><br><span class="line">                <span class="built_in">eprintln!</span>(<span class="string">&quot;Fatal: &#123;e&#125;&quot;</span>);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    stream.<span class="title function_ invoke__">stop</span>().<span class="keyword">await</span>.<span class="title function_ invoke__">ok</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="Load-Testing-Results"><a href="#Load-Testing-Results" class="headerlink" title="Load Testing Results"></a>Load Testing Results</h2><p>The library was benchmarked on an 8-core Intel Xeon Platinum 8370C (16 GB RAM, Ubuntu 22.04) across three PostgreSQL configurations:</p><ul><li><strong>PG16</strong> with protocol v4 + parallel streaming (rustls-tls backend)</li><li><strong>PG18</strong> with binary mode + direct TLS negotiation</li><li><strong>PG18 + COPY</strong> with COPY generator optimization</li></ul><h3 id="DML-Throughput-events-sec"><a href="#DML-Throughput-events-sec" class="headerlink" title="DML Throughput (events&#x2F;sec)"></a>DML Throughput (events&#x2F;sec)</h3><table><thead><tr><th>Scenario</th><th align="right">PG16</th><th align="right">PG18 Binary+DirectTLS</th><th align="right">PG18 +COPY</th></tr></thead><tbody><tr><td>Baseline</td><td align="right">148,533</td><td align="right">168,205</td><td align="right"><strong>209,193</strong></td></tr><tr><td>Batch-5000</td><td align="right">132,623</td><td align="right">151,820</td><td align="right"><strong>190,687</strong></td></tr><tr><td>4-Writers</td><td align="right">135,036</td><td align="right">159,233</td><td align="right"><strong>193,354</strong></td></tr><tr><td>Mixed-DML</td><td align="right">42,198</td><td align="right">176,580</td><td align="right"><strong>186,019</strong></td></tr><tr><td>Batch-100</td><td align="right">22,270</td><td align="right">141,597</td><td align="right"><strong>199,780</strong></td></tr><tr><td>Wide-20col</td><td align="right">18,917</td><td align="right">172,772</td><td align="right"><strong>173,283</strong></td></tr><tr><td>Payload-2KB</td><td align="right">14,017</td><td align="right">114,884</td><td align="right"><strong>134,323</strong></td></tr></tbody></table><p>Peak throughput: <strong>209,193 events&#x2F;sec</strong> (PG18 + COPY, Baseline scenario).</p><h3 id="Data-Throughput-MB-s"><a href="#Data-Throughput-MB-s" class="headerlink" title="Data Throughput (MB&#x2F;s)"></a>Data Throughput (MB&#x2F;s)</h3><table><thead><tr><th>Scenario</th><th align="right">PG16</th><th align="right">PG18 Binary+DirectTLS</th><th align="right">PG18 +COPY</th></tr></thead><tbody><tr><td>Baseline</td><td align="right">30.4</td><td align="right">31.1</td><td align="right"><strong>38.7</strong></td></tr><tr><td>Wide-20col</td><td align="right">21.3</td><td align="right">50.7</td><td align="right"><strong>51.5</strong></td></tr><tr><td>Payload-2KB</td><td align="right">28.3</td><td align="right">43.9</td><td align="right"><strong>57.1</strong></td></tr><tr><td>Mixed-DML</td><td align="right">7.9</td><td align="right">32.1</td><td align="right"><strong>33.8</strong></td></tr><tr><td>Batch-100</td><td align="right">4.6</td><td align="right">26.2</td><td align="right"><strong>37.0</strong></td></tr></tbody></table><p>Best data throughput: <strong>57.1 MB&#x2F;s</strong> (PG18 + COPY, Payload-2KB scenario).</p><h3 id="Stress-Scaling-16-to-192-Concurrent-Writers"><a href="#Stress-Scaling-16-to-192-Concurrent-Writers" class="headerlink" title="Stress Scaling: 16 to 192 Concurrent Writers"></a>Stress Scaling: 16 to 192 Concurrent Writers</h3><table><thead><tr><th align="right">Writers</th><th align="right">PG16</th><th align="right">PG18 Binary+DirectTLS</th><th align="right">PG18 +COPY</th></tr></thead><tbody><tr><td align="right">16</td><td align="right">125,657</td><td align="right">130,625</td><td align="right"><strong>185,044</strong></td></tr><tr><td align="right">32</td><td align="right">111,970</td><td align="right">133,880</td><td align="right"><strong>184,718</strong></td></tr><tr><td align="right">64</td><td align="right">103,937</td><td align="right">125,082</td><td align="right"><strong>182,349</strong></td></tr><tr><td align="right">128</td><td align="right">87,352</td><td align="right">109,594</td><td align="right"><strong>160,293</strong></td></tr><tr><td align="right">192</td><td align="right">71,316</td><td align="right">98,482</td><td align="right"><strong>171,585</strong></td></tr></tbody></table><p>Under high concurrency (16 to 192 writers), PG16 degrades by <strong>43%</strong> while PG18 + COPY only degrades by <strong>~7%</strong>, demonstrating significantly better scalability.</p><h3 id="CPU-Efficiency-events-sec-per-1-CPU"><a href="#CPU-Efficiency-events-sec-per-1-CPU" class="headerlink" title="CPU Efficiency (events&#x2F;sec per 1% CPU)"></a>CPU Efficiency (events&#x2F;sec per 1% CPU)</h3><table><thead><tr><th>Scenario</th><th align="right">PG16</th><th align="right">PG18 Binary+DirectTLS</th><th align="right">PG18 +COPY</th></tr></thead><tbody><tr><td>Baseline</td><td align="right">5,689</td><td align="right">5,637</td><td align="right"><strong>5,920</strong></td></tr><tr><td>Batch-5000</td><td align="right">5,379</td><td align="right"><strong>5,733</strong></td><td align="right">5,440</td></tr><tr><td>Wide-20col</td><td align="right">2,369</td><td align="right">5,059</td><td align="right"><strong>5,517</strong></td></tr><tr><td>Batch-100</td><td align="right">3,966</td><td align="right">5,572</td><td align="right"><strong>5,693</strong></td></tr></tbody></table><p>PG18 variants deliver consistently higher CPU efficiency, averaging <strong>5,200+ events&#x2F;sec per 1% CPU</strong> compared to PG16’s <strong>~4,700</strong>.</p><h3 id="Memory-Usage"><a href="#Memory-Usage" class="headerlink" title="Memory Usage"></a>Memory Usage</h3><p>All configurations remain extremely lightweight — between <strong>15–18 MB RSS</strong> regardless of load. Memory stays flat even under 192 concurrent writers, demonstrating the zero-copy buffer design pays off.</p><h3 id="Key-Takeaways-from-Load-Tests"><a href="#Key-Takeaways-from-Load-Tests" class="headerlink" title="Key Takeaways from Load Tests"></a>Key Takeaways from Load Tests</h3><ol><li><strong>PG18 + COPY + binary mode</strong> is the clear winner, peaking at <strong>209K events&#x2F;sec</strong></li><li><strong>Stress resilience</strong> — PG18 + COPY maintains throughput under heavy concurrency where PG16 degrades sharply</li><li><strong>CPU efficient</strong> — the rustls-tls backend uses ~3x less CPU than libpq in prior benchmarks (4,252 vs 1,628 events&#x2F;sec per 1% CPU)</li><li><strong>Memory stable</strong> — sub-18 MB footprint under all tested conditions</li><li><strong>Binary mode + direct TLS</strong> provide significant improvements even without COPY optimization</li></ol><h2 id="Performance-Design-Decisions"><a href="#Performance-Design-Decisions" class="headerlink" title="Performance Design Decisions"></a>Performance Design Decisions</h2><p>Several design choices contribute to pg-walstream’s performance:</p><ul><li><strong>SmallVec</strong> for tuple data — up to 16 columns stored inline on the stack, avoiding heap allocation for common cases</li><li><strong>Custom OidHasher</strong> — eliminates SipHash overhead for 32-bit OID integer keys</li><li><strong>Arc&lt;str&gt;</strong> for column&#x2F;namespace names — shared immutable strings across events</li><li><strong>CachePadded atomics</strong> for LSN feedback — avoids false sharing in concurrent scenarios</li><li><strong>Feedback throttling</strong> — time checks only every 128 events via bitmask (<code>count &amp; 0x7F == 0</code>)</li></ul><h2 id="TCP-Tuning-for-Production"><a href="#TCP-Tuning-for-Production" class="headerlink" title="TCP Tuning for Production"></a>TCP Tuning for Production</h2><p>For high-throughput deployments, the following Linux kernel parameters are recommended:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">net.core.rmem_max = 67108864</span><br><span class="line">net.core.wmem_max = 67108864</span><br><span class="line">net.ipv4.tcp_rmem = 4096 262144 67108864</span><br><span class="line">net.ipv4.tcp_wmem = 4096 262144 67108864</span><br><span class="line">net.ipv4.tcp_congestion_control = bbr</span><br><span class="line">net.core.netdev_max_backlog = 5000</span><br></pre></td></tr></table></figure><h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>pg-walstream fills a gap in the Rust ecosystem for a production-grade PostgreSQL WAL streaming library. With protocol v1–v4 support, dual connection backends, zero-copy parsing, and throughput exceeding 200K events&#x2F;sec, it provides a solid foundation for building CDC pipelines, event-driven systems, and real-time data synchronization.</p><p>The load testing results demonstrate that pairing pg-walstream with PostgreSQL 18’s binary mode and COPY optimization delivers exceptional performance and scalability — maintaining high throughput even under 192 concurrent writers while keeping memory usage under 18 MB.</p><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li><a href="https://github.com/isdaniel/pg-walstream">pg-walstream GitHub Repository</a></li><li><a href="https://crates.io/crates/pg_walstream">pg-walstream on crates.io</a></li><li><a href="https://docs.rs/pg-walstream">API Documentation on docs.rs</a></li><li><a href="https://github.com/isdaniel/pg-walstream/blob/main/LOAD_TEST_COMPARISON.md">Load Test Comparison Report</a></li><li><a href="https://www.postgresql.org/docs/current/logical-replication.html">PostgreSQL Logical Replication Documentation</a></li></ul><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/pg-walstream-rust-postgresql-wal-streaming/">https://isdaniel.github.io/pg-walstream-rust-postgresql-wal-streaming/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/pg-walstream-rust-postgresql-wal-streaming/</id>
    <link href="https://isdaniel.github.io/pg-walstream-rust-postgresql-wal-streaming/"/>
    <published>2026-04-21T22:00:00.000Z</published>
    <summary>A high-performance Rust library for parsing and streaming PostgreSQL WAL messages via logical replication — with load testing results showing 200K+ events/sec.</summary>
    <title>pg-walstream: High-Performance PostgreSQL WAL Streaming in Rust</title>
    <updated>2026-04-22T03:00:22.029Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/logical-replication/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/tags/logical-replication/"/>
    <category term="CDC" scheme="https://isdaniel.github.io/tags/CDC/"/>
    <content>
      <![CDATA[<h2 id="What-is-pg2any"><a href="#What-is-pg2any" class="headerlink" title="What is pg2any?"></a>What is pg2any?</h2><p><a href="https://github.com/isdaniel/pg2any">pg2any</a> is a Rust library (published as <code>pg2any_lib</code> on <a href="https://crates.io/crates/pg2any_lib">crates.io</a>) that builds production-ready Change Data Capture (CDC) pipelines. It reads PostgreSQL’s Write-Ahead Log (WAL) through logical replication and replays changes — inserts, updates, deletes, and truncates — to destination databases in real time.</p><p>Supported destinations:</p><ul><li><strong>MySQL</strong> (via SQLx)</li><li><strong>SQL Server</strong> (via Tiberius)</li><li><strong>SQLite</strong> (via SQLx)</li></ul><p>Each destination is behind a Cargo feature flag (<code>mysql</code>, <code>sqlserver</code>, <code>sqlite</code>), so you compile only the drivers you need.</p><p>A ready-to-run example application lives at <a href="https://github.com/isdaniel/rust_playground/tree/main/pg2any-example">pg2any-example</a>, and the underlying PostgreSQL streaming replication protocol is handled by the companion crate <a href="https://github.com/isdaniel/pg-walstream">pg_walstream</a>.</p><h2 id="Architecture"><a href="#Architecture" class="headerlink" title="Architecture"></a>Architecture</h2><p>pg2any follows a <strong>producer–consumer</strong> pattern with <strong>file-based transaction persistence</strong> as the intermediary. This design gives crash safety: if the process dies mid-stream, committed-but-unexecuted transactions survive on disk and are replayed on restart.</p><figure class="highlight nix"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">┌─────────────────┐        ┌──────────────────────────────────────┐       ┌──────────────────┐</span><br><span class="line">│   PostgreSQL    │        │          pg2any CDC Engine            │       │   Destination    │</span><br><span class="line">│                 │  WAL   │                                      │  SQL  │  MySQL <span class="symbol">/</span> MSSQL   │</span><br><span class="line">│  Logical Repli- │───────▶│  ┌──────────┐     ┌───────────────┐ │──────▶│  <span class="symbol">/</span> SQLite        │</span><br><span class="line">│  cation Stream  │        │  │ Producer  │────▶│ File Storage  │ │       │                  │</span><br><span class="line">│                 │        │  └──────────┘     └───────┬───────┘ │       └──────────────────┘</span><br><span class="line">└─────────────────┘        │                          │         │</span><br><span class="line">                           │                   ┌──────▼───────┐ │       ┌──────────────────┐</span><br><span class="line">                           │                   │   Consumer   │ │       │   Prometheus     │</span><br><span class="line">                           │                   │ (Priority Q) │ │       │   <span class="symbol">/metrics</span>       │</span><br><span class="line">                           │                   └──────────────┘ │       │   <span class="symbol">/health</span>        │</span><br><span class="line">                           └──────────────────────────────────────┘       └──────────────────┘</span><br></pre></td></tr></table></figure><h3 id="Three-Directory-Transaction-Lifecycle"><a href="#Three-Directory-Transaction-Lifecycle" class="headerlink" title="Three-Directory Transaction Lifecycle"></a>Three-Directory Transaction Lifecycle</h3><p>Transactions flow through three directories during their lifetime:</p><table><thead><tr><th>Directory</th><th>Purpose</th></tr></thead><tbody><tr><td><code>sql_data_tx/</code></td><td>Stores actual SQL content. Files are append-only and rotate at 64 MB segments for large transactions.</td></tr><tr><td><code>sql_received_tx/</code></td><td>Metadata for <strong>in-progress</strong> transactions (created at <code>BEGIN</code>).</td></tr><tr><td><code>sql_pending_tx/</code></td><td>Metadata for <strong>committed</strong> transactions ready for the consumer (atomically moved from <code>sql_received_tx/</code> on <code>COMMIT</code>).</td></tr></tbody></table><p>This three-phase approach means that only fully committed transactions ever reach the consumer, and incomplete transactions are cleaned up on restart.</p><h3 id="Producer"><a href="#Producer" class="headerlink" title="Producer"></a>Producer</h3><p>The producer reads the logical replication stream event by event:</p><ol><li>On <strong>BEGIN</strong> — creates a metadata file in <code>sql_received_tx/</code> and a data file in <code>sql_data_tx/</code>.</li><li>On <strong>INSERT &#x2F; UPDATE &#x2F; DELETE &#x2F; TRUNCATE</strong> — converts each event to destination-dialect SQL and appends it to the data file via a <code>BufferedEventWriter</code>.</li><li>On <strong>COMMIT</strong> — atomically moves metadata from <code>sql_received_tx/</code> to <code>sql_pending_tx/</code>, making the transaction visible to the consumer.</li></ol><p>For protocol version 2+, the producer also handles <strong>streaming transactions</strong> (<code>StreamStart</code> &#x2F; <code>StreamStop</code> &#x2F; <code>StreamCommit</code>), which allow PostgreSQL to send chunks of large in-progress transactions before the final commit.</p><h3 id="Consumer"><a href="#Consumer" class="headerlink" title="Consumer"></a>Consumer</h3><p>The consumer maintains a <strong>priority queue ordered by commit LSN</strong> to guarantee correct replay order:</p><ol><li>Reads pending transaction metadata from <code>sql_pending_tx/</code>.</li><li>Parses SQL from <code>sql_data_tx/</code> using a streaming SQL parser (constant memory regardless of transaction size).</li><li>Executes statements atomically in a destination-side database transaction.</li><li>Invokes a <strong>PreCommitHook</strong> — a callback that runs inside the destination transaction before <code>COMMIT</code>, used to atomically persist the LSN checkpoint alongside the data. This eliminates the window where data is committed but the checkpoint is not (or vice versa).</li><li>Commits, then deletes processed files.</li></ol><h3 id="Crash-Recovery"><a href="#Crash-Recovery" class="headerlink" title="Crash Recovery"></a>Crash Recovery</h3><p>On startup, pg2any scans:</p><ul><li><code>sql_received_tx/</code> for incomplete transactions → <strong>aborts</strong> them.</li><li><code>sql_pending_tx/</code> for committed-but-unexecuted transactions → <strong>replays</strong> them.</li></ul><p>The <code>LsnTracker</code> persists the last successfully applied LSN, so replication resumes exactly where it left off.</p><h2 id="Key-Features"><a href="#Key-Features" class="headerlink" title="Key Features"></a>Key Features</h2><h3 id="DML-Coalescing"><a href="#DML-Coalescing" class="headerlink" title="DML Coalescing"></a>DML Coalescing</h3><p>One of pg2any’s most impactful optimizations. Instead of executing individual DML statements one by one, the coalescing engine merges consecutive same-table operations:</p><ul><li>Multiple <code>INSERT</code>s → multi-value <code>INSERT INTO ... VALUES (...), (...), (...)</code>.</li><li>Multiple <code>UPDATE</code>s → <code>CASE</code>-<code>WHEN</code> batch updates.</li><li>Multiple <code>DELETE</code>s → combined <code>WHERE</code> clauses with <code>OR</code>.</li></ul><p>This is applied across all three destination types, with dialect-aware identifier quoting (backticks for MySQL, brackets for SQL Server, double quotes for SQLite) and respects <code>max_allowed_packet</code> limits (MySQL) with an 80% safety margin.</p><h3 id="Compressed-Storage"><a href="#Compressed-Storage" class="headerlink" title="Compressed Storage"></a>Compressed Storage</h3><p>When enabled via <code>PG2ANY_ENABLE_COMPRESSION=true</code>, transaction files are stored as <code>.sql.gz</code> with accompanying <code>.sql.gz.idx</code> index files. Sync points are created every 1,000 statements, enabling O(1) seeking to arbitrary positions without decompressing the entire file — critical for efficient crash recovery of large transactions.</p><h3 id="Monitoring"><a href="#Monitoring" class="headerlink" title="Monitoring"></a>Monitoring</h3><p>With the <code>metrics</code> feature enabled, pg2any exposes a Prometheus-compatible HTTP server (default port 8080):</p><ul><li><code>GET /metrics</code> — Prometheus text format with event counters, LSN progress, processing rates, error counts, and transaction statistics.</li><li><code>GET /health</code> — JSON health status.</li></ul><p>Metrics use <code>AtomicU64</code> counters (lock-free) to minimize overhead on the hot path. When compiled without the <code>metrics</code> feature, all metric calls become zero-cost no-ops.</p><h3 id="Protocol-Version-Support"><a href="#Protocol-Version-Support" class="headerlink" title="Protocol Version Support"></a>Protocol Version Support</h3><table><thead><tr><th>Version</th><th>Capabilities</th></tr></thead><tbody><tr><td>v1</td><td>Basic logical replication (BEGIN, INSERT, UPDATE, DELETE, TRUNCATE, COMMIT)</td></tr><tr><td>v2</td><td>Adds streaming transactions for large in-progress transactions</td></tr><tr><td>v3</td><td>Adds two-phase commit support</td></tr><tr><td>v4</td><td>Additional protocol capabilities</td></tr></tbody></table><h2 id="Quick-Start"><a href="#Quick-Start" class="headerlink" title="Quick Start"></a>Quick Start</h2><h3 id="Prerequisites"><a href="#Prerequisites" class="headerlink" title="Prerequisites"></a>Prerequisites</h3><ul><li>PostgreSQL 10+ with <code>wal_level = logical</code></li><li>A destination database (MySQL 8.0+, SQL Server, or SQLite)</li></ul><h3 id="PostgreSQL-Setup"><a href="#PostgreSQL-Setup" class="headerlink" title="PostgreSQL Setup"></a>PostgreSQL Setup</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Verify logical replication is enabled</span></span><br><span class="line"><span class="keyword">SHOW</span> wal_level;           <span class="comment">-- must be &#x27;logical&#x27;</span></span><br><span class="line"><span class="keyword">SHOW</span> max_replication_slots;</span><br><span class="line"><span class="keyword">SHOW</span> max_wal_senders;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Create a publication for the tables you want to replicate</span></span><br><span class="line"><span class="keyword">CREATE</span> PUBLICATION cdc_pub <span class="keyword">FOR</span> <span class="keyword">ALL</span> TABLES;</span><br><span class="line"><span class="comment">-- Or for specific tables:</span></span><br><span class="line"><span class="comment">-- CREATE PUBLICATION cdc_pub FOR TABLE orders, customers;</span></span><br></pre></td></tr></table></figure><h3 id="Using-pg2any-as-a-Library"><a href="#Using-pg2any-as-a-Library" class="headerlink" title="Using pg2any as a Library"></a>Using pg2any as a Library</h3><p>Add <code>pg2any_lib</code> to your <code>Cargo.toml</code> with the destination features you need:</p><figure class="highlight toml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="section">[dependencies]</span></span><br><span class="line"><span class="attr">pg2any_lib</span> = &#123; version = <span class="string">&quot;0.9&quot;</span>, features = [<span class="string">&quot;mysql&quot;</span>, <span class="string">&quot;metrics&quot;</span>] &#125;</span><br><span class="line"><span class="attr">tokio</span> = &#123; version = <span class="string">&quot;1&quot;</span>, features = [<span class="string">&quot;full&quot;</span>] &#125;</span><br></pre></td></tr></table></figure><h3 id="Configuration-via-Environment-Variables"><a href="#Configuration-via-Environment-Variables" class="headerlink" title="Configuration via Environment Variables"></a>Configuration via Environment Variables</h3><table><thead><tr><th>Variable</th><th>Description</th><th>Default</th></tr></thead><tbody><tr><td><code>CDC_SOURCE_CONNECTION_STRING</code></td><td>PostgreSQL URI with <code>?replication=database</code></td><td>Required</td></tr><tr><td><code>CDC_DEST_TYPE</code></td><td><code>MySQL</code>, <code>SqlServer</code>, or <code>SQLite</code></td><td>Required</td></tr><tr><td><code>CDC_DEST_URI</code></td><td>Destination connection string</td><td>Required</td></tr><tr><td><code>CDC_REPLICATION_SLOT</code></td><td>Replication slot name</td><td>Required</td></tr><tr><td><code>CDC_PUBLICATION</code></td><td>Publication name</td><td>Required</td></tr><tr><td><code>CDC_SCHEMA_MAPPING</code></td><td>Comma-separated <code>source:dest</code> pairs (e.g., <code>public:cdc_db</code>)</td><td>None</td></tr><tr><td><code>CDC_PROTOCOL_VERSION</code></td><td>Protocol version (1–4)</td><td><code>1</code></td></tr><tr><td><code>CDC_STREAMING_MODE</code></td><td>Enable streaming transactions (requires v2+)</td><td><code>false</code></td></tr><tr><td><code>CDC_BINARY_MODE</code></td><td>Binary format for protocol</td><td><code>false</code></td></tr><tr><td><code>CDC_CONNECTION_TIMEOUT</code></td><td>Connection timeout (seconds)</td><td><code>30</code></td></tr><tr><td><code>CDC_QUERY_TIMEOUT</code></td><td>Query timeout (seconds)</td><td><code>60</code></td></tr><tr><td><code>CDC_BUFFER_SIZE</code></td><td>Transaction channel queue capacity</td><td><code>1000</code></td></tr><tr><td><code>CDC_BATCH_SIZE</code></td><td>Batch size for destination inserts</td><td><code>1000</code></td></tr><tr><td><code>PG2ANY_ENABLE_COMPRESSION</code></td><td>Enable gzip compression for SQL files</td><td><code>false</code></td></tr><tr><td><code>PG2ANY_METRICS_PORT</code></td><td>Prometheus HTTP port</td><td><code>8080</code></td></tr></tbody></table><h3 id="Run-with-Docker-Compose"><a href="#Run-with-Docker-Compose" class="headerlink" title="Run with Docker Compose"></a>Run with Docker Compose</h3><p>The <a href="https://github.com/isdaniel/rust_playground/tree/main/pg2any-example">example project</a> includes a full Docker Compose stack with PostgreSQL, MySQL, the CDC application, and Prometheus:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/rust_playground.git</span><br><span class="line"><span class="built_in">cd</span> rust_playground/pg2any-example</span><br><span class="line"></span><br><span class="line"><span class="comment"># Start all services</span></span><br><span class="line">docker-compose up -d</span><br><span class="line"></span><br><span class="line"><span class="comment"># Watch CDC logs</span></span><br><span class="line">docker-compose logs -f cdc_app</span><br></pre></td></tr></table></figure><h2 id="Design-Decisions-Worth-Noting"><a href="#Design-Decisions-Worth-Noting" class="headerlink" title="Design Decisions Worth Noting"></a>Design Decisions Worth Noting</h2><p><strong>File-based persistence over in-memory queues</strong> — Using the filesystem as the intermediary between producer and consumer trades some latency for crash safety. If the process is killed, no committed transaction data is lost.</p><p><strong>PreCommitHook for atomic checkpoints</strong> — Executing the LSN checkpoint update inside the same destination transaction as the data changes eliminates an entire class of consistency bugs where the checkpoint and data can diverge.</p><p><strong>Feature-gated compilation</strong> — Database drivers and monitoring are behind Cargo features, so the binary only includes what you actually use. This reduces compile time, binary size, and attack surface.</p><p><strong>Transaction segmentation at 64 MB</strong> — Large transactions (e.g., bulk imports) are split across multiple files to prevent unbounded memory and disk usage.</p><h2 id="Testing"><a href="#Testing" class="headerlink" title="Testing"></a>Testing</h2><p>pg2any has 104+ tests across 16 test files, covering:</p><ul><li>Integration tests for all three destination types</li><li>Streaming transaction correctness</li><li>Compression and large file handling</li><li>WHERE clause generation for UPDATE&#x2F;DELETE with various replica identity configurations</li><li>Position tracking for crash recovery</li><li>Metrics logic</li></ul><p>Beyond unit and integration tests, the project runs <strong>chaos testing</strong> in CI — randomly restarting the CDC application during pgbench workloads to validate graceful shutdown and recovery under real conditions.</p><h2 id="Links"><a href="#Links" class="headerlink" title="Links"></a>Links</h2><ul><li><a href="https://github.com/isdaniel/pg2any">pg2any source code (GitHub)</a></li><li><a href="https://crates.io/crates/pg2any_lib">pg2any_lib on crates.io</a></li><li><a href="https://github.com/isdaniel/rust_playground/tree/main/pg2any-example">Example application</a></li><li><a href="https://github.com/isdaniel/pg-walstream">pg_walstream — PostgreSQL replication protocol crate</a></li></ul><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/pg2any-rust-introduce/">https://isdaniel.github.io/pg2any-rust-introduce/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/pg2any-rust-introduce/</id>
    <link href="https://isdaniel.github.io/pg2any-rust-introduce/"/>
    <published>2025-09-09T21:10:43.000Z</published>
    <summary>pg2any is a production-ready Rust library that captures real-time data changes from PostgreSQL via logical replication and streams them to MySQL, SQL Server, or SQLite with crash-safe, file-based transaction persistence.</summary>
    <title>pg2any: A Rust CDC Library for Streaming PostgreSQL Changes to Any Database</title>
    <updated>2026-04-22T03:00:22.030Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="rust" scheme="https://isdaniel.github.io/categories/rust/"/>
    <category term="opensource" scheme="https://isdaniel.github.io/categories/rust/opensource/"/>
    <category term="redis" scheme="https://isdaniel.github.io/categories/rust/opensource/redis/"/>
    <category term="fdw" scheme="https://isdaniel.github.io/categories/rust/opensource/redis/fdw/"/>
    <category term="rust" scheme="https://isdaniel.github.io/tags/rust/"/>
    <category term="opensource" scheme="https://isdaniel.github.io/tags/opensource/"/>
    <category term="redis" scheme="https://isdaniel.github.io/tags/redis/"/>
    <content>
      <![CDATA[<p>大家好，今天要和大家介紹我近期開發的一個開源專案 —— <a href="https://github.com/isdaniel/redis_fdw_rs"><code>redis_fdw_rs</code></a>，這是一個使用 Rust 語言與 <a href="https://github.com/pgcentralfoundation/pgrx">pgrx</a> 框架實作的 <strong>Redis Foreign Data Wrapper (FDW)</strong>，讓你能夠在 PostgreSQL 中直接查詢 Redis 資料，就像操作一般的資料表一樣。</p><h2 id="為什麼需要-Redis-FDW？"><a href="#為什麼需要-Redis-FDW？" class="headerlink" title="為什麼需要 Redis FDW？"></a>為什麼需要 Redis FDW？</h2><p>Redis 是高效的快取資料庫，常被用於儲存 session、排行榜、事件流等資料。但當你想從 PostgreSQL 中同步存取 Redis 資料，就必須透過額外程式碼或 ETL 工具，相對麻煩。</p><p><code>redis_fdw_rs</code> 就是為了解決這個痛點而生：<strong>透過 FDW 介面，讓 PostgreSQL 能用 SQL 查 Redis！</strong></p><hr><h2 id="🚀-專案特色與支援功能"><a href="#🚀-專案特色與支援功能" class="headerlink" title="🚀 專案特色與支援功能"></a>🚀 專案特色與支援功能</h2><p>這個 FDW 專案目前已經支援以下功能，適合實際部署與使用：</p><ul><li>✅ <strong>支援 Redis Cluster</strong></li><li>✅ <strong>WHERE 條件下推</strong>（Pushdown）：減少資料搬移量，提升查詢效率</li><li>✅ <strong>連線池管理</strong>：避免反覆連線 Redis 的開銷</li><li>✅ <strong>Stream 大量資料支援</strong>：批次查詢、分頁等場景皆可處理</li><li>✅ <strong>支援 PostgreSQL 14~17</strong></li><li>✅ <strong>Unit Test &amp; Integration Test</strong>：專案有測試覆蓋，確保穩定性</li></ul><hr><h2 id="使用範例（超簡單）"><a href="#使用範例（超簡單）" class="headerlink" title="使用範例（超簡單）"></a>使用範例（超簡單）</h2><p>只需要幾行 SQL，就能連結 Redis 並開始查詢：</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 建立 Redis 伺服器連線</span></span><br><span class="line"><span class="keyword">CREATE</span> SERVER redis_server</span><br><span class="line"><span class="keyword">FOREIGN</span> DATA WRAPPER redis_wrapper</span><br><span class="line">OPTIONS (host_port <span class="string">&#x27;127.0.0.1:6379&#x27;</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 宣告一個 Redis hash 的外部表格</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">FOREIGN</span> <span class="keyword">TABLE</span> user_profiles (</span><br><span class="line">  field text,</span><br><span class="line">  <span class="keyword">value</span> text</span><br><span class="line">)</span><br><span class="line">SERVER redis_server</span><br><span class="line">OPTIONS (table_type <span class="string">&#x27;hash&#x27;</span>, table_key_prefix <span class="string">&#x27;user:profiles&#x27;</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 開始使用 SQL 操作 Redis！</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> user_profiles <span class="keyword">VALUES</span> (<span class="string">&#x27;name&#x27;</span>, <span class="string">&#x27;John&#x27;</span>);</span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> user_profiles <span class="keyword">WHERE</span> field <span class="operator">=</span> <span class="string">&#x27;email&#x27;</span>;</span><br></pre></td></tr></table></figure><h2 id="Redis-Cluster-模式支援"><a href="#Redis-Cluster-模式支援" class="headerlink" title="Redis Cluster 模式支援"></a>Redis Cluster 模式支援</h2><p><code>redis_fdw_rs</code> 也完全支援 Redis Cluster 架構。你只需指定多個節點的 <code>host_port</code>，即可享有以下好處：</p><h3 id="Cluster-優勢"><a href="#Cluster-優勢" class="headerlink" title="Cluster 優勢"></a>Cluster 優勢</h3><ul><li><strong>自動故障轉移</strong>：節點失效時自動轉移到健康節點</li><li><strong>自動 sharding</strong>：資料分散在多節點，自動分片</li><li><strong>節點自動探索</strong>：只需指定一個節點，驅動程式會自動發現整個叢集</li><li><strong>高可用性</strong>：節點損壞仍可正常讀寫</li></ul><h3 id="範例設定："><a href="#範例設定：" class="headerlink" title="範例設定："></a>範例設定：</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 建立 cluster foreign server</span></span><br><span class="line"><span class="keyword">CREATE</span> SERVER redis_cluster_server</span><br><span class="line"><span class="keyword">FOREIGN</span> DATA WRAPPER redis_wrapper</span><br><span class="line">OPTIONS (</span><br><span class="line">    host_port <span class="string">&#x27;127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002&#x27;</span>,</span><br><span class="line">    password <span class="string">&#x27;your_redis_password&#x27;</span>  <span class="comment">-- 可選</span></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 建立 cluster 對應的外部表格</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">FOREIGN</span> <span class="keyword">TABLE</span> user_sessions (</span><br><span class="line">    field TEXT,</span><br><span class="line">    <span class="keyword">value</span> TEXT</span><br><span class="line">)</span><br><span class="line">SERVER redis_cluster_server</span><br><span class="line">OPTIONS (</span><br><span class="line">    database <span class="string">&#x27;0&#x27;</span>,</span><br><span class="line">    table_type <span class="string">&#x27;hash&#x27;</span>,</span><br><span class="line">    table_key_prefix <span class="string">&#x27;session:active&#x27;</span></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- 與單節點操作無異</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> user_sessions <span class="keyword">VALUES</span> (<span class="string">&#x27;user123&#x27;</span>, <span class="string">&#x27;session_token_abc&#x27;</span>);</span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> user_sessions <span class="keyword">WHERE</span> field <span class="operator">=</span> <span class="string">&#x27;user123&#x27;</span>;</span><br></pre></td></tr></table></figure><h3 id="範例結果"><a href="#範例結果" class="headerlink" title="範例結果"></a>範例結果</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br></pre></td><td class="code"><pre><span class="line">redis_fdw_rs<span class="operator">=</span># <span class="keyword">INSERT INTO</span> user_profiles (key, <span class="keyword">value</span>)</span><br><span class="line"><span class="keyword">SELECT</span> i, <span class="string">&#x27;value_&#x27;</span> <span class="operator">||</span> i</span><br><span class="line"><span class="keyword">FROM</span> generate_series(<span class="number">1</span>,<span class="number">100000</span>) i;</span><br><span class="line"><span class="keyword">INSERT</span> <span class="number">0</span> <span class="number">100000</span></span><br><span class="line"><span class="type">Time</span>: <span class="number">12911.183</span> ms (<span class="number">00</span>:<span class="number">12.911</span>)</span><br><span class="line">redis_fdw_rs<span class="operator">=</span># <span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> user_profiles <span class="keyword">where</span> key <span class="operator">=</span> <span class="string">&#x27;5&#x27;</span>;</span><br><span class="line"> key <span class="operator">|</span>  <span class="keyword">value</span></span><br><span class="line"><span class="comment">-----+---------</span></span><br><span class="line"> <span class="number">5</span>   <span class="operator">|</span> value_5</span><br><span class="line">(<span class="number">1</span> <span class="type">row</span>)</span><br><span class="line"></span><br><span class="line"><span class="type">Time</span>: <span class="number">15.380</span> ms</span><br><span class="line">redis_fdw_rs<span class="operator">=</span># <span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> user_profiles <span class="keyword">where</span> key <span class="keyword">in</span> (<span class="string">&#x27;10&#x27;</span>, <span class="string">&#x27;15&#x27;</span>, <span class="string">&#x27;20&#x27;</span>);</span><br><span class="line"> key <span class="operator">|</span>  <span class="keyword">value</span></span><br><span class="line"><span class="comment">-----+----------</span></span><br><span class="line"> <span class="number">10</span>  <span class="operator">|</span> value_10</span><br><span class="line"> <span class="number">15</span>  <span class="operator">|</span> value_15</span><br><span class="line"> <span class="number">20</span>  <span class="operator">|</span> value_20</span><br><span class="line">(<span class="number">3</span> <span class="keyword">rows</span>)</span><br><span class="line"></span><br><span class="line">redis_fdw_rs<span class="operator">=</span>#  <span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> user_profiles <span class="keyword">where</span> key <span class="keyword">like</span> <span class="string">&#x27;555%&#x27;</span>;</span><br><span class="line">  key  <span class="operator">|</span>    <span class="keyword">value</span></span><br><span class="line"><span class="comment">-------+-------------</span></span><br><span class="line"> <span class="number">55556</span> <span class="operator">|</span> value_55556</span><br><span class="line"> <span class="number">55581</span> <span class="operator">|</span> value_55581</span><br><span class="line"> <span class="number">55569</span> <span class="operator">|</span> value_55569</span><br><span class="line"> <span class="number">55561</span> <span class="operator">|</span> value_55561</span><br><span class="line"> <span class="number">55516</span> <span class="operator">|</span> value_55516</span><br><span class="line"> <span class="number">55538</span> <span class="operator">|</span> value_55538</span><br><span class="line"> <span class="number">55549</span> <span class="operator">|</span> value_55549</span><br><span class="line"> <span class="number">55539</span> <span class="operator">|</span> value_55539</span><br><span class="line"> <span class="number">55531</span> <span class="operator">|</span> value_55531</span><br><span class="line"> <span class="number">55545</span> <span class="operator">|</span> value_55545</span><br><span class="line"> <span class="number">55590</span> <span class="operator">|</span> value_55590</span><br><span class="line"> <span class="number">55512</span> <span class="operator">|</span> value_55512</span><br><span class="line"> <span class="number">55523</span> <span class="operator">|</span> value_55523</span><br><span class="line"> <span class="number">55534</span> <span class="operator">|</span> value_55534</span><br><span class="line"> <span class="number">55518</span> <span class="operator">|</span> value_55518</span><br><span class="line"> <span class="number">55560</span> <span class="operator">|</span> value_55560</span><br><span class="line"> <span class="number">55564</span> <span class="operator">|</span> value_55564</span><br><span class="line"> <span class="number">55592</span> <span class="operator">|</span> value_55592</span><br><span class="line"> <span class="number">55572</span> <span class="operator">|</span> value_55572</span><br><span class="line"> <span class="number">55519</span> <span class="operator">|</span> value_55519</span><br><span class="line"> <span class="number">55526</span> <span class="operator">|</span> value_55526</span><br><span class="line"> <span class="number">5559</span>  <span class="operator">|</span> value_5559</span><br><span class="line"> <span class="number">55530</span> <span class="operator">|</span> value_55530</span><br><span class="line"> <span class="number">55511</span> <span class="operator">|</span> value_55511</span><br><span class="line"> <span class="number">55562</span> <span class="operator">|</span> value_55562</span><br><span class="line"> <span class="number">55542</span> <span class="operator">|</span> value_55542</span><br><span class="line"> <span class="number">55582</span> <span class="operator">|</span> value_55582</span><br><span class="line"> <span class="number">55580</span> <span class="operator">|</span> value_55580</span><br><span class="line"> <span class="number">55501</span> <span class="operator">|</span> value_55501</span><br><span class="line"> <span class="number">55540</span> <span class="operator">|</span> value_55540</span><br><span class="line"> <span class="number">55554</span> <span class="operator">|</span> value_55554</span><br><span class="line"> <span class="number">55546</span> <span class="operator">|</span> value_55546</span><br><span class="line"> <span class="number">55513</span> <span class="operator">|</span> value_55513</span><br><span class="line"> <span class="number">55548</span> <span class="operator">|</span> value_55548</span><br><span class="line"><span class="comment">--More--</span></span><br></pre></td></tr></table></figure><hr><h2 id="支援的-Redis-資料型態"><a href="#支援的-Redis-資料型態" class="headerlink" title="支援的 Redis 資料型態"></a>支援的 Redis 資料型態</h2><p>目前支援以下 Redis 資料型別，並標示已實作的操作：</p><table><thead><tr><th>Redis Type</th><th>SELECT</th><th>INSERT</th><th>UPDATE</th><th>DELETE</th><th>Status</th></tr></thead><tbody><tr><td>Hash</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Partial</strong> (UPDATE not supported)</td></tr><tr><td>List</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Partial</strong> (UPDATE not supported)</td></tr><tr><td>Set</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Partial</strong> (UPDATE not supported)</td></tr><tr><td>ZSet</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Partial</strong> (UPDATE not supported)</td></tr><tr><td>String</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Partial</strong> (UPDATE not supported)</td></tr><tr><td>Stream</td><td>✅</td><td>✅</td><td>❌</td><td>✅</td><td><strong>Full</strong> (Large data set support with pagination)</td></tr></tbody></table><p>可透過 <code>table_type</code> 和 <code>table_key_prefix</code> 指定資料類型與 key 前綴，也支援選擇 Redis 的 <code>database</code>（預設為 0）。</p><hr><h2 id="專案資源"><a href="#專案資源" class="headerlink" title="專案資源"></a>專案資源</h2><ul><li>GitHub 倉庫：<a href="https://github.com/isdaniel/redis_fdw_rs">https://github.com/isdaniel/redis_fdw_rs</a></li><li>文件與安裝說明完整、範例齊全</li><li>歡迎 Star、開 issue、PR 一起改進專案！</li></ul><hr><h2 id="結語：一起打造更強的資料存取能力！"><a href="#結語：一起打造更強的資料存取能力！" class="headerlink" title="結語：一起打造更強的資料存取能力！"></a>結語：一起打造更強的資料存取能力！</h2><p>這個專案仍持續演進中，如果你有 Redis &#x2F; PostgreSQL 混合架構的需求，或是對 Rust、FDW 開發有興趣，非常歡迎你一起參與改進。</p><p>有任何回饋，歡迎透過 GitHub 討論，我會持續更新與優化這個實用的工具！</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/redis-fdw-rs/">https://isdaniel.github.io/redis-fdw-rs/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/redis-fdw-rs/</id>
    <link href="https://isdaniel.github.io/redis-fdw-rs/"/>
    <published>2025-08-16T11:12:43.000Z</published>
    <summary>大家好，今天要和大家介紹我近期開發的一個開源專案 —— redis_fdw_rs，這是一個使用 Rust 語言與 pgrx 框架實作的 **Redis Foreign Data Wrapper (FDW)**，讓你能夠在 PostgreSQL 中直接查詢 Redis 資料，就像操作一般的資料表一樣。</summary>
    <title>【開源介紹】redis_fdw_rs：讓 PostgreSQL 直接查 Redis 的 FDW 擴充套件（Rust 編寫）</title>
    <updated>2026-04-22T03:00:22.031Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/logical-replication/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="logical-replication" scheme="https://isdaniel.github.io/tags/logical-replication/"/>
    <content>
      <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>大家好，今天要和大家介紹我近期開發的一個開源專案 <a href="https://github.com/isdaniel/replication_checker_rs" title="GitHub - isdaniel&#x2F;replication_checker_rs">replication_checker_rs</a> 如果你想用一個輕量、可讀性高的工具實時觀察 PostgreSQL 邏輯複寫（logical replication）流，或想把複寫協議的學習變成可執行的實驗場，<code>replication_checker_rs</code> 是一個不錯的起點：它是 Rust 實現 PostgreSQL logical replication protocol 使用 <code>libpq-sys</code> 的實作，能連上資料庫、建立 replication slot，並將 INSERT &#x2F; UPDATE &#x2F; DELETE &#x2F; TRUNCATE 等變更以可讀格式顯示出來。(<a href="https://github.com/isdaniel/replication_checker_rs">https://github.com/isdaniel/replication_checker_rs</a>)</p><hr><h2 id="為什麼會想用它？"><a href="#為什麼會想用它？" class="headerlink" title="為什麼會想用它？"></a>為什麼會想用它？</h2><ul><li><strong>學習角度</strong>：它直接實作 PostgreSQL 的 logical replication protocol（WAL message parsing、relation&#x2F;tuple 格式）以可讀、可改造的 Rust 程式碼呈現，適合希望理解底層協議的人。</li><li><strong>快速驗證</strong>：想知道 publication 有沒有正確產生事件、或某些 schema 變更會如何呈現時，可以直接跑這個工具觀察實際輸出。</li><li><strong>Rust + libpq 真實範例</strong>：展示如何用 <code>libpq-sys</code> 與 Tokio 做低階連線管理與 parser 實作</li><li><strong>延伸空間大</strong>：可以把它當作 PoC（proof of concept），加上 JSON 化、推到 Kafka&#x2F;Redis、做到可重啟的 consumer。</li></ul><hr><h2 id="功能與限制（快速掃描）"><a href="#功能與限制（快速掃描）" class="headerlink" title="功能與限制（快速掃描）"></a>功能與限制（快速掃描）</h2><p><strong>主要功能</strong></p><ul><li>用作邏輯複寫客戶端（logical replication client），可建立 replication slot 並接收變更。</li><li>支援顯示 <code>BEGIN</code>、<code>COMMIT</code>、<code>INSERT</code>、<code>UPDATE</code>、<code>DELETE</code>、<code>TRUNCATE</code> 以及 relation／tuple 資訊，並能處理 streaming（大型）交易。</li></ul><p><strong>目前限制</strong></p><ul><li>目前只把變更<strong>顯示為人類可讀格式</strong>，沒有把事件推到 Kafka&#x2F;Redis 等下游處理器。</li><li>只對文字型別（text types）有良好處理；binary 類型會以 raw 形式顯示。</li><li>slot 管理、錯誤復原邏輯較簡單（遇到大部分錯誤會結束程式）</li></ul><hr><h2 id="實作前準備（PostgreSQL-與系統）"><a href="#實作前準備（PostgreSQL-與系統）" class="headerlink" title="實作前準備（PostgreSQL 與系統）"></a>實作前準備（PostgreSQL 與系統）</h2><ol><li><strong>PostgreSQL 必須開啟 logical WAL</strong>：<code>wal_level = logical</code>（修改後需重啟 PostgreSQL）。這是 logical replication 的必要條件。</li><li><strong>建立 publication</strong>（只複寫你想要的 table，或 <code>FOR ALL TABLES</code>）：<code>CREATE PUBLICATION my_publication FOR TABLE table1, table2;</code>。</li><li><strong>建立有 replication 權限的 user</strong>：例如 <code>CREATE USER replicator WITH REPLICATION LOGIN PASSWORD &#39;password&#39;;</code> 並給予必要的 SELECT 權限。</li><li><strong>系統相依套件</strong>：需要 <code>libpq</code> 開發檔（例如 Debian&#x2F;Ubuntu 的 <code>libpq-dev</code>、macOS 用 Homebrew 安裝 <code>postgresql</code>），並使用 Rust 1.70+。</li></ol><blockquote><p>請注意：PostgreSQL DB 版本必須等於或高於版本 14，更多資訊請參閱以下連結。</p></blockquote><p><a href="https://www.postgresql.org/docs/14/protocol-replication.html">https://www.postgresql.org/docs/14/protocol-replication.html</a><br><a href="https://www.postgresql.org/docs/current/protocol-logical-replication.html#PROTOCOL-LOGICAL-REPLICATION-PARAMS">https://www.postgresql.org/docs/current/protocol-logical-replication.html#PROTOCOL-LOGICAL-REPLICATION-PARAMS</a></p><h3 id="PostgreSQL-設定（必要）"><a href="#PostgreSQL-設定（必要）" class="headerlink" title="PostgreSQL 設定（必要）"></a>PostgreSQL 設定（必要）</h3><ol><li><code>postgresql.conf</code>：</li></ol><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">wal_level = logical</span><br><span class="line">max_replication_slots = 10    # 視需求調整</span><br><span class="line">max_wal_senders = 10</span><br></pre></td></tr></table></figure><p>修改後重啟 PostgreSQL。</p><ol start="2"><li>建 publication 與 replication user（在 psql 下執行）：</li></ol><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 建 replication user</span></span><br><span class="line"><span class="keyword">CREATE</span> ROLE replicator <span class="keyword">WITH</span> REPLICATION LOGIN PASSWORD <span class="string">&#x27;replicator_pw&#x27;</span>;</span><br><span class="line"><span class="comment">-- 建 publication（只複寫特定 table 或使用 FOR ALL TABLES）</span></span><br><span class="line"><span class="keyword">CREATE</span> PUBLICATION my_publication <span class="keyword">FOR</span> <span class="keyword">TABLE</span> public.my_table;</span><br></pre></td></tr></table></figure><blockquote><p>注意：不要無限制地建立 replication slot（未刪除會造成 WAL 快速累積），測試時留意 WAL 使用量。</p></blockquote><hr><h2 id="快速上手（編譯-執行）"><a href="#快速上手（編譯-執行）" class="headerlink" title="快速上手（編譯 &amp; 執行）"></a>快速上手（編譯 &amp; 執行）</h2><p><strong>從原始碼編譯</strong></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/replication_checker_rs.git</span><br><span class="line"><span class="built_in">cd</span> replication_checker_rs</span><br><span class="line">cargo build --release</span><br></pre></td></tr></table></figure><p>（注意系統需先安裝 libpq 開發庫）。</p><p><strong>範例執行方式</strong>（README 範例改寫，實務上把 db 參數用空格分開的 key-value 傳入）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 設定 slot / publication 名稱（可用環境變數）</span></span><br><span class="line"><span class="built_in">export</span> slot_name=<span class="string">&quot;my_slot&quot;</span></span><br><span class="line"><span class="built_in">export</span> pub_name=<span class="string">&quot;my_publication&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 執行（參數順序為 key value key value ...）</span></span><br><span class="line">./target/release/pg_replica_rs user azureuser **replication database** host 127.0.0.1 dbname redis_fdw_rs port 5432</span><br></pre></td></tr></table></figure><blockquote><p>連接字串需要使用 replication database 代表是 replication 操作<br>你也可以用 <code>RUST_LOG</code> 控制日誌等級（例：<code>RUST_LOG=debug</code>）。</p></blockquote><p><strong>Docker 方式</strong></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">docker build -t pg_replica_rs .</span><br><span class="line">docker run -e slot_name=my_slot -e pub_name=my_pub \</span><br><span class="line">  pg_replica_rs user postgres password secret host host.docker.internal port 5432 dbname mydb</span><br></pre></td></tr></table></figure><p>方便在隔離環境做測試。</p><h2 id="實戰：建立-publication、產生測試資料、觀察輸出"><a href="#實戰：建立-publication、產生測試資料、觀察輸出" class="headerlink" title="實戰：建立 publication、產生測試資料、觀察輸出"></a>實戰：建立 publication、產生測試資料、觀察輸出</h2><h3 id="在資料庫上建立測試-table-與-publication"><a href="#在資料庫上建立測試-table-與-publication" class="headerlink" title="在資料庫上建立測試 table 與 publication"></a>在資料庫上建立測試 table 與 publication</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> public.my_table (</span><br><span class="line">  id serial <span class="keyword">primary key</span>,</span><br><span class="line">  msg text</span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"><span class="keyword">CREATE</span> PUBLICATION my_publication <span class="keyword">FOR</span> <span class="keyword">TABLE</span> public.my_table;</span><br></pre></td></tr></table></figure><h3 id="在另一個-psql-session-產生變更"><a href="#在另一個-psql-session-產生變更" class="headerlink" title="在另一個 psql session 產生變更"></a>在另一個 psql session 產生變更</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">INSERT INTO</span> public.my_table (msg) <span class="keyword">VALUES</span> (<span class="string">&#x27;hello replication&#x27;</span>);</span><br><span class="line"><span class="keyword">UPDATE</span> public.my_table <span class="keyword">SET</span> msg <span class="operator">=</span> <span class="string">&#x27;altered&#x27;</span> <span class="keyword">WHERE</span> id <span class="operator">=</span> <span class="number">1</span>;</span><br><span class="line"><span class="keyword">DELETE</span> <span class="keyword">FROM</span> public.my_table <span class="keyword">WHERE</span> id <span class="operator">=</span> <span class="number">1</span>;</span><br></pre></td></tr></table></figure><h3 id="觀察-replication-checker-rs-的輸出（示例）"><a href="#觀察-replication-checker-rs-的輸出（示例）" class="headerlink" title="觀察 replication_checker_rs 的輸出（示例）"></a>觀察 <code>replication_checker_rs</code> 的輸出（示例）</h3><figure class="highlight apache"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="attribute">025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">57</span>:<span class="number">54</span>.<span class="number">417412</span>Z  INFO Started receiving data from database server</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">01</span>.<span class="number">488499</span>Z  INFO BEGIN: Xid <span class="number">1522</span></span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">01</span>.<span class="number">489158</span>Z  INFO Received relation info for public.t1</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">01</span>.<span class="number">489235</span>Z  INFO TRUNCATE</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">01</span>.<span class="number">489255</span>Z  INFO public.t1</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">01</span>.<span class="number">489318</span>Z  INFO COMMIT: flags: <span class="number">0</span>, lsn: <span class="number">43614824</span>, end_lsn: <span class="number">43614944</span>, commit_time: <span class="number">2025</span>-<span class="number">08</span>-<span class="number">10</span> <span class="number">02</span>:<span class="number">58</span>:<span class="number">01</span>.<span class="number">484</span> UTC</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">583760</span>Z  INFO BEGIN: Xid <span class="number">1523</span></span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">583925</span>Z  INFO Received relation info for public.t1</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">584000</span>Z  INFO table public.t1: INSERT:</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">584012</span>Z  INFO a: <span class="number">1</span></span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">584040</span>Z  INFO table public.t1: INSERT:</span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">584062</span>Z  INFO a: <span class="number">2</span></span><br><span class="line"><span class="attribute">2025</span>-<span class="number">08</span>-<span class="number">10</span>T02:<span class="number">58</span>:<span class="number">07</span>.<span class="number">584104</span>Z  INFO COMMIT: flags: <span class="number">0</span>, lsn: <span class="number">43615128</span>, end_lsn: <span class="number">43615176</span>, commit_time: <span class="number">2025</span>-<span class="number">08</span>-<span class="number">10</span> <span class="number">02</span>:<span class="number">58</span>:<span class="number">07</span>.<span class="number">580</span> UTC</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><h2 id="程式碼結構與關鍵模組導覽"><a href="#程式碼結構與關鍵模組導覽" class="headerlink" title="程式碼結構與關鍵模組導覽"></a>程式碼結構與關鍵模組導覽</h2><p>（以 repo 常見的檔案分佈為範例）</p><ul><li><code>main.rs</code>：啟動、參數解析、log 設定</li><li><code>server.rs</code> &#x2F; <code>connection.rs</code>：與 Postgres 建立 replication 連線、處理 libpq loop</li><li><code>parser.rs</code>：負責把 raw WAL message 解析成內部事件（BEGIN&#x2F;RELATION&#x2F;INSERT&#x2F;UPDATE&#x2F;DELETE&#x2F;TRUNCATE&#x2F;COMMIT）</li><li><code>types.rs</code>：定義 relation、tuple 與欄位型別</li><li><code>utils.rs</code>：byte 解析、LSN 處理、輔助 function</li></ul><hr><h2 id="深入：實踐中的概念與操作建議"><a href="#深入：實踐中的概念與操作建議" class="headerlink" title="深入：實踐中的概念與操作建議"></a>深入：實踐中的概念與操作建議</h2><p>下面幾點是把工具從「觀察器」進化到「可實際應用」時會用到的概念與工程建議。</p><ol><li><p><strong>Replication Slot</strong></p><ul><li>Slot 決定了你從哪個 WAL LSN 開始讀，並讓 PostgreSQL 保留 WAL（直到被確認消費）。測試時注意不要無限建立未刪除的 slot（會造成 WAL 累積）。<code>replication_checker_rs</code> 可建立 slot，但目前在程式中 slot cleanup 還是簡單處理，所以測試環境中你要管理好。</li></ul></li><li><p><strong>Publication 與 Schema 一致性</strong></p><ul><li>Publication 決定哪些 table 的變更會被發送。上線前請確認 schema（尤其 REPLICA IDENTITY、nullable、type 改動）在 source 與 downstream 處理端的一致性，否則解析或重放會有問題。<code>replication_checker_rs</code> 會顯示 relation 資訊，能幫你驗證。</li></ul></li><li><p><strong>Streaming 交易</strong></p><ul><li>對大交易（very large transactions），Postgres 可能以 streaming 模式傳送。此專案已處理 streaming 交易，這讓它在面對大批量資料變更時不會輕易崩潰。(</li></ul></li><li><p><strong>Feedback（ack）機制</strong></p><ul><li>Logical replication protocol 支援回報已處理的 WAL 位置（可用於讓 primary safe remove WAL）。專案實作有 feedback 機制，但在 production 要確保回報策略（多久 ack、持久化 LSN 等）與你下游同步策略一致。</li><li></li></ul></li><li><p><strong>從「顯示」到「處理」：把事件送到下游</strong></p><ul><li>如果你要把變更送到 Kafka、Redis 或寫入另一個 DB，建議把 parser 與 message handling 拆成兩層：<strong>（1）可靠地接收並 ack WAL（LSN）</strong>、<strong>（2）異步或批次地把事件轉送到下游並重試</strong>。目前 <code>replication_checker_rs</code> 主要做第（1）與可視化，延伸第（2）需要你加上連線池、backpressure 與錯誤重試。</li></ul></li></ol><hr><h2 id="實務上常見問題（與排解）"><a href="#實務上常見問題（與排解）" class="headerlink" title="實務上常見問題（與排解）"></a>實務上常見問題（與排解）</h2><ul><li><code>libpq not found</code>：請安裝系統的 PostgreSQL 開發套件（如 <code>libpq-dev</code>、<code>postgresql-devel</code> 或 Homebrew 的 <code>postgresql</code>）。</li><li>權限錯誤：Replication user 需要 <code>REPLICATION</code> 權限，且 publication 應包含你想觀察的 tables。</li><li>Slot 已存在：若 slot 名稱衝突，請手動刪除舊 slot 或指定不同名稱再試。</li></ul><hr><h2 id="延伸建議（如果你想把它用到更真實的場景）"><a href="#延伸建議（如果你想把它用到更真實的場景）" class="headerlink" title="延伸建議（如果你想把它用到更真實的場景）"></a>延伸建議（如果你想把它用到更真實的場景）</h2><ol><li>將輸出轉成 <strong>structured JSON</strong> 並發到 Kafka 或 Event Hub（便於 downstream consumer 處理）。</li><li>增加 <strong>slot cleanup &amp; resume 機制</strong>：遇錯不要直接停，保存最後 ack 的 LSN，重啟時從該位置恢復。</li><li>支援 binary 類型與更完整的 type mapping（目前以 raw 顯示 binary）。</li><li>把 <code>replication_checker_rs</code> 包成一個可監控的 service，加上 metrics（Prometheus）與 health-check endpoint。<br>這些方向都屬於從 PoC → production 的典型演進路線。</li></ol><hr><h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p><code>replication_checker_rs</code> 是一個很實用的學習與測試工具：它把 PostgreSQL logical replication protocol 的各個重要面向（slot、publication、BEGIN&#x2F;COMMIT、tuple parsing、streaming 交易、feedback）用 Rust + libpq 呈現出來，適合用來做教學、驗證複寫行為或作為你自製複寫處理器的起點。想進一步把它變成 production-ready，需要在錯誤復原、slot 管理、下游整合跟 binary type 處理上補強。(<a href="https://github.com/isdaniel/replication_checker_rs" title="GitHub - isdaniel&#x2F;replication_checker_rs">GitHub</a>)</p><ul><li><a href="https://github.com/isdaniel/replication_checker_rs" title="GitHub - isdaniel&#x2F;replication_checker_rs">replication_checker_rs</a></li><li>歡迎 Star、開 issue、PR 一起改進專案！</li></ul><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/logical-replication-checker-replication_checker_rs/">https://isdaniel.github.io/logical-replication-checker-replication_checker_rs&#x2F;</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/logical-replication-checker-replication_checker_rs/</id>
    <link href="https://isdaniel.github.io/logical-replication-checker-replication_checker_rs/"/>
    <published>2025-08-10T10:30:11.000Z</published>
    <summary>大家好，今天要和大家介紹我近期開發的一個開源專案 [replication_checker_rs][1] 如果你想用一個輕量、可讀性高的工具實時觀察 PostgreSQL 邏輯複寫（logical replication）流，或想把複寫協議</summary>
    <title>Rust 實作的 PostgreSQL logical replication checker (replication_checker_rs)</title>
    <updated>2026-04-22T03:00:22.028Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="MySQL" scheme="https://isdaniel.github.io/categories/MySQL/"/>
    <category term="DataBase" scheme="https://isdaniel.github.io/categories/MySQL/DataBase/"/>
    <category term="DataBase" scheme="https://isdaniel.github.io/tags/DataBase/"/>
    <category term="Charset" scheme="https://isdaniel.github.io/tags/Charset/"/>
    <category term="MySQL" scheme="https://isdaniel.github.io/tags/MySQL/"/>
    <content>
      <![CDATA[<h1 id="為什麼修改-MySQL-的-character-set-server-後仍需重啟？從-mysql-connector-net-探討字元集的陷阱"><a href="#為什麼修改-MySQL-的-character-set-server-後仍需重啟？從-mysql-connector-net-探討字元集的陷阱" class="headerlink" title="為什麼修改 MySQL 的 character_set_server 後仍需重啟？從 mysql-connector-net 探討字元集的陷阱"></a>為什麼修改 MySQL 的 <code>character_set_server</code> 後仍需重啟？從 mysql-connector-net 探討字元集的陷阱</h1><p>在近期處理一個與 MySQL 字元集相關的問題時，我深入研究了 MySQL Server 的 Handshake 機制以及 <code>mysql-connector-net</code> 原始碼，發現了一個容易被忽略但可能會造成重大錯誤的細節——<strong>即使 <code>character_set_server</code> 是動態參數，但實際上修改後仍需要重啟 MySQL Server，否則會造成驅動端的解碼錯誤。</strong></p><h2 id="問題背景：為什麼驅動程式仍使用舊的字元集？"><a href="#問題背景：為什麼驅動程式仍使用舊的字元集？" class="headerlink" title="問題背景：為什麼驅動程式仍使用舊的字元集？"></a>問題背景：為什麼驅動程式仍使用舊的字元集？</h2><p>根據 MySQL Server 的設計，當 client 端連線時，Server 會在 Handshake Initial Packet 中回傳一些基本資訊，其中就包括伺服器的預設字元集（<code>character_set_server</code>）。這段資訊是透過以下的程式碼取得：</p><p>🔗 <a href="https://github.com/mysql/mysql-server/blob/61a3a1d8ef15512396b4c2af46e922a19bf2b174/sql/auth/sql_authentication.cc#L1872">MySQL Source Code 參考連結</a></p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">packet-&gt;<span class="built_in">append_int1</span>(default_charset_info-&gt;number);</span><br></pre></td></tr></table></figure><p>根據 MySQL 官方文件，這段資訊會被封裝在 Handshake v10 Protocol 中傳送（<a href="https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_connection_phase_packets_protocol_handshake_v10.html">官方文件</a>）。</p><p>然而，這個值其實是在 MySQL Server 啟動時就被載入的，<strong>即使你在執行中動態修改 <code>character_set_server</code>，重新連線後 greeting packet 裡的字元集值仍然不會更新。</strong></p><p><img src="/../images/2025-07-10_12h42_13.png" alt="img"></p><h2 id="驅動程式的行為：根據-Greeting-決定後續欄位解碼"><a href="#驅動程式的行為：根據-Greeting-決定後續欄位解碼" class="headerlink" title="驅動程式的行為：根據 Greeting 決定後續欄位解碼"></a>驅動程式的行為：根據 Greeting 決定後續欄位解碼</h2><p><code>mysql-connector-net</code> 在收到 greeting packet 時，會將字元集儲存在 <code>ConnectionCharSetIndex</code>：</p><figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* New protocol with 16 bytes to describe server characteristics */</span></span><br><span class="line">owner.ConnectionCharSetIndex = (<span class="built_in">int</span>)packet.ReadByte(); <span class="comment">// e.g. 54 = UTF16</span></span><br></pre></td></tr></table></figure><p>🔗 <a href="https://github.com/mysql/mysql-connector-net/blob/9.1.0/MySQL.Data/src/NativeDriver.cs#L241">連結程式碼位置</a></p><p>接下來，在處理每個欄位的資料時，若欄位的 charset 為 binary (63)，會 fallback 回 connection 的預設值：</p><figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (CharacterSetIndex == <span class="number">63</span>)</span><br><span class="line">    CharacterSetIndex = driver.ConnectionCharSetIndex;</span><br></pre></td></tr></table></figure><p>🔗 <a href="https://github.com/mysql/mysql-connector-net/blob/9.1.0/MySQL.Data/src/Field.cs#L250">欄位處理程式碼位置</a></p><p>這裡的問題就產生了：在 MySQL protocol 中，即使是整數、時間戳等 binary 類型欄位，也會設定 charset 為 binary (63)。這意味著實際 decode 資料時會 fallback 到 UTF-16（預設 greeting 的字元集），導致錯誤解碼，產生亂碼或例外。</p><h2 id="實驗觀察與錯誤範例"><a href="#實驗觀察與錯誤範例" class="headerlink" title="實驗觀察與錯誤範例"></a>實驗觀察與錯誤範例</h2><p>以下是我透過封包擷取與實際執行觀察到的問題：</p><h3 id="Greeting-Packet："><a href="#Greeting-Packet：" class="headerlink" title="Greeting Packet："></a>Greeting Packet：</h3><p>在連線時，我們觀察到 greeting charset 設為 54（UTF-16）：</p><figure class="highlight node-repl"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Server Greeting</span><br><span class="line"><span class="meta prompt_">...</span></span><br><span class="line">    Language: utf16 COLLATE utf16_general_ci (54)</span><br></pre></td></tr></table></figure><p><img src="/../images/2025-07-10_12h42_49.png" alt="img"></p><h3 id="整數欄位錯誤解析："><a href="#整數欄位錯誤解析：" class="headerlink" title="整數欄位錯誤解析："></a>整數欄位錯誤解析：</h3><p>欄位的 <code>characterSet</code> 是 63 (binary)，fallback 到 54（UTF-16）解碼，造成如下的 <code>System.FormatException</code>：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">System.FormatException: Input string was not in a correct format.</span><br><span class="line">   at System.Int32.Parse(String s, IFormatProvider provider)</span><br></pre></td></tr></table></figure><p><img src="/../images/2025-07-10_12h43_10.png" alt="img"></p><p>因為目前 mysql-connector-net client library 尚未完全支援 UTF16, 未來可能會處理此問題</p><h2 id="修正建議與討論"><a href="#修正建議與討論" class="headerlink" title="修正建議與討論"></a>修正建議與討論</h2><h3 id="為什麼需要重啟？"><a href="#為什麼需要重啟？" class="headerlink" title="為什麼需要重啟？"></a>為什麼需要重啟？</h3><p>雖然 <code>character_set_server</code> 是 dynamic parameter，但 greeting 中的字元集是在啟動時就定義的（由 <code>default_charset_info</code> 載入）。因此如果你修改這個參數 <strong>但未重啟 MySQL Server</strong>，那麼新的連線仍然會收到舊的 greeting charset 資訊。</p><p>在使用 <code>mysql-connector-net</code> 這種驅動時，由於 greeting charset 會被 fallback 作為 binary 欄位的 decode 基準，這會導致我們解析數值型資料時發生錯誤或亂碼。</p><h2 id="建議改善方向"><a href="#建議改善方向" class="headerlink" title="建議改善方向"></a>建議改善方向</h2><ol><li><p><strong>驅動程式修正建議</strong>：</p><ul><li>應該判斷 field type 是否為 binary 前，再決定是否 fallback 到 greeting charset。</li><li>例如：對於 Int32、Int64、Timestamp 等欄位，可跳過 charset fallback。</li></ul></li><li><p><strong>伺服器端操作建議</strong>：</p><ul><li>若需更改 <code>character_set_server</code>，務必搭配重新啟動 MySQL server，確保 greeting packet 同步更新。</li><li>如果您是使用 mysql-connector-net library 請使用 utf8mb4 取代 UTF16</li></ul></li></ol><h2 id="總結"><a href="#總結" class="headerlink" title="總結"></a>總結</h2><p>這次調查突顯了 MySQL greeting packet 與驅動解碼邏輯間的一個不易察覺的錯誤來源。特別是在使用 <code>mysql-connector-net</code> 驅動、並處理 binary 或整數類型資料時，若伺服器端未重新啟動而 greeting charset 未更新，就可能引發解碼錯誤與系統例外。</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/mysql-connector-net-charset-issue/">https://isdaniel.github.io/mysql-connector-net-charset-issue/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/mysql-connector-net-charset-issue/</id>
    <link href="https://isdaniel.github.io/mysql-connector-net-charset-issue/"/>
    <published>2025-07-09T23:10:43.000Z</published>
    <summary>為什麼修改 MySQL 的 character_set_server 後仍需重啟？從 mysql-connector-net 探討字元集的陷阱</summary>
    <title>為什麼修改 MySQL 的 character_set_server 後仍需重啟？從 mysql-connector-net 探討字元集的陷阱</title>
    <updated>2026-04-22T03:00:22.029Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="pgrx" scheme="https://isdaniel.github.io/categories/Rust/pgrx/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Rust/pgrx/PostgreSQL/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="pgrx" scheme="https://isdaniel.github.io/tags/pgrx/"/>
    <content>
      <![CDATA[<p>Here’s a well-structured draft for your technical blog post based on the provided Rust + pgrx Foreign Data Wrapper (FDW) code.</p><h1 id="🚀-Building-a-Simple-PostgreSQL-FDW-with-Rust-and-pgrx"><a href="#🚀-Building-a-Simple-PostgreSQL-FDW-with-Rust-and-pgrx" class="headerlink" title="🚀 Building a Simple PostgreSQL FDW with Rust and pgrx"></a>🚀 Building a Simple PostgreSQL FDW with Rust and pgrx</h1><p>PostgreSQL Foreign Data Wrappers (FDW) enable PostgreSQL to query external data sources as if they were regular tables. Traditionally, FDWs are written in C, but with <a href="https://github.com/pgcentralfoundation/pgrx"><code>pgrx</code></a>, we can now build PostgreSQL extensions — including FDWs — in <strong>Rust</strong>, unlocking safety and modern tooling.</p><p>In this post, we’ll walk through creating a simple FDW using Rust and <code>pgrx</code> that simulates reading rows from an external source (e.g., Redis or API). While it’s a stub, it demonstrates how to implement the core FDW lifecycle.</p><h2 id="🛠️-What-Can-You-Build-with-pgrx"><a href="#🛠️-What-Can-You-Build-with-pgrx" class="headerlink" title="🛠️ What Can You Build with pgrx?"></a>🛠️ What Can You Build with <code>pgrx</code>?</h2><ul><li>✅ <strong>SQL Functions:</strong> Scalar, aggregate, and set-returning functions.</li><li>✅ <strong>Custom Types:</strong> Define composite types or enums in Rust.</li><li>✅ <strong>Foreign Data Wrappers (FDWs):</strong> Like the one in your example — connect PostgreSQL to external systems (Redis, APIs, file systems, etc.).</li><li>✅ <strong>Index Access Methods:</strong> Implement new index types.</li><li>✅ <strong>Background Workers:</strong> Run tasks in the background inside PostgreSQL.</li><li>✅ <strong>Hooks:</strong> Intercept or modify PostgreSQL internal behavior (like planner or executor hooks).</li></ul><h3 id="🌐-Under-the-hood"><a href="#🌐-Under-the-hood" class="headerlink" title="🌐 Under the hood:"></a>🌐 Under the hood:</h3><ul><li>PostgreSQL communicates via C APIs.</li><li><code>pgrx</code> provides Rust-safe bindings to these APIs.</li><li>Memory management is handled carefully via <code>PgMemoryContexts</code>, matching PostgreSQL’s memory context model.</li><li>Rust functions are exposed to PostgreSQL as SQL-callable functions with the <code>#[pg_extern]</code> macro.</li></ul><h2 id="🏗️-Key-Components-of-an-default-fdw"><a href="#🏗️-Key-Components-of-an-default-fdw" class="headerlink" title="🏗️ Key Components of an default_fdw"></a>🏗️ Key Components of an default_fdw</h2><p>PostgreSQL FDWs consist of several callback functions that handle different phases of query planning and execution:</p><p>Example extension <a href="https://github.com/isdaniel/rust_pg_extensions/blob/main/src/default_fdw.rs">default_fdw</a></p><ul><li><p><strong>Planning Phase:</strong></p><ul><li><code>GetForeignRelSize</code>: Estimate rows.</li><li><code>GetForeignPaths</code>: Generate access paths.</li><li><code>GetForeignPlan</code>: Create the scan plan.</li></ul></li><li><p><strong>Execution Phase:</strong></p><ul><li><code>BeginForeignScan</code>: Initialize the scan.</li><li><code>IterateForeignScan</code>: Produce each row.</li><li><code>ReScanForeignScan</code>: Restart the scan if needed.</li><li><code>EndForeignScan</code>: Cleanup.</li></ul></li></ul><h3 id="How-to-use"><a href="#How-to-use" class="headerlink" title="How to use"></a>How to use</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">create</span> <span class="keyword">foreign</span> data wrapper default_wrapper</span><br><span class="line">  handler default_fdw_handler;</span><br><span class="line"></span><br><span class="line"><span class="keyword">create</span> server my_default_server</span><br><span class="line">  <span class="keyword">foreign</span> data wrapper default_wrapper</span><br><span class="line">  options (</span><br><span class="line">    foo <span class="string">&#x27;bar&#x27;</span></span><br><span class="line">  );</span><br><span class="line"></span><br><span class="line"><span class="keyword">create</span> <span class="keyword">foreign</span> <span class="keyword">table</span> hello (</span><br><span class="line">  id <span class="type">bigint</span>,</span><br><span class="line">  col text</span><br><span class="line">)</span><br><span class="line">server my_default_server options (</span><br><span class="line">foo <span class="string">&#x27;bar&#x27;</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><p>Then we can select hello table.</p><figure class="highlight axapta"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> * <span class="keyword">from</span> hello;</span><br></pre></td></tr></table></figure><h2 id="🔧-Setting-Up-the-FDW-Handler"><a href="#🔧-Setting-Up-the-FDW-Handler" class="headerlink" title="🔧 Setting Up the FDW Handler"></a>🔧 Setting Up the FDW Handler</h2><p>The entry point is the FDW handler function, which PostgreSQL calls to retrieve a set of function pointers.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_extern(create_or_replace)]</span></span><br><span class="line"><span class="keyword">pub</span> <span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">default_fdw_handler</span>() <span class="punctuation">-&gt;</span> PgBox&lt;pg_sys::FdwRoutine&gt; &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; default_fdw_handler&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="keyword">mut </span><span class="variable">fdw_routine</span> = PgBox::&lt;pg_sys::FdwRoutine, AllocatedByRust&gt;::<span class="title function_ invoke__">alloc_node</span>(pg_sys::NodeTag::T_FdwRoutine);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Planning callbacks</span></span><br><span class="line">        fdw_routine.GetForeignRelSize = <span class="title function_ invoke__">Some</span>(get_foreign_rel_size);</span><br><span class="line">        fdw_routine.GetForeignPaths = <span class="title function_ invoke__">Some</span>(get_foreign_paths);</span><br><span class="line">        fdw_routine.GetForeignPlan = <span class="title function_ invoke__">Some</span>(get_foreign_plan);</span><br><span class="line">        fdw_routine.ExplainForeignScan = <span class="title function_ invoke__">Some</span>(explain_foreign_scan);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Execution callbacks</span></span><br><span class="line">        fdw_routine.BeginForeignScan = <span class="title function_ invoke__">Some</span>(begin_foreign_scan);</span><br><span class="line">        fdw_routine.IterateForeignScan = <span class="title function_ invoke__">Some</span>(iterate_foreign_scan);</span><br><span class="line">        fdw_routine.ReScanForeignScan = <span class="title function_ invoke__">Some</span>(re_scan_foreign_scan);</span><br><span class="line">        fdw_routine.EndForeignScan = <span class="title function_ invoke__">Some</span>(end_foreign_scan);</span><br><span class="line"></span><br><span class="line">        fdw_routine.<span class="title function_ invoke__">into_pg_boxed</span>()</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="📦-Extracting-Foreign-Table-Options"><a href="#📦-Extracting-Foreign-Table-Options" class="headerlink" title="📦 Extracting Foreign Table Options"></a>📦 Extracting Foreign Table Options</h2><p>PostgreSQL allows specifying options like hostnames or credentials when creating a foreign table. This function retrieves those options:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">unsafe</span> <span class="keyword">fn</span> <span class="title function_">get_foreign_table_options</span>(relid: pg_sys::Oid) <span class="punctuation">-&gt;</span> HashMap&lt;<span class="type">String</span>, <span class="type">String</span>&gt; &#123;</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>This is crucial when your FDW needs to connect to external systems like Redis, REST APIs, or filesystems.</p><h2 id="📊-Planner-Callbacks"><a href="#📊-Planner-Callbacks" class="headerlink" title="📊 Planner Callbacks"></a>📊 Planner Callbacks</h2><h3 id="1️⃣-GetForeignRelSize"><a href="#1️⃣-GetForeignRelSize" class="headerlink" title="1️⃣ GetForeignRelSize"></a>1️⃣ <strong>GetForeignRelSize</strong></h3><p>Estimates the number of rows in the foreign table.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">get_foreign_rel_size</span>(..., baserel: *<span class="keyword">mut</span> pg_sys::RelOptInfo, ...) &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; get_foreign_rel_size&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        (*baserel).rows = <span class="number">1000.0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="2️⃣-GetForeignPaths"><a href="#2️⃣-GetForeignPaths" class="headerlink" title="2️⃣ GetForeignPaths"></a>2️⃣ <strong>GetForeignPaths</strong></h3><p>Defines possible access paths for the planner.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">get_foreign_paths</span>(..., baserel: *<span class="keyword">mut</span> pg_sys::RelOptInfo, ...) &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; get_foreign_paths&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">path</span> = pg_sys::<span class="title function_ invoke__">create_foreignscan_path</span>(</span><br><span class="line">            ...,</span><br><span class="line">            (*baserel).rows,</span><br><span class="line">            <span class="number">10.0</span>,   <span class="comment">// startup cost</span></span><br><span class="line">            <span class="number">100.0</span>,  <span class="comment">// total cost</span></span><br><span class="line">            ...</span><br><span class="line">        );</span><br><span class="line">        pg_sys::<span class="title function_ invoke__">add_path</span>(baserel, path <span class="keyword">as</span> *<span class="keyword">mut</span> pg_sys::Path);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="3️⃣-GetForeignPlan"><a href="#3️⃣-GetForeignPlan" class="headerlink" title="3️⃣ GetForeignPlan"></a>3️⃣ <strong>GetForeignPlan</strong></h3><p>Generates the actual execution plan.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">get_foreign_plan</span>(...) <span class="punctuation">-&gt;</span> *<span class="keyword">mut</span> pg_sys::ForeignScan &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; get_foreign_plan&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        pg_sys::<span class="title function_ invoke__">make_foreignscan</span>(...)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="▶️-Execution-Callbacks"><a href="#▶️-Execution-Callbacks" class="headerlink" title="▶️ Execution Callbacks"></a>▶️ Execution Callbacks</h2><h3 id="🏁-BeginForeignScan"><a href="#🏁-BeginForeignScan" class="headerlink" title="🏁 BeginForeignScan"></a>🏁 <strong>BeginForeignScan</strong></h3><p>Initializes the scan.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">begin_foreign_scan</span>(node: *<span class="keyword">mut</span> pg_sys::ForeignScanState, ...) &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; begin_foreign_scan&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">relid</span> = (*(*node).ss.ss_currentRelation).rd_id;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">options</span> = <span class="title function_ invoke__">get_foreign_table_options</span>(relid);</span><br><span class="line">        log!(<span class="string">&quot;Foreign table options: &#123;:?&#125;&quot;</span>, options);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">let</span> <span class="variable">state</span> = PgMemoryContexts::CurrentMemoryContext</span><br><span class="line">            .<span class="title function_ invoke__">leak_and_drop_on_delete</span>(RedisFdwState &#123; row: <span class="number">0</span> &#125;);</span><br><span class="line"></span><br><span class="line">        (*node).fdw_state = state <span class="keyword">as</span> *<span class="keyword">mut</span> std::ffi::c_void;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="🔁-IterateForeignScan"><a href="#🔁-IterateForeignScan" class="headerlink" title="🔁 IterateForeignScan"></a>🔁 <strong>IterateForeignScan</strong></h3><p>Produces rows one at a time.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">iterate_foreign_scan</span>(node: *<span class="keyword">mut</span> pg_sys::ForeignScanState) <span class="punctuation">-&gt;</span> *<span class="keyword">mut</span> pg_sys::TupleTableSlot &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; iterate_foreign_scan&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">state</span> = &amp;<span class="keyword">mut</span> *((*node).fdw_state <span class="keyword">as</span> *<span class="keyword">mut</span> RedisFdwState);</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">slot</span> = (*node).ss.ss_ScanTupleSlot;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">tupdesc</span> = (*slot).tts_tupleDescriptor;</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">natts</span> = (*tupdesc).natts <span class="keyword">as</span> <span class="type">usize</span>;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> state.row &gt;= <span class="number">5</span> &#123;</span><br><span class="line">            <span class="title function_ invoke__">exec_clear_tuple</span>(slot);</span><br><span class="line">            <span class="keyword">return</span> slot;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="title function_ invoke__">exec_clear_tuple</span>(slot);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">let</span> <span class="variable">values_ptr</span> = PgMemoryContexts::<span class="title function_ invoke__">For</span>((*slot).tts_mcxt)</span><br><span class="line">            .<span class="title function_ invoke__">palloc</span>(std::mem::size_of::&lt;pg_sys::Datum&gt;() * natts) <span class="keyword">as</span> *<span class="keyword">mut</span> pg_sys::Datum;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">let</span> <span class="variable">nulls_ptr</span> = PgMemoryContexts::<span class="title function_ invoke__">For</span>((*slot).tts_mcxt)</span><br><span class="line">            .<span class="title function_ invoke__">palloc</span>(std::mem::size_of::&lt;<span class="type">bool</span>&gt;() * natts) <span class="keyword">as</span> *<span class="keyword">mut</span> <span class="type">bool</span>;</span><br><span class="line"></span><br><span class="line">        *values_ptr.<span class="title function_ invoke__">add</span>(<span class="number">0</span>) = (state.row + <span class="number">1</span>).<span class="title function_ invoke__">into</span>();</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">name</span> = <span class="built_in">format!</span>(<span class="string">&quot;hello_&#123;&#125;&quot;</span>, state.row + <span class="number">1</span>);</span><br><span class="line">        <span class="keyword">let</span> <span class="variable">cstring</span> = CString::<span class="title function_ invoke__">new</span>(name).<span class="title function_ invoke__">unwrap</span>();</span><br><span class="line">        *values_ptr.<span class="title function_ invoke__">add</span>(<span class="number">1</span>) = Datum::<span class="title function_ invoke__">from</span>(pg_sys::<span class="title function_ invoke__">cstring_to_text</span>(cstring.<span class="title function_ invoke__">as_ptr</span>()));</span><br><span class="line"></span><br><span class="line">        *nulls_ptr.<span class="title function_ invoke__">add</span>(<span class="number">0</span>) = <span class="literal">false</span>;</span><br><span class="line">        *nulls_ptr.<span class="title function_ invoke__">add</span>(<span class="number">1</span>) = <span class="literal">false</span>;</span><br><span class="line"></span><br><span class="line">        (*slot).tts_values = values_ptr;</span><br><span class="line">        (*slot).tts_isnull = nulls_ptr;</span><br><span class="line"></span><br><span class="line">        pg_sys::<span class="title function_ invoke__">ExecStoreVirtualTuple</span>(slot);</span><br><span class="line"></span><br><span class="line">        state.row += <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">        slot</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>This example emits five rows with <code>(id, name)</code> pairs like <code>(1, hello_1)</code>.</p><h3 id="🔄-ReScanForeignScan"><a href="#🔄-ReScanForeignScan" class="headerlink" title="🔄 ReScanForeignScan"></a>🔄 <strong>ReScanForeignScan</strong></h3><p>Handles rescan requests.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">re_scan_foreign_scan</span>(_node: *<span class="keyword">mut</span> pg_sys::ForeignScanState) &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; re_scan_foreign_scan&quot;</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="🛑-EndForeignScan"><a href="#🛑-EndForeignScan" class="headerlink" title="🛑 EndForeignScan"></a>🛑 <strong>EndForeignScan</strong></h3><p>Frees resources.</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[pg_guard]</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> <span class="keyword">fn</span> <span class="title function_">end_foreign_scan</span>(node: *<span class="keyword">mut</span> pg_sys::ForeignScanState) &#123;</span><br><span class="line">    log!(<span class="string">&quot;&gt; end_foreign_scan&quot;</span>);</span><br><span class="line">    <span class="keyword">unsafe</span> &#123;</span><br><span class="line">        <span class="keyword">if</span> !(*node).fdw_state.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">            (*node).fdw_state = std::ptr::<span class="title function_ invoke__">null_mut</span>();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="🗒️-Utilities"><a href="#🗒️-Utilities" class="headerlink" title="🗒️ Utilities"></a>🗒️ Utilities</h2><p>Tuple clearing is handled by this helper:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">unsafe</span> <span class="keyword">fn</span> <span class="title function_">exec_clear_tuple</span>(slot: *<span class="keyword">mut</span> pg_sys::TupleTableSlot) &#123;</span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">Some</span>(clear) = (*(*slot).tts_ops).clear &#123;</span><br><span class="line">        <span class="title function_ invoke__">clear</span>(slot);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="🏁-Conclusion"><a href="#🏁-Conclusion" class="headerlink" title="🏁 Conclusion"></a>🏁 Conclusion</h2><p>This post walked you through the basics of building a PostgreSQL FDW using Rust and <code>pgrx</code>. While this example generates dummy data, the same structure can be extended to connect with real-world systems like Redis, REST APIs, or message queues.</p><h2 id="🚀-Next-Steps"><a href="#🚀-Next-Steps" class="headerlink" title="🚀 Next Steps"></a>🚀 Next Steps</h2><ul><li>Add connection logic to Redis or any backend.</li><li>Support <code>INSERT</code>, <code>UPDATE</code>, <code>DELETE</code> by implementing the modification callbacks.</li><li>Package and distribute as a PostgreSQL extension.</li></ul><h2 id="📚-References"><a href="#📚-References" class="headerlink" title="📚 References"></a>📚 References</h2><ul><li><a href="https://github.com/pgcentralfoundation/pgrx">pgrx GitHub</a></li><li><a href="https://www.postgresql.org/docs/current/fdw-callbacks.html">PostgreSQL FDW API Documentation</a></li></ul><p>If you’d like, I can help refine this post further, format it for Medium&#x2F;Dev.to, or extend it with Redis connection examples. Would you like that?</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/rust-pgrx-extension-fdw/">https://isdaniel.github.io/rust-pgrx-extension-fdw/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/rust-pgrx-extension-fdw/</id>
    <link href="https://isdaniel.github.io/rust-pgrx-extension-fdw/"/>
    <published>2025-06-30T22:30:11.000Z</published>
    <summary>Here's a well-structured draft for your technical blog post based on the provided Rust + pgrx Foreign Data Wrapper (FDW) code.</summary>
    <title>Building a PostgreSQL Foreign Data Wrapper (FDW) in Rust with pgrx</title>
    <updated>2026-04-22T03:00:22.031Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Python" scheme="https://isdaniel.github.io/categories/Python/"/>
    <category term="Python" scheme="https://isdaniel.github.io/tags/Python/"/>
    <category term="AI" scheme="https://isdaniel.github.io/tags/AI/"/>
    <category term="MCP" scheme="https://isdaniel.github.io/tags/MCP/"/>
    <content>
      <![CDATA[<h2 id="什麼是-MCP"><a href="#什麼是-MCP" class="headerlink" title="什麼是 MCP?"></a>什麼是 MCP?</h2><p>MCP(Model Context Protocol)是一種協定，用於在工具之間進行通訊與協作。透過 MCP，可以讓各種獨立的工具(如模型、插件、服務)以一致的格式互相交換資料與指令。MCP Server 是提供特定功能的伺服器端程式，能與支援 MCP 的前端進行互動。</p><h2 id="Weather-MCP-Server-是什麼？"><a href="#Weather-MCP-Server-是什麼？" class="headerlink" title="Weather MCP Server 是什麼？"></a>Weather MCP Server 是什麼？</h2><p>Weather MCP Server 是一個基於 MCP 協定開發的天氣資訊伺服器，利用 <a href="https://open-meteo.com/">Open-Meteo API</a> 提供免費的天氣資料。透過這個伺服器，你可以查詢：</p><ul><li>某城市的即時天氣資訊</li><li>某城市在特定時間範圍內的天氣預報</li><li>指定時區的目前時間</li></ul><p><a href="https://github.com/isdaniel/mcp_weather_server">mcp_weather_server source code</a><br><a href="https://smithery.ai/server/@isdaniel/mcp_weather_server">smithery AI</a></p><p>使用此 MCP Server 搭配 AI Model 可以輕易搭建出即時天氣小助手, 如下我的 AI Bot</p><p><img src="/../images/20096630aZiRf1dO1g.png"><br><img src="/../images/20096630ecDVbg8JZL.png"></p><h2 id="功能特色"><a href="#功能特色" class="headerlink" title="功能特色"></a>功能特色</h2><ul><li>查詢指定城市的即時天氣</li><li>查詢指定日期區間的天氣預測</li><li>查詢目前時間(支援指定時區)</li></ul><h2 id="安裝方式"><a href="#安裝方式" class="headerlink" title="安裝方式"></a>安裝方式</h2><p>使用 pip 安裝：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install mcp_weather_server</span><br></pre></td></tr></table></figure><p>接著，需要在 MCP 設定檔中手動加入 Weather Server 的啟動設定。</p><h3 id="設定-cline-mcp-settings-json"><a href="#設定-cline-mcp-settings-json" class="headerlink" title="設定 cline_mcp_settings.json"></a>設定 <code>cline_mcp_settings.json</code></h3><p>請將以下內容新增到 <code>cline_mcp_settings.json</code> 檔案中的 <code>mcpServers</code> 區塊：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;weather&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="string">&quot;-m&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;mcp_weather_server&quot;</span></span><br><span class="line">      <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;disabled&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;autoApprove&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>儲存後，即可在 MCP 架構中啟動並使用 Weather Server。</p><h2 id="使用方式"><a href="#使用方式" class="headerlink" title="使用方式"></a>使用方式</h2><p>Weather MCP Server 提供以下三個工具：</p><h3 id="1-get-weather：查詢指定城市目前天氣"><a href="#1-get-weather：查詢指定城市目前天氣" class="headerlink" title="1. get_weather：查詢指定城市目前天氣"></a>1. <code>get_weather</code>：查詢指定城市目前天氣</h3><p><strong>參數說明：</strong></p><ul><li><code>city</code>(字串，必填)：城市名稱，例如 “Taipei”</li></ul><p><strong>範例：</strong></p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">use_mcp_tool</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">server_name</span>&gt;</span>weather<span class="tag">&lt;/<span class="name">server_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">tool_name</span>&gt;</span>get_weather<span class="tag">&lt;/<span class="name">tool_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">arguments</span>&gt;</span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;city&quot;: &quot;Taipei&quot;</span><br><span class="line">&#125;</span><br><span class="line"><span class="tag">&lt;/<span class="name">arguments</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">use_mcp_tool</span>&gt;</span></span><br></pre></td></tr></table></figure><h3 id="2-get-weather-by-datetime-range：查詢日期區間的天氣預報"><a href="#2-get-weather-by-datetime-range：查詢日期區間的天氣預報" class="headerlink" title="2. get_weather_by_datetime_range：查詢日期區間的天氣預報"></a>2. <code>get_weather_by_datetime_range</code>：查詢日期區間的天氣預報</h3><p><strong>參數說明：</strong></p><ul><li><code>city</code>(字串，必填)：城市名稱</li><li><code>start_date</code>(字串，必填)：開始日期，格式為 YYYY-MM-DD</li><li><code>end_date</code>(字串，必填)：結束日期，格式為 YYYY-MM-DD</li></ul><p><strong>範例：</strong></p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">use_mcp_tool</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">server_name</span>&gt;</span>weather<span class="tag">&lt;/<span class="name">server_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">tool_name</span>&gt;</span>get_weather_by_datetime_range<span class="tag">&lt;/<span class="name">tool_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">arguments</span>&gt;</span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;city&quot;: &quot;London&quot;,</span><br><span class="line">  &quot;start_date&quot;: &quot;2024-01-01&quot;,</span><br><span class="line">  &quot;end_date&quot;: &quot;2024-01-07&quot;</span><br><span class="line">&#125;</span><br><span class="line"><span class="tag">&lt;/<span class="name">arguments</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">use_mcp_tool</span>&gt;</span></span><br></pre></td></tr></table></figure><h3 id="3-get-current-datetime：查詢指定時區目前時間"><a href="#3-get-current-datetime：查詢指定時區目前時間" class="headerlink" title="3. get_current_datetime：查詢指定時區目前時間"></a>3. <code>get_current_datetime</code>：查詢指定時區目前時間</h3><p><strong>參數說明：</strong></p><ul><li><code>timezone_name</code>(字串，必填)：IANA 時區名稱，例如 “America&#x2F;New_York”、”Europe&#x2F;London”。若未指定，預設為 UTC。</li></ul><p><strong>範例：</strong></p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">use_mcp_tool</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">server_name</span>&gt;</span>weather<span class="tag">&lt;/<span class="name">server_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">tool_name</span>&gt;</span>get_current_datetime<span class="tag">&lt;/<span class="name">tool_name</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">arguments</span>&gt;</span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;timezone_name&quot;: &quot;America/New_York&quot;</span><br><span class="line">&#125;</span><br><span class="line"><span class="tag">&lt;/<span class="name">arguments</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">use_mcp_tool</span>&gt;</span></span><br></pre></td></tr></table></figure><h2 id="開發者注意事項"><a href="#開發者注意事項" class="headerlink" title="開發者注意事項"></a>開發者注意事項</h2><p>如需在開發或除錯時手動執行 Weather MCP Server，可以直接執行：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python -m mcp_weather_server</span><br></pre></td></tr></table></figure><h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p>Weather MCP Server 是一個輕量、無需 API 金鑰的天氣資訊服務，適合用於教育、研究或原型開發。透過 MCP 的整合能力，可以輕鬆地將天氣查詢功能加入到各種自動化或智慧應用中。</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/mcp-server-weather/">https://isdaniel.github.io/mcp-server-weather/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/mcp-server-weather/</id>
    <link href="https://isdaniel.github.io/mcp-server-weather/"/>
    <published>2025-06-22T20:42:00.000Z</published>
    <summary>MCP(Model Context Protocol)是一種協定，用於在工具之間進行通訊與協作。透過 MCP，可以讓各種獨立的工具(如模型、插件、服務)以一致的格式互相交換資料與指令。MCP Server 是提供特定功能的伺服器端程式，能與支援 MCP 的前端進行互動。</summary>
    <title>使用 Model Context Protocol 以 Weather MCP Server 為例</title>
    <updated>2026-04-22T03:00:22.028Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="C#" scheme="https://isdaniel.github.io/categories/C/"/>
    <category term="OOP" scheme="https://isdaniel.github.io/categories/C/OOP/"/>
    <category term="framework" scheme="https://isdaniel.github.io/categories/C/OOP/framework/"/>
    <category term="C#" scheme="https://isdaniel.github.io/tags/C/"/>
    <category term="OOP" scheme="https://isdaniel.github.io/tags/OOP/"/>
    <category term="Process" scheme="https://isdaniel.github.io/tags/Process/"/>
    <category term="workerpool" scheme="https://isdaniel.github.io/tags/workerpool/"/>
    <category term="message-queue" scheme="https://isdaniel.github.io/tags/message-queue/"/>
    <category term="open-source" scheme="https://isdaniel.github.io/tags/open-source/"/>
    <content>
      <![CDATA[<h2 id="簡介"><a href="#簡介" class="headerlink" title="簡介"></a>簡介</h2><p>最近我開發了 <code>MessageWorkerPool</code> 專案。其主要概念是提供一個平台框架，使使用者能夠快速且輕鬆地在 <code>Worker</code> 內實作邏輯。該設計高度靈活，允許基於我創建的 Worker 通訊協議，以多種程式語言實作 <code>Worker</code>。目前，我已提供使用 C#、Rust 和 Python 編寫的 Worker 範例。</p><p>這個函式庫在多進程環境中處理任務表現優異。此外，它還支援優雅關閉 (graceful shutdown)，確保在隨時 consumer worker 能順利終止處理程序。</p><p><a href="https://github.com/isdaniel/MessageWorkerPool">MessageWorkerPool GitHub</a></p><h2 id="為什麼選擇-ProcessPool-而非-ThreadPool"><a href="#為什麼選擇-ProcessPool-而非-ThreadPool" class="headerlink" title="為什麼選擇 ProcessPool 而非 ThreadPool?"></a>為什麼選擇 ProcessPool 而非 ThreadPool?</h2><p>當你需要強大的隔離性，以防止某個任務影響其他任務時，應該選擇 ProcessPool，特別是針對關鍵操作或容易崩潰的任務。雖然 ThreadPool 較為輕量（因為執行緒共用記憶體並且具有較低的上下文切換開銷），但 ProcessPool 能夠提供更靈活的解決方案，允許使用不同的程式語言來實作 Worker。</p><h2 id="安裝"><a href="#安裝" class="headerlink" title="安裝"></a>安裝</h2><p>要安裝 <code>MessageWorkerPool</code> 套件，請使用以下 NuGet 指令：</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">PM &gt; Install-Package MessageWorkerPool</span><br></pre></td></tr></table></figure><p>若要手動安裝此函式庫，可克隆儲存庫並建置專案：</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/MessageWorkerPool.git</span><br><span class="line"><span class="built_in">cd</span> MessageWorkerPool</span><br><span class="line">dotnet build</span><br></pre></td></tr></table></figure><h2 id="架構概覽"><a href="#架構概覽" class="headerlink" title="架構概覽"></a>架構概覽</h2><p><img src="https://raw.githubusercontent.com/isdaniel/MessageWorkerPool/refs/heads/main/images/arhc-overview.png"></p><h2 id="快速開始"><a href="#快速開始" class="headerlink" title="快速開始"></a>快速開始</h2><p>這是部署 RabbitMQ 和相關服務的快速開始指南，使用提供的 docker-compose.yml 檔案和 .env 中的環境變數。</p><figure class="highlight mel"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker-compose --<span class="keyword">env</span>-<span class="keyword">file</span> .\<span class="keyword">env</span>\.<span class="keyword">env</span> up --build -d</span><br></pre></td></tr></table></figure><ol><li>檢查 RabbitMQ 健康狀態：在瀏覽器中開啟 <a href="http://localhost:8888/">http://localhost:8888</a> 以訪問 RabbitMQ 管理面板。</li></ol><ul><li>使用者名稱: guest</li><li>密碼: guest</li></ul><ol start="2"><li>檢查 OrleansDashboard <a href="http://localhost:8899/">http://localhost:8899</a></li></ol><ul><li>使用者名稱: admin</li><li>密碼: test.123</li></ul><h2 id="程式結構"><a href="#程式結構" class="headerlink" title="程式結構"></a>程式結構</h2><p>以下是創建並配置與 RabbitMQ 互動的 workerpool 的範例程式碼。以下是其功能的解析：workerpool 將根據您的 RabbitMqSetting 設定從 RabbitMQ 伺服器獲取訊息，並通過 Process.StandardInput 將訊息傳遞給用戶創建的真實 worker node</p><figure class="highlight c#"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Program</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">async</span> Task <span class="title">Main</span>(<span class="params"><span class="built_in">string</span>[] <span class="keyword">args</span></span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        CreateHostBuilder(<span class="keyword">args</span>).Build().Run();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> IHostBuilder <span class="title">CreateHostBuilder</span>(<span class="params"><span class="built_in">string</span>[] <span class="keyword">args</span></span>)</span> =&gt;</span><br><span class="line">        Host.CreateDefaultBuilder(<span class="keyword">args</span>)</span><br><span class="line">            .ConfigureLogging(logging =&gt;</span><br><span class="line">            &#123;</span><br><span class="line">                logging.ClearProviders();</span><br><span class="line">                logging.AddConsole(options =&gt; &#123;</span><br><span class="line">                    options.FormatterName = ConsoleFormatterNames.Simple;</span><br><span class="line">                &#125;);</span><br><span class="line">                logging.Services.Configure&lt;SimpleConsoleFormatterOptions&gt;(options =&gt; &#123;</span><br><span class="line">                    options.IncludeScopes = <span class="literal">true</span>;</span><br><span class="line">                    options.TimestampFormat = <span class="string">&quot; yyyy-MM-dd HH:mm:ss &quot;</span>;</span><br><span class="line">                &#125;);</span><br><span class="line">            &#125;).AddRabbitMqWorkerPool(<span class="keyword">new</span> RabbitMqSetting</span><br><span class="line">            &#123;</span><br><span class="line">                UserName = Environment.GetEnvironmentVariable(<span class="string">&quot;USERNAME&quot;</span>) ?? <span class="string">&quot;guest&quot;</span>,</span><br><span class="line">                Password = Environment.GetEnvironmentVariable(<span class="string">&quot;PASSWORD&quot;</span>) ?? <span class="string">&quot;guest&quot;</span>,</span><br><span class="line">                HostName = Environment.GetEnvironmentVariable(<span class="string">&quot;RABBITMQ_HOSTNAME&quot;</span>),</span><br><span class="line">                Port = <span class="built_in">ushort</span>.TryParse(Environment.GetEnvironmentVariable(<span class="string">&quot;RABBITMQ_PORT&quot;</span>), <span class="keyword">out</span> <span class="built_in">ushort</span> p) ? p : (<span class="built_in">ushort</span>) <span class="number">5672</span>,</span><br><span class="line">                PrefetchTaskCount = <span class="number">3</span></span><br><span class="line">            &#125;, <span class="keyword">new</span> WorkerPoolSetting() &#123; WorkerUnitCount = <span class="number">9</span>, CommandLine = <span class="string">&quot;dotnet&quot;</span>, Arguments = <span class="string">@&quot;./ProcessBin/WorkerProcessSample.dll&quot;</span>, QueueName = Environment.GetEnvironmentVariable(<span class="string">&quot;QUEUENAME&quot;</span>), &#125;</span><br><span class="line">            );</span><br><span class="line"></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="worker-process-與-workerPool-之間的協議"><a href="#worker-process-與-workerPool-之間的協議" class="headerlink" title="worker process 與 workerPool 之間的協議"></a>worker process 與 workerPool 之間的協議</h2><p>worker node 與任務進程之間的協議使用 MessagePack 二進制格式來進行更快且更小的資料傳輸，標準輸入將發送信號來控制 worker process。</p><p>一開始 workerPool 將通過標準輸入傳遞 NamedPipe 名稱，因此 worker node 需要接收該名稱並建立 worker process 和 workerPool 之間的 NamedPipe。</p><h3 id="workerPool-發送的操作指令"><a href="#workerPool-發送的操作指令" class="headerlink" title="workerPool 發送的操作指令"></a>workerPool 發送的操作指令</h3><p>目前，workerPool將通過標準輸入向 worker process 發送操作信號或指令。</p><ul><li>CLOSED_SIGNAL (<code>__quit__</code>): 代表 workerPool 發送關閉或關機信號給 worker node，worker process 應盡快執行優雅關機。<br>通過 (Data Named Pipe Stream) 進行資料傳輸<br>命名管道是一種強大的進程間通信 (IPC) 機制，它允許兩個或更多的進程之間進行通信，即使它們運行在不同的機器上（例如 Windows 等支持的平台）。我們的 worker 使用此方式在 worker node 與 workerPool 之間傳輸資料。</li></ul><p>msgpack 協議支持的資料類型如下類別與 byte[] 格式。</p><p>對應的 byte[] 資料是：</p><figure class="highlight dns"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[<span class="number">132,161,48,179</span>,<span class="number">78,101,119,32</span>,<span class="number">79,117,116,80</span>,<span class="number">117,116,32,77</span>,<span class="number">101,115,115,97</span>,<span class="number">103,101,33,161</span>,<span class="number">49,204,200,161</span>,<span class="number">50,129,164,116</span>,<span class="number">101,115,116,167</span>,<span class="number">116,101,115,116</span>,<span class="number">118,97,108,161</span>,<span class="number">51,169,116,101</span>,<span class="number">115,116,81,117</span>,<span class="number">101,117,101</span>]</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>要將提供的偽 JSON 結構表示為 <code>MsgPack</code> 格式（byte[]），我們可以分解過程如下：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">Edit</span><br><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;0&quot;</span><span class="punctuation">:</span> <span class="string">&quot;New OutPut Message!&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;1&quot;</span><span class="punctuation">:</span> <span class="number">200</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;2&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;test&quot;</span><span class="punctuation">:</span> <span class="string">&quot;testval&quot;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;3&quot;</span><span class="punctuation">:</span> <span class="string">&quot;testQueue&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>更多資訊，您可以使用 <a href="https://ref45638.github.io/msgpack-converter/">msgpack-converter</a> 來解碼和編碼。</p><figure class="highlight c#"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"> <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;summary&gt;</span></span></span><br><span class="line"><span class="comment"><span class="doctag">///</span> 封裝來自 MQ 服務的訊息</span></span><br><span class="line"><span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;/summary&gt;</span></span></span><br><span class="line">[<span class="meta">MessagePackObject</span>]</span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">MessageOutputTask</span></span><br><span class="line">&#123;</span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;summary&gt;</span></span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> 來自進程的輸出訊息</span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;/summary&gt;</span></span></span><br><span class="line">   [<span class="meta">Key(<span class="string">&quot;0&quot;</span>)</span>]</span><br><span class="line">   <span class="keyword">public</span> <span class="built_in">string</span> Message &#123; <span class="keyword">get</span>; <span class="keyword">set</span>; &#125;</span><br><span class="line">   [<span class="meta">Key(<span class="string">&quot;1&quot;</span>)</span>]</span><br><span class="line">   <span class="keyword">public</span> MessageStatus Status &#123; <span class="keyword">get</span>; <span class="keyword">set</span>; &#125;</span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;summary&gt;</span></span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> 我們希望儲存的回應資訊以便繼續執行訊息。</span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;/summary&gt;</span></span></span><br><span class="line">   [<span class="meta">Key(<span class="string">&quot;2&quot;</span>)</span>]</span><br><span class="line">   [<span class="meta">MessagePackFormatter(typeof(PrimitiveObjectResolver))</span>]</span><br><span class="line">   <span class="keyword">public</span> IDictionary&lt;<span class="built_in">string</span>, <span class="built_in">object</span>&gt; Headers &#123; <span class="keyword">get</span>; <span class="keyword">set</span>; &#125;</span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;summary&gt;</span></span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> 預設使用 BasicProperties.Reply To 隊列名稱，任務處理器可以覆寫回應隊列名稱。</span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;/summary&gt;</span></span></span><br><span class="line">   <span class="comment"><span class="doctag">///</span> <span class="doctag">&lt;value&gt;</span>預設使用 BasicProperties.Reply<span class="doctag">&lt;/value&gt;</span></span></span><br><span class="line">   [<span class="meta">Key(<span class="string">&quot;3&quot;</span>)</span>]</span><br><span class="line">   <span class="keyword">public</span> <span class="built_in">string</span> ReplyQueueName &#123; <span class="keyword">get</span>; <span class="keyword">set</span>; &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我將在此介紹 MessageStatus 的含義。</p><ul><li>IGNORE_MESSAGE (-1) : 將訊息附加到資料流管道中，而不進行進一步處理。<ul><li>Status &#x3D; -1: 任務處理告訴 worker process 這不是回應或確認訊息，只是回饋到資料流管道。</li></ul></li><li>MESSAGE_DONE (200) : 通知 worker process 該案件可以由訊息隊列服務進行確認。<ul><li>Status &#x3D; 200 任務處理告訴 worker process 該任務已完成並且可以確認。</li></ul></li><li>MESSAGE_DONE_WITH_REPLY (201) : 請確保我們滿足以下步驟以支援 RPC。<ul><li>客戶端代碼必須提供 ReplyTo 資訊。</li><li>任務處理將使用 JSON 負載中的 Message 欄位來回應隊列資訊。</li><li>例如：當 Status &#x3D; 201 透過資料流管道發送時，任務處理指示 worker process 輸出，例如 1010，該數據必須然後發送到回應隊列。</li></ul></li></ul><p>我們可以通過不同的程式語言來編寫自己的 worker node （我已經在此 github 提供了 Python, .NET, rust example code）。</p><h2 id="如何處理長時間運行的任務或涉及處理大量數據行的任務？"><a href="#如何處理長時間運行的任務或涉及處理大量數據行的任務？" class="headerlink" title="如何處理長時間運行的任務或涉及處理大量數據行的任務？"></a>如何處理長時間運行的任務或涉及處理大量數據行的任務？</h2><p>類似於操作系統中的進程，發生上下文切換（中斷等）。</p><p>客戶端可以通過 Header 發送一個 <code>TimeoutMilliseconds</code> 值：在取消之前等待的時間（毫秒）。如果任務執行超過該值，worker process 可以使用該值來設置中斷，例如 CancellationToken。</p><p>例如，<code>MessageOutputTask</code> 的 JSON 可以如下所示，<code>status=201</code> 代表此訊息將重新入隊以便下次處理，並且訊息將攜帶 <code>Headers</code> 資訊再次重新入隊。</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;Message&quot;</span><span class="punctuation">:</span> <span class="string">&quot;This is Mock Json Data&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;Status&quot;</span><span class="punctuation">:</span> <span class="number">201</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;Headers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;CreateTimestamp&quot;</span><span class="punctuation">:</span> <span class="string">&quot;2025-01-01T14:35:00Z&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;PreviousProcessingTimestamp&quot;</span><span class="punctuation">:</span> <span class="string">&quot;2025-01-01T14:40:00Z&quot;</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;Source&quot;</span><span class="punctuation">:</span> <span class="string">&quot;OrderProcessingService&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;PreviousExecutedRows&quot;</span><span class="punctuation">:</span> <span class="string">&quot;123&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;RequeueTimes&quot;</span><span class="punctuation">:</span> <span class="string">&quot;3&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>此專案還包括 integration、unit test 和 github action pipeline。雖然 API 文件（專案仍在 beta 階段），但我計劃在未來逐步添加。如果您對此專案有任何想法或建議，請隨時創建問題或發送 PR。</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/mq-workerpool-introduction/">https://isdaniel.github.io/mq-workerpool-introduction/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/mq-workerpool-introduction/</id>
    <link href="https://isdaniel.github.io/mq-workerpool-introduction/"/>
    <published>2025-06-02T16:00:00.000Z</published>
    <summary>最近我開發了 MessageWorkerPool 專案。其主要概念是提供一個平台框架，使使用者能夠快速且輕鬆地在 Worker 內實作邏輯。該設計高度靈活，允許基於我創建的 Worker 通訊協議，以多種程式語言實作 Worker。目前，我已提供使用 C#、Rust 和 Python 編寫的 Worker 範例。</summary>
    <title>MessageWorkerPool framework introduction</title>
    <updated>2026-04-22T03:00:22.029Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="Linux" scheme="https://isdaniel.github.io/categories/Rust/Linux/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="Linux" scheme="https://isdaniel.github.io/tags/Linux/"/>
    <category term="docker" scheme="https://isdaniel.github.io/tags/docker/"/>
    <content>
      <![CDATA[<h1 id="RustBox"><a href="#RustBox" class="headerlink" title="RustBox"></a>RustBox</h1><blockquote><p>A Docker-like container runtime written in Rust with daemon architecture, supporting multi-container orchestration, persistent state management, and comprehensive CLI commands.</p></blockquote><h2 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h2><p><strong>RustBox</strong> is a container runtime that isn’t competing with (Docker or Kubernetes), we return to the core and build a simplest “Sandbox&#x2F;isolated runtime environment” from the lowest level Linux kernel mechanisms (namespaces, cgroups, OverlayFS, etc.), provides Docker-like functionality using:</p><ul><li><strong>Daemon Architecture</strong> with Unix domain socket communication</li><li><strong>Multi-container Management</strong> with persistent state</li><li><strong>OverlayFS</strong> for isolated container filesystems</li><li><strong>Cgroups v2</strong> for resource limits (memory, CPU)</li><li><strong>Linux namespaces</strong> for complete process isolation</li><li><strong>Comprehensive CLI</strong> with run, stop, list, inspect, remove, logs, and attach commands</li></ul><p>This tool is designed for container orchestration, testing environments, and secure code execution.</p><h2 id="Architecture"><a href="#Architecture" class="headerlink" title="Architecture"></a>Architecture</h2><h3 id="Daemon-Client-Model"><a href="#Daemon-Client-Model" class="headerlink" title="Daemon-Client Model"></a>Daemon-Client Model</h3><figure class="highlight axapta"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br></pre></td><td class="code"><pre><span class="line">┌──────────────────────────────────────────────────────────────────────────────┐</span><br><span class="line">│                                RustBox Architecture                          │</span><br><span class="line">└──────────────────────────────────────────────────────────────────────────────┘</span><br><span class="line"></span><br><span class="line">[rustbox CLI]                                 [rustboxd Daemon]</span><br><span class="line">     │                                               │</span><br><span class="line">     │  Unix Socket                                  │</span><br><span class="line">     │  /tmp/rustbox-daemon.sock                     │</span><br><span class="line">     │                                               │</span><br><span class="line">     │  IPC Protocol (JSON messages)                 │</span><br><span class="line">     │  ───────────────────────────────────────────▶ │</span><br><span class="line">     │  Commands:                                   │</span><br><span class="line">     │   • run                                      │</span><br><span class="line">     │   • stop                                     │</span><br><span class="line">     │   • list                                     │</span><br><span class="line">     │   • inspect                                  │</span><br><span class="line">     │   • remove                                   │</span><br><span class="line">     │   • logs                                     │</span><br><span class="line">     │   • attach                                   │</span><br><span class="line">     │                                               ▼</span><br><span class="line">     │                                    ┌────────────────────────────┐</span><br><span class="line">     │                                    │  Container Manager         │</span><br><span class="line">     │                                    │  ───────────────────────── │</span><br><span class="line">     │                                    │  • Controls lifecycle      │</span><br><span class="line">     │                                    │  • Creates sandbox env     │</span><br><span class="line">     │                                    │  • Manages PTY + process   │</span><br><span class="line">     │                                    └────────────────────────────┘</span><br><span class="line">     │                                               │</span><br><span class="line">     │                                               ▼</span><br><span class="line">     │                                    ┌────────────────────────────┐</span><br><span class="line">     │                                    │  Registry (HashMap&lt;ID, Container&gt;)│</span><br><span class="line">     │                                    └────────────────────────────┘</span><br><span class="line">     │                                               │</span><br><span class="line">     │                     ┌────────────────────────────────────────────┐</span><br><span class="line">     │                     │           Container Instances              │</span><br><span class="line">     │                     └────────────────────────────────────────────┘</span><br><span class="line">     │                         │              │              │</span><br><span class="line">     │                         ▼              ▼              ▼</span><br><span class="line">     │                    [Container <span class="number">1</span>]  [Container <span class="number">2</span>]  [Container N]</span><br><span class="line">     │                         │              │              │</span><br><span class="line">     │                  ┌──────────────────────────────────────────────┐</span><br><span class="line">     │                  │              Sandbox Components               │</span><br><span class="line">     │                  │  overlayfs + cgroups + namespaces             │</span><br><span class="line">     │                  └──────────────────────────────────────────────┘</span><br><span class="line">     │                         │</span><br><span class="line">     │                         │</span><br><span class="line">     │  (When attaching)       │</span><br><span class="line">     │  ───────────────────────────────────────────────────────────────────────────</span><br><span class="line">    ┌─────────────────────────────────────────────────────────────────────────────┐</span><br><span class="line">    │                            Container Attach Flow                            │</span><br><span class="line">    └─────────────────────────────────────────────────────────────────────────────┘</span><br><span class="line"></span><br><span class="line">    Client (e.g. docker attach, CLI, web terminal)</span><br><span class="line">    │</span><br><span class="line">    │  <span class="number">1.</span> Send/receive stdin/stdout over Unix socket</span><br><span class="line">    ▼</span><br><span class="line">    ┌───────────────────────────────────────────────────────────┐</span><br><span class="line">    │ Daemon Process                                            │</span><br><span class="line">    │ ───────────────────────────────────────────────────────── │</span><br><span class="line">    │  • Manages <span class="built_in">container</span> lifecycle                            │</span><br><span class="line">    │  • Holds PTY master side                                  │</span><br><span class="line">    │  • Forwards data between <span class="keyword">client</span> and <span class="built_in">container</span>             │</span><br><span class="line">    │                                                           │</span><br><span class="line">    │  ┌──────────────────────────────────────────────────────┐ │</span><br><span class="line">    │  │ Unix Socket (Client ↔ Daemon)                        │ │</span><br><span class="line">    │  │  - AttachStdin  (<span class="keyword">client</span> → daemon)                    │ │</span><br><span class="line">    │  │  - AttachStdout (daemon → <span class="keyword">client</span>)                    │ │</span><br><span class="line">    │  └──────────────────────────────────────────────────────┘ │</span><br><span class="line">    │                              │</span><br><span class="line">    │                              │ (I/O forwarding loop)</span><br><span class="line">    │                              ▼</span><br><span class="line">    │  ┌──────────────────────────────────────────────────────┐ │</span><br><span class="line">    │  │ PTY Master                                           │ │</span><br><span class="line">    │  │  - Pseudo terminal device endpoint controlled <span class="keyword">by</span>     │ │</span><br><span class="line">    │  │    the daemon                                        │ │</span><br><span class="line">    │  │  - Reads <span class="built_in">container</span> output                            │ │</span><br><span class="line">    │  │  - Writes <span class="keyword">client</span> input                               │ │</span><br><span class="line">    │  └──────────────────────────────────────────────────────┘ │</span><br><span class="line">    │                              │</span><br><span class="line">    │                              │ (kernel-level link)</span><br><span class="line">    │                              ▼</span><br><span class="line">    │  ┌──────────────────────────────────────────────────────┐ │</span><br><span class="line">    │  │ PTY Slave                                            │ │</span><br><span class="line">    │  │  - Exposed inside the <span class="built_in">container</span> <span class="keyword">as</span> /dev/tty or stdin │ │</span><br><span class="line">    │  │  - Attached to the <span class="built_in">container</span>’s process (e.g. /bin/bash)││</span><br><span class="line">    │  │  - Container writes stdout/stderr → goes to Master   │ │</span><br><span class="line">    │  │  - Container reads stdin ← comes <span class="keyword">from</span> Master         │ │</span><br><span class="line">    │  └──────────────────────────────────────────────────────┘ │</span><br><span class="line">    └───────────────────────────────────────────────────────────┘</span><br><span class="line">    │</span><br><span class="line">    ▼</span><br><span class="line">    Container Process (e.g. /bin/bash, sh)</span><br><span class="line">    • Reads <span class="keyword">from</span> stdin (/dev/tty)</span><br><span class="line">    • Writes to stdout/stderr (/dev/tty)</span><br><span class="line"></span><br><span class="line">    ───────────────────────────────────────────────────────────────</span><br><span class="line">    Summary:</span><br><span class="line">    - PTY Master: controlled <span class="keyword">by</span> the daemon, mediates all I/O</span><br><span class="line">    - PTY Slave : presented to the <span class="built_in">container</span> process <span class="keyword">as</span> its terminal</span><br><span class="line">    - Unix Socket: transports attach stream between <span class="keyword">client</span> ↔ daemon</span><br><span class="line">    ───────────────────────────────────────────────────────────────</span><br><span class="line"></span><br></pre></td></tr></table></figure><h3 id="Container-Isolation"><a href="#Container-Isolation" class="headerlink" title="Container Isolation"></a>Container Isolation</h3><p>RustBox employs a <strong>double fork</strong> pattern for each container to ensure proper isolation:</p><h3 id="Process-Hierarchy-Per-Container"><a href="#Process-Hierarchy-Per-Container" class="headerlink" title="Process Hierarchy (Per Container)"></a>Process Hierarchy (Per Container)</h3><figure class="highlight tcl"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">[Daemon Process]</span><br><span class="line">    └─&gt; spawn_blocking()</span><br><span class="line">        └─&gt; [Container Task]</span><br><span class="line">            └─&gt; fork() #<span class="number">1</span></span><br><span class="line">                ├─&gt; [Namespaced Parent Process]</span><br><span class="line">                │   ├─&gt; unshare() - Creates new namespaces</span><br><span class="line">                │   ├─&gt; setup cgroups and overlay</span><br><span class="line">                │   └─&gt; fork() #<span class="number">2</span></span><br><span class="line">                │       ├─&gt; [Inner Child Process]</span><br><span class="line">                │       │   ├─&gt; Mount /<span class="keyword">proc</span><span class="title"> and</span> /dev</span><br><span class="line">                │       │   ├─&gt;<span class="title"> chroot()</span> to<span class="title"> merged</span> overlay</span><br><span class="line">                │       │   ├─&gt;<span class="title"> chdir()</span> to<span class="title"> working</span> directory</span><br><span class="line">                │       │   └─&gt;<span class="title"> execv()</span> -<span class="title"> Execute</span> command</span><br><span class="line">                │       └─&gt; [Namespaced<span class="title"> Parent]</span> waits<span class="title"> for</span> inner<span class="title"> child</span></span><br><span class="line">                │           └─&gt;<span class="title"> Unmounts</span> /<span class="keyword">proc</span><span class="title"> and</span> /dev<span class="title"> inside</span> namespace</span><br><span class="line">                └─&gt; [Container<span class="title"> Task]</span> waits<span class="title"> for</span> namespaced<span class="title"> parent</span></span><br><span class="line">                    ├─&gt;<span class="title"> Unmounts</span> overlay<span class="title"> filesystem</span></span><br><span class="line">                    ├─&gt;<span class="title"> Cleans</span> up<span class="title"> cgroups</span></span><br><span class="line">                    └─&gt;<span class="title"> Updates</span> container<span class="title"> state</span> in<span class="title"> registry</span></span><br></pre></td></tr></table></figure><h3 id="Container-Lifecycle-States"><a href="#Container-Lifecycle-States" class="headerlink" title="Container Lifecycle States"></a>Container Lifecycle States</h3><figure class="highlight awk"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Created ──(start)──&gt; Running ──(<span class="keyword">exit</span>)──────&gt; Exited</span><br><span class="line">                       │</span><br><span class="line">                       └──(stop)──&gt; Stopped ──(<span class="keyword">exit</span>)──&gt; Exited</span><br></pre></td></tr></table></figure><h3 id="Persistent-State-Management"><a href="#Persistent-State-Management" class="headerlink" title="Persistent State Management"></a>Persistent State Management</h3><ul><li><strong>Container metadata</strong>: <code>/var/lib/rustbox/containers/&lt;container-id&gt;.json</code></li><li><strong>Container logs</strong>: <code>/var/lib/rustbox/logs/&lt;container-id&gt;/</code></li><li><strong>Overlay filesystems</strong>: <code>/var/lib/rustbox/overlay/&lt;container-id&gt;/</code></li><li><strong>State recovery</strong>: Daemon recovers container state on restart</li></ul><h2 id="Features"><a href="#Features" class="headerlink" title="Features"></a>Features</h2><ul><li><strong>Daemon Architecture</strong> with background process and client-server communication</li><li><strong>Multi-container Management</strong> supporting concurrent container execution</li><li><strong>Persistent State Management</strong> with automatic recovery across daemon restarts</li><li><strong>Complete Container Lifecycle</strong> (create, start, stop, remove, inspect)</li><li><strong>Interactive Attach Support</strong> with TTY allocation and real-time I&#x2F;O streaming</li><li><strong>Real-time Logging</strong> with per-container stdout&#x2F;stderr files</li><li><strong>Resource Isolation</strong> using cgroups v2 (memory, CPU limits)</li><li><strong>Filesystem Isolation</strong> using overlayfs with automatic cleanup</li><li><strong>Full Namespace Isolation</strong> (PID, UTS, NET, USER, IPC)</li><li><strong>Docker-like CLI</strong> with familiar commands (run, ps, logs, inspect, rm, attach)</li><li><strong>Graceful Shutdown</strong> with proper signal handling and resource cleanup</li><li><strong>Security</strong> with proper privilege separation and input validation</li></ul><h2 id="Requirements"><a href="#Requirements" class="headerlink" title="Requirements"></a>Requirements</h2><ul><li>Linux kernel 5.x or higher (with overlayfs and cgroups v2 support)</li><li>Rust (1.70+ recommended)</li><li>Root privileges (for daemon operations, mounting, and namespace creation)</li></ul><h2 id="Installation"><a href="#Installation" class="headerlink" title="Installation"></a>Installation</h2><h3 id="Build-from-Source"><a href="#Build-from-Source" class="headerlink" title="Build from Source"></a>Build from Source</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/RustBox.git</span><br><span class="line"><span class="built_in">cd</span> RustBox</span><br><span class="line">cargo build --release</span><br></pre></td></tr></table></figure><h3 id="Binaries"><a href="#Binaries" class="headerlink" title="Binaries"></a>Binaries</h3><p>After building, you’ll have two binaries:</p><ul><li><code>rustbox</code> - Client CLI tool</li><li><code>daemon_rs</code> - Background daemon process</li></ul><h2 id="Usage"><a href="#Usage" class="headerlink" title="Usage"></a>Usage</h2><h3 id="Start-the-Daemon"><a href="#Start-the-Daemon" class="headerlink" title="Start the Daemon"></a>Start the Daemon</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Start the daemon in background (requires root)</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/daemon_rs 2&gt;&amp;1 &amp;</span><br></pre></td></tr></table></figure><p>The daemon will:</p><ul><li>Listen on Unix socket <code>/tmp/rustbox-daemon.sock</code></li><li>Create system directories under <code>/var/lib/rustbox/</code></li><li>Recover existing container state from disk</li><li>Handle graceful shutdown on SIGTERM&#x2F;SIGINT</li></ul><h3 id="Container-Management"><a href="#Container-Management" class="headerlink" title="Container Management"></a>Container Management</h3><h4 id="Create-and-Run-Containers"><a href="#Create-and-Run-Containers" class="headerlink" title="Create and Run Containers"></a>Create and Run Containers</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Run a container in background with TTY support (allows interactive attach)</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox run --<span class="built_in">tty</span> --memory 256M --cpu 0.5 /bin/bash</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run a container with custom name</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox run --<span class="built_in">tty</span> --memory 256M --cpu 0.5 /bin/bash 2&gt;&amp;1</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run a non-interactive container</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox run --memory 256M /usr/bin/python3 script.py</span><br></pre></td></tr></table></figure><p><strong>Note</strong>: The <code>--tty</code> flag is required if you want to attach to the container later.</p><h4 id="List-Containers"><a href="#List-Containers" class="headerlink" title="List Containers"></a>List Containers</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># List running containers</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox list</span><br><span class="line"></span><br><span class="line"><span class="comment"># List all containers (including stopped)</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox list -a</span><br><span class="line"><span class="comment"># or</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox list --all</span><br></pre></td></tr></table></figure><h4 id="Attach-to-Running-Containers"><a href="#Attach-to-Running-Containers" class="headerlink" title="Attach to Running Containers"></a>Attach to Running Containers</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Attach to a running container (container must have been created with --tty flag)</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox attach &lt;container-id&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Example:</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox attach f1a5f84880a1</span><br></pre></td></tr></table></figure><p><strong>Interactive Controls</strong>:</p><ul><li>Press <code>Ctrl+P</code> followed by <code>Ctrl+Q</code> to detach from container (leaves it running)</li><li>Press <code>Ctrl+C</code> to send interrupt signal and exit</li></ul><p><strong>Requirements</strong>:</p><ul><li>Container must have been started with <code>--tty</code> flag</li><li>Container must be in <code>Running</code> state</li></ul><h4 id="Stop-Remove-and-Inspect-Containers"><a href="#Stop-Remove-and-Inspect-Containers" class="headerlink" title="Stop, Remove, and Inspect Containers"></a>Stop, Remove, and Inspect Containers</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Stop a running container</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox stop &lt;container-id&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># View container logs</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox logs &lt;container-id&gt;</span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox logs --<span class="built_in">tail</span> 50 &lt;container-id&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Inspect container details</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox inspect &lt;container-id&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Remove a stopped container</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox remove &lt;container-id&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Force remove a running container</span></span><br><span class="line"><span class="built_in">sudo</span> ./target/release/rustbox remove --force &lt;container-id&gt;</span><br></pre></td></tr></table></figure><p><strong>Available Options</strong>:</p><ul><li><code>--name</code> - Custom container name (auto-generated if not provided)</li><li><code>--memory</code> - Memory limit (e.g., “256M”, “1G”, “512000”)</li><li><code>--cpu</code> - CPU limit as fraction of one core (e.g., “0.5”, “1.0”)</li><li><code>--workdir</code> - Working directory inside container (default: “&#x2F;“)</li><li><code>--rootfs</code> - Path to rootfs directory (default: “.&#x2F;rootfs”)</li><li><code>--tty</code> - Allocate a pseudo-TTY for interactive use (required for attach)</li></ul><h2 id="Directory-Structure"><a href="#Directory-Structure" class="headerlink" title="Directory Structure"></a>Directory Structure</h2><h3 id="Runtime-Directories-created-by-daemon"><a href="#Runtime-Directories-created-by-daemon" class="headerlink" title="Runtime Directories (created by daemon)"></a>Runtime Directories (created by daemon)</h3><figure class="highlight nix"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="operator">/</span>var<span class="operator">/</span>lib<span class="operator">/</span>rustbox<span class="symbol">/</span></span><br><span class="line">├── containers<span class="symbol">/</span>           <span class="comment"># Container metadata (JSON files)</span></span><br><span class="line">│   ├── a1b2c3d4e5f6.json</span><br><span class="line">│   └── f6e5d4c3b2a1.json</span><br><span class="line">├── logs<span class="symbol">/</span>                 <span class="comment"># Container logs</span></span><br><span class="line">│   ├── a1b2c3d4e5f6<span class="symbol">/</span></span><br><span class="line">│   │   ├── stdout.log</span><br><span class="line">│   │   └── stderr.log</span><br><span class="line">│   └── f6e5d4c3b2a1<span class="symbol">/</span></span><br><span class="line">│       ├── stdout.log</span><br><span class="line">│       └── stderr.log</span><br><span class="line">└── overlay<span class="symbol">/</span>              <span class="comment"># Overlay filesystem layers</span></span><br><span class="line">    ├── a1b2c3d4e5f6<span class="symbol">/</span></span><br><span class="line">    │   ├── lowerdir<span class="symbol">/</span>     <span class="comment"># Read-only base layer</span></span><br><span class="line">    │   ├── upperdir<span class="symbol">/</span>     <span class="comment"># Container changes</span></span><br><span class="line">    │   ├── workdir<span class="symbol">/</span>      <span class="comment"># Overlay work directory</span></span><br><span class="line">    │   └── merged<span class="symbol">/</span>       <span class="comment"># Final mounted filesystem</span></span><br><span class="line">    └── f6e5d4c3b2a1<span class="symbol">/</span></span><br><span class="line">        ├── lowerdir<span class="symbol">/</span></span><br><span class="line">        ├── upperdir<span class="symbol">/</span></span><br><span class="line">        ├── workdir<span class="symbol">/</span></span><br><span class="line">        └── merged<span class="operator">/</span></span><br></pre></td></tr></table></figure><h3 id="Source-Code-Structure"><a href="#Source-Code-Structure" class="headerlink" title="Source Code Structure"></a>Source Code Structure</h3><figure class="highlight nix"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">src<span class="symbol">/</span></span><br><span class="line">├── lib.rs                <span class="comment"># Public API exports</span></span><br><span class="line">├── main.rs               <span class="comment"># Client CLI entry point</span></span><br><span class="line">├── daemon<span class="symbol">/</span>               <span class="comment"># Daemon implementation</span></span><br><span class="line">│   ├── main.rs          <span class="comment"># Daemon entry point</span></span><br><span class="line">│   ├── server.rs        <span class="comment"># Unix socket server</span></span><br><span class="line">│   ├── container_manager.rs <span class="comment"># Container lifecycle management</span></span><br><span class="line">│   └── signal_handler.rs <span class="comment"># Graceful shutdown handling</span></span><br><span class="line">├── ipc<span class="symbol">/</span>                  <span class="comment"># Inter-process communication</span></span><br><span class="line">│   ├── protocol.rs      <span class="comment"># Message types and framing</span></span><br><span class="line">│   └── client.rs        <span class="comment"># Client-side socket communication</span></span><br><span class="line">├── container<span class="symbol">/</span>            <span class="comment"># Container abstractions</span></span><br><span class="line">│   ├── mod.rs           <span class="comment"># Container data structures</span></span><br><span class="line">│   ├── config.rs        <span class="comment"># Configuration and validation</span></span><br><span class="line">│   ├── sandbox.rs       <span class="comment"># Core isolation logic</span></span><br><span class="line">│   ├── state_machine.rs <span class="comment"># Container state transitions</span></span><br><span class="line">│   └── id.rs            <span class="comment"># ID generation and validation</span></span><br><span class="line">├── storage<span class="symbol">/</span>              <span class="comment"># Persistent storage</span></span><br><span class="line">│   ├── metadata.rs      <span class="comment"># Container metadata management</span></span><br><span class="line">│   └── logs.rs          <span class="comment"># Log file management</span></span><br><span class="line">├── cli<span class="symbol">/</span>                  <span class="comment"># CLI command implementations</span></span><br><span class="line">│   ├── run.rs           <span class="comment"># Create and start containers</span></span><br><span class="line">│   ├── stop.rs          <span class="comment"># Stop containers</span></span><br><span class="line">│   ├── list.rs          <span class="comment"># List containers</span></span><br><span class="line">│   ├── inspect.rs       <span class="comment"># Container details</span></span><br><span class="line">│   ├── remove.rs        <span class="comment"># Remove containers</span></span><br><span class="line">│   ├── logs.rs          <span class="comment"># View container logs</span></span><br><span class="line">│   └── attach.rs        <span class="comment"># Attach to containers</span></span><br><span class="line">└── error.rs             <span class="comment"># Error handling</span></span><br></pre></td></tr></table></figure><h2 id="Technical-Details"><a href="#Technical-Details" class="headerlink" title="Technical Details"></a>Technical Details</h2><h3 id="IPC-Protocol"><a href="#IPC-Protocol" class="headerlink" title="IPC Protocol"></a>IPC Protocol</h3><p>Communication between client and daemon uses length-prefixed JSON messages over Unix domain sockets:</p><figure class="highlight scheme"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[<span class="name">4-byte</span> length (<span class="name">u32</span>, big-endian)][<span class="name">JSON</span> payload]</span><br></pre></td></tr></table></figure><p>Example:</p><figure class="highlight puppet"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">0<span class="keyword">x0000001E</span>  &#123;<span class="string">&quot;type&quot;</span>:<span class="string">&quot;ListRequest&quot;</span>,<span class="string">&quot;all&quot;</span>:<span class="keyword">true</span>&#125;</span><br></pre></td></tr></table></figure><h3 id="Container-ID-Format"><a href="#Container-ID-Format" class="headerlink" title="Container ID Format"></a>Container ID Format</h3><ul><li>12-character hexadecimal identifiers (e.g., <code>a1b2c3d4e5f6</code>)</li><li>Auto-generated names follow <code>adjective-noun</code> pattern (e.g., <code>happy-elephant</code>)</li><li>CLI commands accept either ID or name</li></ul><h3 id="Resource-Limits"><a href="#Resource-Limits" class="headerlink" title="Resource Limits"></a>Resource Limits</h3><ul><li><strong>Memory</strong>: Supports units like <code>100M</code>, <code>1G</code>, <code>512000</code> (bytes)</li><li><strong>CPU</strong>: Fraction of one core, e.g., <code>0.5</code> for 50% CPU limit</li><li>Enforced via cgroups v2 at <code>/sys/fs/cgroup/rustbox/&lt;container-id&gt;/</code></li></ul><h3 id="Security-Model"><a href="#Security-Model" class="headerlink" title="Security Model"></a>Security Model</h3><ul><li>Daemon runs as root for privileged operations</li><li>Client commands run as user, connect via Unix socket</li><li>Containers run in isolated namespaces (PID, NET, UTS, IPC, USER)</li><li>Input validation prevents directory traversal and injection attack</li></ul><h2 id="Final-Thoughts"><a href="#Final-Thoughts" class="headerlink" title="Final Thoughts"></a>Final Thoughts</h2><p><strong>RustBox</strong> is not a full container system, and that’s by design — it’s <strong>transparent</strong>, <strong>hackable</strong>, and <strong>educational</strong>. Whether you’re looking to secure untrusted code, explore low-level Linux features, or just love writing systems code in Rust, RustBox is a fantastic playground.</p><blockquote><p>Give it a ⭐ on <a href="https://github.com/isdaniel/RustBox">GitHub</a> and explore the source!</p></blockquote><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/rustbox-introduce/">https://isdaniel.github.io/rustbox-introduce/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/rustbox-introduce/</id>
    <link href="https://isdaniel.github.io/rustbox-introduce/"/>
    <published>2025-06-01T23:30:11.000Z</published>
    <summary>A Docker-like container runtime written in Rust with daemon architecture, supporting multi-container orchestration, persistent state management, and</summary>
    <title>RustBox - Docker-Lite Sandbox for Hackers and Learners</title>
    <updated>2026-04-22T03:00:22.032Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="C#" scheme="https://isdaniel.github.io/categories/Rust/C/"/>
    <category term="Sidecar" scheme="https://isdaniel.github.io/categories/Rust/C/Sidecar/"/>
    <category term="C#" scheme="https://isdaniel.github.io/tags/C/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="Sidecar" scheme="https://isdaniel.github.io/tags/Sidecar/"/>
    <content>
      <![CDATA[<h2 id="introduce"><a href="#introduce" class="headerlink" title="introduce"></a>introduce</h2><p>If you’ve ever struggled to instrument legacy apps or non-standard services for observability, <code>OpenTelemetry_SideCar</code> is here to help. This project offers a <strong>non-intrusive way to export metrics and traces</strong> via <strong>OpenTelemetry</strong> using a <strong>sidecar approach</strong> — no SDK required in your main app.</p><hr><h2 id="🌐-Project-Overview"><a href="#🌐-Project-Overview" class="headerlink" title="🌐 Project Overview"></a>🌐 Project Overview</h2><p>📦 <strong>Repo:</strong> <a href="https://github.com/isdaniel/OpenTelemetry_SideCar">isdaniel&#x2F;OpenTelemetry_SideCar</a></p><p><code>OpenTelemetry_SideCar</code> is a standalone proxy service that collects metrics and traces <strong>outside of your application</strong> and forwards them to a telemetry backend (e.g., Prometheus, Jaeger, or Azure Monitor).</p><h3 id="💡-Why-a-Sidecar"><a href="#💡-Why-a-Sidecar" class="headerlink" title="💡 Why a Sidecar?"></a>💡 Why a Sidecar?</h3><p>In cloud-native systems, a <em>sidecar</em> is a helper container or process that runs alongside your main application. It can observe, extend, or enhance app behavior <strong>without changing application code</strong>. This pattern is ideal for adding <strong>observability</strong> when:</p><ul><li>You can’t modify the original code (e.g., closed-source, legacy binaries).</li><li>You want to centralize telemetry logic.</li><li>You’re aiming for a unified instrumentation strategy.</li></ul><hr><h2 id="🧠-Key-Concepts"><a href="#🧠-Key-Concepts" class="headerlink" title="🧠 Key Concepts"></a>🧠 Key Concepts</h2><h3 id="📊-OpenTelemetry"><a href="#📊-OpenTelemetry" class="headerlink" title="📊 OpenTelemetry"></a>📊 OpenTelemetry</h3><p>OpenTelemetry is the CNCF-backed observability framework offering a vendor-neutral standard to collect metrics, logs, and traces.</p><p>This project leverages:</p><ul><li><strong>OTLP (OpenTelemetry Protocol)</strong> for data transport</li><li><strong>Push-based metrics</strong> collection via HTTP endpoints</li><li><strong>Custom trace generation</strong> from event messages</li></ul><h3 id="🧱-Sidecar-Design-Pattern"><a href="#🧱-Sidecar-Design-Pattern" class="headerlink" title="🧱 Sidecar Design Pattern"></a>🧱 Sidecar Design Pattern</h3><p>This service runs in parallel with your main app and exposes lightweight endpoints for:</p><ul><li>Sending <strong>metrics</strong> via <code>/metrics</code></li><li>Sending <strong>traces</strong> via <code>/trace</code></li></ul><p>Apps interact with the sidecar using simple HTTP POST requests.</p><hr><h2 id="⚙️-How-It-Works"><a href="#⚙️-How-It-Works" class="headerlink" title="⚙️ How It Works"></a>⚙️ How It Works</h2><h2 id="Architecture"><a href="#Architecture" class="headerlink" title="Architecture"></a>Architecture</h2><p>The project consists of the following components:</p><ul><li><strong>.NET Web Application</strong>: A simple web service with custom metrics and tracing- <strong>OpenTelemetry Collector</strong>: Receives telemetry data and exports it to backends</li><li><strong>Prometheus</strong>: Time-series database for storing and querying metrics- <strong>Jaeger</strong>: Distributed tracing system for monitoring and troubleshooting</li><li><strong>Grafana</strong>: Visualization and dashboarding platform</li></ul><figure class="highlight stata"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐</span><br><span class="line">│  Rust We        │     │                 │     │                 │</span><br><span class="line">│  .<span class="keyword">NET</span> <span class="keyword">App</span>       │────▶│  OTel Collector │────▶│  Prometheus     │</span><br><span class="line">│                 │     │                 │     │                 │</span><br><span class="line">└─────────────────┘     └────────┬────────┘     └─────────────────┘</span><br><span class="line">                                 │                        ▲</span><br><span class="line">                                 │                        │</span><br><span class="line">                                 ▼                        │</span><br><span class="line">                        ┌─────────────────┐      ┌────────┴────────┐</span><br><span class="line">                        │                 │      │                 │</span><br><span class="line">                        │  Jaeger         │      │  Grafana        │</span><br><span class="line">                        │                 │      │                 │</span><br><span class="line">                        └─────────────────┘      └─────────────────┘</span><br></pre></td></tr></table></figure><h2 id="Features"><a href="#Features" class="headerlink" title="Features"></a>Features</h2><ul><li>Custom metrics using OpenTelemetry Metrics API- Distributed tracing with OpenTelemetry Tracing API</li><li>Nested HTTP calls with context propagation- Prometheus metrics collection</li><li>Jaeger trace visualization- Grafana dashboards for metrics visualization</li></ul><h2 id="Prerequisites"><a href="#Prerequisites" class="headerlink" title="Prerequisites"></a>Prerequisites</h2><ul><li>Docker and Docker Compose</li><li>.NET 9.0 SDK (for local development)</li></ul><h2 id="Getting-Started"><a href="#Getting-Started" class="headerlink" title="Getting Started"></a>Getting Started</h2><h3 id="Running-the-Application"><a href="#Running-the-Application" class="headerlink" title="Running the Application"></a>Running the Application</h3><p>Clone the repository:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/yourusername/OpenTelemetry_SideCar.git</span><br><span class="line"><span class="built_in">cd</span> OpenTelemetry_SideCar</span><br></pre></td></tr></table></figure><p>Start the services using Docker Compose:<br> <figure class="highlight ebnf"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="attribute">docker-compose up -d</span></span><br></pre></td></tr></table></figure></p><ol><li>Access the application: Web Application: <a href="http://localhost:8080/">http://localhost:8080</a>   - Nested Greeting: <a href="http://localhost:8080/NestedGreeting?nestlevel=3">http://localhost:8080/NestedGreeting?nestlevel=3</a></li></ol><h3 id="Accessing-Observability-Tools"><a href="#Accessing-Observability-Tools" class="headerlink" title="Accessing Observability Tools"></a>Accessing Observability Tools</h3><ul><li><strong>Prometheus</strong>: <a href="http://localhost:9090/">http://localhost:9090</a><ul><li>Query metrics with PromQL  - Example: <code>greetings_count_total</code></li></ul></li><li><strong>Jaeger UI</strong>: <a href="http://localhost:16686/">http://localhost:16686</a><ul><li>View distributed traces  - Filter by service: <code>telemetry_example</code></li></ul></li><li><strong>Grafana</strong>: <a href="http://localhost:3000/">http://localhost:3000</a><ul><li>Default credentials: admin&#x2F;admin  - Pre-configured dashboards for application metrics</li></ul></li></ul><h2 id="Application-Endpoints"><a href="#Application-Endpoints" class="headerlink" title="Application Endpoints"></a>Application Endpoints</h2><ul><li><strong>&#x2F;</strong> - Returns a simple greeting and increments the greeting counter</li><li><strong>&#x2F;NestedGreeting?nestlevel&#x3D;N</strong> - Creates a chain of N nested HTTP calls, demonstrating trace context propagation</li></ul><h2 id="Configuration-Files"><a href="#Configuration-Files" class="headerlink" title="Configuration Files"></a>Configuration Files</h2><ul><li><strong>docker-compose.yml</strong>: Defines all services and their connections- <strong>otel-collector-config.yaml</strong>: Configures the OpenTelemetry Collector</li><li><strong>prometheus.yml</strong>: Prometheus scraping configuration- <strong>Dockerfile</strong>: Builds the .NET application</li></ul><h2 id="Troubleshooting"><a href="#Troubleshooting" class="headerlink" title="Troubleshooting"></a>Troubleshooting</h2><h3 id="Common-Issues"><a href="#Common-Issues" class="headerlink" title="Common Issues"></a>Common Issues</h3><ol><li><strong>No metrics in Prometheus</strong>:<ul><li>Verify the OpenTelemetry Collector is running: <code>docker-compose ps</code>   - Check collector logs: <code>docker-compose logs otel-collector</code></li><li>Ensure Prometheus is scraping the collector: <a href="http://localhost:9090/targets">http://localhost:9090/targets</a></li></ul></li><li><strong>No traces in Jaeger</strong>:   - Verify Jaeger is running: <code>docker-compose ps jaeger</code><ul><li>Check that OTLP is enabled in Jaeger   - Generate some traces by accessing the application endpoints</li></ul></li><li><strong>Application errors</strong>:<ul><li>Check application logs: <code>docker-compose logs app</code></li></ul></li></ol><h2 id="Development"><a href="#Development" class="headerlink" title="Development"></a>Development</h2><h3 id="Local-Development"><a href="#Local-Development" class="headerlink" title="Local Development"></a>Local Development</h3><p>To run the application locally:</p><ol><li>Navigate to the src directory2. Run <code>dotnet run</code><br>Note: When running locally, you’ll need to update the OTLP endpoint in Program.cs to point to your local OpenTelemetry Collector.</li></ol><h3 id="Adding-Custom-Metrics"><a href="#Adding-Custom-Metrics" class="headerlink" title="Adding Custom Metrics"></a>Adding Custom Metrics</h3><ol><li>Create a new meter:<figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> myMeter = <span class="keyword">new</span> Meter(<span class="string">&quot;MyApp.Metrics&quot;</span>, <span class="string">&quot;1.0.0&quot;</span>);</span><br></pre></td></tr></table></figure></li><li>Create metrics instruments:<figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> myCounter = myMeter.CreateCounter&lt;<span class="built_in">int</span>&gt;(<span class="string">&quot;my.counter&quot;</span>, <span class="string">&quot;Count of operations&quot;</span>);</span><br></pre></td></tr></table></figure></li><li>Record measurements:<figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">myCounter.Add(<span class="number">1</span>);</span><br></pre></td></tr></table></figure></li><li>Register the meter in the OpenTelemetry configuration:<figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">metricsProviderBuilder.AddMeter(<span class="string">&quot;MyApp.Metrics&quot;</span>);</span><br></pre></td></tr></table></figure></li></ol><hr><h2 id="📥-Example-Sending-Metrics"><a href="#📥-Example-Sending-Metrics" class="headerlink" title="📥 Example: Sending Metrics"></a>📥 Example: Sending Metrics</h2><p>Suppose your app wants to record a counter metric for user logins. All it needs to do is POST to the sidecar:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">curl -X POST http://localhost:8080/metrics \</span><br><span class="line">  -H <span class="string">&quot;Content-Type: application/json&quot;</span> \</span><br><span class="line">  -d <span class="string">&#x27;&#123;</span></span><br><span class="line"><span class="string">    &quot;name&quot;: &quot;user_login_total&quot;,</span></span><br><span class="line"><span class="string">    &quot;kind&quot;: &quot;counter&quot;,</span></span><br><span class="line"><span class="string">    &quot;value&quot;: 1,</span></span><br><span class="line"><span class="string">    &quot;attributes&quot;: &#123;</span></span><br><span class="line"><span class="string">      &quot;service&quot;: &quot;auth-service&quot;,</span></span><br><span class="line"><span class="string">      &quot;status&quot;: &quot;success&quot;</span></span><br><span class="line"><span class="string">    &#125;</span></span><br><span class="line"><span class="string">  &#125;&#x27;</span></span><br></pre></td></tr></table></figure><p>📌 This will be transformed into an OpenTelemetry metric and pushed to your configured OTLP collector.</p><hr><h2 id="📡-Example-Sending-Traces"><a href="#📡-Example-Sending-Traces" class="headerlink" title="📡 Example: Sending Traces"></a>📡 Example: Sending Traces</h2><p>To record a trace span (e.g., for a request to <code>/api/data</code>), POST to <code>/trace</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">curl -X POST http://localhost:8080/trace \</span><br><span class="line">  -H <span class="string">&quot;Content-Type: application/json&quot;</span> \</span><br><span class="line">  -d <span class="string">&#x27;&#123;</span></span><br><span class="line"><span class="string">    &quot;trace_id&quot;: &quot;abc123&quot;,</span></span><br><span class="line"><span class="string">    &quot;span_id&quot;: &quot;def456&quot;,</span></span><br><span class="line"><span class="string">    &quot;name&quot;: &quot;GET /api/data&quot;,</span></span><br><span class="line"><span class="string">    &quot;kind&quot;: &quot;server&quot;,</span></span><br><span class="line"><span class="string">    &quot;start_time&quot;: &quot;2024-01-01T00:00:00Z&quot;,</span></span><br><span class="line"><span class="string">    &quot;end_time&quot;: &quot;2024-01-01T00:00:01Z&quot;,</span></span><br><span class="line"><span class="string">    &quot;attributes&quot;: &#123;</span></span><br><span class="line"><span class="string">      &quot;http.method&quot;: &quot;GET&quot;,</span></span><br><span class="line"><span class="string">      &quot;http.status_code&quot;: 200</span></span><br><span class="line"><span class="string">    &#125;</span></span><br><span class="line"><span class="string">  &#125;&#x27;</span></span><br></pre></td></tr></table></figure><p>The sidecar will:</p><ul><li>Create the span</li><li>Set its metadata and timing</li><li>Export it via OTLP to your backend (e.g., Jaeger or Zipkin)</li></ul><hr><h2 id="🛠️-Configuration"><a href="#🛠️-Configuration" class="headerlink" title="🛠️ Configuration"></a>🛠️ Configuration</h2><p>Set the following environment variables:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4317</span><br><span class="line">OTEL_SERVICE_NAME=my-sidecar</span><br><span class="line">OTEL_METRICS_EXPORT_INTERVAL=1000</span><br></pre></td></tr></table></figure><p>These control how frequently data is flushed and where it’s sent.</p><hr><h2 id="🔒-No-SDK-No-Problem"><a href="#🔒-No-SDK-No-Problem" class="headerlink" title="🔒 No SDK, No Problem"></a>🔒 No SDK, No Problem</h2><p>One of the biggest benefits of <code>OpenTelemetry_SideCar</code> is that your main app doesn’t need to:</p><ul><li>Link or compile with any OpenTelemetry SDK</li><li>Maintain exporter or collector logic</li><li>Handle telemetry lifecycle</li></ul><p>Your app stays clean — just send HTTP!</p><hr><h2 id="🚀-Get-Started"><a href="#🚀-Get-Started" class="headerlink" title="🚀 Get Started"></a>🚀 Get Started</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/OpenTelemetry_SideCar.git</span><br><span class="line"><span class="built_in">cd</span> OpenTelemetry_SideCar</span><br><span class="line">cargo run</span><br></pre></td></tr></table></figure><p>Then, start POSTing traces and metrics from your apps.</p><hr><h2 id="🙌-Final-Thoughts"><a href="#🙌-Final-Thoughts" class="headerlink" title="🙌 Final Thoughts"></a>🙌 Final Thoughts</h2><p><code>OpenTelemetry_SideCar</code> empowers teams to <strong>add observability with zero code changes</strong> to their applications. It’s perfect for teams looking to modernize telemetry practices without touching production binaries.</p><p>If you’re working with mixed environments or maintaining legacy services, give it a try!</p><blockquote><p>⭐️ Star the repo: <a href="https://github.com/isdaniel/OpenTelemetry_SideCar">isdaniel&#x2F;OpenTelemetry_SideCar</a></p></blockquote><p>&lt;! Above information summaries from AI. &#x2F;&gt;</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/opentelemetry-sidecar/">https://isdaniel.github.io/opentelemetry-sidecar/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/opentelemetry-sidecar/</id>
    <link href="https://isdaniel.github.io/opentelemetry-sidecar/"/>
    <published>2025-06-01T22:30:11.000Z</published>
    <summary>If you’ve ever struggled to instrument legacy apps or non-standard services for observability, OpenTelemetry_SideCar is here to help. This project offers a</summary>
    <title>Instrument Any App Instantly Using OpenTelemetry_SideCar</title>
    <updated>2026-04-22T03:00:22.029Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Python" scheme="https://isdaniel.github.io/categories/Python/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Python/PostgreSQL/"/>
    <category term="AI" scheme="https://isdaniel.github.io/categories/Python/PostgreSQL/AI/"/>
    <category term="Performance" scheme="https://isdaniel.github.io/categories/Python/PostgreSQL/AI/Performance/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="Python" scheme="https://isdaniel.github.io/tags/Python/"/>
    <category term="AI" scheme="https://isdaniel.github.io/tags/AI/"/>
    <category term="MCP" scheme="https://isdaniel.github.io/tags/MCP/"/>
    <category term="Performance-Tuning" scheme="https://isdaniel.github.io/tags/Performance-Tuning/"/>
    <category term="Database-Optimization" scheme="https://isdaniel.github.io/tags/Database-Optimization/"/>
    <content>
      <![CDATA[<h1 id="AI-Powered-PostgreSQL-Performance-Tuning-with-MCP-Introducing-pgtuner-mcp"><a href="#AI-Powered-PostgreSQL-Performance-Tuning-with-MCP-Introducing-pgtuner-mcp" class="headerlink" title="AI-Powered PostgreSQL Performance Tuning with MCP - Introducing pgtuner_mcp"></a>AI-Powered PostgreSQL Performance Tuning with MCP - Introducing pgtuner_mcp</h1><p>Database performance optimization is one of the most critical yet challenging aspects of maintaining production systems. Identifying slow queries, optimizing indexes, and monitoring database health requires deep expertise and constant vigilance. Today, I’m excited to introduce <strong>pgtuner_mcp</strong>, a Model Context Protocol (MCP) server that brings AI-powered PostgreSQL performance tuning capabilities directly into your development workflow.</p><h2 id="What-is-pgtuner-mcp"><a href="#What-is-pgtuner-mcp" class="headerlink" title="What is pgtuner_mcp?"></a>What is pgtuner_mcp?</h2><p><strong>pgtuner_mcp</strong> is an intelligent PostgreSQL performance analysis server built on the Model Context Protocol (MCP). It bridges the gap between AI assistants (like Claude) and your PostgreSQL database, enabling natural language interactions for complex database optimization tasks.</p><p><a href="https://github.com/isdaniel/pgtuner_mcp">GitHub Repository</a></p><h3 id="Key-Capabilities"><a href="#Key-Capabilities" class="headerlink" title="Key Capabilities"></a>Key Capabilities</h3><ul><li><strong>Intelligent Query Analysis</strong>: Identify slow queries with detailed statistics from <code>pg_stat_statements</code></li><li><strong>AI-Powered Index Recommendations</strong>: Get smart indexing suggestions based on actual workload patterns</li><li><strong>Hypothetical Index Testing</strong>: Test indexes without creating them using HypoPG</li><li><strong>Comprehensive Health Checks</strong>: Monitor connections, cache efficiency, locks, and replication</li><li><strong>Bloat Detection</strong>: Identify and quantify table&#x2F;index bloat for maintenance</li><li><strong>Vacuum Monitoring</strong>: Track vacuum operations and autovacuum effectiveness</li><li><strong>I&#x2F;O Analysis</strong>: Analyze disk read&#x2F;write patterns and identify bottlenecks</li><li><strong>Configuration Review</strong>: Get recommendations for memory, checkpoint, and connection settings</li></ul><h2 id="Architecture-Overview"><a href="#Architecture-Overview" class="headerlink" title="Architecture Overview"></a>Architecture Overview</h2><p>pgtuner_mcp leverages several PostgreSQL extensions and Python libraries to provide comprehensive analysi</p><figure class="highlight pgsql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">┌─────────────────┐</span><br><span class="line">│  AI Assistant   │</span><br><span class="line">│  (Claude, etc)  │</span><br><span class="line">└────────┬────────┘</span><br><span class="line">         │ MCP Protocol</span><br><span class="line">         ▼</span><br><span class="line">┌─────────────────┐    ┌──────────────────┐</span><br><span class="line">│  pgtuner_mcp    │───▶│   PostgreSQL     │</span><br><span class="line">│  (Python MCP    │    │   + Extensions   │</span><br><span class="line">│   <span class="keyword">Server</span>)       │    │  - pg_stat_      │</span><br><span class="line">└─────────────────┘    │    statements    │</span><br><span class="line">         │              │  - hypopg        │</span><br><span class="line">         │              │  - pgstattuple   │</span><br><span class="line">         ▼              └──────────────────┘</span><br><span class="line">┌─────────────────┐</span><br><span class="line">│  Analysis &amp;     │</span><br><span class="line">│  Recommendations│</span><br><span class="line">└─────────────────┘</span><br></pre></td></tr></table></figure><h3 id="Core-Components"><a href="#Core-Components" class="headerlink" title="Core Components"></a>Core Components</h3><ol><li><strong>MCP Server</strong>: Provides tools, prompts, and resources via Model Context Protocol</li><li><strong>Query Analyzer</strong>: Parses and analyzes SQL using <code>pglast</code> library</li><li><strong>Performance Metrics</strong>: Collects statistics from PostgreSQL system views</li><li><strong>AI Recommendations</strong>: Generates intelligent suggestions based on workload patterns</li><li><strong>Multiple Transport Modes</strong>: Supports stdio, SSE, and streamable HTTP</li></ol><h2 id="Installation-and-Setup"><a href="#Installation-and-Setup" class="headerlink" title="Installation and Setup"></a>Installation and Setup</h2><h3 id="Prerequisites"><a href="#Prerequisites" class="headerlink" title="Prerequisites"></a>Prerequisites</h3><p>Before installing pgtuner_mcp, ensure you have:</p><ul><li>Python 3.10+</li><li>PostgreSQL 12+ (recommended: 14+)</li><li>Access to install PostgreSQL extensions</li></ul><h3 id="Quick-Installation"><a href="#Quick-Installation" class="headerlink" title="Quick Installation"></a>Quick Installation</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Install via pip</span></span><br><span class="line">pip install pgtuner_mcp</span><br><span class="line"></span><br><span class="line"><span class="comment"># Or using uv (faster)</span></span><br><span class="line">uv pip install pgtuner_mcp</span><br></pre></td></tr></table></figure><h3 id="PostgreSQL-Extensions-Setup"><a href="#PostgreSQL-Extensions-Setup" class="headerlink" title="PostgreSQL Extensions Setup"></a>PostgreSQL Extensions Setup</h3><p>pgtuner_mcp requires specific PostgreSQL extensions for full functionality:</p><h4 id="1-pg-stat-statements-Required"><a href="#1-pg-stat-statements-Required" class="headerlink" title="1. pg_stat_statements (Required)"></a>1. pg_stat_statements (Required)</h4><p>This extension tracks query execution statistics:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Add to postgresql.conf</span></span><br><span class="line">shared_preload_libraries <span class="operator">=</span> <span class="string">&#x27;pg_stat_statements&#x27;</span></span><br><span class="line">compute_query_id <span class="operator">=</span> o</span><br><span class="line">pg_stat_statements.max <span class="operator">=</span> <span class="number">10000</span></span><br><span class="line">pg_stat_statements.track <span class="operator">=</span> top</span><br><span class="line">pg_stat_statements.track_utility <span class="operator">=</span> <span class="keyword">on</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- Restart PostgreSQL, then:</span></span><br><span class="line"><span class="keyword">CREATE</span> EXTENSION IF <span class="keyword">NOT</span> <span class="keyword">EXISTS</span> pg_stat_statements;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Verify installation</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> pg_stat_statements LIMIT <span class="number">1</span>;</span><br></pre></td></tr></table></figure><h4 id="2-HypoPG-Optional-Recommended"><a href="#2-HypoPG-Optional-Recommended" class="headerlink" title="2. HypoPG (Optional, Recommended)"></a>2. HypoPG (Optional, Recommended)</h4><p>Enables hypothetical index testing without disk usage:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> EXTENSION IF <span class="keyword">NOT</span> <span class="keyword">EXISTS</span> hypopg;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Verify installation</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> hypopg_list_indexes();</span><br></pre></td></tr></table></figure><h4 id="3-pgstattuple-Optional-for-Bloat-Detection"><a href="#3-pgstattuple-Optional-for-Bloat-Detection" class="headerlink" title="3. pgstattuple (Optional, for Bloat Detection)"></a>3. pgstattuple (Optional, for Bloat Detection)</h4><p>Provides tuple-level statistics for bloat analysis:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> EXTENSION IF <span class="keyword">NOT</span> <span class="keyword">EXISTS</span> pgstattuple;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Verify installation</span></span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> pgstattuple(<span class="string">&#x27;pg_class&#x27;</span>) LIMIT <span class="number">1</span>;</span><br></pre></td></tr></table></figure><h3 id="User-Permissions"><a href="#User-Permissions" class="headerlink" title="User Permissions"></a>User Permissions</h3><p>Create a dedicated monitoring user with minimal required permissions:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Create monitoring user</span></span><br><span class="line"><span class="keyword">CREATE</span> <span class="keyword">USER</span> pgtuner_monitor <span class="keyword">WITH</span> PASSWORD <span class="string">&#x27;secure_password&#x27;</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Grant connection and schema access</span></span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">CONNECT</span> <span class="keyword">ON</span> DATABASE your_database <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">GRANT</span> USAGE <span class="keyword">ON</span> SCHEMA public <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Grant read access to user tables</span></span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> <span class="keyword">ALL</span> TABLES <span class="keyword">IN</span> SCHEMA public <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">ALTER</span> <span class="keyword">DEFAULT</span> PRIVILEGES <span class="keyword">IN</span> SCHEMA public</span><br><span class="line">  <span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> TABLES <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Grant system statistics access (PostgreSQL 10+)</span></span><br><span class="line"><span class="keyword">GRANT</span> pg_read_all_stats <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Grant access to pg_stat_statements</span></span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> pg_stat_statements <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> pg_stat_statements_info <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- For bloat detection (PostgreSQL 14+)</span></span><br><span class="line"><span class="keyword">GRANT</span> pg_stat_scan_tables <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- For HypoPG functions</span></span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">SELECT</span> <span class="keyword">ON</span> hypopg_list_indexes <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">EXECUTE</span> <span class="keyword">ON</span> <span class="keyword">FUNCTION</span> hypopg_create_index(text) <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">EXECUTE</span> <span class="keyword">ON</span> <span class="keyword">FUNCTION</span> hypopg_drop_index(oid) <span class="keyword">TO</span> pgtuner_monitor;</span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">EXECUTE</span> <span class="keyword">ON</span> <span class="keyword">FUNCTION</span> hypopg_reset() <span class="keyword">TO</span> pgtuner_monitor;</span><br></pre></td></tr></table></figure><h2 id="Configuration"><a href="#Configuration" class="headerlink" title="Configuration"></a>Configuration</h2><h3 id="Server-Modes"><a href="#Server-Modes" class="headerlink" title="Server Modes"></a>Server Modes</h3><p>pgtuner_mcp supports three deployment modes:</p><h4 id="1-Standard-MCP-Mode-stdio"><a href="#1-Standard-MCP-Mode-stdio" class="headerlink" title="1. Standard MCP Mode (stdio)"></a>1. Standard MCP Mode (stdio)</h4><p>Best for MCP clients like Claude Desktop or Cline:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Default mode</span></span><br><span class="line">python -m pgtuner_mcp</span><br><span class="line"></span><br><span class="line"><span class="comment"># Explicit stdio mode</span></span><br><span class="line">python -m pgtuner_mcp --mode stdio</span><br></pre></td></tr></table></figure><p><strong>Configuration for Claude Desktop</strong> (<code>cline_mcp_settings.json</code>):</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;pgtuner_mcp&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;command&quot;</span><span class="punctuation">:</span> <span class="string">&quot;python&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;-m&quot;</span><span class="punctuation">,</span> <span class="string">&quot;pgtuner_mcp&quot;</span><span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;DATABASE_URI&quot;</span><span class="punctuation">:</span> <span class="string">&quot;postgresql://pgtuner_monitor:password@localhost:5432/mydb&quot;</span></span><br><span class="line">      <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;disabled&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;autoApprove&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h4 id="2-HTTP-SSE-Mode-Legacy-Web-Applications"><a href="#2-HTTP-SSE-Mode-Legacy-Web-Applications" class="headerlink" title="2. HTTP SSE Mode (Legacy Web Applications)"></a>2. HTTP SSE Mode (Legacy Web Applications)</h4><p>Server-Sent Events for web-based MCP communication:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Start SSE server</span></span><br><span class="line">python -m pgtuner_mcp --mode sse --host 0.0.0.0 --port 8080</span><br><span class="line"></span><br><span class="line"><span class="comment"># With debug mode</span></span><br><span class="line">python -m pgtuner_mcp --mode sse --debug</span><br></pre></td></tr></table></figure><p><strong>Endpoints</strong>:</p><ul><li><code>GET /sse</code> - SSE connection endpoint</li><li><code>POST /messages</code> - Send messages&#x2F;requests</li></ul><h4 id="3-Streamable-HTTP-Mode-Recommended-for-Web"><a href="#3-Streamable-HTTP-Mode-Recommended-for-Web" class="headerlink" title="3. Streamable HTTP Mode (Recommended for Web)"></a>3. Streamable HTTP Mode (Recommended for Web)</h4><p>Modern MCP protocol with single <code>/mcp</code> endpoint:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Stateful mode (maintains session state)</span></span><br><span class="line">python -m pgtuner_mcp --mode streamable-http</span><br><span class="line"></span><br><span class="line"><span class="comment"># Stateless mode (serverless-friendly)</span></span><br><span class="line">python -m pgtuner_mcp --mode streamable-http --stateless</span><br><span class="line"></span><br><span class="line"><span class="comment"># Custom host/port</span></span><br><span class="line">python -m pgtuner_mcp --mode streamable-http --host localhost --port 8080</span><br></pre></td></tr></table></figure><p><strong>Configuration</strong>:</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;mcpServers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;pgtuner_mcp&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;http&quot;</span><span class="punctuation">,</span></span><br><span class="line">      <span class="attr">&quot;url&quot;</span><span class="punctuation">:</span> <span class="string">&quot;http://localhost:8080/mcp&quot;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="Environment-Variables"><a href="#Environment-Variables" class="headerlink" title="Environment Variables"></a>Environment Variables</h3><table><thead><tr><th>Variable</th><th>Description</th><th>Required</th></tr></thead><tbody><tr><td><code>DATABASE_URI</code></td><td>PostgreSQL connection string</td><td>Yes</td></tr><tr><td><code>PGTUNER_EXCLUDE_USERIDS</code></td><td>Comma-separated user OIDs to exclude</td><td>No</td></tr></tbody></table><p><strong>Connection String Format</strong>:</p><figure class="highlight elixir"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">postgresql:</span>//<span class="symbol">user:</span>password<span class="variable">@host</span><span class="symbol">:port/database</span></span><br></pre></td></tr></table></figure><h2 id="Available-Tools"><a href="#Available-Tools" class="headerlink" title="Available Tools"></a>Available Tools</h2><p>pgtuner_mcp provides 15+ specialized tools organized into categories:</p><h3 id="Performance-Analysis-Tools"><a href="#Performance-Analysis-Tools" class="headerlink" title="Performance Analysis Tools"></a>Performance Analysis Tools</h3><table><thead><tr><th>Tool</th><th>Description</th></tr></thead><tbody><tr><td><code>get_slow_queries</code></td><td>Retrieve slow queries with detailed statistics (time, calls, cache hit ratio)</td></tr><tr><td><code>analyze_query</code></td><td>Analyze execution plans with EXPLAIN ANALYZE and automated issue detection</td></tr><tr><td><code>get_table_stats</code></td><td>Get table statistics: size, row counts, dead tuples, access patterns</td></tr><tr><td><code>analyze_disk_io_patterns</code></td><td>Analyze I&#x2F;O patterns, identify hot tables and bottlenecks</td></tr></tbody></table><h3 id="Index-Tuning-Tools"><a href="#Index-Tuning-Tools" class="headerlink" title="Index Tuning Tools"></a>Index Tuning Tools</h3><table><thead><tr><th>Tool</th><th>Description</th></tr></thead><tbody><tr><td><code>get_index_recommendations</code></td><td>AI-powered index recommendations based on workload analysis</td></tr><tr><td><code>explain_with_indexes</code></td><td>Test hypothetical indexes without creating them</td></tr><tr><td><code>manage_hypothetical_indexes</code></td><td>Create, list, drop, or reset HypoPG hypothetical indexes</td></tr><tr><td><code>find_unused_indexes</code></td><td>Find unused and duplicate indexes for cleanup</td></tr></tbody></table><h3 id="Database-Health-Tools"><a href="#Database-Health-Tools" class="headerlink" title="Database Health Tools"></a>Database Health Tools</h3><table><thead><tr><th>Tool</th><th>Description</th></tr></thead><tbody><tr><td><code>check_database_health</code></td><td>Comprehensive health check with scoring</td></tr><tr><td><code>get_active_queries</code></td><td>Monitor active queries and find long-running transactions</td></tr><tr><td><code>analyze_wait_events</code></td><td>Identify I&#x2F;O, lock, or CPU bottlenecks</td></tr><tr><td><code>review_settings</code></td><td>Review PostgreSQL configuration with recommendations</td></tr></tbody></table><h3 id="Bloat-Detection-Tools"><a href="#Bloat-Detection-Tools" class="headerlink" title="Bloat Detection Tools"></a>Bloat Detection Tools</h3><table><thead><tr><th>Tool</th><th>Description</th></tr></thead><tbody><tr><td><code>analyze_table_bloat</code></td><td>Analyze table bloat using pgstattuple extension</td></tr><tr><td><code>analyze_index_bloat</code></td><td>Analyze B-tree index bloat (also supports GIN&#x2F;Hash)</td></tr><tr><td><code>get_bloat_summary</code></td><td>Comprehensive bloat overview with maintenance priorities</td></tr></tbody></table><h3 id="Vacuum-Monitoring-Tools"><a href="#Vacuum-Monitoring-Tools" class="headerlink" title="Vacuum Monitoring Tools"></a>Vacuum Monitoring Tools</h3><table><thead><tr><th>Tool</th><th>Description</th></tr></thead><tbody><tr><td><code>monitor_vacuum_progress</code></td><td>Track VACUUM, VACUUM FULL, and autovacuum operations</td></tr></tbody></table><h2 id="Docker-Deployment"><a href="#Docker-Deployment" class="headerlink" title="Docker Deployment"></a>Docker Deployment</h2><p>pgtuner_mcp is available as a Docker image for easy deployment:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Pull the image</span></span><br><span class="line">docker pull dog830228/pgtuner_mcp</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run in streamable HTTP mode (recommended)</span></span><br><span class="line">docker run -p 8080:8080 \</span><br><span class="line">  -e DATABASE_URI=postgresql://user:pass@host:5432/db \</span><br><span class="line">  dog830228/pgtuner_mcp --mode streamable-http</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run in stateless mode (serverless-friendly)</span></span><br><span class="line">docker run -p 8080:8080 \</span><br><span class="line">  -e DATABASE_URI=postgresql://user:pass@host:5432/db \</span><br><span class="line">  dog830228/pgtuner_mcp --mode streamable-http --stateless</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run in stdio mode for MCP clients</span></span><br><span class="line">docker run -i \</span><br><span class="line">  -e DATABASE_URI=postgresql://user:pass@host:5432/db \</span><br><span class="line">  dog830228/pgtuner_mcp --mode stdio</span><br></pre></td></tr></table></figure><h2 id="Real-World-Use-Cases"><a href="#Real-World-Use-Cases" class="headerlink" title="Real-World Use Cases"></a>Real-World Use Cases</h2><h3 id="1-Slow-Query-Investigation"><a href="#1-Slow-Query-Investigation" class="headerlink" title="1. Slow Query Investigation"></a>1. Slow Query Investigation</h3><p><strong>Scenario</strong>: Application experiencing slow response times.</p><p><strong>Workflow</strong>:</p><figure class="highlight n1ql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">User: &quot;Find the slowest queries in my database&quot;</span><br><span class="line">AI + pgtuner_mcp:</span><br><span class="line">  1. Calls get_slow_queries(limit=10, order_by=<span class="string">&quot;total_time&quot;</span>)</span><br><span class="line">  <span class="number">2.</span> Identifies top <span class="number">3</span> problematic queries</span><br><span class="line">  <span class="number">3.</span> Calls analyze_query() <span class="keyword">for</span> <span class="keyword">each</span></span><br><span class="line">  <span class="number">4.</span> Detects sequential scans <span class="keyword">and</span> <span class="literal">missing</span> indexes</span><br><span class="line">  <span class="number">5.</span> Calls get_index_recommendations()</span><br><span class="line">  <span class="number">6.</span> Provides <span class="keyword">CREATE</span> <span class="keyword">INDEX</span> statements <span class="keyword">with</span> impact estimates</span><br></pre></td></tr></table></figure><h3 id="2-Index-Optimization"><a href="#2-Index-Optimization" class="headerlink" title="2. Index Optimization"></a>2. Index Optimization</h3><p><strong>Scenario</strong>: Database growing, need to optimize indexes.</p><p><strong>Workflow</strong>:</p><figure class="highlight csharp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">User: <span class="string">&quot;Help me optimize my database indexes&quot;</span></span><br><span class="line">AI + pgtuner_mcp:</span><br><span class="line">  <span class="number">1.</span> <span class="function">Calls <span class="title">find_unused_indexes</span>() to identify cleanup candidates</span></span><br><span class="line"><span class="function">  2. Calls <span class="title">get_index_recommendations</span>() <span class="keyword">for</span> <span class="keyword">new</span> index suggestions</span></span><br><span class="line"><span class="function">  3. Calls <span class="title">explain_with_indexes</span>() to test hypothetical indexes</span></span><br><span class="line"><span class="function">  4. Estimates storage savings <span class="keyword">and</span> performance improvements</span></span><br><span class="line"><span class="function">  5. Provides prioritized action plan</span></span><br></pre></td></tr></table></figure><h3 id="3-Health-Check-Before-Production-Deploy"><a href="#3-Health-Check-Before-Production-Deploy" class="headerlink" title="3. Health Check Before Production Deploy"></a>3. Health Check Before Production Deploy</h3><p><strong>Scenario</strong>: Pre-deployment database health validation.</p><p><strong>Workflow</strong>:</p><figure class="highlight routeros"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">User: <span class="string">&quot;Is my database ready for production traffic?&quot;</span></span><br><span class="line">AI + pgtuner_mcp:</span><br><span class="line">  1. Calls check_database_health(<span class="attribute">verbose</span>=<span class="literal">True</span>)</span><br><span class="line">  2. Analyzes<span class="built_in"> connection pool </span>capacity</span><br><span class="line">  3. Checks cache hit ratios</span><br><span class="line">  4. Reviews vacuum <span class="keyword">and</span> autovacuum status</span><br><span class="line">  5. Analyzes wait events</span><br><span class="line">  6. Reviews configuration<span class="built_in"> settings</span></span><br><span class="line"><span class="built_in"></span>  7. Provides comprehensive<span class="built_in"> health </span>report with recommendations</span><br></pre></td></tr></table></figure><h3 id="4-Performance-Regression-Investigation"><a href="#4-Performance-Regression-Investigation" class="headerlink" title="4. Performance Regression Investigation"></a>4. Performance Regression Investigation</h3><p><strong>Scenario</strong>: Performance degraded after recent changes.</p><p><strong>Workflow</strong>:</p><figure class="highlight smali"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">User: <span class="string">&quot;Why is my database slower than last week?&quot;</span></span><br><span class="line">AI + pgtuner_mcp:</span><br><span class="line">  1. Calls get_table_stats() to identify growth patterns</span><br><span class="line">  2. Calls analyze_disk_io_patterns() for I/O bottlenecks</span><br><span class="line">  3. Calls get_bloat_summary() to detect table/index bloat</span><br><span class="line">  4. Calls monitor_vacuum_progress() to<span class="built_in"> check </span>maintenance</span><br><span class="line">  5. Calls analyze_wait_events() to find resource contention</span><br><span class="line">  6. Identifies root causes<span class="built_in"> and </span>provides remediation steps</span><br></pre></td></tr></table></figure><h2 id="Performance-Considerations"><a href="#Performance-Considerations" class="headerlink" title="Performance Considerations"></a>Performance Considerations</h2><h3 id="Extension-Overhead"><a href="#Extension-Overhead" class="headerlink" title="Extension Overhead"></a>Extension Overhead</h3><table><thead><tr><th>Extension</th><th>Performance Impact</th><th>Recommendation</th></tr></thead><tbody><tr><td><code>pg_stat_statements</code></td><td>Low (~1-2%)</td><td>Always enable</td></tr><tr><td><code>track_io_timing</code></td><td>Low-Medium (~2-5%)</td><td>Enable in production, test first</td></tr><tr><td><code>track_functions = all</code></td><td>Low</td><td>Enable for function-heavy workloads</td></tr><tr><td><code>pgstattuple</code> functions</td><td>Varies by table size</td><td>Use <code>_approx</code> for large tables</td></tr><tr><td><code>HypoPG</code></td><td>Zero (in-memory only)</td><td>Safe for all environments</td></tr></tbody></table><p><strong>Tip</strong>: Use <code>pg_test_timing</code> to measure timing overhead on your specific hardware:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> pg_test_timing();</span><br></pre></td></tr></table></figure><h3 id="Best-Practices"><a href="#Best-Practices" class="headerlink" title="Best Practices"></a>Best Practices</h3><ol><li><strong>Use Approximate Analysis</strong>: For large tables (&gt;5GB), use <code>pgstattuple_approx</code> instead of <code>pgstattuple</code></li><li><strong>Filter System Users</strong>: Exclude monitoring&#x2F;replication users using <code>PGTUNER_EXCLUDE_USERIDS</code></li><li><strong>Limit Query History</strong>: Configure <code>pg_stat_statements.max</code> based on your workload</li><li><strong>Regular Maintenance</strong>: Use vacuum monitoring tools to ensure optimal performance</li><li><strong>Test Hypothetical Indexes</strong>: Always test with HypoPG before creating real indexes</li></ol><h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>pgtuner_mcp represents a paradigm shift in database performance optimization. By combining the power of AI assistants with deep PostgreSQL expertise through the Model Context Protocol, it makes advanced database tuning accessible to developers at all skill levels.</p><p>The tool doesn’t replace database administrators—it augments their capabilities and democratizes access to expert-level analysis. Whether you’re debugging a slow query, planning index strategies, or conducting pre-deployment health checks, pgtuner_mcp provides intelligent, context-aware assistance.</p><h3 id="Key-Takeaways"><a href="#Key-Takeaways" class="headerlink" title="Key Takeaways"></a>Key Takeaways</h3><ol><li><strong>AI-Native Performance Tuning</strong>: Natural language interface to complex database operations</li><li><strong>Risk-Free Testing</strong>: HypoPG enables index testing without disk usage</li><li><strong>Comprehensive Analysis</strong>: 15+ tools covering queries, indexes, health, bloat, vacuum, and I&#x2F;O</li><li><strong>Flexible Deployment</strong>: stdio, HTTP SSE, or streamable HTTP modes</li><li><strong>Production-Ready</strong>: Minimal overhead, proper permissions, comprehensive monitoring</li></ol><p>Whether you’re a seasoned DBA looking to leverage AI for faster workflows or a developer seeking to understand and optimize database performance, pgtuner_mcp offers a powerful, modern approach to PostgreSQL tuning.</p><h3 id="Resources"><a href="#Resources" class="headerlink" title="Resources"></a>Resources</h3><ul><li><a href="https://github.com/isdaniel/pgtuner_mcp">pgtuner_mcp GitHub Repository</a></li><li><a href="https://modelcontextprotocol.io/">Model Context Protocol Documentation</a></li><li><a href="https://www.postgresql.org/docs/current/performance-tips.html">PostgreSQL Performance Tuning Guide</a></li><li><a href="https://github.com/HypoPG/hypopg">HypoPG Extension</a></li><li><a href="https://www.postgresql.org/docs/current/pgstatstatements.html">pg_stat_statements Documentation</a></li></ul><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/pgtuner-mcp-ai-powered-postgresql-performance/">https://isdaniel.github.io/pgtuner-mcp-ai-powered-postgresql-performance/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/pgtuner-mcp-ai-powered-postgresql-performance/</id>
    <link href="https://isdaniel.github.io/pgtuner-mcp-ai-powered-postgresql-performance/"/>
    <published>2025-01-13T10:30:00.000Z</published>
    <summary>An MCP server that delivers AI-driven PostgreSQL performance analysis, tuning advice, and health checks.</summary>
    <title>AI-Powered PostgreSQL Performance Tuning with MCP - Introducing pgtuner_mcp</title>
    <updated>2026-04-22T03:00:22.030Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Rust" scheme="https://isdaniel.github.io/categories/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/"/>
    <category term="Database-Safety" scheme="https://isdaniel.github.io/categories/Rust/PostgreSQL/Database-Safety/"/>
    <category term="Rust" scheme="https://isdaniel.github.io/tags/Rust/"/>
    <category term="PostgreSQL" scheme="https://isdaniel.github.io/tags/PostgreSQL/"/>
    <category term="Database-Safety" scheme="https://isdaniel.github.io/tags/Database-Safety/"/>
    <category term="pgrx" scheme="https://isdaniel.github.io/tags/pgrx/"/>
    <category term="Extension" scheme="https://isdaniel.github.io/tags/Extension/"/>
    <content>
      <![CDATA[<h1 id="Building-Safe-PostgreSQL-Extensions-with-Rust-Introducing-pg-where-guard"><a href="#Building-Safe-PostgreSQL-Extensions-with-Rust-Introducing-pg-where-guard" class="headerlink" title="Building Safe PostgreSQL Extensions with Rust - Introducing pg_where_guard"></a>Building Safe PostgreSQL Extensions with Rust - Introducing pg_where_guard</h1><p>Database safety is a critical concern for any production system. Accidental data loss from <code>DELETE</code> or <code>UPDATE</code> statements without <code>WHERE</code> clauses can be catastrophic. Today, I’ll introduce <strong>pg_where_guard</strong>, a PostgreSQL extension built with Rust and the pgrx framework that prevents these dangerous operations.</p><h2 id="What-is-pg-where-guard"><a href="#What-is-pg-where-guard" class="headerlink" title="What is pg_where_guard?"></a>What is pg_where_guard?</h2><p><strong>pg_where_guard</strong> is a PostgreSQL extension that acts as a safety net for your database by intercepting and blocking potentially dangerous SQL operations:</p><ul><li><strong>DELETE Protection</strong>: Prevents <code>DELETE FROM table</code> without WHERE clause</li><li><strong>UPDATE Protection</strong>: Prevents <code>UPDATE table SET ...</code> without WHERE clause</li><li><strong>CTE Support</strong>: Recursively checks Common Table Expressions</li><li><strong>Hook Integration</strong>: Uses PostgreSQL’s <code>post_parse_analyze_hook</code> for query interception</li><li><strong>Memory Safe</strong>: Written in Rust with pgrx for safety and performance</li></ul><p><a href="https://github.com/isdaniel/pg_where_guard">GitHub Repository</a></p><h2 id="Why-Rust-for-PostgreSQL-Extensions"><a href="#Why-Rust-for-PostgreSQL-Extensions" class="headerlink" title="Why Rust for PostgreSQL Extensions?"></a>Why Rust for PostgreSQL Extensions?</h2><p>Building PostgreSQL extensions traditionally meant working with C and dealing with manual memory management, potential segmentation faults, and complex debugging. Rust changes this paradigm by offering:</p><h3 id="Performance"><a href="#Performance" class="headerlink" title="Performance"></a>Performance</h3><p>Zero-cost abstractions mean Rust code performs as well as equivalent C code while being much safer.</p><h3 id="pgrx-Framework"><a href="#pgrx-Framework" class="headerlink" title="pgrx Framework"></a>pgrx Framework</h3><p>The <a href="https://github.com/pgcentralfoundation/pgrx">pgrx framework</a> provides:</p><ul><li>Type-safe PostgreSQL API bindings</li><li>Automatic SQL schema generation</li><li>Comprehensive testing support</li><li>Easy development workflow</li></ul><h2 id="Technical-Architecture"><a href="#Technical-Architecture" class="headerlink" title="Technical Architecture"></a>Technical Architecture</h2><h3 id="Hook-Based-Implementation"><a href="#Hook-Based-Implementation" class="headerlink" title="Hook-Based Implementation"></a>Hook-Based Implementation</h3><p>pg_where_guard leverages PostgreSQL’s hook system to intercept queries after parsing:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Hook registration in _PG_init</span></span><br><span class="line">PREV_POST_PARSE_ANALYZE_HOOK = pg_sys::post_parse_analyze_hook;</span><br><span class="line">pg_sys::post_parse_analyze_hook = <span class="title function_ invoke__">Some</span>(delete_needs_where_check);</span><br></pre></td></tr></table></figure><h3 id="Query-Analysis-Engine"><a href="#Query-Analysis-Engine" class="headerlink" title="Query Analysis Engine"></a>Query Analysis Engine</h3><p>The extension examines the parsed query tree to detect dangerous operations:</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Query checking logic</span></span><br><span class="line"><span class="keyword">match</span> query.commandType &#123;</span><br><span class="line">    pg_sys::CmdType::CMD_DELETE =&gt; &#123;</span><br><span class="line">        <span class="keyword">if</span> !query.jointree.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">            <span class="keyword">let</span> <span class="variable">jointree</span> = &amp;*query.jointree;</span><br><span class="line">            <span class="keyword">if</span> jointree.quals.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">                error!(<span class="string">&quot;DELETE requires a WHERE clause&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    pg_sys::CmdType::CMD_UPDATE =&gt; &#123;</span><br><span class="line">        <span class="keyword">if</span> !query.jointree.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">            <span class="keyword">let</span> <span class="variable">jointree</span> = &amp;*query.jointree;</span><br><span class="line">            <span class="keyword">if</span> jointree.quals.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">                error!(<span class="string">&quot;UPDATE requires a WHERE clause&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    _ =&gt; &#123;</span><br><span class="line">        <span class="comment">// Other command types are allowed</span></span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="Key-Components"><a href="#Key-Components" class="headerlink" title="Key Components"></a>Key Components</h3><ol><li><p><strong>Hook Function</strong> (<code>delete_needs_where_check</code>):</p><ul><li>Intercepts queries via <code>post_parse_analyze_hook</code></li><li>Checks command types (DELETE&#x2F;UPDATE)</li><li>Validates presence of WHERE clauses</li><li>Handles Common Table Expressions recursively</li></ul></li><li><p><strong>Query Analysis</strong> (<code>check_query_for_where_clause</code>):</p><ul><li>Examines the query’s <code>jointree</code> structure</li><li>Looks for <code>quals</code> (qualification&#x2F;WHERE conditions)</li><li>Throws errors for unqualified modifications</li></ul></li><li><p><strong>Extension Functions</strong>:</p><ul><li><code>pg_where_guard_is_enabled()</code>: Check if protection is active</li><li><code>pg_where_guard_enable()</code>: Enable protection</li></ul></li></ol><h2 id="Installation-and-Setup"><a href="#Installation-and-Setup" class="headerlink" title="Installation and Setup"></a>Installation and Setup</h2><h3 id="Prerequisites"><a href="#Prerequisites" class="headerlink" title="Prerequisites"></a>Prerequisites</h3><p>Before installing pg_where_guard, ensure you have:</p><ul><li>Rust toolchain (1.70+)</li><li>pgrx framework</li><li>PostgreSQL development headers</li><li>cargo-pgrx</li></ul><h3 id="Build-and-Install"><a href="#Build-and-Install" class="headerlink" title="Build and Install"></a>Build and Install</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Clone the repository</span></span><br><span class="line">git <span class="built_in">clone</span> https://github.com/isdaniel/pg_where_guard.git</span><br><span class="line"><span class="built_in">cd</span> pg_where_guard</span><br><span class="line"></span><br><span class="line"><span class="comment"># Install cargo-pgrx if not already installed</span></span><br><span class="line">cargo install cargo-pgrx</span><br><span class="line"></span><br><span class="line"><span class="comment"># Initialize pgrx for your PostgreSQL version</span></span><br><span class="line">cargo pgrx init</span><br><span class="line"></span><br><span class="line"><span class="comment"># Install the extension</span></span><br><span class="line">cargo pgrx install</span><br></pre></td></tr></table></figure><h3 id="Database-Setup"><a href="#Database-Setup" class="headerlink" title="Database Setup"></a>Database Setup</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Create the extension</span></span><br><span class="line"><span class="keyword">CREATE</span> EXTENSION pg_where_guard;</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Verify installation</span></span><br><span class="line"><span class="keyword">SELECT</span> pg_where_guard_is_enabled();  <span class="comment">-- Returns: true</span></span><br></pre></td></tr></table></figure><h2 id="Usage-Examples"><a href="#Usage-Examples" class="headerlink" title="Usage Examples"></a>Usage Examples</h2><h3 id="Safe-Operations-Allowed"><a href="#Safe-Operations-Allowed" class="headerlink" title="Safe Operations (Allowed)"></a>Safe Operations (Allowed)</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- Create a test table</span></span><br><span class="line"><span class="keyword">CREATE TABLE</span> employees (</span><br><span class="line">    id SERIAL <span class="keyword">PRIMARY KEY</span>,</span><br><span class="line">    name TEXT <span class="keyword">NOT NULL</span>,</span><br><span class="line">    department TEXT,</span><br><span class="line">    salary <span class="type">INTEGER</span></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- Insert test data</span></span><br><span class="line"><span class="keyword">INSERT INTO</span> employees (name, department, salary) <span class="keyword">VALUES</span></span><br><span class="line">    (<span class="string">&#x27;Alice Johnson&#x27;</span>, <span class="string">&#x27;Engineering&#x27;</span>, <span class="number">75000</span>),</span><br><span class="line">    (<span class="string">&#x27;Bob Smith&#x27;</span>, <span class="string">&#x27;Marketing&#x27;</span>, <span class="number">65000</span>),</span><br><span class="line">    (<span class="string">&#x27;Charlie Brown&#x27;</span>, <span class="string">&#x27;Engineering&#x27;</span>, <span class="number">80000</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">-- These operations work fine (have WHERE clauses)</span></span><br><span class="line"><span class="keyword">UPDATE</span> employees <span class="keyword">SET</span> salary <span class="operator">=</span> <span class="number">78000</span> <span class="keyword">WHERE</span> name <span class="operator">=</span> <span class="string">&#x27;Alice Johnson&#x27;</span>;</span><br><span class="line"><span class="keyword">DELETE</span> <span class="keyword">FROM</span> employees <span class="keyword">WHERE</span> department <span class="operator">=</span> <span class="string">&#x27;Marketing&#x27;</span>;</span><br></pre></td></tr></table></figure><h3 id="Dangerous-Operations-Blocked"><a href="#Dangerous-Operations-Blocked" class="headerlink" title="Dangerous Operations (Blocked)"></a>Dangerous Operations (Blocked)</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- These commands will FAIL due to pg_where_guard protection:</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- This will fail: UPDATE without WHERE clause</span></span><br><span class="line"><span class="keyword">UPDATE</span> employees <span class="keyword">SET</span> salary <span class="operator">=</span> <span class="number">100000</span>;</span><br><span class="line"><span class="comment">-- ERROR: UPDATE requires a WHERE clause</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- This will fail: DELETE without WHERE clause</span></span><br><span class="line"><span class="keyword">DELETE</span> <span class="keyword">FROM</span> employees;</span><br><span class="line"><span class="comment">-- ERROR: DELETE requires a WHERE clause</span></span><br></pre></td></tr></table></figure><h3 id="Common-Table-Expression-Support"><a href="#Common-Table-Expression-Support" class="headerlink" title="Common Table Expression Support"></a>Common Table Expression Support</h3><p>The extension also protects CTEs:</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- This would also be blocked</span></span><br><span class="line"><span class="keyword">WITH</span> department_update <span class="keyword">AS</span> (</span><br><span class="line">    <span class="keyword">UPDATE</span> employees <span class="keyword">SET</span> salary <span class="operator">=</span> salary <span class="operator">*</span> <span class="number">1.1</span>  <span class="comment">-- No WHERE clause!</span></span><br><span class="line">    RETURNING <span class="operator">*</span></span><br><span class="line">)</span><br><span class="line"><span class="keyword">SELECT</span> <span class="operator">*</span> <span class="keyword">FROM</span> department_update;</span><br></pre></td></tr></table></figure><h2 id="Performance-Considerations"><a href="#Performance-Considerations" class="headerlink" title="Performance Considerations"></a>Performance Considerations</h2><h3 id="Minimal-Overhead"><a href="#Minimal-Overhead" class="headerlink" title="Minimal Overhead"></a>Minimal Overhead</h3><p>pg_where_guard adds minimal performance overhead because it:</p><ul><li>Only analyzes DELETE and UPDATE statements</li><li>Performs lightweight checks on the parsed query tree</li><li>Uses efficient Rust code with zero-cost abstractions</li><li>Operates at parse time, not execution time</li></ul><h3 id="Production-Readiness"><a href="#Production-Readiness" class="headerlink" title="Production Readiness"></a>Production Readiness</h3><p>The extension is designed for production use with:</p><ul><li>Comprehensive error handling</li><li>Memory-safe implementation</li><li>Minimal system resource usage</li><li>Support for PostgreSQL 12-16</li></ul><h2 id="Development-and-Testing"><a href="#Development-and-Testing" class="headerlink" title="Development and Testing"></a>Development and Testing</h2><h3 id="Project-Structure"><a href="#Project-Structure" class="headerlink" title="Project Structure"></a>Project Structure</h3><figure class="highlight pgsql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">pg_where_guard/</span><br><span class="line">├── Cargo.toml              # Rust project <span class="keyword">configuration</span></span><br><span class="line">├── pg_where_guard.control  # PostgreSQL <span class="keyword">extension</span> control file</span><br><span class="line">├── src/</span><br><span class="line">│   ├── lib.rs             # Main <span class="keyword">extension</span> code</span><br><span class="line">│   └── bin/</span><br><span class="line">│       └── pgrx_embed.rs  # pgrx <span class="keyword">schema</span> generation</span><br><span class="line">├── <span class="keyword">sql</span>/                   # <span class="keyword">SQL</span> test scripts</span><br><span class="line">└── tests/                 # Test files</span><br></pre></td></tr></table></figure><h3 id="Running-Tests"><a href="#Running-Tests" class="headerlink" title="Running Tests"></a>Running Tests</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Run the test suite</span></span><br><span class="line">cargo pgrx <span class="built_in">test</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Test with specific PostgreSQL version</span></span><br><span class="line">cargo pgrx <span class="built_in">test</span> pg15</span><br></pre></td></tr></table></figure><h3 id="Development-Workflow"><a href="#Development-Workflow" class="headerlink" title="Development Workflow"></a>Development Workflow</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Start a development PostgreSQL instance</span></span><br><span class="line">cargo pgrx run</span><br><span class="line"></span><br><span class="line"><span class="comment"># Install the extension in development</span></span><br><span class="line">cargo pgrx install --debug</span><br></pre></td></tr></table></figure><h2 id="Benefits-of-the-Rust-pgrx-Approach"><a href="#Benefits-of-the-Rust-pgrx-Approach" class="headerlink" title="Benefits of the Rust + pgrx Approach"></a>Benefits of the Rust + pgrx Approach</h2><h3 id="Developer-Experience"><a href="#Developer-Experience" class="headerlink" title="Developer Experience"></a>Developer Experience</h3><ol><li><strong>Type Safety</strong>: Compile-time guarantees prevent runtime errors</li><li><strong>Modern Tooling</strong>: Cargo ecosystem and excellent IDE support</li><li><strong>Testing</strong>: Built-in unit testing and integration testing</li><li><strong>Documentation</strong>: Automatic documentation generation</li></ol><h3 id="Safety-Guarantees"><a href="#Safety-Guarantees" class="headerlink" title="Safety Guarantees"></a>Safety Guarantees</h3><ol><li><strong>Memory Safety</strong>: No buffer overflows or memory leaks</li><li><strong>Thread Safety</strong>: Rust’s ownership model prevents data races</li><li><strong>Error Handling</strong>: Explicit error handling with Result types</li><li><strong>Null Safety</strong>: No null pointer dereferences</li></ol><h3 id="Performance-Benefits"><a href="#Performance-Benefits" class="headerlink" title="Performance Benefits"></a>Performance Benefits</h3><ol><li><strong>Zero-Cost Abstractions</strong>: High-level code without runtime overhead</li><li><strong>Optimized Compilation</strong>: LLVM backend generates efficient machine code</li><li><strong>Minimal Dependencies</strong>: Small runtime footprint</li><li><strong>Efficient Resource Usage</strong>: Predictable memory usage patterns</li></ol><h2 id="Comparison-with-Traditional-C-Extensions"><a href="#Comparison-with-Traditional-C-Extensions" class="headerlink" title="Comparison with Traditional C Extensions"></a>Comparison with Traditional C Extensions</h2><table><thead><tr><th>Aspect</th><th>C Extension</th><th>Rust + pgrx Extension</th></tr></thead><tbody><tr><td>Memory Safety</td><td>Manual management</td><td>Automatic, compile-time guaranteed</td></tr><tr><td>Development Speed</td><td>Slow, error-prone</td><td>Fast, safe development</td></tr><tr><td>Debugging</td><td>GDB, complex</td><td>Standard Rust tooling</td></tr><tr><td>Testing</td><td>Manual, limited</td><td>Built-in unit&#x2F;integration tests</td></tr><tr><td>Maintenance</td><td>High overhead</td><td>Low overhead</td></tr><tr><td>Performance</td><td>Optimal</td><td>Near-optimal with safety</td></tr></tbody></table><h3 id="Integration-Possibilities"><a href="#Integration-Possibilities" class="headerlink" title="Integration Possibilities"></a>Integration Possibilities</h3><p>pg_where_guard can be integrated with:</p><ul><li><strong>Database Migration Tools</strong>: Validate migrations before execution</li><li><strong>ORM Frameworks</strong>: Add safety checks to generated queries</li><li><strong>Monitoring Systems</strong>: Alert on attempted dangerous operations</li><li><strong>Audit Systems</strong>: Log blocked operations for compliance</li></ul><h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>pg_where_guard demonstrates the power of modern Rust tooling for PostgreSQL extension development. By combining Rust’s safety guarantees with pgrx’s ease of use, we can build robust database tools that protect against common but dangerous operations.</p><p>The extension serves as both a practical safety tool and an example of how Rust is revolutionizing systems programming beyond traditional applications. As the PostgreSQL ecosystem continues to evolve, Rust-based extensions like pg_where_guard pave the way for safer, more maintainable database tools.</p><h3 id="Key-Takeaways"><a href="#Key-Takeaways" class="headerlink" title="Key Takeaways"></a>Key Takeaways</h3><ol><li><strong>Safety First</strong>: Rust eliminates entire classes of bugs that plague C extensions</li><li><strong>Developer Productivity</strong>: pgrx makes PostgreSQL extension development accessible</li><li><strong>Performance</strong>: Memory safety doesn’t require sacrificing performance</li><li><strong>Future-Proof</strong>: Rust’s growing ecosystem ensures long-term maintainability</li></ol><p>Whether you’re looking to protect your database from accidental data loss or explore modern PostgreSQL extension development, pg_where_guard offers a compelling example of what’s possible with Rust and pgrx.</p><h3 id="Resources"><a href="#Resources" class="headerlink" title="Resources"></a>Resources</h3><ul><li><a href="https://github.com/isdaniel/pg_where_guard">pg_where_guard GitHub Repository</a></li><li><a href="https://github.com/pgcentralfoundation/pgrx">pgrx Framework Documentation</a></li><li><a href="https://www.postgresql.org/docs/current/extend.html">PostgreSQL Extension Development Guide</a></li></ul><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/pg-where-guard-rust-postgresql-extension/">https://isdaniel.github.io/pg-where-guard-rust-postgresql-extension/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/pg-where-guard-rust-postgresql-extension/</id>
    <link href="https://isdaniel.github.io/pg-where-guard-rust-postgresql-extension/"/>
    <published>2025-01-03T07:20:00.000Z</published>
    <summary>A Rust/pgrx PostgreSQL extension that blocks unsafe UPDATE/DELETE statements without WHERE clauses.</summary>
    <title>Building Safe PostgreSQL Extensions with Rust - Introducing pg_where_guard</title>
    <updated>2026-04-22T03:00:22.029Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="ssl" scheme="https://isdaniel.github.io/categories/ssl/"/>
    <category term="tls" scheme="https://isdaniel.github.io/categories/ssl/tls/"/>
    <category term="ssl" scheme="https://isdaniel.github.io/tags/ssl/"/>
    <category term="tls" scheme="https://isdaniel.github.io/tags/tls/"/>
    <content>
      <![CDATA[<h2 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h2><p>TLS (Transport Layer Security) is a cryptographic protocol that provides secure communication over a network, commonly used to secure HTTP traffic (i.e., HTTPS). Here’s a high-level overview of the TLS workflow, which includes handshake and data transfer phases.</p><p>After TCP handshake, it will execute TLS handshake if client require.</p><p>Below image is my experiment TLS (TLS 1.2) workflow from PostgreSQL server, Red-frame represent TCP 3 handshake, and yellow-frame represent TLS handshake.</p><p><img src="/images/tls-ssl/2024-11-11_15h17_11.png" alt="img"></p><p>In the beginning, client will send a request to require sslmode connection (SSL&#x2F;TLS), if server support it will reply (‘S’).</p><p><img src="/images/tls-ssl/2024-11-11_17h48_32.png" alt="img"></p><p>Eventually, processing below steps to do TLS handshake.</p><ol><li>ClientHello → 2. ServerHello → 3. Server Certificate → 4. ServerHelloDone → 5. Client Key Exchange</li></ol><h3 id="TLS-work-flow"><a href="#TLS-work-flow" class="headerlink" title="TLS work-flow"></a>TLS work-flow</h3><ol><li>Client Hello:<ul><li>The TLS version the client supports.</li><li>A list of cipher suites (encryption algorithms) it supports.</li><li>A random number (used later for generating encryption keys).</li></ul></li><li>Server Hello:<ul><li>The TLS version and cipher suite that it chose based on the client’s list.</li><li>A random number (used later for generating encryption keys).</li></ul></li><li>Server Certificate and Optional Server Key Exchange: The server sends its digital certificate to the client to prove its identity. This certificate includes the server’s public key and is typically signed by a trusted Certificate Authority (CA).<ul><li>Checking its expiration date.</li><li>Ensuring that the certificate is signed by a CA trusted by the client’s operating system or browser.</li></ul></li><li>Server Hello Done : The server sends a ServerHelloDone message, indicating it has finished its part of the handshake.</li><li>Client Key Exchange<ul><li>The client generates a pre-master secret (a random value used for encryption key generation) and encrypts it with the server’s public key (from the server’s certificate).</li><li>This encrypted pre-master secret is sent to the server. Only the server, with its private key, can decrypt this secret.<br><img src="/images/tls-ssl/2024-11-11_18h53_01.png" alt="img"></li></ul></li><li>Generating Session Keys: Both the client and server now have enough information (random numbers and the pre-master secret) to generate session keys.<ul><li>Symmetric encryption of the data sent over the connection.</li><li>Message integrity to ensure data is not tampered with.</li></ul></li></ol><p>Above items, I pointed out in below snapshot workflow.</p><p><img src="/images/tls-ssl/2024-11-11_18h47_35.png" alt="img"></p><h2 id="Certificate-File"><a href="#Certificate-File" class="headerlink" title="Certificate File"></a>Certificate File</h2><p>A certificate file is an important part within TLS workflow, that for doing Client Key Exchange, and client will encrypt data via public key (server side can decrypt by private key).</p><p>Certificate file contains several important pieces of information:</p><ul><li>Public Key: The certificate includes the public key of the entity being certified (e.g., a server or individual). This key can be used to encrypt data or verify signatures.</li><li>Subject Information: Identifying information about the certificate holder, such as:<ul><li>Common Name (CN) – often the domain name for websites.</li><li>Organization (O), Organizational Unit (OU).</li><li>Country (C).</li></ul></li><li>Issuer Information: Identifying information about the Certificate Authority (CA) that issued the certificate.</li><li>Validity Period: The start and end dates defining the period during which the certificate is valid.</li><li>Digital Signature: A cryptographic signature from the CA, which verifies the certificate’s authenticity.</li><li>Certificate Serial Number: A unique identifier for the certificate issued by the CA.</li></ul><h3 id="Certificate-Root-Certificate-file"><a href="#Certificate-Root-Certificate-file" class="headerlink" title="Certificate &amp; Root Certificate file"></a>Certificate &amp; Root Certificate file</h3><ul><li>Root Certificate file: A root certificate is the top-level certificate in a certificate chain and serves as the foundation of trust for all other certificates within the hierarchy.<ul><li>Lifespan: Root certificates are used to issue intermediate certificates, which in turn issue end-entity certificates. This hierarchy enhances security by limiting direct exposure of the root certificate.</li></ul></li><li>Certificate file: This type of certificate is used to verify the identity of an entity, like a website, individual, or organization, and is typically issued by an intermediate certificate, not directly by the root.<ul><li>Lifespan:They usually have shorter lifespans (often 1-2 years) to enhance security through periodic renewal and revocation if compromised.</li></ul></li></ul><p>Certificate file would be verified whether valid by Root certificate, and OS machine would install Root certificate file when we setup our machine.</p><blockquote><p>The root certificate is the trust anchor, while end-entity certificates rely on this anchor for trust. This hierarchy allows a scalable, secure infrastructure where trust flows from the root certificate down to the individual end-entity certificates.</p></blockquote><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/tls-ssl/">https://isdaniel.github.io/tls-ssl/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/tls-ssl/</id>
    <link href="https://isdaniel.github.io/tls-ssl/"/>
    <published>2024-11-11T22:30:11.000Z</published>
    <summary>TLS (Transport Layer Security) is a cryptographic protocol that provides secure communication over a network, commonly used to secure HTTP traffic (i.e.,</summary>
    <title>Understand TLS/SSL networking flow</title>
    <updated>2026-04-22T03:00:22.032Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="C#" scheme="https://isdaniel.github.io/categories/C/"/>
    <category term="DesignPattern" scheme="https://isdaniel.github.io/categories/C/DesignPattern/"/>
    <category term="Design-Pattern" scheme="https://isdaniel.github.io/tags/Design-Pattern/"/>
    <category term="C" scheme="https://isdaniel.github.io/tags/C/"/>
    <content>
      <![CDATA[<h2 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h2><p>In PostgreSQL, there isn’t a native foreach loop construct in C, because C itself doesn’t have a foreach loop as you might find in higher-level languages like Python or PHP. However, PostgreSQL often implements loop iterations over elements using <strong>Macros</strong> that simplify the handling of data structures, such as linked lists, which are commonly used within its codebase.</p><h2 id="Common-Loop-Macros-in-PostgreSQL"><a href="#Common-Loop-Macros-in-PostgreSQL" class="headerlink" title="Common Loop Macros in PostgreSQL"></a>Common Loop Macros in PostgreSQL</h2><ol><li><p><code>lfirst(lc)</code>:</p><ul><li>This macro retrieves the data stored in a <code>ListCell</code>. The <code>ListCell</code> structure typically contains a union that can hold various types of pointers (like <code>void*</code>, <code>int</code>, etc.). The <code>ptr_value</code> is a generic pointer that can point to any node or structure, and <code>lfirst</code> simply casts it back from the <code>void *</code>.</li></ul></li><li><p><code>lfirst_node(type, lc)</code>:</p><ul><li>This macro is used when the list elements are known to be of a specific node type, which is common in the parser and planner where lists often contain specific types of nodes (e.g., expression or plan nodes). <code>lfirst_node</code> uses <code>castNode</code> to cast the pointer retrieved by <code>lfirst</code> to the specified type, ensuring type safety and readability in the code.</li></ul></li><li><p><code>castNode(_type_, nodeptr)</code>:</p><ul><li>A simple cast to the specified type <code>_type_</code>. It enhances readability and ensures that the casting is explicit in the code, which is crucial for understanding that a type conversion is taking place, particularly when navigating complex data structures common in PostgreSQL’s internals.</li></ul></li></ol><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> lfirst(lc)((lc)-&gt;ptr_value)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> lfirst_node(type,lc)castNode(type, lfirst(lc))</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> castNode(_type_, nodeptr) ((_type_ *) (nodeptr))</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> true1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> false0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> foreach(cell, lst)\</span></span><br><span class="line"><span class="meta">for (ForEachState cell##__state = &#123;(lst), 0&#125;; \</span></span><br><span class="line"><span class="meta"> (cell##__state.l != NIL &amp;&amp; \</span></span><br><span class="line"><span class="meta">  cell##__state.i <span class="string">&lt; cell##__state.l-&gt;</span>length) ? \</span></span><br><span class="line"><span class="meta"> (cell = &amp;cell##__state.l-&gt;elements[cell##__state.i], true) : \</span></span><br><span class="line"><span class="meta"> (cell = NULL, false); \</span></span><br><span class="line"><span class="meta"> cell##__state.i++)</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> NIL((List *) NULL)</span></span><br></pre></td></tr></table></figure><p>The <code>ListCell</code> union consists of a single member, <code>ptr_value</code>, which is a generic pointer <code>(void *)</code>.</p><p>This pointer can hold a reference to any type of data, allowing for flexibility in what kind of data the list can contain.<br>This structure is useful for managing lists of generic data types.</p><p>The List structure represents a dynamic list in PostgreSQL.<br>It contains:</p><ul><li><code>length</code>: An integer that specifies the current number of elements in the list.</li><li><code>elements</code>: A pointer to an array of <code>ListCell</code> elements, which holds the actual data in the list. This array can be re-allocated as the list grows or shrinks, allowing for dynamic resizing.</li><li>The comment suggests that sometimes <code>ListCell</code> elements may be allocated directly alongside the List structure itself. This can optimize memory usage and improve performance.</li></ul><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">union</span> <span class="title">ListCell</span></span></span><br><span class="line"><span class="class">&#123;</span></span><br><span class="line"><span class="type">void</span>   *ptr_value;</span><br><span class="line">&#125; ListCell;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">List</span></span></span><br><span class="line"><span class="class">&#123;</span></span><br><span class="line"><span class="type">int</span>length;<span class="comment">/* number of elements currently present */</span></span><br><span class="line">ListCell   *elements;<span class="comment">/* re-allocatable array of cells */</span></span><br><span class="line">&#125; List;</span><br></pre></td></tr></table></figure><p>The ForEachState structure is used to manage state while iterating over a list in PostgreSQL.</p><ul><li><code>l</code>: A constant pointer to the list being iterated. The list is not meant to be modified during iteration.</li><li><code>i</code>: An integer tracking the current index of the element in the list being processed. This helps keep track of the iteration progress.</li></ul><p>These structures work together to handle lists of data in PostgreSQL, providing the flexibility to work with generic data types and iterate over lists efficiently and safely. The List structure allows for dynamic lists, while <code>ForEachState</code> helps manage the state of iteration over the list.</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">ForEachState</span></span></span><br><span class="line"><span class="class">&#123;</span></span><br><span class="line"><span class="type">const</span> List *l;<span class="comment">/* list we&#x27;re looping through */</span></span><br><span class="line"><span class="type">int</span>i;<span class="comment">/* current element index */</span></span><br><span class="line">&#125; ForEachState;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>Here is the sample code, we can easy use foreach to iterator <code>List*</code> objects.</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> <span class="title function_">main</span><span class="params">(<span class="type">void</span>)</span> &#123;</span><br><span class="line">    srand( time(<span class="literal">NULL</span>) );    </span><br><span class="line">    ListCell   *item;</span><br><span class="line">    List *<span class="built_in">list</span> = InitialStudents();</span><br><span class="line"></span><br><span class="line">    foreach(item, <span class="built_in">list</span>) &#123;</span><br><span class="line">        Student *stu = lfirst_node(Student, item);</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;student name: %s, age: %d\n&quot;</span>, stu-&gt;name,stu-&gt;age);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Free allocated memory</span></span><br><span class="line">    <span class="built_in">free</span>(<span class="built_in">list</span>-&gt;elements-&gt;ptr_value);</span><br><span class="line">    <span class="built_in">free</span>(<span class="built_in">list</span>-&gt;elements);</span><br><span class="line">    <span class="built_in">free</span>(<span class="built_in">list</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Example code: <a href="https://github.com/isdaniel/BlogSample/tree/master/src/C_Sample/foreach_loop">foreach loop</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/c_foreach/">https://isdaniel.github.io/c_foreach&#x2F;</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/c_foreach/</id>
    <link href="https://isdaniel.github.io/c_foreach/"/>
    <published>2024-04-17T22:30:11.000Z</published>
    <summary>In PostgreSQL, there isn't a native foreach loop construct in C, because C itself doesn't have a foreach loop as you might find in higher-level languages like</summary>
    <title>C language implement foreach</title>
    <updated>2026-04-22T03:00:22.023Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="vscode" scheme="https://isdaniel.github.io/categories/vscode/"/>
    <category term="postgresql" scheme="https://isdaniel.github.io/categories/vscode/postgresql/"/>
    <category term="debugger" scheme="https://isdaniel.github.io/categories/vscode/postgresql/debugger/"/>
    <category term="sourcecode" scheme="https://isdaniel.github.io/categories/vscode/postgresql/debugger/sourcecode/"/>
    <category term="postgresql" scheme="https://isdaniel.github.io/tags/postgresql/"/>
    <category term="vscode" scheme="https://isdaniel.github.io/tags/vscode/"/>
    <category term="debugger" scheme="https://isdaniel.github.io/tags/debugger/"/>
    <content>
      <![CDATA[<h2 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h2><p>Using gdb for command-line debugging still feels inconvenient. I initially wanted to find a simpler way to directly debug the PostgreSQL source code under Windows. After searching for a while, I found that Visual Studio (VS) was the only option available, but it is heavy and the steps are quite complex. Since most real environments run on Linux, it is better to debug the PostgreSQL source code under Linux.</p><h2 id="How-to-build-install-PostgreSQL-from-source-code"><a href="#How-to-build-install-PostgreSQL-from-source-code" class="headerlink" title="How to build &amp; install PostgreSQL from source code."></a>How to build &amp; install PostgreSQL from source code.</h2><p>I used Ubuntu Linux environment, the first step we might need to install pre-requirement tool for PostgreSQL build.</p><figure class="highlight q"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-<span class="built_in">get</span> <span class="keyword">update</span></span><br><span class="line">sudo apt-<span class="built_in">get</span> install build-essential libreadline-<span class="built_in">dev</span> zlib1g-<span class="built_in">dev</span> flex bison libxml2-<span class="built_in">dev</span> libxslt-<span class="built_in">dev</span> libssl-<span class="built_in">dev</span> libxml2-utils xsltproc ccache libsystemd-<span class="built_in">dev</span> -y</span><br></pre></td></tr></table></figure><p>download PostgreSQL source code.</p><figure class="highlight apache"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="attribute">wget</span> https://ftp.postgresql.org/pub/source/v14.<span class="number">8</span>/postgresql-<span class="number">14</span>.<span class="number">8</span>.tar.gz</span><br><span class="line"><span class="attribute">tar</span> xvfz postgresql-<span class="number">14</span>.<span class="number">8</span>.tar.gz</span><br><span class="line"><span class="attribute">cd</span> postgresql-<span class="number">14</span>.<span class="number">8</span></span><br></pre></td></tr></table></figure><p>we would need to make sure the path (<code>--prefix</code>) exist in your system.  </p><figure class="highlight jboss-cli"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">./configure</span> <span class="params">--prefix=/home/daniel/postgresql-14</span>.8/pgsql <span class="params">--with-icu</span> <span class="params">--with-openssl</span> <span class="params">--with-systemd</span> <span class="params">--with-libxml</span> <span class="params">--enable-debug</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#or debug -g3 mode</span></span><br><span class="line"></span><br><span class="line"><span class="string">./configure</span> <span class="params">--prefix=/home/daniel/postgresql-14</span>.8/pgsql <span class="params">--with-icu</span> <span class="params">--with-openssl</span> <span class="params">--with-systemd</span> <span class="params">--with-libxml</span> <span class="params">--enable-debug</span> CFLAGS=<span class="string">&quot;-DGCC_HASCLASSVISIBILITY -O0 -Wall -W -g3 -gdwarf-2&quot;</span></span><br><span class="line"></span><br><span class="line">make -j 8</span><br><span class="line">make install</span><br></pre></td></tr></table></figure><blockquote><p>we must build with <code>--enable-debug</code> parameter, otherwise we can’t debug with our source code.</p></blockquote><p>Here are commands we would use later.</p><figure class="highlight awk"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="regexp">/home/</span>daniel<span class="regexp">/postgresql-14.8/</span>pgsql<span class="regexp">/bin/</span>psql</span><br><span class="line"><span class="regexp">/home/</span>daniel<span class="regexp">/postgresql-14.8/</span>pgsql<span class="regexp">/bin/i</span>nitdb -D <span class="regexp">/home/</span>daniel<span class="regexp">/postgresql-14.8/</span>pgsql/data</span><br><span class="line"><span class="regexp">/home/</span>daniel<span class="regexp">/postgresql-14.8/</span>pgsql<span class="regexp">/bin/</span>pg_ctl -D <span class="regexp">/home/</span>daniel<span class="regexp">/postgresql-14.8/</span>pgsql/data -l logfile start</span><br></pre></td></tr></table></figure><h2 id="setup-PostgreSQL-environment-path"><a href="#setup-PostgreSQL-environment-path" class="headerlink" title="setup PostgreSQL environment path"></a>setup PostgreSQL environment path</h2><p>I would recommend setup PostgreSQL environment path, after we build &amp; install PostgreSQL program from source code that would assist with us easier our next steps.</p><p>Identify your shell: Determine which shell you are using by running the following command:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">echo</span> <span class="variable">$SHELL</span></span><br></pre></td></tr></table></figure><p>It will display the path to your current shell.</p><p>Locate the configuration file: The configuration file you need to modify depends on your shell. Here are the common ones:</p><ul><li>Bash: ~&#x2F;.bashrc or ~&#x2F;.bash_profile</li><li>Zsh: ~&#x2F;.zshrc or ~&#x2F;.zprofile</li><li>Fish: ~&#x2F;.config&#x2F;fish&#x2F;config.fish</li></ul><p>Open the configuration file: Use a text editor, such as nano or vim, to open the configuration file. For example, if you are using Bash and the file is <code>~/.bashrc</code>, you can </p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">vim ~/.bashrc</span><br></pre></td></tr></table></figure><p>please modify with your environment <code>PGDATA</code> &amp; <code>PATH</code> setting align with your PostgreSQL build bin &amp; data path.</p><blockquote><p>because my build parameter used <code>--prefix=/home/daniel/postgresql-14.8/pgsql</code> so the setting would be align with that. </p></blockquote><figure class="highlight routeros"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">export</span> <span class="attribute">PGDATA</span>=<span class="string">&quot;/home/daniel/postgresql-14.8/pgsql/data&quot;</span></span><br><span class="line"><span class="built_in">export</span> <span class="attribute">PATH</span>=<span class="string">&quot;/home/daniel/postgresql-14.8/pgsql/bin:<span class="variable">$PATH</span>&quot;</span></span><br></pre></td></tr></table></figure><p>Update the environment: To apply the changes to your current session, run the following command in your terminal:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">source</span> ~/.bashrc  <span class="comment"># or the appropriate file for your shell</span></span><br></pre></td></tr></table></figure><h2 id="build-pem-file-to-a-create-user"><a href="#build-pem-file-to-a-create-user" class="headerlink" title="build pem file to a create user"></a>build pem file to a create user</h2><p>To create the keys, a preferred command is ssh-keygen, which is available with OpenSSH utilities.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">ssh-keygen -m PEM -t rsa -b 4096</span><br><span class="line"></span><br><span class="line">danielss@postgresql-debuger:~$ ssh-keygen -m PEM -t rsa -b 4096</span><br><span class="line">Generating public/private rsa key pair.</span><br><span class="line">Enter file <span class="keyword">in</span> <span class="built_in">which</span> to save the key (/home/danielss/.ssh/id_rsa): /home/danielss/.ssh/danielkey</span><br><span class="line">Created directory <span class="string">&#x27;/home/danielss/.ssh&#x27;</span>.</span><br><span class="line">Enter passphrase (empty <span class="keyword">for</span> no passphrase):</span><br><span class="line">Enter same passphrase again:</span><br><span class="line">Your identification has been saved <span class="keyword">in</span> /home/danielss/.ssh/danielkey</span><br><span class="line">Your public key has been saved <span class="keyword">in</span> /home/danielss/.ssh/danielkey.pub</span><br></pre></td></tr></table></figure><p>When you create an Azure VM by specifying the public key, Azure copies the public key (in the .pub format) to the <code>~/.ssh/authorized_keys</code> folder on the VM. SSH keys in <code>~/.ssh/authorized_keys</code> ensure that connecting clients present the corresponding private key during an SSH connection.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">danielss@postgresql-debuger:~/.ssh$ ll</span><br><span class="line">total 20</span><br><span class="line">drwx------ 2 danielss danielss 4096 Jun  9 02:51 ./</span><br><span class="line">drwxr-xr-x 4 danielss danielss 4096 Jun  9 02:52 ../</span><br><span class="line">-rw-rw-r-- 1 danielss danielss  753 Jun  9 02:51 authorized_keys</span><br><span class="line">-rw------- 1 danielss danielss 3247 Jun  9 02:50 danielkey</span><br><span class="line">-rw-r--r-- 1 danielss danielss  753 Jun  9 02:50 danielkey.pub</span><br></pre></td></tr></table></figure><p>copy <code>danielkey</code> (pem file) to your local machine, then you might login via the pem file.</p><h3 id="permission-problem-on-pem-file"><a href="#permission-problem-on-pem-file" class="headerlink" title="permission problem on pem file"></a>permission problem on pem file</h3><p>Confirm that the permissions of the <code>.ssh</code> directory and the authorized_keys file on the server are set correctly. Run the following commands on the server:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">chmod</span> 700 ~/.ssh</span><br><span class="line"><span class="built_in">chmod</span> 600 ~/.ssh/authorized_keys</span><br></pre></td></tr></table></figure><p>These commands ensure that the directory has read, write, and execute permissions only for the user, and the <code>authorized_keys</code> file has read and write permissions only for the user.</p><h2 id="setup-vscode"><a href="#setup-vscode" class="headerlink" title="setup vscode"></a>setup vscode</h2><p>Please follow below config</p><blockquote><p>Modify “program” column with PostgreSQL binary file path from json file </p></blockquote><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;version&quot;</span><span class="punctuation">:</span> <span class="string">&quot;0.2.0&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;configurations&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;dbg postgresql&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;cppdbg&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;request&quot;</span><span class="punctuation">:</span> <span class="string">&quot;attach&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;program&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/home/daniel/postgresql-14.8/pgsql/bin/postgres&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;processId&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$&#123;command:pickProcess&#125;&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;MIMode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;gdb&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;setupCommands&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">                <span class="punctuation">&#123;</span></span><br><span class="line">                    <span class="attr">&quot;description&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Enable pretty-printing for gdb&quot;</span><span class="punctuation">,</span></span><br><span class="line">                    <span class="attr">&quot;text&quot;</span><span class="punctuation">:</span> <span class="string">&quot;-enable-pretty-printing&quot;</span><span class="punctuation">,</span></span><br><span class="line">                    <span class="attr">&quot;ignoreFailures&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">true</span></span></span><br><span class="line">                <span class="punctuation">&#125;</span></span><br><span class="line">          <span class="punctuation">]</span></span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>Final used ssh with pem file to login your VM via VsCode.</p><h2 id="demo"><a href="#demo" class="headerlink" title="demo"></a>demo</h2><p>Here is the demo for vscode debug PostgreSQL running process by sourcecode.</p><p><img src="/../images/pg-debugger/postgres_debuger.gif" alt="img"></p><h2 id="How-can-solve-“ptrace-Operation-not-permitted”-problem"><a href="#How-can-solve-“ptrace-Operation-not-permitted”-problem" class="headerlink" title="How can solve “ptrace:Operation not permitted” problem?"></a>How can solve “ptrace:Operation not permitted” problem?</h2><p>Regarding to <a href="https://github.com/microsoft/MIEngine/wiki/Troubleshoot-attaching-to-processes-using-GDB">Troubleshoot attaching to processes using GDB</a></p><p>Run the following command as super user: <code>echo 0| sudo tee /proc/sys/kernel/yama/ptrace_scope</code></p><p>This will set the ptrace level to 0, after this just with user permissions you can attach to processes which are not launched by the debugger.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">postgres@postgresql-debuger:~$ <span class="built_in">echo</span> 0| <span class="built_in">sudo</span> <span class="built_in">tee</span> /proc/sys/kernel/yama/ptrace_scope</span><br><span class="line">[<span class="built_in">sudo</span>] password <span class="keyword">for</span> postgres:</span><br><span class="line">0</span><br></pre></td></tr></table></figure><p>More information</p><p><a href="https://github.com/microsoft/MIEngine/wiki/Troubleshoot-attaching-to-processes-using-GDB">https://github.com/microsoft/MIEngine/wiki/Troubleshoot-attaching-to-processes-using-GDB</a></p><p><a href="https://learn.microsoft.com/en-us/azure/virtual-machines/linux/create-ssh-keys-detailed#overview-of-ssh-and-keys">https://learn.microsoft.com/en-us/azure/virtual-machines/linux/create-ssh-keys-detailed#overview-of-ssh-and-keys</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/postgresql-vscode-sourcecode-debugger/">https://isdaniel.github.io/postgresql-vscode-sourcecode-debugger/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/postgresql-vscode-sourcecode-debugger/</id>
    <link href="https://isdaniel.github.io/postgresql-vscode-sourcecode-debugger/"/>
    <published>2023-06-02T22:30:11.000Z</published>
    <summary>Using gdb for command-line debugging still feels inconvenient. I initially wanted to find a simpler way to directly debug the PostgreSQL source code under</summary>
    <title>How can we use vscode debug PostgreSQL running process by source code?</title>
    <updated>2026-04-22T03:00:22.030Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Azure" scheme="https://isdaniel.github.io/categories/Azure/"/>
    <category term="flexibleServer" scheme="https://isdaniel.github.io/categories/Azure/flexibleServer/"/>
    <category term="Managed Identity" scheme="https://isdaniel.github.io/categories/Azure/flexibleServer/Managed-Identity/"/>
    <category term="Azure" scheme="https://isdaniel.github.io/tags/Azure/"/>
    <content>
      <![CDATA[<h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>This ariticle will guide us how to do long-term backup (more than 35 days) on Azure PostgreSQL Flexible.</p><h2 id="Prerequisites"><a href="#Prerequisites" class="headerlink" title="Prerequisites:"></a>Prerequisites:</h2><ul><li>A Postgres flexible server and an Azure VM (Linux (ubuntu 20.04)) that has access to it.</li><li>A MI (Managed Identity) in your subscription.</li><li>Please kindly make sure backup Postgres flexible server version align with pg_dump version.</li></ul><p>Here is the sample code of Azure PostgreSQL Flexible long term backup with Managed Identity.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;start program&#x27;</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;=====================&#x27;</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;start installing postgresql client suit&#x27;</span></span><br><span class="line"><span class="built_in">sudo</span> apt update</span><br><span class="line"><span class="built_in">sudo</span> apt -y install postgresql-client</span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;======================&#x27;</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;postgresql client install ends&#x27;</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;start mount storage&#x27;</span></span><br><span class="line"></span><br><span class="line">wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb</span><br><span class="line"><span class="built_in">sudo</span> dpkg -i packages-microsoft-prod.deb</span><br><span class="line"><span class="built_in">sudo</span> apt-get update</span><br><span class="line"><span class="built_in">sudo</span> apt-get install blobfuse -y</span><br><span class="line"><span class="built_in">sudo</span> apt-get install jq -y</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> [ ! -f blob.conf  ]; <span class="keyword">then</span></span><br><span class="line">    <span class="built_in">echo</span> <span class="string">&#x27;accountName &lt;&lt;Your blob accountName&gt;&gt;&#x27;</span> &gt;&gt; blob.conf </span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;authType MSI&#x27;</span> &gt;&gt; fuse_connection.cfg</span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;identityObjectId &lt;&lt;Your blob accountKey&gt;&gt;&#x27;</span> &gt;&gt; blob.conf </span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;containerName &lt;&lt;Your blob Container Name&gt;&gt;&#x27;</span> &gt;&gt; blob.conf </span><br><span class="line"><span class="keyword">fi</span></span><br><span class="line"><span class="comment">#create a folder which can mount to blob</span></span><br><span class="line"><span class="built_in">mkdir</span> ~/data</span><br><span class="line"><span class="built_in">sudo</span> blobfuse ~/data --tmp-path=/mnt/resource/mycontainer  --config-file=./blob.conf -o attr_timeout=240 -o entry_timeout=240 -o negative_timeout=120</span><br><span class="line"></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;======= starting backup============&#x27;</span></span><br><span class="line">DNAME=`<span class="built_in">date</span> +%Y%m%d%H%M%S`</span><br><span class="line"><span class="built_in">export</span> PGPASSWORD=`curl -s <span class="string">&#x27;http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&amp;resource=https%3A%2F%2Fossrdbms-aad.database.windows.net&amp;client_id=&lt;&lt;MI client ID&gt;&gt;&#x27;</span> -H Metadata:<span class="literal">true</span> | jq -r .access_token`</span><br><span class="line">pg_dump --host=test-conn.postgres.database.azure.com --username=<span class="string">&#x27;MI-Demo&#x27;</span> -Fc -c  testdb &gt; ~/data/dump<span class="variable">$DNAME</span>.sql</span><br></pre></td></tr></table></figure><h2 id="Experiments"><a href="#Experiments" class="headerlink" title="Experiments:"></a>Experiments:</h2><h3 id="Here-is-the-guideline-for-using-managed-identity-to-connect-to-your-DB-server-in-VM"><a href="#Here-is-the-guideline-for-using-managed-identity-to-connect-to-your-DB-server-in-VM" class="headerlink" title="Here is the guideline for using managed identity to connect to your DB server in VM."></a>Here is the guideline for using managed identity to connect to your DB server in VM.</h3><p>We would make sure the Authentication setting that was PostgreSQL and Azure Active Directory authentication.</p><p><img src="https://i.imgur.com/BVNlWCx.png"></p><p>Selected which managed identity you want to add to the PostgreSQL flexible server.</p><p><img src="https://i.imgur.com/61LOp1w.png"></p><p>When we had added managed identity in Azure portal that will add the managed identity in the PostgreSQL server.</p><p>In below script, we need to replace <code>&lt;&lt;MI client ID&gt;&gt;</code> by your MI Client Id and replace by your login MI as red frames that would tell Azure PostgreSQL server verify MI user by access token instead of user password.</p><p><img src="https://i.imgur.com/r8IBDz8.png"></p><p>After we replace all content by your really connect information that I pointed out, we can try to run the script and the backup file would be back up by pg_dump without password.</p><p><img src="https://i.imgur.com/qhOIDsV.png"></p><h3 id="Here-is-guideline-to-guide-us-how-to-mount-Linux-VM-from-blob-storage-by-MI"><a href="#Here-is-guideline-to-guide-us-how-to-mount-Linux-VM-from-blob-storage-by-MI" class="headerlink" title="Here is guideline to guide us how to mount Linux VM from blob storage by MI."></a>Here is guideline to guide us how to mount Linux VM from blob storage by MI.</h3><p>If you’re using User assigned managed identity, please add the identity in <code>User assigned</code> configuration of your Linux VM as shown below (choose which MI you want to use to verify.)</p><p><img src="https://i.imgur.com/BgKLeeS.png"></p><p>Please following the step “Managed Identities -&gt; Azure role assignments -&gt; Add role assignment (Preview)” to choose which Blob storage you want to mounted and verified by the MI.</p><p><img src="https://i.imgur.com/1Tpvopj.png"></p><p>Edit the storage account information which mark as red frame from the script which I provided to you.</p><ul><li>AccountName: blob Account Name</li><li>AuthType: auth type must be MSI</li><li>IdentityObjectId: Filled with your managed identity object it as below red arrow.</li><li>ContainerName: blob Container Name<br><img src="https://i.imgur.com/cJLGI1i.png"></li></ul><p>After the execution, you could see the dump file has successfully uploaded to storage account.</p><p><img src="https://i.imgur.com/ROSZvsp.png"></p><h2 id="More-information"><a href="#More-information" class="headerlink" title="More information"></a>More information</h2><p><a href="https://techcommunity.microsoft.com/t5/azure-paas-blog/mount-blob-storage-on-linux-vm-using-managed-identities-or/ba-p/1821744">https://techcommunity.microsoft.com/t5/azure-paas-blog/mount-blob-storage-on-linux-vm-using-managed-identities-or/ba-p/1821744</a></p><p><a href="https://github.com/Azure/azure-storage-fuse/blob/master/README.md#valid-authentication-setups">https://github.com/Azure/azure-storage-fuse/blob/master/README.md#valid-authentication-setups</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/azure-flexible-longterm-backup/">https://isdaniel.github.io/azure-flexible-longterm-backup/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/azure-flexible-longterm-backup/</id>
    <link href="https://isdaniel.github.io/azure-flexible-longterm-backup/"/>
    <published>2022-12-25T22:30:11.000Z</published>
    <summary>This ariticle will guide us how to do long-term backup (more than 35 days) on Azure PostgreSQL Flexible.</summary>
    <title>Azure PostgreSQL Flexible long term backup with Managed Identity</title>
    <updated>2026-04-22T03:00:22.022Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="postgresql" scheme="https://isdaniel.github.io/categories/postgresql/"/>
    <category term="DataBase" scheme="https://isdaniel.github.io/tags/DataBase/"/>
    <category term="postgresql" scheme="https://isdaniel.github.io/tags/postgresql/"/>
    <content>
      <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>因為工作需要最近在研究 postgresql DB，postgresql DB 是一個 Open Source RDBMS，所以有任何問題疑問都可以把 source code 下載並 debug 了解原因，本篇希望可以快速幫助想要透過 source code 安裝 postgresql DB 的人</p><h3 id="Install-Postgresl"><a href="#Install-Postgresl" class="headerlink" title="Install Postgresl"></a>Install Postgresl</h3><p>假如你跟我一樣是 Ubuntu 在安裝前需要先把開發環境設定完畢，詳細資訊可參考 <a href="https://wiki.postgresql.org/wiki/Compile_and_Install_from_source_code">Postgresql Compile_and_Install_from_source_code</a></p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get update</span><br><span class="line"></span><br><span class="line">sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache</span><br></pre></td></tr></table></figure><p>我們可以透過 <a href="https://www.postgresql.org/ftp/source/">FTP Source Code</a> 要安裝 postgres source code 版本</p><p>假如我是要安裝 v12.9 的 postgres 版本，我會跑下面命令下載 source code &amp; 解壓縮</p><p>解壓縮完畢後，應該會在該目錄看到一個該版號資料夾裡面存放的就是該版本使用 Postgresql source code.</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wget https://ftp.postgresql.org/pub/source/v12.<span class="number">9</span>/postgresql-<span class="number">12</span>.<span class="number">9</span>.tar.gz</span><br><span class="line">tar xvfz postgresql-<span class="number">12</span>.<span class="number">9</span>.tar.gz</span><br><span class="line"></span><br><span class="line"><span class="built_in">cd</span> postgresql-<span class="number">12</span>.<span class="number">9</span> </span><br></pre></td></tr></table></figure><p>利用 source code 安裝 postgres</p><ul><li><code>configure</code>：透過參數設定 postgres 相關設定 &amp; 安裝位置….</li><li><code>make</code>：透過 Makefile 安裝 postgres db</li></ul><figure class="highlight gauss"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">./configure</span><br><span class="line"></span><br><span class="line"><span class="built_in">make</span> &amp; <span class="built_in">make</span> install </span><br></pre></td></tr></table></figure><p>PostgreSQL <a href="https://www.postgresql.org/docs/6.5/config.htm">.&#x2F;configure</a> options</p><ul><li>–prefix&#x3D;PREFIX install  architecture-independent files in PREFIX. Default installation location is &#x2F;usr&#x2F;local&#x2F;pgsql</li><li>–enable-integer-datetimes  enable 64-bit integer date&#x2F;time support</li><li>–enable-nls[&#x3D;LANGUAGES]  enable Native Language Support</li><li>–disable-shared         do not build shared libraries</li><li>–disable-rpath           do not embed shared library search path in executables</li><li>–disable-spinlocks    do not use spinlocks</li><li>–enable-debug           build with debugging symbols (-g)</li><li>–enable-profiling       build with profiling enabled</li><li>–enable-dtrace           build with DTrace support</li><li>–enable-depend         turn on automatic dependency tracking</li><li>–enable-cassert         enable assertion checks (for debugging)</li><li>–enable-thread-safety  make client libraries thread-safe</li><li>–enable-thread-safety-force  force thread-safety despite thread test failure</li><li>–disable-largefile       omit support for large files</li><li>–with-docdir&#x3D;DIR      install the documentation in DIR [PREFIX&#x2F;doc]</li><li>–without-docdir         do not install the documentation</li><li>–with-includes&#x3D;DIRS  look for additional header files in DIRS</li><li>–with-libraries&#x3D;DIRS  look for additional libraries in DIRS</li><li>–with-libs&#x3D;DIRS         alternative spelling of –with-libraries</li><li>–with-pgport&#x3D;PORTNUM   change default port number [5432]</li><li>–with-tcl                     build Tcl modules (PL&#x2F;Tcl)</li><li>–with-tclconfig&#x3D;DIR   tclConfig.sh is in DIR</li><li>–with-perl                   build Perl modules (PL&#x2F;Perl)</li><li>–with-python              build Python modules (PL&#x2F;Python)</li><li>–with-gssapi               build with GSSAPI support</li><li>–with-krb5                  build with Kerberos 5 support</li><li>–with-krb-srvnam&#x3D;NAME  default service principal name in Kerberos [postgres]</li><li>–with-pam                  build with PAM support</li><li>–with-ldap                  build with LDAP support</li><li>–with-bonjour            build with Bonjour support</li><li>–with-openssl            build with OpenSSL support</li><li>–without-readline      do not use GNU Readline nor BSD Libedit for editing</li><li>–with-libedit-preferred  prefer BSD Libedit over GNU Readline</li><li>–with-ossp-uuid        use OSSP UUID library when building contrib&#x2F;uuid-ossp</li><li>–with-libxml               build with XML support</li><li>–with-libxslt               use XSLT support when building contrib&#x2F;xml2</li><li>–with-system-tzdata&#x3D;DIR  use system time zone data in DIR</li><li>–without-zlib              do not use Zlib</li><li>–with-gnu-ld              assume the C compiler uses GNU ld [default&#x3D;no]</li></ul><p>假如在 <code>./configure</code> 沒有特別設定 <code>–prefix</code> 檔案位置預設為 <code>/usr/local/pgsql/</code></p><p>所以我們可以透過下面命令查看安裝結果是否正確</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"># ls -l /usr/local/pgsql/</span><br><span class="line">total <span class="number">20</span></span><br><span class="line">drwxr-xr-x  <span class="number">2</span> root     root     <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">09</span>:<span class="number">21</span> bin</span><br><span class="line">drwx------ <span class="number">19</span> postgres postgres <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">09</span>:<span class="number">23</span> data</span><br><span class="line">drwxr-xr-x  <span class="number">6</span> root     root     <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">09</span>:<span class="number">21</span> include</span><br><span class="line">drwxr-xr-x  <span class="number">4</span> root     root     <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">09</span>:<span class="number">21</span> lib</span><br><span class="line">drwxr-xr-x  <span class="number">6</span> root     root     <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">09</span>:<span class="number">21</span> share</span><br></pre></td></tr></table></figure><h3 id="建立-postgres-user"><a href="#建立-postgres-user" class="headerlink" title="建立 postgres user"></a>建立 postgres user</h3><p>建立一個 postgres user 並給密碼</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"># adduser postgres</span><br><span class="line"></span><br><span class="line"># passwd postgres</span><br><span class="line">Changing password <span class="keyword">for</span> user postgres.</span><br><span class="line">New UNIX password:</span><br><span class="line">Retype new UNIX password:</span><br><span class="line"><span class="function">passwd: <span class="title">all</span> <span class="title">authentication</span> <span class="title">tokens</span> <span class="title">updated</span> <span class="title">successfully</span>.</span></span><br></pre></td></tr></table></figure><h3 id="初始化-postgres-data-路徑"><a href="#初始化-postgres-data-路徑" class="headerlink" title="初始化 postgres data 路徑"></a>初始化 postgres data 路徑</h3><p>建立一個資料夾並把權限設定給剛剛建立 postgres user</p><blockquote><p>因為該使用者需要有權限寫入此資料夾</p></blockquote><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">mkdir</span> /usr/local/pgsql/data</span><br><span class="line">chown postgres:postgres /usr/local/pgsql/data</span><br></pre></td></tr></table></figure><p>user mode 切換成 postgres user，並利用 <code>initdb</code> 初始化資料庫</p><figure class="highlight gradle"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"># su - postgres</span><br><span class="line"># <span class="regexp">/usr/</span>local<span class="regexp">/pgsql/</span>bin<span class="regexp">/initdb -D /u</span>sr<span class="regexp">/local/</span>pgsql<span class="regexp">/data/</span></span><br></pre></td></tr></table></figure><p>查看 <code>/usr/local/pgsql/data</code> 資料夾應該會有如下檔案</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">ls -l /usr/local/pgsql/data</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres     <span class="number">3</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> PG_VERSION</span><br><span class="line">drwx------ <span class="number">6</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">03</span> base</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">47</span> global</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_commit_ts</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_dynshmem</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres  <span class="number">4760</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_hba.conf</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres  <span class="number">1636</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_ident.conf</span><br><span class="line">drwx------ <span class="number">4</span> postgres postgres  <span class="number">4096</span> Sep  <span class="number">1</span> <span class="number">09</span>:<span class="number">23</span> pg_logical</span><br><span class="line">drwx------ <span class="number">4</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_multixact</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">46</span> pg_notify</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_replslot</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_serial</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_snapshots</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">46</span> pg_stat</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">50</span> pg_stat_tmp</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_subtrans</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_tblspc</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_twophase</span><br><span class="line">drwx------ <span class="number">3</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">08</span>:<span class="number">05</span> pg_wal</span><br><span class="line">drwx------ <span class="number">2</span> postgres postgres  <span class="number">4096</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> pg_xact</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres    <span class="number">88</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> postgresql.auto.conf</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres <span class="number">26720</span> Aug <span class="number">31</span> <span class="number">06</span>:<span class="number">02</span> postgresql.conf</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres    <span class="number">59</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">46</span> postmaster.opts</span><br><span class="line">-rw------- <span class="number">1</span> postgres postgres    <span class="number">87</span> Sep  <span class="number">2</span> <span class="number">07</span>:<span class="number">46</span> postmaster.pid</span><br></pre></td></tr></table></figure><h3 id="啟動-postgres-db"><a href="#啟動-postgres-db" class="headerlink" title="啟動 postgres db"></a>啟動 postgres db</h3><p>最後利用 postgres user 執行 <code>postmaster</code> 啟動 postgres db</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data &gt;logfile <span class="number">2</span>&gt;&amp;<span class="number">1</span> &amp;</span><br><span class="line">[<span class="number">1</span>] <span class="number">7936</span></span><br></pre></td></tr></table></figure><p>啟動完畢後可以利用 <code>psql</code> 進入 postgres 操作 db</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">#/usr/local/pgsql/bin/psql</span><br><span class="line"></span><br><span class="line">psql (<span class="number">12</span>.<span class="number">12</span>)</span><br><span class="line"><span class="built_in">Type</span> &quot;<span class="built_in">help</span>&quot; <span class="keyword">for</span> <span class="built_in">help</span>.</span><br><span class="line"></span><br><span class="line">postgres=# </span><br></pre></td></tr></table></figure><h2 id="小結"><a href="#小結" class="headerlink" title="小結"></a>小結</h2><p>希望透過這邊文章可以幫助大家，利用 postgres source code 安裝 db 環境，並進行 debug &amp; 調式，詳細文章可以參考官網 <a href="https://www.postgresql.org/docs/current/installation.html">Installation from Source Code</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/postgresql-debug-source-code/">https://isdaniel.github.io/postgresql-debug-source-code/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/postgresql-debug-source-code/</id>
    <link href="https://isdaniel.github.io/postgresql-debug-source-code/"/>
    <published>2022-09-03T12:30:11.000Z</published>
    <summary>因為工作需要最近在研究 postgresql DB，postgresql DB 是一個 Open Source RDBMS，所以有任何問題疑問都可以把 source code 下載並 debug 了解原因，本篇希望可以快速幫助想要透過 source code 安裝 postgresql DB 的人</summary>
    <title>postgresql source code install</title>
    <updated>2026-04-22T03:00:22.030Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="AWS" scheme="https://isdaniel.github.io/categories/AWS/"/>
    <category term="Certified" scheme="https://isdaniel.github.io/categories/AWS/Certified/"/>
    <category term="AWS" scheme="https://isdaniel.github.io/tags/AWS/"/>
    <category term="Certified" scheme="https://isdaniel.github.io/tags/Certified/"/>
    <category term="Solutions Architect" scheme="https://isdaniel.github.io/tags/Solutions-Architect/"/>
    <content>
      <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>我算是一個雲端小白，自從進入目前公司後才開始接觸雲端相關概念並感受到雲端服務威力</p><p>我會想參加 SSA 考試主要有下面幾個原因</p><ol><li>主要是想趕在 SAA-C02 (2022-08) 換考試範圍前來測驗</li><li>感覺放在履歷上，增加職場競爭力</li><li>驗收自己目前對於雲端相關知識</li><li>目前在架構 Team 擔任工程師，希望規劃出好的系統架構</li></ol><p><img src="/../images/aws/2022-06-26_20h25_16.png" alt="img"></p><h2 id="考試重點"><a href="#考試重點" class="headerlink" title="考試重點"></a>考試重點</h2><h3 id="S3-考試必考"><a href="#S3-考試必考" class="headerlink" title="S3 (考試必考)"></a>S3 (考試必考)</h3><ul><li>S3 Standard</li><li>S3 IA (Infrequent Access)</li><li>Intelligent-Tiering：設計來優化成本，基於其自動移動 data 到相對便宜的 tier，不影響 performance，也不需提前處理。</li><li>S3 One Zone-IA</li><li>S3 Glacier：價格便宜，適合存放 Archive 檔案 (預設會加密)<ul><li>Instant Retrieval：最少存放 90 天</li><li>Flexible Retrieval (formerly S3 Glacier)：最少存放 90 天<ul><li>Expedited：1~5 min</li><li>Standard：3~5 hours</li><li>Bulk：5~12 hours</li></ul></li><li>Deep Archive - for long term storage: 最少存放 180 天<ul><li>Standard：12 hours</li><li>Bulk：48 hours</li></ul></li></ul></li></ul><h4 id="重點功能"><a href="#重點功能" class="headerlink" title="重點功能"></a>重點功能</h4><p>S3 Lifecycle rule 控制 S3 存放物件存放規則 S3 Type，這樣可以讓S3使用費用更有效率</p><p>以下是 S3 type 使用 Lifecycle 轉換表 <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html">lifecycle-transition-general-considerations</a></p><p><img src="https://i.imgur.com/liydRJs.png"></p><p>S3 Transfer Acceleration 啟用後擁有更快，簡單跨長距離資料交換</p><p>因為 S3 上傳有 5 GB 限制，如果要傳大型檔案可以使用 S3 multipart uploads 將檔案切分快速上傳，如果部分上傳失敗，只會針對部分重傳</p><h3 id="Storage-價格高到低"><a href="#Storage-價格高到低" class="headerlink" title="Storage 價格高到低"></a>Storage 價格高到低</h3><p>Standard &gt; IA &gt; Intelligent &gt; IA One Zone &gt; Glacier &gt; Glacier Deep Archive</p><h3 id="S3-內容加解密"><a href="#S3-內容加解密" class="headerlink" title="S3 內容加解密"></a>S3 內容加解密</h3><ul><li><p>SSE-S3: Key 存在 AWS S3 上，在 header 必須帶入 (<code>x-amz-server-side-encryption:AES256</code>)，由 AWS S3 管理</p></li><li><p>SSE-KMS： Key 存在 AWS KMS 上，在 header 必須帶入 (<code>x-amz-server-side-encryption:AES256</code>)，可較好管控 key 使用權限，自行管理</p></li><li><p>SSE-C：client Key 保存在 Client，提供 Data Key in Header 所以必須使用 Https</p></li><li><p>Client Side Encryption：在 Client 加密並上傳</p></li></ul><h3 id="S3-access-logs"><a href="#S3-access-logs" class="headerlink" title="S3 access logs"></a>S3 access logs</h3><p>針對 S3 操作動作進行 audit log, 並可使用 athena 來查詢</p><h3 id="S3-Object-Lock"><a href="#S3-Object-Lock" class="headerlink" title="S3 Object Lock"></a>S3 Object Lock</h3><p>必須啟用 versioning</p><blockquote><p>Version 一旦被啟動後只能被 suspended</p></blockquote><h3 id="Use-S3-Glacier-vault"><a href="#Use-S3-Glacier-vault" class="headerlink" title="Use S3 Glacier vault"></a>Use S3 Glacier vault</h3><p>Glacier Vault Lock 允許您使用保險庫鎖定策略輕鬆部署和實施單個 S3 Glacier 保險庫的合規性控制。</p><p>您可以在 Vault Lock policy 中指定諸如“一次寫入多次讀取”(WORM) 之類的控制，並鎖定該策略以防止將來進行編輯。</p><h3 id="S3-WebHost"><a href="#S3-WebHost" class="headerlink" title="S3 WebHost"></a>S3 WebHost</h3><p>S3 Host Web 使用 URL 規則如下</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">http://bucket-name.s3-website.Region.amazonaws.com</span><br><span class="line"></span><br><span class="line">http://bucket-name.s3-website-Region.amazonaws.com</span><br></pre></td></tr></table></figure><h3 id="EBS-Elastic-Block-Store"><a href="#EBS-Elastic-Block-Store" class="headerlink" title="EBS (Elastic Block Store)"></a>EBS (Elastic Block Store)</h3><p>非暫時的 network drive Storage</p><blockquote><p>如果長時間儲存還是較適合放在 S3</p></blockquote><p>EBS Volumes 類型</p><ul><li>gp2&#x2F;gp3 (SSD)：通用型，價格和性能都不錯</li><li>io1&#x2F;io2 (SSD)：價格高，高性能 (適合用在 DataBase Workloads)</li><li>stI (HDD)：價格便宜</li><li>scI (HDD)：價格最便宜</li></ul><blockquote><p>Boot volumes 只能是 <code>gp</code> 或 <code>io</code></p></blockquote><p>EBS Multi-Attach 只有在 io1&#x2F;io2 使用 ,通常用在大量讀寫 Cluster</p><p>EBS volume 不能跨 AZ，如果要跨 AZ 請使用 EBS Snapshot</p><h3 id="Instance-Store"><a href="#Instance-Store" class="headerlink" title="Instance Store"></a>Instance Store</h3><p>Instance Storage 又稱 Ephemeral Storage，暫時的 Storage。high-performance disk use</p><p>需要 high-performance disk use EC2 instance store，成本較EBS Provisioned IOPS SSD (io1) 低</p><blockquote><p>Instance Store 跟 EC2 在同一台 Host 中，所以相較於 EBS 有較快執行效率</p></blockquote><h4 id="EBS-vs-Instance-Storage"><a href="#EBS-vs-Instance-Storage" class="headerlink" title="EBS vs Instance Storage"></a>EBS vs Instance Storage</h4><p>狀態保留 EBS vs Instance Storage</p><ul><li>Reboot instance: 兩者皆不會丟失資料；</li><li>Stop instance: EBS 會保留、Instance Storage 資料會遺失。</li><li>Terminate instance: 預設來說，兩者的 ROOT volume 都會被刪除，然而，EBS 可以選擇要不要保留。</li></ul><h3 id="EFS-Elastic-File-System"><a href="#EFS-Elastic-File-System" class="headerlink" title="EFS(Elastic File System)"></a>EFS(Elastic File System)</h3><p>File storage 服務，讓你可以共享檔案資料，資料可以儲存 across multi-AZ（單一 region），POSIX。</p><p>他是 Linux Based 所以不能跑在 Window 上</p><h2 id="SnowBall"><a href="#SnowBall" class="headerlink" title="SnowBall"></a>SnowBall</h2><p>AWS 提供 Snow 方案來作超大量 Data 轉移到 AWS 上</p><ul><li><p>SnowBall Edge：物理資料轉移解決方案 (轉移 TBs or PBs 等級資料)，AWS提供物理硬碟把資料放進後AWS會再派人把資料帶走匯入 AWS 雲端</p></li><li><p>Snowcone：比起 SnowBall 更輕便，更安全．可以存放 8Tbs 資料，可以選擇直連 DataSync 到 DataCenter</p></li><li><p>Snowmobile:資料卡車可轉移 (1 EB &#x3D; 1000 PB) 資料</p></li></ul><h2 id="Route53"><a href="#Route53" class="headerlink" title="Route53"></a>Route53</h2><p>AWS Route 53 是 DNS 服務，有以下 Routing 類型</p><ul><li>Simple Routin</li><li>Weighted Routing</li><li>Latency-based Routing</li><li>Failover Routing</li><li>Geolocation Routing</li><li>Geoproximity Routing (Traffic Flow Only)</li><li>Multivalue Answer Routing</li></ul><h3 id="A-Record-vs-CNAME-vs-Alias-Record"><a href="#A-Record-vs-CNAME-vs-Alias-Record" class="headerlink" title="A Record vs CNAME vs Alias Record"></a>A Record vs CNAME vs Alias Record</h3><ul><li>A Record：把 Domain 跟 IP 對照起來</li><li>CNAME：FQDN 指向一個 top level domain 類似一個參考<ul><li>Can not use <code>example.com</code></li><li>Can use <code>www.exmple.com</code></li></ul></li><li>Alias Record：Alias Record 跟 CNAME很類似，但她可以參考subdomain or top level domain，底層實作使用　A Record</li></ul><h3 id="Health-Check"><a href="#Health-Check" class="headerlink" title="Health Check"></a>Health Check</h3><p>支援：HTTP,HTTPS,TCP<br>如果超過 18% 的 health checker 判斷 health Route53 會標示 health<br>Status Code 界於 200 ~ 299 都算,或是可以判斷前 5120 bytes 字元判斷是否 health</p><h2 id="EC2"><a href="#EC2" class="headerlink" title="EC2"></a>EC2</h2><ul><li>On-Demand：一般租借 full price</li><li>Reserved：可以預約 1 或 3 年，價格相較於即時會比較便宜</li><li>Savings Plan:每個月固定給租金,可以使用任何主機，任何超出承諾的用量，則將以正常的隨需費率收費</li><li>dedicated-hosts:專用主機可讓您使用現有的每個通訊端、每個核心或每個 VM 軟體授權,通訊端或實體核心的其他軟體授權 (取決於您的授權條款規定)。這可協助您充分利用現有投資來節省資金。</li><li>Dedicated instance：專用執行個體，會放置於VPC內，會在硬體層級就進行隔離，只專屬於單一用戶</li></ul><p>透過以下網址可以取得 EC2 instance metadata</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">http://169.254.169.254/latest/meta-data/public-ipv4</span><br></pre></td></tr></table></figure><p>在運行的 EC2 想要修改 <code>DeleteOnTermination</code> 可以透過 commandline 修改此值</p><h3 id="Spot-Instance"><a href="#Spot-Instance" class="headerlink" title="Spot Instance"></a>Spot Instance</h3><p>在 Launch instance 可以設定</p><p>Spot Instance：Stateless，設定一個 max price 如果低於這個價格可得到機器工作(類似拍賣)，比起 On-Demand 擁有更便宜價格，</p><p>因為隨時可能都會被中斷所以需要程式紀錄運行狀態，後續接續執行</p><blockquote><p>取消 Spot Request 不代表終止 Spot Instance，所以要手動終止 Spot Instance</p></blockquote><h3 id="Spot-Fleet"><a href="#Spot-Fleet" class="headerlink" title="Spot Fleet"></a>Spot Fleet</h3><p>Spot Fleet &#x3D; Spot Instance + (option) On-Demand Instance</p><p>盡量選擇最便宜價格，適合長運行可用 worker，可以設定要多少資源使用</p><h3 id="Spot-blocks"><a href="#Spot-blocks" class="headerlink" title="Spot blocks"></a>Spot blocks</h3><p>Spot blocks 允許使用者 booking 1 ~ 6 小時，來運作 EC2 可以避免突然被中斷</p><h3 id="EC2-Placement-Groups"><a href="#EC2-Placement-Groups" class="headerlink" title="EC2 Placement Groups"></a>EC2 Placement Groups</h3><p>把 EC2 分群策略</p><ul><li><p>Cluster：</p><ul><li>優點：放在相同 AZ，交互入速度快</li><li>缺點：如果 Cluster 連不到全部都連線不到（可用性較低）<br><img src="https://i.imgur.com/eWM7Ytk.png"></li></ul></li><li><p>Spread：</p><ul><li>優點：放置在不同區域 AZ &amp; 硬體，可靠性提高</li><li>缺點：限制 7 個 instance per AZ per group<br><img src="https://i.imgur.com/AL9ieFf.png"></li></ul></li><li><p>Partition：每個 Partition 都是獨立的</p><ul><li>適合用到 big data 或大量資料分析情境<br>  <img src="https://i.imgur.com/kxiu5DD.png"></li></ul></li></ul><p>EC2 User Data：在第一次開機時會預設安裝使用</p><h3 id="Hibernate"><a href="#Hibernate" class="headerlink" title="Hibernate"></a>Hibernate</h3><p>因為 in-memory (RAM) state，所以在開機會比較快速(OS is not stopped)</p><p>可用於 On-Demand,Reserved，Spot Instance Type</p><p>EC2 Hibernate the EC2 Instance Root Volume type 必須要是 EBS volume 且加密.</p><h2 id="Service-Control-Policies-SCP"><a href="#Service-Control-Policies-SCP" class="headerlink" title="Service Control Policies(SCP)"></a>Service Control Policies(SCP)</h2><p>黑白名單 IAM，適用於 OU or Account Level，除了 master account 外 User &amp; Roles 都是用 (包含 Root)</p><blockquote><p>預設全部都不允許</p></blockquote><h2 id="Redis-vs-Memcached"><a href="#Redis-vs-Memcached" class="headerlink" title="Redis vs Memcached"></a>Redis vs Memcached</h2><p>基本都選擇 Redis 除了你的架構需要 Multithreaded architecture</p><blockquote><p><a href="https://aws.amazon.com/tw/elasticache/redis-vs-memcached/">https://aws.amazon.com/tw/elasticache/redis-vs-memcached/</a></p></blockquote><h2 id="考試建議"><a href="#考試建議" class="headerlink" title="考試建議"></a>考試建議</h2><p>我個人考試內容蠻側重，AWS 服務之間整合使用，ex:身為架構師依照客戶目前情況給出一個，合理且最有經濟效益方案…等等問題</p><p>所以對於每個服務種類比較，VPC，網路服務，S3 各種種類優缺點問題必須要了解</p><p>SSA 考得真的蠻廣，像是我沒有任何雲端概念開始準備考試我有下面幾個建議（依照重要程度來排序）</p><ol><li>買個線上課程，有線上課程可以省去很多時間找資料跟著講師走可以知道 AWS 大致上服務功能</li><li>模擬考卷來做(很重要)，我使用 <a href="https://www.udemy.com/course/practice-exams-aws-certified-solutions-architect-associate/">Practice Exams | AWS Certified Solutions Architect Associate</a> 模擬試卷，個人覺得和考試卷蠻類似且都有附贈詳細解答說明，個人蠻推薦</li><li>看 AWS 白皮書和官網綜合案例，考題大多會給一個情境要你給出最合適答案，這和官網上 Best Practice 有些部分可以吻合，只是案例真的太多了建議有時間再來看</li></ol><blockquote><p>個人覺得此模擬試卷比真實考試簡單，如果購買此模擬試卷建議每個模擬試卷都要考 80 % 以上再去考試</p></blockquote><p>我本次是在考試中心考試，我個人比較喜歡考試中心，原因是如果中途尿急可以上廁所 (考試時長 2 小時多)，如果選擇線上考試要注意不能中途上廁所不然就取消資格…</p><p>最後祝大家可以通過 SSA 證照</p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/aws-certified-ssa-experience/">https://isdaniel.github.io/aws-certified-ssa-experience/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/aws-certified-ssa-experience/</id>
    <link href="https://isdaniel.github.io/aws-certified-ssa-experience/"/>
    <published>2022-06-26T12:00:00.000Z</published>
    <summary>AWS SAA-C02 認證考試完整指南：涵蓋 S3 儲存類型、EC2 執行個體、EBS、Route53、SnowBall 等核心服務重點整理，包含服務比較、最佳實踐案例與考試準備建議，助您順利通過 AWS 解決方案架構師認證考試</summary>
    <title>AWS Certified Solutions Architect - Associate (SAA-C02) 考試重點與心得</title>
    <updated>2026-04-22T03:00:22.022Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="AWS" scheme="https://isdaniel.github.io/categories/AWS/"/>
    <category term="Lambda" scheme="https://isdaniel.github.io/categories/AWS/Lambda/"/>
    <category term="AWS" scheme="https://isdaniel.github.io/tags/AWS/"/>
    <category term="Lambda" scheme="https://isdaniel.github.io/tags/Lambda/"/>
    <category term=".netcore" scheme="https://isdaniel.github.io/tags/netcore/"/>
    <content>
      <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>AWS lambda 作為 serverless 服務，之前有介紹過 <a href="aws-first-lambda.md">AWS Lambda 初體驗 by .net core</a>，本次要介紹 <a href="https://www.serverless.com/framework/docs">serverless</a> 框架搭配 AWS <code>CloudFormation</code> (IaC)</p><p>Serverless 預設使用 provider 是 AWS</p><blockquote><p>AWS is the default cloud provider used by Serverless Framework.</p></blockquote><h2 id="建立第一個-serverless"><a href="#建立第一個-serverless" class="headerlink" title="建立第一個 serverless"></a>建立第一個 serverless</h2><p>本次案例我們利用 <a href="https://github.com/serverless/serverless#install-via-npm">serverless cli</a> 建立 dotnet template，利用 nqm 安裝 </p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">npm install -g serverless</span><br></pre></td></tr></table></figure><p>安裝完後建立一個 dotnet core serverless project</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">serverless create -t aws-csharp -n dotnetServerless</span><br></pre></td></tr></table></figure><p>本次使用參數說明</p><ul><li>–template &#x2F; -t ：Template for the service</li><li>–name &#x2F; -n： Name for the service. Overwrites the default name of the created service</li></ul><p>跑完命令後會出現下圖專案結構</p><ul><li>Build script： template 產生 build 專案腳本<ul><li>build.cmd：window 使用</li><li>build.sh：linux 使用</li></ul></li><li>Handler.cs：template 預設 lambda 呼叫點</li><li>serverless.yml：<ul><li>provider：<a href="https://www.serverless.com/framework/docs/providers">Serverless Infrastructure Providers</a> AWS,Azure,GCP…</li><li>service：部署上 serverless 名稱(本次使用 lambda)</li><li>frameworkVersion：使用 Serverless 版本(建議使用)</li><li>runtime：運行環境</li><li>functions：<ul><li>handler：運行執行 serverless entry point</li></ul></li></ul></li></ul><p>長出來的 <code>serverless.yml</code> 會如下</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">service:</span> <span class="string">dotnetServerless</span></span><br><span class="line"></span><br><span class="line"><span class="attr">frameworkVersion:</span> <span class="string">&#x27;3&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="attr">provider:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">aws</span></span><br><span class="line">  <span class="attr">runtime:</span> <span class="string">dotnet6</span></span><br><span class="line"></span><br><span class="line"><span class="attr">package:</span></span><br><span class="line">  <span class="attr">individually:</span> <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="attr">functions:</span></span><br><span class="line">  <span class="attr">hello:</span></span><br><span class="line">    <span class="attr">handler:</span> <span class="string">CsharpHandlers::AwsDotnetCsharp.Handler::Hello</span></span><br><span class="line"></span><br><span class="line">    <span class="attr">package:</span></span><br><span class="line">      <span class="attr">artifact:</span> <span class="string">bin/Release/net6.0/hello.zip</span></span><br><span class="line">  </span><br></pre></td></tr></table></figure><p><img src="https://i.imgur.com/TRP3aX4.png"></p><h3 id="Deploy-serverless-package"><a href="#Deploy-serverless-package" class="headerlink" title="Deploy serverless package"></a>Deploy serverless package</h3><p>執行 <code>serverless deploy</code> 命令我們會將</p><blockquote><p>預設 Deploy Region : <code>us-east-1</code> 或是使用參數 <code>--region / -r</code> 指定上傳 Region</p></blockquote><figure class="highlight lasso"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Deploying dotnetServerless <span class="keyword">to</span> stage dev (us<span class="params">-east</span><span class="number">-1</span>)</span><br><span class="line"></span><br><span class="line">✔ Service deployed <span class="keyword">to</span> <span class="built_in">stack</span> dotnetServerless<span class="params">-dev</span> (<span class="number">113</span>s)</span><br></pre></td></tr></table></figure><p>上傳完畢後在 <code>CloudFormation</code> 應該可以看到我們建立資源如下</p><ul><li>AWS::Lambda::Function</li><li>AWS::Lambda::Version</li><li>AWS::Logs::LogGroup</li><li>AWS::IAM::Role</li><li>AWS::S3::Bucket</li><li>AWS::S3::BucketPolicy</li></ul><p>在 lambda 會自動建立 <code>dotnetServerless-dev-hello</code></p><p>我們利用 UI 測試 lambda 會得到如下資訊</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;Message&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Go Serverless v1.0! Your function executed successfully!&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;Request&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;Key1&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value1&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;Key2&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value2&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;Key3&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value3&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="lambda-ReDeploy"><a href="#lambda-ReDeploy" class="headerlink" title="lambda ReDeploy"></a>lambda ReDeploy</h3><p>我們稍微更新 lambda 回應資訊</p><figure class="highlight c#"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">public</span> Response <span class="title">Hello</span>(<span class="params">Request request</span>)</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="keyword">new</span> Response(<span class="string">&quot;Lambda Upgrade !!&quot;</span>, request);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>再次執行 <code>.\build.cmd &amp; serverless deploy</code> 後並測試 lambda 會得到更新後資訊，讓我們更新 lambda 變得很簡單，是不是很猛</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;Message&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Lambda Upgrade !!&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;Request&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;Key1&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value1&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;Key2&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value2&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;Key3&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value3&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h2 id="小結"><a href="#小結" class="headerlink" title="小結"></a>小結</h2><p>Serverless 框架搭配 <code>CloudFormation</code> 幫助我們把許多自動化細節封裝起來讓我們只需要關注開發，利用 Serverless cli 我們可以快速把一個 CI&#x2F;CD lambda 流程建立起來</p><p>本次 Sample code 連結 <a href="https://github.com/isdaniel/BlogSample/tree/master/src/AWS_Sample/DotNetServerless">DotNetServerless</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/aws-serverless/">https://isdaniel.github.io/aws-serverless/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/aws-serverless/</id>
    <link href="https://isdaniel.github.io/aws-serverless/"/>
    <published>2022-05-16T22:30:11.000Z</published>
    <summary>AWS lambda 作為 serverless 服務，之前有介紹過 AWS Lambda 初體驗 by .net core，本次要介紹 serverless 框架搭配 AWS CloudFormation (IaC)</summary>
    <title>Serverless + CloudFormation 撰寫 lambda</title>
    <updated>2026-04-22T03:00:22.022Z</updated>
  </entry>
  <entry>
    <author>
      <name>Daniel Shih</name>
    </author>
    <category term="Postgresql" scheme="https://isdaniel.github.io/categories/Postgresql/"/>
    <category term="Vacuum" scheme="https://isdaniel.github.io/categories/Postgresql/Vacuum/"/>
    <category term="Postgresql" scheme="https://isdaniel.github.io/tags/Postgresql/"/>
    <category term="Vacuum" scheme="https://isdaniel.github.io/tags/Vacuum/"/>
    <category term="AutoVacuum" scheme="https://isdaniel.github.io/tags/AutoVacuum/"/>
    <content>
      <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>AutoVacuum 在 Postgresql 是一個很重要的機制(甚至可以說最重要也不為過)，但裡面有些地方需要了解今天就帶大家初探</p><h2 id="資料-測試資料資訊"><a href="#資料-測試資料資訊" class="headerlink" title="資料 &amp; 測試資料資訊"></a>資料 &amp; 測試資料資訊</h2><p>本次執行 Sample Data</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE TABLE</span> T1 (</span><br><span class="line">    ID <span class="type">INT</span> <span class="keyword">NOT NULL</span> <span class="keyword">PRIMARY KEY</span>,</span><br><span class="line">val <span class="type">INT</span> <span class="keyword">NOT NULL</span>,</span><br><span class="line">col1 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col2 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col3 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col4 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col5 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col6 UUID <span class="keyword">NOT NULL</span></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">INSERT INTO</span> T1</span><br><span class="line"><span class="keyword">SELECT</span> i,</span><br><span class="line">       RANDOM() <span class="operator">*</span> <span class="number">1000000</span>,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid</span><br><span class="line"><span class="keyword">FROM</span> generate_series(<span class="number">1</span>,<span class="number">20000000</span>) i;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">CREATE TABLE</span> T2 (</span><br><span class="line">    ID <span class="type">INT</span> <span class="keyword">NOT NULL</span> <span class="keyword">PRIMARY KEY</span>,</span><br><span class="line">val <span class="type">INT</span> <span class="keyword">NOT NULL</span>,</span><br><span class="line">col1 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col2 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col3 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col4 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col5 UUID <span class="keyword">NOT NULL</span>,</span><br><span class="line">col6 UUID <span class="keyword">NOT NULL</span></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"><span class="keyword">INSERT INTO</span> T2</span><br><span class="line"><span class="keyword">SELECT</span> i,</span><br><span class="line">       RANDOM() <span class="operator">*</span> <span class="number">1000000</span>,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid,</span><br><span class="line">   md5(random()::text <span class="operator">||</span> clock_timestamp()::text)::uuid</span><br><span class="line"><span class="keyword">FROM</span> generate_series(<span class="number">1</span>,<span class="number">1000000</span>) i;</span><br><span class="line"></span><br><span class="line">vacuum ANALYZE T1;</span><br><span class="line">vacuum ANALYZE T2;</span><br></pre></td></tr></table></figure><p>查詢 sample code</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">EXPLAIN (ANALYZE,TIMING <span class="keyword">ON</span>,BUFFERS <span class="keyword">ON</span>)</span><br><span class="line"><span class="keyword">SELECT</span> t1.<span class="operator">*</span></span><br><span class="line"><span class="keyword">FROM</span> T1 </span><br><span class="line"><span class="keyword">INNER</span> <span class="keyword">JOIN</span> T2 <span class="keyword">ON</span> t1.id <span class="operator">=</span> t2.id </span><br><span class="line"><span class="keyword">WHERE</span> t1.id <span class="operator">&lt;</span> <span class="number">1000000</span> </span><br></pre></td></tr></table></figure><p>此次查詢如期走 <code>Merge Join</code></p><figure class="highlight n1ql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">&quot;Gather  (cost=1016.37..30569.85 rows=53968 width=104) (actual time=0.278..837.297 rows=999999 loops=1)&quot;</span><br><span class="line">&quot;  Workers Planned: 2&quot;</span><br><span class="line">&quot;  Workers Launched: 2&quot;</span><br><span class="line">&quot;  Buffers: shared hit=38273 read=21841&quot;</span><br><span class="line">&quot;  -&gt;  <span class="keyword">Merge</span> <span class="keyword">Join</span>  (cost=<span class="number">16.37</span>.<span class="number">.24173</span><span class="number">.05</span> rows=<span class="number">22487</span> width=<span class="number">104</span>) (actual time=<span class="number">11.993</span>.<span class="number">.662</span><span class="number">.770</span> rows=<span class="number">333333</span> loops=<span class="number">3</span>)<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>        <span class="keyword">Merge</span> Cond: (t2.id = t1.id)<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>        Buffers: shared hit=<span class="number">38273</span> read=<span class="number">21841</span><span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>        -&gt;  Parallel <span class="keyword">Index</span> Only Scan <span class="keyword">using</span> t2_pkey <span class="keyword">on</span> t2  (cost=<span class="number">0.42</span>.<span class="number">.20147</span><span class="number">.09</span> rows=<span class="number">416667</span> width=<span class="number">4</span>) (actual time=<span class="number">0.041</span>.<span class="number">.69</span><span class="number">.947</span> rows=<span class="number">333333</span> loops=<span class="number">3</span>)<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>              Heap Fetches: <span class="number">0</span><span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>              Buffers: shared hit=<span class="number">6</span> read=<span class="number">2732</span><span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>        -&gt;  <span class="keyword">Index</span> Scan <span class="keyword">using</span> t1_pkey <span class="keyword">on</span> t1  (cost=<span class="number">0.44</span>.<span class="number">.48427</span><span class="number">.24</span> rows=<span class="number">1079360</span> width=<span class="number">104</span>) (actual time=<span class="number">0.041</span>.<span class="number">.329</span><span class="number">.874</span> rows=<span class="number">999819</span> loops=<span class="number">3</span>)<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>              <span class="keyword">Index</span> Cond: (id &lt; <span class="number">1000000</span>)<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>              Buffers: shared hit=<span class="number">38267</span> read=<span class="number">19109</span><span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>Planning:<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>  Buffers: shared hit=<span class="number">4</span> read=<span class="number">8</span><span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>Planning Time: <span class="number">0.228</span> ms<span class="string">&quot;</span></span><br><span class="line"><span class="string">&quot;</span>Execution Time: <span class="number">906.760</span> ms<span class="string">&quot;</span></span><br></pre></td></tr></table></figure><p>假如更新如多資料，但未觸發臨界值</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">update</span> T1</span><br><span class="line"><span class="keyword">set</span> id <span class="operator">=</span> id <span class="operator">+</span> <span class="number">100000000</span></span><br><span class="line"><span class="keyword">where</span> id <span class="operator">&lt;</span> <span class="number">1000000</span></span><br></pre></td></tr></table></figure><p>在查詢可以發現使用執行計畫，還是 <code>Merge Join</code> (但依照現在資料量，理當不走 <code>Merge Join</code>) 那是甚麼原因造成的呢?</p><p>那是因為目前統計資訊並未與最新資料量對齊</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">&quot;Gather  (cost=1016.37..30707.83 rows=53968 width=104) (actual time=51.403..55.517 rows=0 loops=1)&quot;</span><br><span class="line">&quot;  Workers Planned: 2&quot;</span><br><span class="line">&quot;  Workers Launched: 2&quot;</span><br><span class="line">&quot;  Buffers: shared hit=8215&quot;</span><br><span class="line">&quot;  -&gt;  Merge Join  (cost=16.37..24311.03 rows=22487 width=104) (actual time=6.736..6.738 rows=0 loops=3)&quot;</span><br><span class="line">&quot;        Merge Cond: (t2.id = t1.id)&quot;</span><br><span class="line">&quot;        Buffers: shared hit=8215&quot;</span><br><span class="line">&quot;        -&gt;  Parallel Index Only Scan using t2_pkey on t2  (cost=0.42..20147.09 rows=416667 width=4) (actual time=0.024..0.024 rows=1 loops=3)&quot;</span><br><span class="line">&quot;              Heap Fetches: 0&quot;</span><br><span class="line">&quot;              Buffers: shared hit=8&quot;</span><br><span class="line">&quot;        -&gt;  Index Scan using t1_pkey on t1  (cost=0.44..50848.71 rows=1133330 width=104) (actual time=6.710..6.710 rows=0 loops=3)&quot;</span><br><span class="line">&quot;              Index Cond: (id &lt; 1000000)&quot;</span><br><span class="line">&quot;              Buffers: shared hit=8207&quot;</span><br><span class="line">&quot;Planning:&quot;</span><br><span class="line">&quot;  Buffers: shared hit=2745&quot;</span><br><span class="line">&quot;Planning Time: 3.938 ms&quot;</span><br><span class="line">&quot;Execution Time: 55.550 ms&quot;</span><br></pre></td></tr></table></figure><h3 id="ANALYZE-VACUUM"><a href="#ANALYZE-VACUUM" class="headerlink" title="ANALYZE &amp; VACUUM"></a>ANALYZE &amp; VACUUM</h3><ul><li>ANALYZE：<ol><li>主要是更統計資訊，可以提供 QO 更好執行計畫</li><li>建立 &amp; 更新 visibility map 檔案</li></ol></li><li>VACUUM：<ol><li>將 dead tuple 空出來，但硬碟空間不會釋放出來（如果要釋放硬碟空間需要使用 FULL Vacuum)</li><li>更新 transaction ID 序號 (避免 transaction wraparound )</li></ol></li></ul><h3 id="使用-ANALYZE-VACUUM-後續"><a href="#使用-ANALYZE-VACUUM-後續" class="headerlink" title="使用 ANALYZE &amp; VACUUM 後續"></a>使用 ANALYZE &amp; VACUUM 後續</h3><p>執行 <code>vacuum ANALYZE T1;</code>，再次執行就可發現目前執行計畫就很正確了，跑 <code>Nested Loop</code> 演算法，並且只有 read 3 個 block &amp; 執行時間也大幅降低</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">&quot;QUERY PLAN&quot;</span><br><span class="line">&quot;Nested Loop  (cost=0.86..8.90 rows=1 width=104) (actual time=0.004..0.004 rows=0 loops=1)&quot;</span><br><span class="line">&quot;  Buffers: shared hit=3&quot;</span><br><span class="line">&quot;  -&gt;  Index Scan using t1_pkey on t1  (cost=0.44..4.46 rows=1 width=104) (actual time=0.003..0.003 rows=0 loops=1)&quot;</span><br><span class="line">&quot;        Index Cond: (id &lt; 1000000)&quot;</span><br><span class="line">&quot;        Buffers: shared hit=3&quot;</span><br><span class="line">&quot;  -&gt;  Index Only Scan using t2_pkey on t2  (cost=0.42..4.44 rows=1 width=4) (never executed)&quot;</span><br><span class="line">&quot;        Index Cond: (id = t1.id)&quot;</span><br><span class="line">&quot;        Heap Fetches: 0&quot;</span><br><span class="line">&quot;Planning:&quot;</span><br><span class="line">&quot;  Buffers: shared hit=20&quot;</span><br><span class="line">&quot;Planning Time: 0.232 ms&quot;</span><br><span class="line">&quot;Execution Time: 0.027 ms&quot;</span><br></pre></td></tr></table></figure><p>所以擁有正常的統計資訊，可以讓QO選擇正確決策</p><h2 id="Auto-vacuum-時機"><a href="#Auto-vacuum-時機" class="headerlink" title="Auto vacuum 時機"></a>Auto vacuum 時機</h2><p>如果不了解 Auto vacuum 觸發時機的人會詢問， Postgresql 不是會自動幫我們做 <code>vacuum</code> 為什麼上面例子還需要自己執行?</p><blockquote><p>因為上面案例還沒觸發閥值，所以不會做 Vacuum</p></blockquote><p>我們可以用 <code>pg_settings</code> 查看當前 postgres 關於 AutoVacuum 關鍵資訊設定</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> name,setting</span><br><span class="line"><span class="keyword">from</span> pg_settings </span><br><span class="line"><span class="keyword">where</span> name <span class="keyword">in</span> (<span class="string">&#x27;autovacuum_vacuum_scale_factor&#x27;</span>,<span class="string">&#x27;autovacuum_analyze_scale_factor&#x27;</span>,<span class="string">&#x27;autovacuum_analyze_threshold&#x27;</span>,<span class="string">&#x27;autovacuum_vacuum_threshold&#x27;</span>);</span><br></pre></td></tr></table></figure><p>AutoVacuum 主要是由下面兩個公式判斷是否需要執行 vacuum 或 analyze，會有一個類似累積池概念，累績目前資料表 dead tuple 數量</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">觸發 auto_analyze = autovacuum_analyze_scale_factor * number of tuples + autovacuum_analyze_threshold</span><br><span class="line">觸發 auto_vacuum = autovacuum_vacuum_scale_factor * number of tuples + autovacuum_vacuum_threshold</span><br></pre></td></tr></table></figure><p><a href="https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM">Autovacuum Daemon</a></p><hr><table><thead><tr><th>name</th><th>setting</th></tr></thead><tbody><tr><td>autovacuum_analyze_scale_factor</td><td>0.05</td></tr><tr><td>autovacuum_analyze_threshold</td><td>50</td></tr><tr><td>autovacuum_vacuum_scale_factor</td><td>0.1</td></tr><tr><td>autovacuum_vacuum_threshold</td><td>50</td></tr></tbody></table><p>舉個例子上面 <code>T1</code> 資料表目前有 <code>20000000</code> 筆資料</p><p><code>0.05 * 20000000 + 50 = 1000050</code> 所以觸發 <code>auto_analyze</code> 資料表需要有 <code>1000050</code> 筆 dead tuple 資料</p><p><code>0.1 * 20000000 + 50 = 2000050</code> 所以觸發 <code>auto_vacuum</code> 資料表需要有 <code>2000050</code> 筆 dead tuple 資料</p><p>我們能發現一個問題是如果資料量越大，觸發的條件越困難….</p><p>所以假如線上執行計畫跑掉可以利用下面語法查詢，上次更新執行 vacuum 時間</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span></span><br><span class="line">  schemaname, relname,</span><br><span class="line">  last_vacuum, last_autovacuum,</span><br><span class="line">  <span class="comment">--vacuum_count, autovacuum_count,</span></span><br><span class="line">  last_analyze,last_autoanalyze</span><br><span class="line"><span class="keyword">FROM</span> pg_stat_user_tables</span><br><span class="line"><span class="keyword">where</span> relname <span class="operator">=</span> <span class="string">&#x27;t2&#x27;</span>;</span><br></pre></td></tr></table></figure><p><img src="https://i.imgur.com/OadcNdT.png"></p><h3 id="By-Table-設定-autovacuum"><a href="#By-Table-設定-autovacuum" class="headerlink" title="By Table 設定 autovacuum"></a>By Table 設定 autovacuum</h3><p>因為每張資料表都有不一樣情境邏輯．postgres 可以針對每個 Table 設定閥值</p><p>我把們 t2 的 <code>auto_analyze</code> &amp; <code>auto_vacuum</code> 閥值改成修改 1000，讓 autovacuum 更容易觸發</p><blockquote><p>為了更好理解我把 <code>autovacuum_vacuum_threshold</code> 和 <code>autovacuum_analyze_threshold</code> 設定成 0</p></blockquote><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">ALTER TABLE</span> t2 <span class="keyword">SET</span> (autovacuum_vacuum_scale_factor <span class="operator">=</span> <span class="number">0.001</span>);</span><br><span class="line"><span class="keyword">ALTER TABLE</span> t2 <span class="keyword">SET</span> (autovacuum_vacuum_threshold <span class="operator">=</span> <span class="number">0</span>);</span><br><span class="line"><span class="keyword">ALTER TABLE</span> t2 <span class="keyword">SET</span> (autovacuum_analyze_scale_factor <span class="operator">=</span> <span class="number">0.001</span>);</span><br><span class="line"><span class="keyword">ALTER TABLE</span> t2 <span class="keyword">SET</span> (autovacuum_analyze_threshold <span class="operator">=</span> <span class="number">0</span>);</span><br></pre></td></tr></table></figure><ul><li>autovacuum_analyze &#x3D; <code>0.001 * 1000000 + 0 = 1000</code></li><li>autovacuum_vacuum &#x3D; <code>0.001 * 1000000 + 0 = 1000</code></li></ul><p>我利用下面語法查詢，了解某張資料表 dead tuple 數量</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> relname,schemaname,n_dead_tup <span class="keyword">as</span> &quot;死元組數&quot;,</span><br><span class="line">(<span class="keyword">case</span>  <span class="keyword">when</span> n_live_tup <span class="operator">&gt;</span> <span class="number">0</span> <span class="keyword">then</span> </span><br><span class="line"> n_dead_tup::float8<span class="operator">/</span>n_live_tup::float8</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line"> <span class="number">0</span></span><br><span class="line"><span class="keyword">end</span>) <span class="keyword">as</span> &quot;死/活元組的比例&quot;</span><br><span class="line"><span class="keyword">from</span> pg_stat_all_tables</span><br><span class="line"><span class="keyword">where</span> relname <span class="operator">=</span> <span class="string">&#x27;t2&#x27;</span></span><br></pre></td></tr></table></figure><p>換句話說 t2 資料表只需要有超過 1000 個 dead tuple 就會出發更新 (就類似所謂的校正回歸XDD)</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">UPDATE</span> t2</span><br><span class="line"><span class="keyword">SET</span> val <span class="operator">=</span> <span class="number">20000</span></span><br><span class="line"><span class="keyword">WHERE</span> id <span class="operator">&lt;</span> <span class="number">1002</span></span><br></pre></td></tr></table></figure><p>在過一陣子後 postgrsql console log 會在背景執行 autovacuum 並把那些資料 mark</p><p><img src="https://i.imgur.com/4hec84Q.png"></p><p>假如我們更新的資料還沒到臨界值就會造成，造成統計資訊和 dead tuple 過多</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">UPDATE</span> t2</span><br><span class="line"><span class="keyword">SET</span> val <span class="operator">=</span> <span class="number">100</span></span><br><span class="line"><span class="keyword">WHERE</span> id <span class="operator">&lt;=</span> <span class="number">999</span></span><br></pre></td></tr></table></figure><h3 id="auto-vaccum-與-insert-data"><a href="#auto-vaccum-與-insert-data" class="headerlink" title="auto vaccum 與 insert data"></a>auto vaccum 與 insert data</h3><p>上面都是講述 daed tuple 假如是新增是否會觸發 auto vaccum 呢?</p><p>答案：是要看你使用 Postgresql 版本是多少，假如是 PG 13 之前新增資料超過 thresholds 不會執行,但 PG 13 之後會執行 auto vaccum</p><p>假如你的場景是常常大量 Insert PG 版本我會建議更新到 PG 13 以上，這樣才會比較規律更新統計資訊</p><blockquote><p><a href="https://postgrespro.com/list/id/CAODZiv5M+g7DmtVYi2VqXWh44FNgnBNMkwEFbC_WCLLDKn+=7g@mail.gmail.com#head">https://postgrespro.com/list/id/CAODZiv5M+g7DmtVYi2VqXWh44FNgnBNMkwEFbC_WCLLDKn+=7g@mail.gmail.com#head</a></p></blockquote><h2 id="小結"><a href="#小結" class="headerlink" title="小結"></a>小結</h2><p>經過查找一系列資料跟比較之前使用 sql-server 經驗，postgres 可以針對每個 Table 特性設定他適合的 <code>autovacuum</code> 閥值，來定期更新統計資訊</p><p>執行 <code>vacuum</code> 會針對資料表上 <code>SHARE UPDATE EXCLUSIVE</code> lock 且會有少許 IO 操作</p><blockquote><p>雖然資料表可以照常 CRUD 但還是對於資料表會有些許負擔</p></blockquote><p>所以我會建議做 vacuum 時，最好在系統離峰，但如果統計值偏差很大要更新還是更新不然會造成問題更大</p><p>至於是否要 by table 去做 autovacuum 閥值設定，就可以看情況</p><p>假如我們資料量大到一定程度，需要讀寫分離．在 Read DB 建議在每天離峰時做一次 <code>VACUUM</code> and <code>ANALYZE</code>.</p><blockquote><p>A common strategy for read-mostly databases is to run VACUUM and ANALYZE once a day during a low-usage time of day. (This will not be sufficient if there is heavy update activity.)</p></blockquote><p>參考 : <a href="https://www.postgresql.org/docs/14/sql-analyze.html">https://www.postgresql.org/docs/14/sql-analyze.html</a></p><p><strong>此文作者</strong>：Daniel Shih(石頭)<br /><strong>此文地址</strong>： <a href="https://isdaniel.github.io/postgresql-autovacuum/">https://isdaniel.github.io/postgresql-autovacuum/</a> <br /><strong>版權聲明</strong>：本博客所有文章除特別聲明外，均採用 <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/tw/">CC BY-NC-SA 3.0 TW</a> 許可協議。轉載請註明出處！</p>]]>
    </content>
    <id>https://isdaniel.github.io/postgresql-autovacuum/</id>
    <link href="https://isdaniel.github.io/postgresql-autovacuum/"/>
    <published>2022-01-21T08:12:43.000Z</published>
    <summary>AutoVacuum 在 Postgresql 是一個很重要的機制(甚至可以說最重要也不為過)，但裡面有些地方需要了解今天就帶大家初探</summary>
    <title>Postgresql AutoVacuum 介紹</title>
    <updated>2026-04-22T03:00:22.030Z</updated>
  </entry>
</feed>
