• 1. Introduction

    I started where most people start: asking Claude.ai to analyze stocks for me. “What’s AAPL’s RSI?” — and it works, kind of. Claude can fetch data through MCP tools, compute a single indicator for a single ticker, and give you a reasonable answer.

    But the moment I tried anything real — “Screen all US stocks for oversold signals” or “Compare RSI and MACD across my 50-ticker watchlist” — it fell apart. The LLM can only handle one ticker at a time. The data comes from third-party APIs with no guarantee of accuracy. Ten tickers already eats 11% of the context window. And every query costs tokens

    I wanted something different: accurate data I control, analysis across thousands of tickers at once, and a cost I could sustain as a personal project — not a $200/month cloud bill. So I built the data infrastructure first, then used AI as the last step — not to fetch and compute, but to write the C++ code with duckdb extension that made it all possible.


    The Problem: Three Ways to Analyze a Stock

    I want a simple answer: “What are AAPL’s SMA(20), RSI(14), and EMA(12) for the past year?”

    Here are three ways to get it — same question, very different trade-offs.

    Way 1: LLM Tool Prompt

    Using Claude.ai with a financial data MCP server (e.g., FMP — Financial Modeling Prep), you can ask a question in natural language and the LLM fetches the data for you:

    You: "Calculate SMA(20), RSI(14), and EMA(12) for AAPL's
    last 365 days of daily closing prices."

    Claude calls the FMP MCP tool, receives ~250 rows of JSON (~11,375 tokens), then tries to compute the indicators. For one ticker, this works — the API call takes 7.1 seconds, plus ~5 seconds of inference, totaling about 12 seconds and 0.03intokens.Butfor10tickers?Thats71secondsofAPIcalls,110.30 in tokens. At 100 tickers, the data doesn’t fit in the context window at all.

    Way 2: Python + yfinance + TA-Lib (Script)

    import yfinance as yf
    import talib
    import numpy as np
    df = yf.download("AAPL", start="2025-04-17", end="2026-04-17")
    close = df["Close"].values.astype(np.float64)
    df["sma_20"] = talib.SMA(close, timeperiod=20)
    df["rsi_14"] = talib.RSI(close, timeperiod=14)
    df["ema_12"] = talib.EMA(close, timeperiod=12)
    
    
    
    
    

    Way 3: DuckDB + talib Extension (What I Built)

    Fast once running (~405ms including network download), but: you need Python installed, yfinance + TA-Lib dependencies, and a script per analysis. For multiple tickers you write groupby loops. For different time periods you copy-paste code. The data lives in memory — restart and it’s gone. https://duckdb.org/community_extensions/extensions/talib

    # Prerequisite: install DuckDB (single binary, no dependencies)
    # https://duckdb.org/docs/installation/
    # macOS, dudckdb version v1.5.2
    brew install duckdb
    # or Linux
    curl -fsSL https://github.com/duckdb/duckdb/releases/latest/download/duckdb_cli-linux-amd64.zip -o duckdb.zip && unzip duckdb.zip
    -- That's it. Now run:
    INSTALL talib FROM community;
    LOAD talib;
    SELECT date, close,
    ta_sma(close, 20) OVER (ORDER BY date) AS sma_20,
    ta_rsi(close, 14) OVER (ORDER BY date) AS rsi_14,
    ta_ema(close, 12) OVER (ORDER BY date) AS ema_12
    FROM read_parquet('aapl.parquet')
    WHERE date >= '2025-04-17';

    4ms. One binary (duckdb), no Python, no API keys, no virtual environments. Change the ticker? Change the WHERE clause. 500 tickers at once? Add PARTITION BY symbol. Query data on S3? Replace read_parquet with your Iceberg table. Same SQL, any scale.

    The Trade-off of these 3 ways at a Glance

    ┌──────────────────┬──────────┬──────────┬───────────┬────────────────────┐
    │ Approach │ Time │ vs Best │ Cost │ Scales to 100+? │
    ├──────────────────┼──────────┼──────────┼───────────┼────────────────────┤
    │ Claude.ai + FMP │12,100 ms │ 3,025x │ ~$0.03/q │ No (context limit) │
    │ Python + TA-Lib │ 405 ms │ 101x │ free │ Slow (API + loops) │
    │ DuckDB + talib │ 4 ms │ 1x │ free │ Yes (one SQL) │
    └──────────────────┴──────────┴──────────┴───────────┴────────────────────┘

    DuckDB is 101x faster than Python and 3,025x faster than the LLM approach — and it’s the only one that scales. That’s what I built, and why.


    What Is It?

    atm_talib is a DuckDB extension that wraps 100+ TA-Lib technical analysis functions as native SQL functions. RSI, MACD, Bollinger Bands, 49 candlestick patterns — all available directly in SQL, https://duckdb.org/community_extensions/extensions/talib

    Try It: Apple Stock in 4 lines SQL

    -- Step 1: Install
    INSTALL talib FROM community;
    LOAD talib;
    -- Step 2: Load 30 days of Apple stock (via CSV or your preferred source)
    CREATE TABLE apple AS SELECT * FROM read_csv('apple_30d.csv');
    -- Step 3: Analyze
    SELECT
    date,
    close,
    ta_sma(close, 7) OVER (ORDER BY date) AS sma_7,
    ta_ema(close, 12) OVER (ORDER BY date) AS ema_12,
    ta_rsi(close, 14) OVER (ORDER BY date) AS rsi_14
    FROM apple
    ORDER BY date;

    Or use the scalar form for full-series computation:

    -- Get MACD signals in one shot
    SELECT t_macd(list(close ORDER BY date), 12, 26, 9) FROM apple;
    -- Detect Doji candlestick patterns
    SELECT t_cdldoji(
    list(open ORDER BY date), list(high ORDER BY date),
    list(low ORDER BY date), list(close ORDER BY date)
    ) FROM apple;

    Two function styles, one purpose:

    StylePrefixBest For
    Scalart_Backtests, batch processing, full-series analysis
    Aggregateta_Dashboards, ad-hoc queries, row-level output

    Lesson 1: AI Makes C++ Accessible to Everyone

    I’m a Python and Java developer. I’d never written a DuckDB extension.I wrote C library via CMake in my Universtiy Time. But with Claude as my pair-programming partner, I went from zero to a working extension in 2 days.

       Day 1                          Day 2
        │                              │
        ▼                              ▼
    ┌─────────────┐            ┌──────────────────┐
    │ Design Spec │            │ Multi-Output Fns │
    └──────┬──────┘            │ (MACD, BBANDS)   │
           │                   └────────┬─────────┘
           ▼                            │
    ┌──────────────┐                    ▼
    │ Scaffold     │           ┌────────────────┐
    │ Extension    │           │ Test Suite     │
    └──────┬───────┘           └────────┬───────┘
           │                            │
           ▼                            ▼
    ┌──────────────────┐       ┌────────────────────┐
    │ Adapter Layer    │       │ Community          │
    │ DuckDB <> TA-Lib │       │ Submission         │
    └──────┬───────────┘       └────────┬───────────┘
           │                            │
           ▼                            ▼
    ┌──────────────────┐       ┌────────────────────┐
    │ X-Macro Pattern  │       │ ✓ Merged into      │
    │ 100+ Functions   │──────>│   DuckDB Community │
    └──────────────────┘       └────────────────────┘
    
    
    

    The key insight: I described what I wanted — “wrap TA-Lib’s TA_SMA as a DuckDB scalar function” — and Claude helped me figure out how. The C++ X-macro pattern that registers all 100+ functions? I wouldn’t have discovered that approach on my own. The result: ~3,000 lines of C++ that compile on Linux and macOS, with no Python build dependencies.

    This is what AI-assisted development unlocks: a Python developer can ship production C++ code. Not by learning C++ for months, but by focusing on the domain problem while the AI handles the language mechanics.


    Lesson 2: Performance, Performance, Performance

    I benchmarked four ways to answer the same question: “What are AAPL’s SMA(20), RSI(14), and EMA(12) for the past year?”

    4-Way Benchmark: DuckDB Local vs S3 Iceberg vs Python vs LLM

      DuckDB Local  ▏ 4 ms
      DuckDB S3     ▏██ 155 ms
      Python+yfinance ▏████ 405 ms
      DuckDB S3 cold ▏██████████████████████████████████ 2,724 ms
      LLM+Inference ▏████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 12,100 ms
    
    
    
    ApproachTime (ms)vs Best
    DuckDB + Parquet (local)4 ms1x
    DuckDB + S3 Iceberg (warm connection)155 ms36x
    Python + yfinance API + TA-Lib405 ms101x
    DuckDB + S3 Iceberg (cold start)2,724 ms633x
    LLM Tool (FMP API + inference)~12,100 ms3,025x

    Key insights:

    • Local Parquet is king at 4ms — no network, pure columnar scan
    • S3 Iceberg warm (155ms) beats Python+yfinance (405ms) — and you get access to 14,669 tickers, not just one
    • S3 cold start (2.7s) includes loading 4 extensions + Glue catalog attach — a one-time cost per session
    • LLM tools are 3,025x slower — great for questions, terrible for batch analysis

    At Scale: The Real S3 Advantage

    The single-ticker benchmark doesn’t show S3’s real strength. Here’s what happens when you need all tickers:

    QueryDuckDB + S3Python + yfinanceLLM Tool (measured)
    1 ticker, 1 year155 ms405 ms7.1 sec (1 API call, 1.1% context)
    10 tickers4.8 sec0.2 sec71 sec (10 API calls, 11% context)
    100 tickers7.0 secN/Aimpossible (1.1M tokens > context)
    14,669 tickers → RSI < 300.34 secimpossibleimpossible

    I measured this: each FMP MCP tool call takes 7.1 seconds and returns 250 rows of JSON (~11,375 tokens). At 10 tickers you’ve consumed 11% of a 1M-token context window and spent 71 seconds waiting. At 100 tickers (1.1M tokens), the data doesn’t fit — the LLM can’t even read it, let alone analyze it.

    DuckDB + S3 scans all 14,669 tickers in a single 340ms SQL query. No token limits. No API rate limits.

    When to Use Each

    DuckDB LocalDuckDB + S3Python + APILLM Tool
    Speed4 ms155 ms405 ms~12 sec
    Best forBacktests, dashboardsFull universe scansOne-off scriptsAd-hoc questions
    DataLocal files54M rows on S3API per requestAPI per request
    CodeSQLSQL15+ lines PythonNatural language

    The LLM + DuckDB Combo

    The real power move: use DuckDB to pre-compute indicators, then feed compact results to an LLM:

    -- Filter to only oversold/overbought signals, then feed to LLM
    WITH signals AS (
    SELECT ticker, date, close,
    ta_rsi(close, 14) OVER w AS rsi,
    ta_sma(close, 20) OVER w AS sma_20
    FROM read_parquet('stocks.parquet')
    WINDOW w AS (PARTITION BY ticker ORDER BY date)
    )
    SELECT * FROM signals WHERE rsi < 30 OR rsi > 70;

    Instead of the LLM processing 250 rows of raw JSON, it gets 10 pre-filtered signals. Fewer tokens = lower cost, better analysis.


    Lesson 3: 54 Million Rows, $1/Month — A Personal Data Lakehouse

    I have a real setup: 54 million rows of daily OHLCV data for 14,669 tickers stored as Iceberg tables on S3, managed by AWS Glue. Here’s how it works:

    ┌──────────────────────────────────────────────────────────────┐
    │ Claude.ai Desktop │
    │ (AI Analysis Client) │
    │ │
    │ "Find all oversold stocks" "Show AAPL trend with MACD" │
    │ │ │ │
    │ ▼ ▼ │
    │ ┌────────────────────────────────────────────────────────┐ │
    │ │ Claude Skills │ │
    │ │ │ │
    │ │ ┌──────────────────┐ ┌──────────────────────────┐ │ │
    │ │ │ Python Scripts │───>│ DuckDB SQL + talib ext │ │ │
    │ │ │ (wrapper) │ │ │ │ │
    │ │ │ • duckdb python │ │ • t_sma(), t_rsi() │ │ │
    │ │ │ • yfinance │ │ • t_macd(), t_bbands() │ │ │
    │ │ │ • pandas │ │ • 49 candlestick fns │ │ │
    │ │ │ │ │ • read_parquet/iceberg │ │ │
    │ │ └──────────────────┘ └────────────┬─────────────┘ │ │
    │ └───────────────────────────────────────┼────────────────┘ │
    └──────────────────────────────────────────┼───────────────────┘
    ┌──────────────────────────────────┐
    │ DuckDB Engine │
    │ (in-process, no server) │
    │ │
    │ LOAD talib; │
    │ LOAD iceberg; │
    │ LOAD httpfs; │
    │ LOAD aws; │
    └───────────────┬──────────────────┘
    S3 protocol (HTTPS)
    ┌──────────────────────────────────┐
    │ AWS Data Lakehouse │
    │ │
    │ ┌────────────────────────────┐ │
    │ │ AWS Glue Catalog │ │
    │ │ (Iceberg table metadata) │ │
    │ │ │ │
    │ │ database: atm │ │
    │ │ table: t_bar_1d │ │
    │ └─────────────┬──────────────┘ │
    │ │ │
    │ ▼ │
    │ ┌───────────────────────────┐ │
    │ │ AWS S3 Bucket │ │
    │ │ (Iceberg Parquet files)│ │
    │ │ │ │
    │ │ • 54,018,638 rows │ │
    │ │ • 14,669 tickers │ │
    │ │ • 1980 — present │ │
    │ │ • ~$1/month storage │ │
    │ └────────────────────────────┘ │
    └──────────────────────────────────┘

    Real S3 Iceberg Benchmark (from my MacBook)

    -- This is all it takes to query 54M rows on S3 from a MacBook
    LOAD talib; LOAD iceberg; LOAD httpfs; LOAD aws;
    ATTACH '872583282217' AS gc (TYPE iceberg, ...);
    SELECT symbol, t_rsi(list(close ORDER BY date), 14)[-1] AS rsi
    FROM gc.atm.t_bar_1d
    WHERE date >= '2025-01-01'
    GROUP BY symbol
    HAVING rsi < 30; -- Find all oversold stocks across 14,669 tickers
    QueryDuckDB + S3 (cold)DuckDB + S3 (warm)Python + yfinance
    AAPL 1 year → SMA + RSI + EMA3.2 sec0.6 sec0.4 sec
    AAPL full history (1980-2026) → SMA(200)1.5 sec0.15 sec0.4 sec
    10 tickers → SMA + RSI6.6 sec4.8 sec0.2 sec
    14,669 tickers → RSI < 306.7 sec0.34 secimpossible

    The first few queries show S3 network latency. But the last row is the killer feature: scanning 14,669 tickers to find 274 oversold stocks in 340ms. With yfinance, you’d need 14,669 separate API calls — that would take hours and likely get rate-limited.

    Cost Comparison: All US Market Daily OHLCV (54M rows, 14K tickers)

    ┌─────────────────────┬───────────┬───────────┬───────────┬────────────┐
    │ │ Storage │ Compute │ Total/mo │ vs DuckDB │
    ├─────────────────────┼───────────┼───────────┼───────────┼────────────┤
    │ Snowflake │ $23/mo │ $228/mo* │ ~$251/mo │ 251x │
    │ Databricks │ $23/mo │ $150/mo* │ ~$173/mo │ 173x │
    │ BigQuery │ free 10GB │ $30/mo** │ ~$30/mo │ 30x │
    │ AWS RDS (Postgres) │ $50/mo │ $80/mo │ ~$130/mo │ 130x │
    │ DuckDB + S3 Iceberg │ $1/mo │ $0 (local)│ ~$1/mo │ 1x │
    └─────────────────────┴───────────┴───────────┴───────────┴────────────┘
    * Snowflake: X-Small warehouse, ~1hr/day usage
    Databricks: DBU cost for SQL warehouse, ~1hr/day
    ** BigQuery: on-demand pricing, ~6TB scanned/month
    All estimates for personal/light analytics workload

    The cloud data warehouses are built for teams and enterprise scale. For a personal financial analysis setup — where you run queries a few times a day from your laptop — paying 130251/month for always-on infrastructure is overkill.

    DuckDB + S3 gives you the same SQL power for $1/month. No server. No cluster. No idle compute bills. Just SQL from your laptop.


    Lesson 4: Open Source Community Wins

    I submitted the extension to the DuckDB Community Extensions registry. It was merged in 2 days.

    INSTALL talib FROM community;
    LOAD talib;
    -- That's it. Works on Linux and macOS with Duckdb v1.5.2

    The DuckDB community is remarkably welcoming. The extension system is well-documented, the review process is fast, and the ecosystem is growing rapidly. For a solo developer, this kind of community support is what makes a project viable.


    Summary

                            ┌──────────────────┐
                            │  DuckDB + TA-Lib │
                            └────────┬─────────┘
                   ┌─────────────────┼─────────────────┐
                   ▼                 ▼                  ▼
        ┌────────────────┐ ┌──────────────────┐ ┌────────────────┐
        │ AI Development │ │   Performance    │ │ Cost & Community│
        ├────────────────┤ ├──────────────────┤ ├────────────────┤
        │ Claude as C++  │ │ 4ms local        │ │ $1/mo S3       │
        │ pair programmer│ │ 155ms S3         │ │ 251x cheaper   │
        │                │ │ 3,025x vs LLM    │ │ than Snowflake │
        │ Python dev     │ │                  │ │                │
        │ ships native   │ │ 14,669 tickers   │ │ DuckDB community│
        │ extension      │ │ scanned in 340ms │ │ merged in 2 days│
        │                │ │                  │ │                │
        │ 2 days: zero   │ │ 54M rows from    │ │ Open source    │
        │ to community   │ │ a MacBook        │ │ MIT licensed   │
        └────────────────┘ └──────────────────┘ └────────────────┘
    
    
    

    Building this extension was one of the most rewarding side projects I’ve done. The combination of:

    • AI pair programming — making unfamiliar languages accessible
    • DuckDB’s architecture — in-process, columnar, extensible
    • Cloud-native storage — S3 + Iceberg for pennies
    • Open community — fast review, broad distribution

    …creates something that wasn’t possible even a year ago: a single developer can build and ship a high-performance analytical tool that rivals what used to require a team and a budget.

    If you do financial analysis, give it a try duckdb https://duckdb.org/ with new extension https://duckdb.org/community_extensions/list_of_extensions

    INSTALL talib FROM community;
    LOAD talib;

    All 100+ functions. Zero infrastructure. Just SQL.

    GitHub | Function Reference | SQL Cookbook

  • “All human knowledge is uncertain, inexact, and partial.” – Bertrand Russell


    Last week , we had three days intensive course – AI for Managers in ESMT Berlin 📚. I like to call them workshops for management . They aim to solve business problems with machine learning knowledge. I love ❤️ the process from a business problem into actionable real business decision.

    But professor only allowed use an “ancient calculator” 🧮️ to get the real answer in exam 😅. It is tough 💪, but memorable ⭐. I only write down several sessions which I had new insights.

    #1 Start Simple, Combine Strategically

    You don’t need deep learning or neural networks for most business problems. Classical ML methods (logistic regression, clustering, factor analysis) are:

    • Interpretable — you can explain the results to non-technical stakeholders
    • Fast — run in seconds, iterate quickly
    • Powerful when combined — 1+1+1 = 10

    #2 The Real Skill Is Interpretation

    Running the algorithm is the easy part. The hard part is:

    • Naming factors meaningfully (not just “Factor 1, Factor 2”)
    • Interpreting cluster profiles (turning numbers into customer personas)
    • Translating coefficients into business recommendations

    Machine learning amplifies your judgment, it doesn’t replace it.

    As management, you need to make trade-offs, cut-off value.

    #3: All Models Are Wrong, But Some Are Useful

    The course drilled this into us from day one. No algorithm gives you “the answer.” Machine learning models offer insights, patterns, and probabilities—but you still need to make the decision. The real skill isn’t running the algorithm. It’s knowing when to trust it, when to question it, and how to combine it with business knowledge.

    What this means in practice in class:

    • Factor Analysis told us there are preference dimensions—but WE named them and interpreted what they meant
    • K-means gave us customer clusters—but WE decided how to target each segment
    • Logistic Regression predicted who would like the product—but WE chose the marketing strategy

    #4: Integration Beats Perfection

    Instead of searching for the “perfect” ML method, we learned to chain multiple simple techniques together.

    Think of it like cooking: you don’t need one magical ingredient. You need the right combination of ingredients, each serving its purpose.

    In the course, Microvan project:

    • Factor Analysis = Prep work (cleaning and organizing ingredients)
    • Clustering = Understanding your audience (who are we cooking for?)
    • Logistic Regression = The final dish (what will they love?)

    Alone, each method gives limited insight. Together, they create a complete strategy.

    #3 A Practical Business Decision Framework

    The key insight: Most real business problems need MULTIPLE methods working together.

    #5 A Real World Automotive Case

    The Challenge

    Our automotive company was developing a new Microvan. We had:

    • Survey data from 500 potential customers
    • 15 different vehicle feature ratings (style, safety, price, etc.)
    • One crucial question: Who should we target, and what should we emphasize?

    The Problem We Faced

    Looking at 15 different variables was overwhelming. Customers who liked “modern design” also liked “sporty appearance” and “interior space”—everything was correlated. We couldn’t just build a regression model because of high correlation (when variables are too similar, the model gets confused).We learned a systematic problem solving.

    How We Solved It

    Phase 1: Understanding What Customers Really Care About (Factor Analysis)

    The Question: “What are the underlying preference dimensions?”

    What We Did: Instead of looking at 15 separate features, we used Factor Analysis to find underlying patterns.

    Key Insight: Customers don’t think about 15 separate features. They think about Style, Safety, and Value. This simplified everything. Mathematical: We converted each customer’s 15 ratings into 5 “factor scores”—a much cleaner dataset.

    Regression model
    ==========================================

    Call: lm(formula = mvliking ~ F1 + F2 + F3 + F4 + F5, data = scores)

    Coefficients:
                Estimate Std. Error t value             Pr(>|t|)    
    (Intercept)   4.8425     0.1102  43.925 < 0.0000000000000002 ***
    F1            1.0277     0.1104   9.311 < 0.0000000000000002 ***
    F2            0.9989     0.1104   9.049 < 0.0000000000000002 ***
    F3            0.2091     0.1104   1.894               0.0589 .  
    F4           -0.1729     0.1104  -1.566               0.1181    
    F5           -0.5545     0.1104  -5.023          0.000000772 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 2.205 on 394 degrees of freedom
    Multiple R-squared:  0.3365,    Adjusted R-squared:  0.3281 
    F-statistic: 39.97 on 5 and 394 DF,  p-value: < 0.00000000000000022

    Phase 2: Discovering Customer Segments (K-means Clustering)

    The Question: “Are there distinct types of customers?”

    What We Did: Using those factor scores, we ran K-means clustering to find natural customer groups.

    With these data, we converted into a real persona matrix

    Cluster centers (z-score means on factors), k = 4
    ClusterF1F2F3F4F5
    1-0.780.150.44-0.03-1.30
    2-0.890.05-0.51-0.640.37
    30.681.170.210.370.25
    40.63-1.130.070.280.10

    Base on these inputs, my team find a good pattern to make Cluster 3 as our GO-TO-Market Primary Target.  we had data-driven clarity on exactly who to target and what to say, it called “explainable ai”.

    All Steps Together of Use Case

    The Magic of Integration:

    1. Factor Analysis gave us clean, interpretable dimensions
    2. Clustering used those dimensions to find natural customer groups
    3. Logistic Regression used those same dimensions to predict success
    4. Together they created a complete, actionable business strategy

    Each method reinforced the others. The factors made clustering easier. The clusters helped us understand predictions. The predictions validated our customer segmentation.

  • ✈️ When AI Broke Rose’s Heart: A Travel Insurance Story That Changed Everything, The future story of how one woman’s denied claim exposed a hidden bias in artificial intelligence 🧠⚖️ — and the data scientist who fixed it 🛠️


    ⚠️ Disclaimer: This narrative is fictional and intended for educational purposes.

    All insurance outcomes, accuracy figures 📈, and datasets 📊 are illustrative and purpose-built, not drawn from real customers.

    But bias in AI is widespread because legacy data 🗂️ often encodes historical inequities that regulation alone cannot retroactively fix ⚖️. Our goal is to show that bias can be detected 🔎, measured 📏, and mitigated 🧰 — and yes, we can fix it ✅.

    Notebook, Python Code is for data scientist, engineer to download

    🛫 The Perfect Trip That Wasn’t

    Rose and Jack 👫 had been planning their dream vacation to New York 🗽 for months. The flights ✈️ were booked. The hotels 🏨 were confirmed. Their suitcases 🧳 were packed with excitement for their first trip together.

    But as fate would have it, their perfect getaway took an unexpected turn ⚠️.

    🗽 Day 3 in New York City
    📍 JFK Airport, Terminal 4

    “Sir, ma’am, I’m sorry…” The airline representative’s voice trailed off as Rose’s heart sank. Their luggage — containing Jack’s camera equipment, Rose’s carefully planned outfits, and precious souvenirs — had vanished somewhere between connections.

    Total loss: $1,847 worth of belongings.

    But they had travel insurance. “At least we’re covered,” Jack said, squeezing Rose’s hand.

    They never imagined that artificial intelligence would soon judge them differently — not based on their claim, but based on who they were.


    📝 Two Identical Claims, Two Different Outcomes

    Jack’s Experience (5 days later)

    📧 Email Notification
    Subject: CLAIM APPROVED ✅

    Dear Jack,
    Your claim for $1,847 has been approved.
    Payment will be processed within 3-5 business days.

    Status: APPROVED IN 5 DAYS

    Rose’s Experience (5 days later)

    📧 Email Notification
    Subject: CLAIM UPDATE ❌

    Dear Rose,
    After careful review, we regret to inform you…
    Your claim has been denied.

    Status: REJECTED

    Same flight. Same lost luggage. Same insurance company. Different genders. Different outcomes.

    Rose was devastated. “Why would they approve Jack’s claim but deny mine?” she asked, tears welling up. “We lost the exact same things.”


    📞 The Call That Changed Everything

    “I need to speak to a manager,” Rose insisted, her voice steady despite the frustration.

    That’s when Laura, the company’s senior data scientist responsible for its AI claim, taking care this issue to analysis. What she discovered would shake the entire organization.

    “This can’t be right…” Laura muttered, her fingers flying across the keyboard as she dove into the data.

    Laura’s Investigation Flow
    ╔══════════════════════════════════════╗
    ║ 📞 Rose’s Complaint ║
    ║ “Same claim, different result” ║
    ╚══════════════════════╬═══════════════╝


    ╔══════════════════════════════════════╗
    ║ 🔍 Check AI Decision Log ║
    ║ Same confidence scores? ║
    ╚══════════════════════╬═══════════════╝


    ╔══════════════════════════════════════╗
    ║ ⚠️ Gender Pattern Detected ║
    ║ Male: 69.7% approval ║
    ║ Female: 36.1% approval ║
    ╚══════════════════════╬═══════════════╝


    ╔══════════════════════════════════════╗
    ║ 🚨 BIAS ALERT! ║
    ║ Systematic discrimination ║
    ╚══════════════════════════════════════╝


    📊 The Shocking Discovery

    Laura pulled up the historical claims data — 2,000 insurance claims from the past year. What she found made her stomach drop:

    The Numbers Don’t Lie (2000 claims, Mock Datasets)

    📊 APPROVAL RATES BY GENDER
    ┌─────────────────────────────────────┐
    │ Male Customers: 69.7% approved │
    │ Female Customers: 36.1% approved │
    │ │
    │ 🤯 GAP: 33.6 percentage points │
    └─────────────────────────────────────┘

    But Why? The Hidden Truth

    The AI wasn’t consciously discriminating — it had learned from biased historical data. Here’s what Laura discovered:

    The AI’s Learning Process (The Problem)
    ╔════════════════════════════════════════╗
    ║ 📚 Historical Claims Data ║
    ║ ├─ 70% from male customers ║
    ║ ├─ 30% from female customers ║
    ║ └─ Legacy of human bias ║
    ║ │ ║
    ║ ▼ ║
    ║ 🤖 AI Model Training ║
    ║ ├─ “Learn patterns” ║
    ║ ├─ Males = higher approval rate ║
    ║ └─ Pattern becomes decision rule ║
    ║ │ ║
    ║ ▼ ║
    ║ ⚖️ New Claims Processing ║
    ║ ├─ Apply learned patterns ║
    ║ ├─ Same claim, different gender ║
    ║ └─ Different outcome = DISCRIMINATION ║
    ╚════════════════════════════════════════╝

    The heartbreaking reality: Rose wasn’t denied because her claim was invalid. She was denied because she was female which is the big factor.


    🛠️ The Fix: Teaching AI to Be Fair

    Laura knew she couldn’t change the past data, but she could fix the future. She turned to Fairlearn, a Microsoft toolkit designed to detect and mitigate AI bias.

    Step 1: Measuring the Bias

    Before she could fix the problem, Laura needed to quantify it. She used a key metric from the Fairlearn toolkit: Demographic Parity Difference. This metric calculates the difference in approval rates between the most and least advantaged groups.

    A value close to zero means everyone has a roughly equal chance of getting their claim approved, regardless of their gender. A high value, however, signals a major problem.

    # Laura's bias detection code
    from fairlearn.metrics import demographic_parity_difference
    
    # Compare actual outcomes vs. AI predictions
    bias_score = demographic_parity_difference(
        y_true=actual_claims,
        y_pred=ai_predictions, 
        sensitive_features=customer_gender
    )
    
    print(f"Initial Bias Score: {bias_score:.3f}")
    # Result: 0.336 — an extremely high score!
    
    

    The result of 0.336 confirmed her fears. It was concrete proof that the system was heavily skewed. To make this clear to her team, she also visualized the disparity.

    Initial Gender Bias Chart
    Initial Gender Bias Chart

    The bar chart showed that males were being approved at a rate of 69.7%, while females were only approved 36.1% of the time. This meant males had a 1.93x higher chance of getting their claim approved. The data was undeniable.

    Step 2: Three Mitigation Solutions Benchmark

    Laura knew there was no one-size-fits-all solution for fairness. She decided to test three different mitigation strategies available in Fairlearn to find the best balance between reducing bias and maintaining the model’s accuracy.

    Here’s a summary of her findings:

    Fairness Experiments
    Fairness Experiments

    Key Metrics Interpretation

    MetricBaseline (Biased)After Mitigation (Fair)Improvement
    DP Difference0.0330.02040%
    Accuracy59.0%57.3%-1.7% (minor)
    Male Approval Rate61.0%60.0%Fairer Outcome
    Female Approval Rate57.7%62.0%Fairer Outcome ✅

    Trade-offs:

    • Mitigation reduces bias but may slightly reduce accuracy
    • Different constraints optimize for different fairness notions
    • Choose based on your fairness requirements and regulatory needs

    Method 1: Demographic Parity

    This method aims for the most straightforward definition of fairness: equal approval rates for all groups. The goal is to make the selection_rate (the percentage of people approved) the same for both men and women.

    • Goal: Make approval rates identical.
    • Result: While it successfully reduced the demographic_parity_difference to 0.049 (a huge improvement!), it came at a cost. The overall accuracy of the model dropped, meaning it made more incorrect decisions for everyone.
    • Verdict: Not ideal. It achieved fairness by sacrificing too much accuracy.
    # Method 1: Forcing approval rates to be the same
    from fairlearn.reductions import ExponentiatedGradient, DemographicParity
    
    mitigator_dp = ExponentiatedGradient(
        estimator=LogisticRegression(),
        constraints=DemographicParity() # Goal: Equal selection rates
    )
    

    Method 2: Equalized Odds ⭐ WINNER

    This approach is more nuanced. It aims for equal error rates across groups. In this context, it means ensuring that the rates of false positives (approving a fraudulent claim) and false negatives (denying a valid claim) are the same for both men and women.

    This is often the preferred method in scenarios like insurance or lending, where the consequences of errors are high.

    • Goal: Make sure the model makes mistakes at the same rate for everyone.
    • Result: This was the clear winner. It reduced the equalized_odds_difference to just 0.020, a 40% reduction in bias from the original model. Crucially, it did so while maintaining a strong level of accuracy.
    • Verdict: The best of both worlds — significantly fairer without compromising performance.
    # Method 2: Balancing the error rates
    from fairlearn.reductions import EqualizedOdds
    
    fair_model = ExponentiatedGradient(
        estimator=LogisticRegression(),
        constraints=EqualizedOdds() # Goal: Equal error rates
    )
    

    Method 3: Threshold Optimizer

    This is a post-processing technique, meaning it doesn’t retrain the model. Instead, it adjusts the decision threshold (the score needed to approve a claim) for each group separately. It’s a quicker fix but often less robust.

    • Goal: Find different approval thresholds for each group to balance outcomes.
    • Result: It offered a decent improvement, reducing bias by 27% (demographic_parity_difference of 0.045). However, it wasn’t as effective as Equalized Odds.
    • Verdict: A good quick fix, but not the most thorough solution.
    # Method 3: Adjusting the decision threshold after prediction
    from fairlearn.postprocessing import ThresholdOptimizer
    
    threshold_optimizer = ThresholdOptimizer(
        estimator=base_model,
        constraints="demographic_parity"
    )
    

    Step 4: 📈 The Results: A Fairer Future

    Before vs. After

    🎯 APPROVAL RATES (After Fix)
    ┌─────────────────────────────────────┐
    │ Male Customers: 60.0% approved │
    │ Female Customers: 62.0% approved │
    │ │
    │ ✅ Gap: -2.0% (females slightly │
    │ higher approval – that’s okay!) │
    │ ✅ Bias reduced by 40% │
    │ ✅ Accuracy maintained at 57.3% │
    └─────────────────────────────────────┘

    The Business Impact

    💰 COST vs. BENEFIT ANALYSIS
    ╔═══════════════════════════════════════╗
    ║ Implementation Costs: ║
    ║ ├─ Development: $8,000 ║
    ║ ├─ Accuracy loss: $17,000 ║
    ║ ├─ Monitoring: $5,000/year ║
    ║ └─ Total: ~$30,000 ║
    ║ ║
    ║ Benefits: ║
    ║ ├─ Avoid lawsuits: $500K-$5M ║
    ║ ├─ Regulatory compliance: ✅ ║
    ║ ├─ Brand protection: $100K ║
    ║ └─ Customer trust: Priceless! 💎 ║
    ╚═══════════════════════════════════════╝


    💝 The Happy Ending

    Two weeks later, Rose received an email:

    Email Notification
    Subject: CLAIM RE-EVALUATION 
    
    Dear Rose,
    After system improvements, your claim 
    has been re-evaluated and APPROVED.
    
    Payment of $1,847 is being processed.
    We apologize for the inconvenience.
    
    

    Laura’s fix was deployed company-wide. Within a month:

    • 🎯 40% reduction in gender bias
    • 💼 $0 spent on discrimination lawsuits
    • ❤️ Customer satisfaction scores improved
    • 🏆 Industry recognition for ethical AI

    🎓 Key Takeaways (What You Need to Know)

    For Everyone

    1️⃣ AI learns from historical data │
    2️⃣ Historical data contains bias │
    3️⃣ AI learns and repeats bias │
    4️⃣ This affects real people! 😢 │
    5️⃣ But we CAN fix it! ✅ │

    For Business Leaders

    • Bias audits should be mandatory
    • Fairness metrics need tracking
    • Diverse teams build better AI
    • Transparency builds customer trust
    • Ethical AI is good business

    🌟 The Choice, Red or Blue ?

    Rose and Jack’s story isn’t just about travel insurance — it’s about the future of artificial intelligence. As AI makes more decisions about our lives, ensuring fairness becomes critical.

    Path A: Ignore Bias 😈 ║
    ║ ├─ Discrimination continues ║
    ║ ├─ Legal liability grows ║
    ║ ├─ Customer trust erodes ║
    ║ └─ AI becomes a tool of oppression ║
    ║ ║
    ║ Path B: Fix Bias 😇 ║
    ║ ├─ Fair decisions for all ║
    ║ ├─ Legal compliance achieved ║
    ║ ├─ Customer trust earned ║
    ║ └─ AI becomes a force for good ║

    Laura chose Path B. Every day, more data scientists are choosing fairness over convenience.

    Key Takeaways

    • Bias is real, measurable, and fixable. It’s not a mysterious force; it’s a data problem we can solve.
    • Fairlearn provides effective, easy-to-use tools for both detecting and mitigating bias.
    • A small trade-off in accuracy can lead to a significant improvement in fairness.
    • Choose the right fairness constraint for your use case. Different scenarios require different definitions of “fair.”

    📚 Learn More

    For the curious minds:


    Note: This narrative is fictional and created for educational purposes. All accuracy numbers and datasets are illustrative and purpose-built, not real customer data. Bias is widespread, and legacy data cannot be rewritten by regulation alone — that’s the core challenge we must solve. The good news: with measurement, audits, better data, and mitigation techniques, we can fix it.

  • The Dance of Communication

    Communication is not just about talking — it’s about connection. In every meaningful exchange, we dance between expressing ourselves and truly hearing others.

    Fred Kofman, in The Dance of Communication”, reminds us that good communication is less about control. It is more about awareness. It also involves humility and curiosity.

    Here’s how I’ve learned to apply this idea — from my leadership course at ESMT Berlin.

    🌟 Step 1: Ask Yourself Before You Speak

    Before any conversation begins, pause and reflect. These questions help me center my mindset — not just my message:

    🌟 Focus🧠 Reflective Question💬 Purpose
    🎯 Intention“What is my real purpose in this conversation?”To clarify if you want to learn, solve, or just express.
    🤔 Assumptions“What assumptions or judgments am I bringing into this talk?”To reduce bias and stay open-minded.
    ❤️ Attitude“Am I ready to listen with empathy and respect?”To remind yourself to stay calm and kind.
    🧩 Outcome“What outcome would be meaningful for both of us?”To focus on collaboration, not competition.
    🪞 Self-Awareness“Am I speaking to learn or to win?”To balance advocacy 🗣️ and inquiry ❓.



    🧮 Step 2: Advocacy vs. Inquiry — Finding the Balance

    Communication thrives when we share to learn (advocacy) and listen to understand (inquiry).

    1. Speak to learn, not to win 🧘‍♂️
    2. Balance advocacy 🗣️ and inquiry ❓
    3. Lead with humility 🙇curiosity 🔍, and respect 🤝

    🗣️ Step 3: Practice the Dance — Advocacy and Inquiry in Action

    🌟 Topic💡 Core Idea🧭 Practice / Example
    🤝 1. The Power of ConversationConversations can change beliefs, perceptions, and actions.Communicate with openness ❤️ and curiosity 🧠, not control 🔒.
    🚫 2. The Problem: Unilateral ControlPeople think “I’m right” 👑 and steer talks alone.Focus on shared goals 🎯, not ego. Admit others may be right too 🤔.
    ❌ 3. Symptoms of Control– Speak without reasoning 🗣️- Ask rhetorical questions 🎭- Hide your true views 🙊Be transparent 🪞 with logic, data 📊, and uncertainty ❓.
    🌱 4. Productive Mindset“We need to learn together — I might be wrong.”Shift from winning 🏆 to learning 📚.
    📣 5. Productive AdvocacyExpress ideas clearly and humbly 🙇.– Show reasoning 🧮 & data 📑- Admit doubt 😌- Invite feedback 👂
    👥 Example❌ “We should hire Bill.”✅ “I think Bill fits better because of his experience — but I’d like your thoughts.”Encourage dialogue 💬 and shared judgment ⚖️.
    🔍 6. Productive InquiryListen with curiosity 👂 and without judgment 🚫.– Explain why you ask 💬- Ask open questions ❓- Check understanding ✅
    💭 Good Questions“What led you to that view?”“How do you see my role?”“Can you give an example?”Build trust 🤝 through genuine curiosity ❤️.
    ⚖️ 7. Balance BothOnly advocacy = forcing 💥Only inquiry = hiding 🤐Both = collaboration 🤝Share 🗣️ + Listen 👂 = Learn together 📚.
    🧮 Advocacy vs. Inquiry MatrixHigh + High → 🤝 Collaboration & Learning High + Low→ 💥 Forcing Low + High → 🕊️ Accommodating Low + Low → 🚪 WithdrawingBalance words ⚖️ and questions ❓ for growth 🌱.
    🧩 8. Handling ImpasseWhen stuck 😕, state the dilemma openly 🗣️ and ask for help 🆘.– Ask what might change their view 🔄- Try new data or role switch 🔁- Co-create solutions 💡
    🪞 9. Reflection Questions1️⃣ What’s my intention? 🎯2️⃣ Learning or winning? 🧠3️⃣ What are my assumptions? 💭4️⃣ What truly matters? ❤️Ask these before each important talk 🗣️.

    Final Thought: The real skill in communication isn’t speaking well — it’s listening beautifully. Speak with intention 🎯, listen with curiosity 🔍, and lead with respect 🤝.

  • In modern software projects, most of our effort goes not into coding — but into talking in Meetings, clarifications, tickets, re-alignments, and documentation eat away precious hours. —– GitHub’s Global Code Time Report.

    The average developer spends about 480 – 52 = 428 minutes per day communicating and only 52 minutes coding.

    1. The Problem: 90% Communication, 10% Coding

    The root cause from our daily oral speaking.

    • Unstructured communication brings cost.
    • Misunderstandings between people.
    • Massive rework and wasted effort.
    • No single source of truth.

    2. The Solution: Shift From “How” to “What

    Do we have another way to fix it ? BDD, TDD, DDD or SDD

    The definiation: Spec-Driven Development is a methodology that prioritizes creating clear, structured specifications. Structured specification—a single source of truth—is created before any code is written, and this specification becomes the executable contract that drives, validates, and documents the entire engineering process.

    The machines not only write “Java Code”, but also write “User Story”

    Specs becomes the new source code of communication

    The process can be broken down into clear, validated 4 phases:

    1. Specify: A human provides a high-level description of the feature or product, and an AI agent generates a detailed, structured specification that captures the intent, behaviors, and requirements.
    2. Plan: The spec is translated into a technical plan outlining the architectural decisions, research tasks, and overall strategy for implementation.
    3. Tasks: The plan is broken down into small, reviewable, and implementable tasks.
    4. Code & Validate: AI agents generate the production code based on the tasks and specifications, and both humans and automated tests validate the outcome against the original spec.

    3. The impact: Structured spec == Less talk, More build

    There are two major business impacts:

    • Faster Iteration: Developers spend less time in “meeting tennis” with AI and more time on high-value tasks like system design, spec review, With structured specs, teams can cut 40% fewer meetings.
    • By forcing a clear definition of requirements upfront, SDD reduces ambiguity and results in less rework and higher-quality, like AWS Kiro, rework drops — 7× fewer iterations than without specs.

    4. Role Exchange: From Execution to Steering

    In the SDD paradigm, humans move up the value chain:

    Traditional RoleFuture with SDD
    DeveloperWrites and maintains specs, AI writes the code
    ArchitectDefines context boundaries and system desgins
    QA EngineerValidates specs and outcomes via spec-based tests
    Product ManagerPrioritizes spec reuse and spec quality metrics

    The result: Developers spend less time typing and more time steerring.

    4. Context Engineering – How specs get selected as Context

    New Code tools like (Claude Code, Codex, Cursor, Qwen Code, Trae ai) enable select right specification as context automatically.

    if you like to go deeper, it is all about context engineering, Langchain has a good blog to expalin https://blog.langchain.com/context-engineering-for-agents/

    For example, in code cli tool, using compress command, the windows of context becames into 100% free.

    5. Spec-Driven Tool (Kiro from AWS), easy understand.

    6. Create a New Business idea in practice

    An insurance company likes to introduce new Humaniod Robot insurance buiness Model.

    Use Case: In Germany, a Figure 03 humanoid robot cooking in a Miele kitchen lost its camera vision due to smoke from overcooking, causing serious damage to the kitchen. 

    Accept Criteria: The workflow should demonstrate policy coverage evaluation and calculate the compensation amount provided.

    9. Key Takeaways

    • Shift your mindset — from how to codewhat to build.
    • Treat specs as assets, not overhead.
    • Use one SDD tool in your next sprint — measure time saved.
    • Build a community around shared specs (e.g. GDPR.md, AI_ACT.md).
    • Log improvements — time, rework, clarity.

    10. Time Spend in this new idea totally

    11. Get Started (Technical in detail)

    We are using open source stack spec-kit and Qwen cli.

    uv tool install specify-cli
    npm install -g @qwen-code/qwen-code@latest
    
    cd your_project_folder
    specify check # it supports many tools, but you need install the one you like. 
    
    specify.init .
    qwen # code cli https://github.com/QwenLM/qwen-code, you will find serval new command specify init/inject into code tools.
    

    After another 4 commands.

    specify.specify
    specify.plan
    specify.tasks
    specify.implement
    

    we will have a new business model from idea into reality.!!!

    Results Sharing

    1.Project Tree with clean Architecture

    2. OpenAPI 3.0 Contract

    2. Data Model Design

    3. Project Timeline

    4. User Story analysis

    5. Business Data Flow

    Closing Words

    Thanks for reading! The tools aren’t perfect — not yet — but they’ve helped us immensely in transforming ideas into working products faster than ever before. Martin Fowler team published another high level overview of the tools, you can read it and have another architects team view https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html

  • Summary

    Robot programming is still challenging — even for senior developers and kids! Why? Because it combines the physical world 🌍 with the software world 💻.

    AI hasn’t conquered the physical world yet 🤖🌍 — sensors, motors, and real-world variables still surprise us every day! ⚙️🔋🛞

    In the end, we must face the real-world chaos head-on, squash every bug 🐞, and win the game 🏆!


    📖 Our Story

    • ⏳ We started in October 2024 with zero experience and no equipments.
    • 🎄 After the Christmas break, we bought the devices and map and gained a basic understanding of the system, devices, and game rules.
    • 📋 We received the task in January 2025.
    • 🔧 From April 2025, we began intensive task preparation.
    • 🎯 Before the final competition, we could solve 100% of the task.
    • 👧👧 Two amazing girls scored 100 points on their own
    • 🥈🎉 2nd place in their group!
    • 🤝 Keep friendship difficulties together — crying and laughing side by side 😢😂💪

    ⚠️ Key Lessons

    • Don’t rely on the color sensor – it’s inconsistent and unreliable under different light conditions.
    • ⚙️ LEGO Strike Stake doesn’t behave as expected due to physical instability:
      → lighting 💡, wheels 🛞, ground texture 🪵, battery levels 🔋 all affect performance!
    • 🧱 Avoid block programming – it’s a nightmare for:
    • 1. debugging 🐛
    • 2. maintenance 🛠️
    • 3. understanding logic 🧠
      4. Use Python instead – it’s clearer, scalable, and more logical 🐍✨
    • 🕐 Speed matters! Knowing is not enough – you must finish fast to win ⏱️
    • 🗺️ Strategy over perfection – focus on solving the task effectively within limited time.
    • 🌟 Curiosity beats competition – stay passionate and explore, especially for young girls 💪👩‍🔬

    Screenshot

    🚀 Actions of 2026 WRO

    • ✅ Use Python for better logic and code management
    • 🔁 Practice with tasks from previous years for broad experience
    • 🤖 Leverage AI tools (like Copilot) to help the team learn independently
    • 💡 Be ambitious, stay curious, and encourage girls to lead boldly! 👧🚀💕
    • 🧑‍🏫👩‍🏫 To become a better coach 🤝, focus on being patient 🧘‍♂️ Connect with other coaches to learn and grow together 📈.

  • The Hard Problem in Database Migrations

    Online App: https://huggingface.co/spaces/neuesql/sqlgptapp

    The Problem

    A simple Oracle into Postgresql data migration solution analysis

    YES=Meet Need, X=Not at All, P=Partially

    ProviderSchema
    migrate
    Function
    migrate
    Store Procedure
    migrate
    AWS DMSYXX
    GOOGLE DMSYXX
    Azure DMSYXX
    Ora2pgYPP
    JooqYPP
    Database Migration Service

    Migrate functions and stored procedures: You need to migrate the stored procedures from Oracle to PostgreSQL. It is the most complicated tasks, even google, AWS, Azure do not support for enterprise big requirement and demand. The million of migration and transition cost is always the main blocker to shift technology forward.

    Objectives

    • Overing all the features from database like view, index, schema, stored procedures, functions, etc…
    • Know the limitation when the functions can’t find on target database.
    • Automation with Test pipeline.

    Solutions

    the solution is inspired by OpenAI GPT model. Converting the problem from Database Domain Special Programming Language Compiler into General NLP problem.

    Features

    • No Coding for SQL compiler or Converter in DSL language.
    • Based on Large language models (LLMs), like OpenAI GPT or Google T5, Facebook llama
    • SQL GPT be adapted into different databases with different datasets.
    • Support any data objects: Tables, Views, Indexes, Packages, Partitions, Procedures, Functions, Triggers, Types, Sequences, Materialized View, Database Links, Scheduler, GIS, etc…
    • Reinforcement learning from Human Feedback(DBA) for language model. OpenAI Paper

    Roadmap

    Version 1: SQL GPT is verifying the possibility of this design by OpenAI GPT model and API.

    Version 2: to extend model like open-source model like Google T5 model + clean dataset to build for enterprise demand.

    System Architecture(in plan)

    • There are several components like SQL collector, Dummy-Data-Generator, SQLGPT service …
    • SQLCollector: Web Service to receive source SQL.
    • DataGenerator: Generate Dummy Data for Schema.
    • SQLGPT Service: Core Service to generate target SQL.
    • Models : V1 from OpenAI model; V2 from Google T5 Model.
    • SQLTrainer: training the model by new HumanFeedback Reinforcement learning.

    Examples

    Example 1: select top 100 customer from Oracle PL/SQL into Postgresql

    ### Oracle
    SELECT id, client_id
    FROM customer
    WHERE rownum <= 100
    ORDER BY create_time DESC;
    
    ### Postgresql
    SELECT id, client_id
    FROM customer
    ORDER BY create_time DESC
    LIMIT 100;
    

    Example 2: A Transform SQL from Oracle PL/SQL into PostgreSQL PG/SQL

    ### Oracle
    CREATE OR REPLACE PROCEDURE print_contact(
        in_customer_id NUMBER
    )
        IS
        r_contact contacts%ROWTYPE;
    BEGIN
    
        SELECT *
        INTO r_contact
        FROM contacts
        WHERE customer_id = p_customer_id;
    
        dbms_output.put_line(r_contact.first_name || ' ' ||
                             r_contact.last_name || '<' || r_contact.email || '>');
    
    EXCEPTION
        WHEN OTHERS THEN
            dbms_output.put_line(SQLERRM);
    END;
    
    ### Postgresql
    -- A postgresql PG/SQL Procedure
    create procedure print_contact(IN in_customer_id integer)
        language plpgsql
    as
    $$
    DECLARE
        r_contact contacts%ROWTYPE;
    BEGIN
        -- get contact based on customer id
        SELECT *
        INTO r_contact
        FROM contacts
        WHERE customer_id = in_customer_id;
    
        -- print out contact's information
        RAISE NOTICE '% %<%>', r_contact.first_name, r_contact.last_name, r_contact.email;
    EXCEPTION
        WHEN OTHERS THEN
            RAISE EXCEPTION '%', SQLERRM;
    END;
    $$;
    

    Limitation:

    SQL context is limited to the LLM max training token size now, T5 model need different datasets to feed. Testing Facebook LLLM model.

    Source Code

    Github Introducation: https://github.com/neuesql/sqlgpt

    Github T5 model: https://github.com/neuesql/sqltransformer

    T5-Endpoint: https://huggingface.co/neuesql/sqltransformer

  • Make infrastructure as code testable and callable

    Infrastructure as code is not programming, it’s only configuration.

    — as a software developer

    With the big cloud migration wave, there are a lot of cloud provision and configuration effort. There is a modern name called “ Infrastructure as code “. But after AWS released in the market over 1 decade, it is still in earlier stage from a software engineer point view.

    The biggest market winner is https://www.terraform.io/, which it can deliver infrastructure easily. But it was same year in 2014, AWS released boto3 API, https://pypi.org/project/boto3/0.0.1/ which can also provision all AWS service in python code easily, link here

    import boto3
    client = boto3.resource('s3')
    response = client.create_bucket(
        Bucket='examplebucket',
        CreateBucketConfiguration={
            'LocationConstraint': 'eu-west-1',
        },
    )
    
    print(response)

    How is the definition of simplicity from Terraform point view? I have different options. Is boto3 complicated?

    The main disadvantage for Terraform is the DSL language can’t test easily with Unit Test, Mock Service, Integration Test, E2E test like other modern language Python, Java. Many Infra Engineers or Architects are always challenging me, WHY need test? —– It’s like a joke.

    However, there are some innovative solution is coming,

    1. AWS CDK https://aws.amazon.com/cdk/
    2. Plumi https://www.pulumi.com/
    3. Terraform https://www.terraform.io/cdktf

    Which is providing modern programming productivity to make infrastructure as real code with quality.

    import pulumi
    from pulumi_aws import s3
    
    # Create an AWS resource (S3 Bucket)
    bucket = s3.Bucket('my-bucket')
    
    # Export the name of the bucket
    pulumi.export('bucket_name',  bucket.id)

    As today, there are maybe many existing legacies Terraform code or module in your organization which you can’t drop directly. I made a hand-on dirty to find a testable solution to maintain TF code. And today SonarQube support TF code check for AWS.

    1. Overview of Test Cost

    Cost to Run Test

    2. Unit Test in Terraform

    It is not really unit test, it is more grammar check and plan explanation.

    terraform fmt -check
    tflint
    terraform validate
    terraform plan

    3. Integration Test Terraform

    In terraform, you can apply to deploy your code at Integration Test Directly.

    terraform apply
    terraform destory [don't forget release code]

    But you can also use advanced IT test framework such as Terratest4 or kitchen-terraform5. It’s a provision PostgreSQL test example.

    
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
    		TerraformDir: tfFilePath,
    		Vars: map[string]interface{}{
    			"postfix":     uniquePostfix,
    			"db_user":     expectedUser,
    			"db_password": expectedPassword,
    		},
    		NoColor: false,
    	})
    	subscriptionID := "xxx"
    
    	defer terraform.Destroy(t, terraformOptions)
    
    	terraform.InitAndApply(t, terraformOptions)
    	expectedServername := "postgresqlserver-" + uniquePostfix // see fixture
    	actualServername := terraform.Output(t, terraformOptions, "servername")
    	rgName := terraform.Output(t, terraformOptions, "rgname")
    	expectedSkuName := terraform.Output(t, terraformOptions, "sku_name")
    	actualServer := azure.GetPostgreSQLServer(t, rgName, actualServername, subscriptionID)
    	actualServerAddress := *actualServer.ServerProperties.FullyQualifiedDomainName
    	actualServerUser := *actualServer.ServerProperties.AdministratorLogin
    
    	// Expectation
    	assert.NotNil(t, actualServer)
    	assert.Equal(t, expectedUser, actualServerUser)
    	assert.Equal(t, expectedServername, actualServername)
    	assert.Equal(t, expectedSkuName, *actualServer.Sku.Name)
    

    In go libraries, you can use SQL to make end2end test like this,

    func ConnectDB(t *testing.T, userName string, expectedPassword string, databaseAddress string, actualServername string) {
    	var connectionString string = fmt.Sprintf("host=%s user=%s password=%s dbname=%s sslmode=require", databaseAddress, userName+"@"+actualServername, expectedPassword, "postgres")
    	print(connectionString)
    	db, err := sql.Open("postgres", connectionString)
    	assert.Nil(t, err, "open db failed")
    	err = db.Ping()
    	assert.Nil(t, err, "connect db failed")
    	fmt.Println("Successfully created connection to database")
    	var currentTime string
    	err = db.QueryRow("select now()").Scan(&currentTime)
    	assert.Nil(t, err, "query failed ")
    	assert.NotEmpty(t, currentTime, "Get Query Time "+currentTime)
    

    github code is here https://github.com/wuqunfei/tfmodule-azure-resource-postgresql/blob/main/test/mod_test.go#L56

    4. Policy Test in Terraform

    IT Compliance and Security become into code in these years when you provision your infrastructure. Azure has an example blog https://learn.microsoft.com/en-us/azure/developer/terraform/best-practices-compliance-testing

    terraform show -json main.tfplan > main.tfplan.json
    
    docker run --rm -v $PWD:/target -it eerkunt/terraform-compliance -f features -p main.tfplan.json
    
    # https://github.com/terraform-compliance/cli  a lightweight, security focused, BDD test framework against terraform.
    

    5. Terraform Code Quality Check

    Since last year, SonarQube supports Terraform Grammar and Security Check. It helps us reduce a lot manually setup and check.

    For example, this code has Omitting “public_network_access_enabled” allows network access from the Internet. Make sure it is safe here.

    # public_network_access_enabled = true
    # default is true, need to set false
    public_network_access_enabled = false

    Which developer can’t find it easily because public network access as default.

    6. Making Terraform code callable

    1. Shell spaghetti easily: Using some shell and scripts or github ci/cd or jenkins, the provision can work automation quickly and easily.

    2. Integrating with automation tools : we can also use Ansible TF-module and define a play book to run automatically. Ansible TF model link Ansible Towner can explore their API easily to automation tools.

    3. Crossplane Open API definition

    It is a new design architecture from Crossplane which had considered each component can be callable by Open API definition.

      versions:
      - name: v1alpha1
        served: true
        referenceable: true
        schema:
          openAPIV3Schema:
            type: object
            properties:
              spec:
                type: object
                properties:
                  parameters:
                    type: object
                    properties:
                      storageGB:
                        type: integer
                    required:
                      - storageGB
                required:
                  - parameters

    Original code is here https://github.com/crossplane/crossplane/blob/master/docs/getting-started/create-configuration.md

    Summary

    To write infrastructure code goes straightforward, but the foundation of high-quality software mindset which is still missing in many cloud engineering daily job. In this blog, I summarized the existing marketing options, it may help someone like to improve their automation and quality in their infrastructure daily work.

  • Centralizing CI/CD pipeline in a big organization

    Don’t repeat yourself

    https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

    Using Gitlab CI/CD and Github actions pipeline makes workflow for a single project quickly. There is always a pipeline file like .gitlab or .github/workflows in an individual project. But each developer or team need to maintain and update it regularly when there are bugs or updates. 

    In a big organization, there would be better to have a centralized team (e.g., Cloud Platform, Performance & Reliability Engineering, Engineering Tools) to develop standard tooling and infrastructure to solve every development team’s problems.  

    https://netflixtechblog.com/full-cycle-developers-at-netflix-a08c31f83249

    Pipeline for different applications is a common demand for software engineering, data engineering and data scientist. It can also help business developers focus on their business logic, and a centralized team can support the most updated stacks in the big organization. 

    A pipeline is a code. So how can we share and combine pipeline code with the business code quickly and smoothly? This article will share 3 patterns and anti-patterns with Jenkins features to achieve this goal. However, it does not limit you to using modern CI/CD tools like cloud pipelines like Azure Pipelines, AWS CodePipeline, Google Cloud build. 

    Patterns:

    1. To set boundaries and separate responsibilities. Let specialists do the professional job. 
    2. Centralized pipelines in a centralized git repository, a centralized team to maintain, update and bug fix. CI can combine the various source code, pipeline and deployment in one building process. 
    3.  To give the possibility and flexibility to the end developer to customize features and add new ideas. 

    Anti-Patterns:

    1. Anyone can do anything. I am not sure whether it is good or bad, but asking data scientists to write a service deploy pipeline is too expensive, opposite asking DevOps to write an NLP pipeline is also challenging. 
    2. Each project has its pipeline. How to fix >100 repositories with the same legacy pipeline simultaneously? Don’t repeat yourself
    3.  Control over Innovations. A centralized team does not have the capacity or passion for taking the responsibilities. Then the mess is coming, in the organization better have a mature-approve model to control innovation instead of killing the invention. 

    Economy effects

    Depending on your organization size, if there are 1000 developers, each team has over 10 projects running on the production like today’s modern microservice pattern. And the pipeline bug fix needs 2 hours for a specialist, and a general developer needs to invest time to understand the context and solve it for 4 hours at the first time. Others same pipeline can be solved in 1 hour with previous experience.

    Hour(x) = 1000(developers) x ( 1 (first project) x 4(hour) + 9(other projects) x 1(hour))

           = 1000 x( 1 x 4 + 9 x 1)

           = 13,000 h

    Using Germany Munich Average Developer’s Salaries € 7000/Month

    Cost(x) = 13000(h) x 7000(€)/22(D)/8(H) 

            = 13000 x 7000 /22/8

            = 517, 045 €

    In scaling economy, one specialist fixes a bug in the centralized repository pipeline, can save half a million for 2 hours consumed bug in 1000 developer’s organization. 

    Let’s do it in action.

    Use case: build a pipeline that can support python flask web service, which can deploy to Azure Kubernetes service.

    1. Create two repositories, one for source code one for pipeline code.
      1. https://github.com/wuqunfei/jenkins_ai_pipelines
      2. https://github.com/wuqunfei/ocr_service
    2. To use Jenkins DSL to create a similar pipeline for the same type of workflow with parameters, like spring boot application, python web application, etc…
      1. ocr_service is classic python flask web service
    3. Combine pipeline code and source code in the same CI job
    //github server setting
    String github_token_credential = "git-token-credentials"
    String github_host = "github.com"
    
    //central pipeline repository
    String pipeline_repository = "wuqunfei/jenkins_ai_pipelines"
    String pipeline_jenkins_file = "Jenkinsfile.py.aks.groovy"
    
    //application source code
    String source_code_repository_url = "https://github.com/wuqunfei/ocr_service"
    String source_code_branch = "main"
    
    //Azure ACR and AKS
    String acr_name = "ocr"
    String acr_credential = "acr_credential"
    String aks_kubeconfig_file_credential = "k8s"
    
    
    //Application
    String application_name = "pysimple"
    
    pipelineJob("ocr-service-builder") {
        parameters {
    
            stringParam('github_token_credential', github_token_credential, 'Github token credential id')
    
            stringParam("application_name", application_name, "application_name for docker image")
            stringParam("source_code_repository_url", source_code_repository_url, "Application Source Code HTTP URL")
            stringParam("source_code_branch", source_code_branch, "Application Source Code Branch, default main")
    
    
            stringParam("pipeline_repository", pipeline_repository, "pipeline github project name")
            stringParam("pipeline_jenkins_file", pipeline_jenkins_file, 'pipeline file')
    
    
            stringParam("acr_name", acr_name, "Azure Container Registry name for docker image")
            stringParam("acr_credential", acr_credential, "Azure Container credential(user/pwd) id in jenkins ")
            stringParam("aks_kubeconfig_file_credential",aks_kubeconfig_file_credential, "Azure AKS kubeconfig file credential id in Jenkins" )
    
        }
        definition {
            cpsScm {
                scm {
                    git {
                        remote {
                            github(pipeline_repository, "https", github_host)
                            credentials(github_token_credential)
                        }
                    }
                }
                scriptPath(pipeline_jenkins_file)
            }
        }
    }
    

    Jenkins DSL API https://jenkinsci.github.io/job-dsl-plugin/#path/pipelineJob

    pipeline {
        agent any
        stages {
            stage('Checkout Source Code and Deployment Code') {
                steps {
    
                    echo "Checkout source code done ${source_code_repository_url}"
    
                    git branch: "${params.source_code_branch}", credentialsId: "${params.github_token}", url: "${params.source_code_repository_url}"
                    echo "Checkout source code done ${source_code_repository_url}"
    
                }
            }
            stage("Test Code"){
                steps{
                    echo "Test code"
                }
            }
            stage("Build Code"){
                steps{
                    echo "application build done"
                }
            }
            stage("Docker Build"){
                steps{
                    script {
                        dockerImage = docker.build("${params.application_name}:${env.BUILD_ID}")
                    }
                    echo "docker build done"
                }
            }
            stage("Docker Publish ACR"){
                steps{
                    script{
                        docker_register_url =  "https://${params.acr_name}.azurecr.io"
                        docker.withRegistry( docker_register_url, "${params.acr_credential}" ) {
                            dockerImage.push("latest")
                        }
                    }
                    echo "docker push done"
                }
            }
            stage("Kubernetes Deploy"){
                steps{
                    withCredentials([kubeconfigContent(credentialsId: 'k8s', variable: 'kubeconfig_file')]) {
                        dir ('~/.kube') {
                            writeFile file:'config', text: "$kubeconfig_file"
                        }
                        sh 'cat ~/.kube/config'
                        echo "K8s deploy is done"
                    }
                }
            }
            stage("Service Health Check"){
                steps{
                    echo "Service is up"
                }
            }
        }
    }
    

    The implementation is straightforward, but letting the team members and manager comprehend takes much more time. One of my working companies took at least two years to mature this idea, with some fantastic architects pushing the idea “pipeline driven organization” https://www.infoq.com/articles/pipeline-driven-organization/.

    I hope my experience can inspire anyone to apply to your organization with Jenkins or other cloud CI/CD tools. 

    Reference is here:

    https://github.com/wuqunfei/jenkins_ai_pipelines

    https://www.digitalocean.com/community/tutorials/how-to-automate-jenkins-job-configuration-using-job-dsl

  • A big Package of “Architecture” Principles/Manifesto

    Define a set of guiding principles is an important first step of any strategy

    —- Cloud Strategy from Gregor Hohpe.

    1. Agile manifesto

    We are uncovering better ways of developing
    software by doing it and helping others do it.
    Through this work we have come to value:

    1. Individuals and interactions over processes and tools
    2. Working software over comprehensive documentation
    3. Customer collaboration over contract negotiation
    4. Responding to change over following a plan

    That is, while there is value in the items on
    the right, we value the items on the left more.

    https://agilemanifesto.org/iso/en/manifesto.html

    3. 21 principles of enterprise architecture

    Four categories of principles

    • General principles
    • Information principles
    • Application principles
    • Technology principles

    General principles

    1. IT and business alignment
    2. Maximum benefits at the lowest cost and risk
    3. Business continuity
    4. Compliance with standards and policies
    5. Adoption of the best practices for the market

    Information principles

    1. Information treated as an asset
    2. Shared information
    3. Accessible information
    4. Common terminology and data definitions
    5. Information security

    Application principles

    1. Technological independence
    2. Easy-to-use applications
    3. Component reusability and simplicity
    4. Adaptability and flexibility
    5. Convergence with the enterprise architecture
    6. Enterprise architecture also applies to external applications
    7. Low-coupling interfaces
    8. Adherence to functional domains

    Technology principles

    1. Changes based on requirements
    2. Control of technical diversity and suppliers
    3. Interoperability

    https://developer.ibm.com/articles/enterprise-architecture-financial-sector/

    3. 6 pillars of AWS Architectured framework

    1. Operational Excellence
    2. Security
    3. Reliability
    4. Performance Efficiency
    5. Cost Optimization
    6. Sustainability

    https://aws.amazon.com/cn/blogs/apn/the-6-pillars-of-the-aws-well-architected-framework/

    4. The Twelve Factors

    1. Codebase
      One codebase tracked in revision control, many deploys
    2. Dependencies
      Explicitly declare and isolate dependencies
    3. Config
      Store config in the environment
    4. Backing services
      Treat backing services as attached resources
    5. Build, release, run
      Strictly separate build and run stages
    6. Processes
      Execute the app as one or more stateless processes
    7. Port binding
      Export services via port binding
    8. Concurrency
      Scale out via the process model
    9. Disposability
      Maximize robustness with fast startup and graceful shutdown
    10. Dev/prod parity
      Keep development, staging, and production as similar as possible
    11. Logs
      Treat logs as event streams
    12. Admin processes
      Run admin/management tasks as one-off processes

    https://12factor.net/

    5. Manifesto for software craftsmanship

    As aspiring Software Craftsmen we are raising the bar of professional software development by practicing it and helping others learn the craft. Through this work we have come to value:

    1. Not only working software, but also well-crafted software.
    2. Not only responding to change, but also steadily adding value
    3. Not only individuals and interactions, but also a community of professionals
    4. Not only customer collaboration, but also productive partnerships

    That is, in pursuit of the items on the left we have found the items on the right to be indispensable.

    https://manifesto.softwarecraftsmanship.org/#/en