Qunfei@Data, ML and AI Architecture

AI for Managers

2026-02-22
“All human knowledge is uncertain, inexact, and partial.” – Bertrand Russell

Last week , we had three days intensive course – AI for Managers in ESMT Berlin 📚. I like to call them workshops for management . They aim to solve business problems with machine learning knowledge. I love ❤️ the process from a business problem into actionable real business decision.

But professor only allowed use an “ancient calculator” 🧮️ to get the real answer in exam 😅. It is tough 💪, but memorable ⭐. I only write down several sessions which I had new insights.

#1 Start Simple, Combine Strategically

You don’t need deep learning or neural networks for most business problems. Classical ML methods (logistic regression, clustering, factor analysis) are:
- Interpretable — you can explain the results to non-technical stakeholders
- Fast — run in seconds, iterate quickly
- Powerful when combined — 1+1+1 = 10
#2 The Real Skill Is Interpretation

Running the algorithm is the easy part. The hard part is:
- Naming factors meaningfully (not just “Factor 1, Factor 2”)
- Interpreting cluster profiles (turning numbers into customer personas)
- Translating coefficients into business recommendations
Machine learning amplifies your judgment, it doesn’t replace it.

As management, you need to make trade-offs, cut-off value.

#3: All Models Are Wrong, But Some Are Useful

The course drilled this into us from day one. No algorithm gives you “the answer.” Machine learning models offer insights, patterns, and probabilities—but you still need to make the decision. The real skill isn’t running the algorithm. It’s knowing when to trust it, when to question it, and how to combine it with business knowledge.

What this means in practice in class:
- Factor Analysis told us there are preference dimensions—but WE named them and interpreted what they meant
- K-means gave us customer clusters—but WE decided how to target each segment
- Logistic Regression predicted who would like the product—but WE chose the marketing strategy
#4: Integration Beats Perfection

Instead of searching for the “perfect” ML method, we learned to chain multiple simple techniques together.

Think of it like cooking: you don’t need one magical ingredient. You need the right combination of ingredients, each serving its purpose.

In the course, Microvan project:
- Factor Analysis = Prep work (cleaning and organizing ingredients)
- Clustering = Understanding your audience (who are we cooking for?)
- Logistic Regression = The final dish (what will they love?)
Alone, each method gives limited insight. Together, they create a complete strategy.

#3 A Practical Business Decision Framework

The key insight: Most real business problems need MULTIPLE methods working together.

#5 A Real World Automotive Case

The Challenge

Our automotive company was developing a new Microvan. We had:
- Survey data from 500 potential customers
- 15 different vehicle feature ratings (style, safety, price, etc.)
- One crucial question: Who should we target, and what should we emphasize?
The Problem We Faced

Looking at 15 different variables was overwhelming. Customers who liked “modern design” also liked “sporty appearance” and “interior space”—everything was correlated. We couldn’t just build a regression model because of high correlation (when variables are too similar, the model gets confused).We learned a systematic problem solving.

How We Solved It

Phase 1: Understanding What Customers Really Care About (Factor Analysis)

The Question: “What are the underlying preference dimensions?”

What We Did: Instead of looking at 15 separate features, we used Factor Analysis to find underlying patterns.

Key Insight: Customers don’t think about 15 separate features. They think about Style, Safety, and Value. This simplified everything. Mathematical: We converted each customer’s 15 ratings into 5 “factor scores”—a much cleaner dataset.
```
Regression model
==========================================

Call: lm(formula = mvliking ~ F1 + F2 + F3 + F4 + F5, data = scores)

Coefficients:
            Estimate Std. Error t value             Pr(>|t|)    
(Intercept)   4.8425     0.1102  43.925 < 0.0000000000000002 ***
F1            1.0277     0.1104   9.311 < 0.0000000000000002 ***
F2            0.9989     0.1104   9.049 < 0.0000000000000002 ***
F3            0.2091     0.1104   1.894               0.0589 .  
F4           -0.1729     0.1104  -1.566               0.1181    
F5           -0.5545     0.1104  -5.023          0.000000772 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.205 on 394 degrees of freedom
Multiple R-squared:  0.3365,    Adjusted R-squared:  0.3281 
F-statistic: 39.97 on 5 and 394 DF,  p-value: < 0.00000000000000022
```
Phase 2: Discovering Customer Segments (K-means Clustering)

The Question: “Are there distinct types of customers?”

What We Did: Using those factor scores, we ran K-means clustering to find natural customer groups.

With these data, we converted into a real persona matrix

Cluster centers (z-score means on factors), k = 4
Cluster F1 F2 F3 F4 F5
1 -0.78 0.15 0.44 -0.03 -1.30
2 -0.89 0.05 -0.51 -0.64 0.37
3 0.68 1.17 0.21 0.37 0.25
4 0.63 -1.13 0.07 0.28 0.10

Base on these inputs, my team find a good pattern to make Cluster 3 as our GO-TO-Market Primary Target. we had data-driven clarity on exactly who to target and what to say, it called “explainable ai”.

All Steps Together of Use Case

The Magic of Integration:
1. Factor Analysis gave us clean, interpretable dimensions
2. Clustering used those dimensions to find natural customer groups
3. Logistic Regression used those same dimensions to predict success
4. Together they created a complete, actionable business strategy
Each method reinforced the others. The factors made clustering easier. The clusters helped us understand predictions. The predictions validated our customer segmentation.

Cluster centers (z-score means on factors), k = 4
1	-0.78	0.15	0.44	-0.03	-1.30
2	-0.89	0.05	-0.51	-0.64	0.37
3	0.68	1.17	0.21	0.37	0.25
4	0.63	-1.13	0.07	0.28	0.10

Responsible AI Series 1: From Legacy Data Bias to Fair AI in Insurance

2025-11-12

✈️ When AI Broke Rose’s Heart: A Travel Insurance Story That Changed Everything, The future story of how one woman’s denied claim exposed a hidden bias in artificial intelligence 🧠⚖️ — and the data scientist who fixed it 🛠️

⚠️ Disclaimer: This narrative is fictional and intended for educational purposes.

All insurance outcomes, accuracy figures 📈, and datasets 📊 are illustrative and purpose-built, not drawn from real customers.

But bias in AI is widespread because legacy data 🗂️ often encodes historical inequities that regulation alone cannot retroactively fix ⚖️. Our goal is to show that bias can be detected 🔎, measured 📏, and mitigated 🧰 — and yes, we can fix it ✅.

Notebook, Python Code is for data scientist, engineer to download

🛫 The Perfect Trip That Wasn’t

Rose and Jack 👫 had been planning their dream vacation to New York 🗽 for months. The flights ✈️ were booked. The hotels 🏨 were confirmed. Their suitcases 🧳 were packed with excitement for their first trip together.

But as fate would have it, their perfect getaway took an unexpected turn ⚠️.

🗽 Day 3 in New York City
📍 JFK Airport, Terminal 4

“Sir, ma’am, I’m sorry…” The airline representative’s voice trailed off as Rose’s heart sank. Their luggage — containing Jack’s camera equipment, Rose’s carefully planned outfits, and precious souvenirs — had vanished somewhere between connections.

Total loss: $1,847 worth of belongings.

But they had travel insurance. “At least we’re covered,” Jack said, squeezing Rose’s hand.

They never imagined that artificial intelligence would soon judge them differently — not based on their claim, but based on who they were.

📝 Two Identical Claims, Two Different Outcomes

Jack’s Experience (5 days later)

📧 Email Notification
Subject: CLAIM APPROVED ✅

Dear Jack,
Your claim for $1,847 has been approved.
Payment will be processed within 3-5 business days.

Status: APPROVED IN 5 DAYS

Rose’s Experience (5 days later)

📧 Email Notification
Subject: CLAIM UPDATE ❌

Dear Rose,
After careful review, we regret to inform you…
Your claim has been denied.

Status: REJECTED

Same flight. Same lost luggage. Same insurance company. Different genders. Different outcomes.

Rose was devastated. “Why would they approve Jack’s claim but deny mine?” she asked, tears welling up. “We lost the exact same things.”

📞 The Call That Changed Everything

“I need to speak to a manager,” Rose insisted, her voice steady despite the frustration.

That’s when Laura, the company’s senior data scientist responsible for its AI claim, taking care this issue to analysis. What she discovered would shake the entire organization.

“This can’t be right…” Laura muttered, her fingers flying across the keyboard as she dove into the data.

Laura’s Investigation Flow
╔══════════════════════════════════════╗
║ 📞 Rose’s Complaint ║
║ “Same claim, different result” ║
╚══════════════════════╬═══════════════╝
│
▼
╔══════════════════════════════════════╗
║ 🔍 Check AI Decision Log ║
║ Same confidence scores? ║
╚══════════════════════╬═══════════════╝
│
▼
╔══════════════════════════════════════╗
║ ⚠️ Gender Pattern Detected ║
║ Male: 69.7% approval ║
║ Female: 36.1% approval ║
╚══════════════════════╬═══════════════╝
│
▼
╔══════════════════════════════════════╗
║ 🚨 BIAS ALERT! ║
║ Systematic discrimination ║
╚══════════════════════════════════════╝

📊 The Shocking Discovery

Laura pulled up the historical claims data — 2,000 insurance claims from the past year. What she found made her stomach drop:

The Numbers Don’t Lie (2000 claims, Mock Datasets)

📊 APPROVAL RATES BY GENDER
┌─────────────────────────────────────┐
│ Male Customers: 69.7% approved │
│ Female Customers: 36.1% approved │
│ │
│ 🤯 GAP: 33.6 percentage points │
└─────────────────────────────────────┘

But Why? The Hidden Truth

The AI wasn’t consciously discriminating — it had learned from biased historical data. Here’s what Laura discovered:

The AI’s Learning Process (The Problem)
╔════════════════════════════════════════╗
║ 📚 Historical Claims Data ║
║ ├─ 70% from male customers ║
║ ├─ 30% from female customers ║
║ └─ Legacy of human bias ║
║ │ ║
║ ▼ ║
║ 🤖 AI Model Training ║
║ ├─ “Learn patterns” ║
║ ├─ Males = higher approval rate ║
║ └─ Pattern becomes decision rule ║
║ │ ║
║ ▼ ║
║ ⚖️ New Claims Processing ║
║ ├─ Apply learned patterns ║
║ ├─ Same claim, different gender ║
║ └─ Different outcome = DISCRIMINATION ║
╚════════════════════════════════════════╝

The heartbreaking reality: Rose wasn’t denied because her claim was invalid. She was denied because she was female which is the big factor.

🛠️ The Fix: Teaching AI to Be Fair

Laura knew she couldn’t change the past data, but she could fix the future. She turned to Fairlearn, a Microsoft toolkit designed to detect and mitigate AI bias.

Step 1: Measuring the Bias

Before she could fix the problem, Laura needed to quantify it. She used a key metric from the Fairlearn toolkit: Demographic Parity Difference. This metric calculates the difference in approval rates between the most and least advantaged groups.

A value close to zero means everyone has a roughly equal chance of getting their claim approved, regardless of their gender. A high value, however, signals a major problem.

# Laura's bias detection code
from fairlearn.metrics import demographic_parity_difference

# Compare actual outcomes vs. AI predictions
bias_score = demographic_parity_difference(
    y_true=actual_claims,
    y_pred=ai_predictions, 
    sensitive_features=customer_gender
)

print(f"Initial Bias Score: {bias_score:.3f}")
# Result: 0.336 — an extremely high score!

The result of 0.336 confirmed her fears. It was concrete proof that the system was heavily skewed. To make this clear to her team, she also visualized the disparity.

The bar chart showed that males were being approved at a rate of 69.7%, while females were only approved 36.1% of the time. This meant males had a 1.93x higher chance of getting their claim approved. The data was undeniable.

Step 2: Three Mitigation Solutions Benchmark

Laura knew there was no one-size-fits-all solution for fairness. She decided to test three different mitigation strategies available in Fairlearn to find the best balance between reducing bias and maintaining the model’s accuracy.

Here’s a summary of her findings:

Key Metrics Interpretation

Metric	Baseline (Biased)	After Mitigation (Fair)	Improvement
DP Difference	0.033	0.020	40% ✅
Accuracy	59.0%	57.3%	-1.7% (minor)
Male Approval Rate	61.0%	60.0%	Fairer Outcome
Female Approval Rate	57.7%	62.0%	Fairer Outcome ✅

Trade-offs:

Mitigation reduces bias but may slightly reduce accuracy
Different constraints optimize for different fairness notions
Choose based on your fairness requirements and regulatory needs

Method 1: Demographic Parity

This method aims for the most straightforward definition of fairness: equal approval rates for all groups. The goal is to make the selection_rate (the percentage of people approved) the same for both men and women.

Goal: Make approval rates identical.
Result: While it successfully reduced the demographic_parity_difference to 0.049 (a huge improvement!), it came at a cost. The overall accuracy of the model dropped, meaning it made more incorrect decisions for everyone.
Verdict: Not ideal. It achieved fairness by sacrificing too much accuracy.

# Method 1: Forcing approval rates to be the same
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

mitigator_dp = ExponentiatedGradient(
    estimator=LogisticRegression(),
    constraints=DemographicParity() # Goal: Equal selection rates
)

Method 2: Equalized Odds ⭐ WINNER

This approach is more nuanced. It aims for equal error rates across groups. In this context, it means ensuring that the rates of false positives (approving a fraudulent claim) and false negatives (denying a valid claim) are the same for both men and women.

This is often the preferred method in scenarios like insurance or lending, where the consequences of errors are high.

Goal: Make sure the model makes mistakes at the same rate for everyone.
Result: This was the clear winner. It reduced the equalized_odds_difference to just 0.020, a 40% reduction in bias from the original model. Crucially, it did so while maintaining a strong level of accuracy.
Verdict: The best of both worlds — significantly fairer without compromising performance.

# Method 2: Balancing the error rates
from fairlearn.reductions import EqualizedOdds

fair_model = ExponentiatedGradient(
    estimator=LogisticRegression(),
    constraints=EqualizedOdds() # Goal: Equal error rates
)

Method 3: Threshold Optimizer

This is a post-processing technique, meaning it doesn’t retrain the model. Instead, it adjusts the decision threshold (the score needed to approve a claim) for each group separately. It’s a quicker fix but often less robust.

Goal: Find different approval thresholds for each group to balance outcomes.
Result: It offered a decent improvement, reducing bias by 27% (demographic_parity_difference of 0.045). However, it wasn’t as effective as Equalized Odds.
Verdict: A good quick fix, but not the most thorough solution.

# Method 3: Adjusting the decision threshold after prediction
from fairlearn.postprocessing import ThresholdOptimizer

threshold_optimizer = ThresholdOptimizer(
    estimator=base_model,
    constraints="demographic_parity"
)

Step 4: 📈 The Results: A Fairer Future

Before vs. After

🎯 APPROVAL RATES (After Fix)
┌─────────────────────────────────────┐
│ Male Customers: 60.0% approved │
│ Female Customers: 62.0% approved │
│ │
│ ✅ Gap: -2.0% (females slightly │
│ higher approval – that’s okay!) │
│ ✅ Bias reduced by 40% │
│ ✅ Accuracy maintained at 57.3% │
└─────────────────────────────────────┘

The Business Impact

💰 COST vs. BENEFIT ANALYSIS
╔═══════════════════════════════════════╗
║ Implementation Costs: ║
║ ├─ Development: $8,000 ║
║ ├─ Accuracy loss: $17,000 ║
║ ├─ Monitoring: $5,000/year ║
║ └─ Total: ~$30,000 ║
║ ║
║ Benefits: ║
║ ├─ Avoid lawsuits: $500K-$5M ║
║ ├─ Regulatory compliance: ✅ ║
║ ├─ Brand protection: $100K ║
║ └─ Customer trust: Priceless! 💎 ║
╚═══════════════════════════════════════╝

💝 The Happy Ending

Two weeks later, Rose received an email:

Email Notification
Subject: CLAIM RE-EVALUATION 

Dear Rose,
After system improvements, your claim 
has been re-evaluated and APPROVED.

Payment of $1,847 is being processed.
We apologize for the inconvenience.

Laura’s fix was deployed company-wide. Within a month:

🎯 40% reduction in gender bias
💼 $0 spent on discrimination lawsuits
❤️ Customer satisfaction scores improved
🏆 Industry recognition for ethical AI

🎓 Key Takeaways (What You Need to Know)

For Everyone

1️⃣ AI learns from historical data │
2️⃣ Historical data contains bias │
3️⃣ AI learns and repeats bias │
4️⃣ This affects real people! 😢 │
5️⃣ But we CAN fix it! ✅ │

For Business Leaders

Bias audits should be mandatory
Fairness metrics need tracking
Diverse teams build better AI
Transparency builds customer trust
Ethical AI is good business

🌟 The Choice, Red or Blue ?

Rose and Jack’s story isn’t just about travel insurance — it’s about the future of artificial intelligence. As AI makes more decisions about our lives, ensuring fairness becomes critical.

Path A: Ignore Bias 😈 ║
║ ├─ Discrimination continues ║
║ ├─ Legal liability grows ║
║ ├─ Customer trust erodes ║
║ └─ AI becomes a tool of oppression ║
║ ║
║ Path B: Fix Bias 😇 ║
║ ├─ Fair decisions for all ║
║ ├─ Legal compliance achieved ║
║ ├─ Customer trust earned ║
║ └─ AI becomes a force for good ║

Laura chose Path B. Every day, more data scientists are choosing fairness over convenience.

Key Takeaways

✅ Bias is real, measurable, and fixable. It’s not a mysterious force; it’s a data problem we can solve.
✅ Fairlearn provides effective, easy-to-use tools for both detecting and mitigating bias.
✅ A small trade-off in accuracy can lead to a significant improvement in fairness.
✅ Choose the right fairness constraint for your use case. Different scenarios require different definitions of “fair.”

📚 Learn More

For the curious minds:

Fairlearn Documentation – Microsoft’s bias mitigation toolkit
AI Fairness Resources – Industry best practices
Bias in Machine Learning – Technical deep-dives

Note: This narrative is fictional and created for educational purposes. All accuracy numbers and datasets are illustrative and purpose-built, not real customer data. Bias is widespread, and legacy data cannot be rewritten by regulation alone — that’s the core challenge we must solve. The good news: with measurement, audits, better data, and mitigation techniques, we can fix it.

The Dance of Communication

2025-11-03

Communication is not just about talking — it’s about connection. In every meaningful exchange, we dance between expressing ourselves and truly hearing others.

Fred Kofman, in “The Dance of Communication”, reminds us that good communication is less about control. It is more about awareness. It also involves humility and curiosity.

Here’s how I’ve learned to apply this idea — from my leadership course at ESMT Berlin.

🌟 Step 1: Ask Yourself Before You Speak

Before any conversation begins, pause and reflect. These questions help me center my mindset — not just my message:

🌟 Focus	🧠 Reflective Question	💬 Purpose
🎯 Intention	“What is my real purpose in this conversation?”	To clarify if you want to learn, solve, or just express.
🤔 Assumptions	“What assumptions or judgments am I bringing into this talk?”	To reduce bias and stay open-minded.
❤️ Attitude	“Am I ready to listen with empathy and respect?”	To remind yourself to stay calm and kind.
🧩 Outcome	“What outcome would be meaningful for both of us?”	To focus on collaboration, not competition.
🪞 Self-Awareness	“Am I speaking to learn or to win?”	To balance advocacy 🗣️ and inquiry ❓.

🧮 Step 2: Advocacy vs. Inquiry — Finding the Balance

Communication thrives when we share to learn (advocacy) and listen to understand (inquiry).

Speak to learn, not to win 🧘‍♂️
Balance advocacy 🗣️ and inquiry ❓
Lead with humility 🙇, curiosity 🔍, and respect 🤝

🗣️ Step 3: Practice the Dance — Advocacy and Inquiry in Action

🌟 Topic	💡 Core Idea	🧭 Practice / Example
🤝 1. The Power of Conversation	Conversations can change beliefs, perceptions, and actions.	Communicate with openness ❤️ and curiosity 🧠, not control 🔒.
🚫 2. The Problem: Unilateral Control	People think “I’m right” 👑 and steer talks alone.	Focus on shared goals 🎯, not ego. Admit others may be right too 🤔.
❌ 3. Symptoms of Control	– Speak without reasoning 🗣️- Ask rhetorical questions 🎭- Hide your true views 🙊	Be transparent 🪞 with logic, data 📊, and uncertainty ❓.
🌱 4. Productive Mindset	“We need to learn together — I might be wrong.”	Shift from winning 🏆 to learning 📚.
📣 5. Productive Advocacy	Express ideas clearly and humbly 🙇.	– Show reasoning 🧮 & data 📑- Admit doubt 😌- Invite feedback 👂
👥 Example	❌ “We should hire Bill.”✅ “I think Bill fits better because of his experience — but I’d like your thoughts.”	Encourage dialogue 💬 and shared judgment ⚖️.
🔍 6. Productive Inquiry	Listen with curiosity 👂 and without judgment 🚫.	– Explain why you ask 💬- Ask open questions ❓- Check understanding ✅
💭 Good Questions	“What led you to that view?”“How do you see my role?”“Can you give an example?”	Build trust 🤝 through genuine curiosity ❤️.
⚖️ 7. Balance Both	Only advocacy = forcing 💥Only inquiry = hiding 🤐Both = collaboration 🤝	Share 🗣️ + Listen 👂 = Learn together 📚.
🧮 Advocacy vs. Inquiry Matrix	High + High → 🤝 Collaboration & Learning High + Low→ 💥 Forcing Low + High → 🕊️ Accommodating Low + Low → 🚪 Withdrawing	Balance words ⚖️ and questions ❓ for growth 🌱.
🧩 8. Handling Impasse	When stuck 😕, state the dilemma openly 🗣️ and ask for help 🆘.	– Ask what might change their view 🔄- Try new data or role switch 🔁- Co-create solutions 💡
🪞 9. Reflection Questions	1️⃣ What’s my intention? 🎯2️⃣ Learning or winning? 🧠3️⃣ What are my assumptions? 💭4️⃣ What truly matters? ❤️	Ask these before each important talk 🗣️.

Final Thought: The real skill in communication isn’t speaking well — it’s listening beautifully. Speak with intention 🎯, listen with curiosity 🔍, and lead with respect 🤝.

Spec-Driven-Development in Practice

2025-10-30

In modern software projects, most of our effort goes not into coding — but into talking in Meetings, clarifications, tickets, re-alignments, and documentation eat away precious hours. —– GitHub’s Global Code Time Report.

The average developer spends about 480 – 52 = 428 minutes per day communicating and only 52 minutes coding.

1. The Problem: 90% Communication, 10% Coding

The root cause from our daily oral speaking.

Unstructured communication brings cost.
Misunderstandings between people.
Massive rework and wasted effort.
No single source of truth.

2. The Solution: Shift From “How” to “What”

Do we have another way to fix it ? BDD, TDD, DDD or SDD

The definiation: Spec-Driven Development is a methodology that prioritizes creating clear, structured specifications. Structured specification—a single source of truth—is created before any code is written, and this specification becomes the executable contract that drives, validates, and documents the entire engineering process.

The machines not only write “Java Code”, but also write “User Story”

Specs becomes the new source code of communication

The process can be broken down into clear, validated 4 phases:

Specify: A human provides a high-level description of the feature or product, and an AI agent generates a detailed, structured specification that captures the intent, behaviors, and requirements.
Plan: The spec is translated into a technical plan outlining the architectural decisions, research tasks, and overall strategy for implementation.
Tasks: The plan is broken down into small, reviewable, and implementable tasks.
Code & Validate: AI agents generate the production code based on the tasks and specifications, and both humans and automated tests validate the outcome against the original spec.

3. The impact: Structured spec == Less talk, More build

There are two major business impacts:

Faster Iteration: Developers spend less time in “meeting tennis” with AI and more time on high-value tasks like system design, spec review, With structured specs, teams can cut 40% fewer meetings.
By forcing a clear definition of requirements upfront, SDD reduces ambiguity and results in less rework and higher-quality, like AWS Kiro, rework drops — 7× fewer iterations than without specs.

4. Role Exchange: From Execution to Steering

In the SDD paradigm, humans move up the value chain:

Traditional Role	Future with SDD
Developer	Writes and maintains specs, AI writes the code
Architect	Defines context boundaries and system desgins
QA Engineer	Validates specs and outcomes via spec-based tests
Product Manager	Prioritizes spec reuse and spec quality metrics

The result: Developers spend less time typing and more time steerring.

4. Context Engineering – How specs get selected as Context

New Code tools like (Claude Code, Codex, Cursor, Qwen Code, Trae ai) enable select right specification as context automatically.

if you like to go deeper, it is all about context engineering, Langchain has a good blog to expalin https://blog.langchain.com/context-engineering-for-agents/

For example, in code cli tool, using compress command, the windows of context becames into 100% free.

5. Spec-Driven Tool (Kiro from AWS), easy understand.

6. Create a New Business idea in practice

An insurance company likes to introduce new Humaniod Robot insurance buiness Model.

Use Case: In Germany, a Figure 03 humanoid robot cooking in a Miele kitchen lost its camera vision due to smoke from overcooking, causing serious damage to the kitchen. 

Accept Criteria: The workflow should demonstrate policy coverage evaluation and calculate the compensation amount provided.

9. Key Takeaways

Shift your mindset — from how to code → what to build.
Treat specs as assets, not overhead.
Use one SDD tool in your next sprint — measure time saved.
Build a community around shared specs (e.g. GDPR.md, AI_ACT.md).
Log improvements — time, rework, clarity.

10. Time Spend in this new idea totally

11. Get Started (Technical in detail)

We are using open source stack spec-kit and Qwen cli.

uv tool install specify-cli
npm install -g @qwen-code/qwen-code@latest

cd your_project_folder
specify check # it supports many tools, but you need install the one you like.

specify.init .
qwen # code cli https://github.com/QwenLM/qwen-code, you will find serval new command specify init/inject into code tools.

After another 4 commands.

specify.specify
specify.plan
specify.tasks
specify.implement

we will have a new business model from idea into reality.!!!

Results Sharing

1.Project Tree with clean Architecture

2. OpenAPI 3.0 Contract

2. Data Model Design

3. Project Timeline

4. User Story analysis

5. Business Data Flow

Closing Words

Thanks for reading! The tools aren’t perfect — not yet — but they’ve helped us immensely in transforming ideas into working products faster than ever before. Martin Fowler team published another high level overview of the tools, you can read it and have another architects team view https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html

A Lesson Learned from World Robot Olympiad as a Coach

2025-07-07
Summary

Robot programming is still challenging — even for senior developers and kids! Why? Because it combines the physical world 🌍 with the software world 💻.

AI hasn’t conquered the physical world yet 🤖🌍 — sensors, motors, and real-world variables still surprise us every day! ⚙️🔋🛞

In the end, we must face the real-world chaos head-on, squash every bug 🐞, and win the game 🏆!

📖 Our Story
- ⏳ We started in October 2024 with zero experience and no equipments.
- 🎄 After the Christmas break, we bought the devices and map and gained a basic understanding of the system, devices, and game rules.
- 📋 We received the task in January 2025.
- 🔧 From April 2025, we began intensive task preparation.
- 🎯 Before the final competition, we could solve 100% of the task.
- 👧👧 Two amazing girls scored 100 points on their own
- 🥈🎉 2nd place in their group!
- 🤝 Keep friendship difficulties together — crying and laughing side by side 😢😂💪
⚠️ Key Lessons
- ❌ Don’t rely on the color sensor – it’s inconsistent and unreliable under different light conditions.
- ⚙️ LEGO Strike Stake doesn’t behave as expected due to physical instability:
  → lighting 💡, wheels 🛞, ground texture 🪵, battery levels 🔋 all affect performance!
- 🧱 Avoid block programming – it’s a nightmare for:
- 1. debugging 🐛
- 2. maintenance 🛠️
- 3. understanding logic 🧠
  4. Use Python instead – it’s clearer, scalable, and more logical 🐍✨
- 🕐 Speed matters! Knowing is not enough – you must finish fast to win ⏱️
- 🗺️ Strategy over perfection – focus on solving the task effectively within limited time.
- 🌟 Curiosity beats competition – stay passionate and explore, especially for young girls 💪👩‍🔬
Screenshot

🚀 Actions of 2026 WRO
- ✅ Use Python for better logic and code management
- 🔁 Practice with tasks from previous years for broad experience
- 🤖 Leverage AI tools (like Copilot) to help the team learn independently
- 💡 Be ambitious, stay curious, and encourage girls to lead boldly! 👧🚀💕
- 🧑‍🏫👩‍🏫 To become a better coach 🤝, focus on being patient 🧘‍♂️ Connect with other coaches to learn and grow together 📈.
LLM(Generative AI) for Database Migration

2023-06-03
The Hard Problem in Database Migrations

Online App: https://huggingface.co/spaces/neuesql/sqlgptapp

The Problem

A simple Oracle into Postgresql data migration solution analysis

YES=Meet Need, X=Not at All, P=Partially

Provider Schema
migrate Function
migrate Store Procedure
migrate
AWS DMS Y X X
GOOGLE DMS Y X X
Azure DMS Y X X
Ora2pg Y P P
Jooq Y P P
Database Migration Service

Migrate functions and stored procedures: You need to migrate the stored procedures from Oracle to PostgreSQL. It is the most complicated tasks, even google, AWS, Azure do not support for enterprise big requirement and demand. The million of migration and transition cost is always the main blocker to shift technology forward.

Objectives
- Overing all the features from database like view, index, schema, stored procedures, functions, etc…
- Know the limitation when the functions can’t find on target database.
- Automation with Test pipeline.
Solutions

the solution is inspired by OpenAI GPT model. Converting the problem from Database Domain Special Programming Language Compiler into General NLP problem.

Features
- No Coding for SQL compiler or Converter in DSL language.
- Based on Large language models (LLMs), like OpenAI GPT or Google T5, Facebook llama
- SQL GPT be adapted into different databases with different datasets.
- Support any data objects: Tables, Views, Indexes, Packages, Partitions, Procedures, Functions, Triggers, Types, Sequences, Materialized View, Database Links, Scheduler, GIS, etc…
- Reinforcement learning from Human Feedback(DBA) for language model. OpenAI Paper
Roadmap

Version 1: SQL GPT is verifying the possibility of this design by OpenAI GPT model and API.

Version 2: to extend model like open-source model like Google T5 model + clean dataset to build for enterprise demand.

System Architecture(in plan)
- There are several components like SQL collector, Dummy-Data-Generator, SQLGPT service …
- SQLCollector: Web Service to receive source SQL.
- DataGenerator: Generate Dummy Data for Schema.
- SQLGPT Service: Core Service to generate target SQL.
- Models : V1 from OpenAI model; V2 from Google T5 Model.
- SQLTrainer: training the model by new HumanFeedback Reinforcement learning.
Examples

Example 1: select top 100 customer from Oracle PL/SQL into Postgresql
```
### Oracle
SELECT id, client_id
FROM customer
WHERE rownum <= 100
ORDER BY create_time DESC;
```
```
### Postgresql
SELECT id, client_id
FROM customer
ORDER BY create_time DESC
LIMIT 100;
```
Example 2: A Transform SQL from Oracle PL/SQL into PostgreSQL PG/SQL
```
### Oracle
CREATE OR REPLACE PROCEDURE print_contact(
    in_customer_id NUMBER
)
    IS
    r_contact contacts%ROWTYPE;
BEGIN

    SELECT *
    INTO r_contact
    FROM contacts
    WHERE customer_id = p_customer_id;

    dbms_output.put_line(r_contact.first_name || ' ' ||
                         r_contact.last_name || '<' || r_contact.email || '>');

EXCEPTION
    WHEN OTHERS THEN
        dbms_output.put_line(SQLERRM);
END;
```
```
### Postgresql
-- A postgresql PG/SQL Procedure
create procedure print_contact(IN in_customer_id integer)
    language plpgsql
as
$$
DECLARE
    r_contact contacts%ROWTYPE;
BEGIN
    -- get contact based on customer id
    SELECT *
    INTO r_contact
    FROM contacts
    WHERE customer_id = in_customer_id;

    -- print out contact's information
    RAISE NOTICE '% %<%>', r_contact.first_name, r_contact.last_name, r_contact.email;
EXCEPTION
    WHEN OTHERS THEN
        RAISE EXCEPTION '%', SQLERRM;
END;
$$;
```
Limitation:

SQL context is limited to the LLM max training token size now, T5 model need different datasets to feed. Testing Facebook LLLM model.

Source Code

Github Introducation: https://github.com/neuesql/sqlgpt

Github T5 model: https://github.com/neuesql/sqltransformer

T5-Endpoint: https://huggingface.co/neuesql/sqltransformer
Make infrastructure as code testable and callable

2022-10-08
Infrastructure as code is not programming, it’s only configuration.

— as a software developer

With the big cloud migration wave, there are a lot of cloud provision and configuration effort. There is a modern name called “ Infrastructure as code “. But after AWS released in the market over 1 decade, it is still in earlier stage from a software engineer point view.

The biggest market winner is https://www.terraform.io/, which it can deliver infrastructure easily. But it was same year in 2014, AWS released boto3 API, https://pypi.org/project/boto3/0.0.1/ which can also provision all AWS service in python code easily, link here
```
import boto3
client = boto3.resource('s3')
response = client.create_bucket(
    Bucket='examplebucket',
    CreateBucketConfiguration={
        'LocationConstraint': 'eu-west-1',
    },
)

print(response)
```
How is the definition of simplicity from Terraform point view? I have different options. Is boto3 complicated?

The main disadvantage for Terraform is the DSL language can’t test easily with Unit Test, Mock Service, Integration Test, E2E test like other modern language Python, Java. Many Infra Engineers or Architects are always challenging me, WHY need test? —– It’s like a joke.

However, there are some innovative solution is coming,
1. AWS CDK https://aws.amazon.com/cdk/
2. Plumi https://www.pulumi.com/
3. Terraform https://www.terraform.io/cdktf
Which is providing modern programming productivity to make infrastructure as real code with quality.
```
import pulumi
from pulumi_aws import s3

# Create an AWS resource (S3 Bucket)
bucket = s3.Bucket('my-bucket')

# Export the name of the bucket
pulumi.export('bucket_name',  bucket.id)
```
As today, there are maybe many existing legacies Terraform code or module in your organization which you can’t drop directly. I made a hand-on dirty to find a testable solution to maintain TF code. And today SonarQube support TF code check for AWS.

1. Overview of Test Cost

Cost to Run Test

2. Unit Test in Terraform

It is not really unit test, it is more grammar check and plan explanation.
```
terraform fmt -check
tflint
terraform validate
terraform plan
```
3. Integration Test Terraform

In terraform, you can apply to deploy your code at Integration Test Directly.
```
terraform apply
terraform destory [don't forget release code]
```
But you can also use advanced IT test framework such as Terratest4 or kitchen-terraform5. It’s a provision PostgreSQL test example.
```
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
		TerraformDir: tfFilePath,
		Vars: map[string]interface{}{
			"postfix":     uniquePostfix,
			"db_user":     expectedUser,
			"db_password": expectedPassword,
		},
		NoColor: false,
	})
	subscriptionID := "xxx"

	defer terraform.Destroy(t, terraformOptions)

	terraform.InitAndApply(t, terraformOptions)
	expectedServername := "postgresqlserver-" + uniquePostfix // see fixture
	actualServername := terraform.Output(t, terraformOptions, "servername")
	rgName := terraform.Output(t, terraformOptions, "rgname")
	expectedSkuName := terraform.Output(t, terraformOptions, "sku_name")
	actualServer := azure.GetPostgreSQLServer(t, rgName, actualServername, subscriptionID)
	actualServerAddress := *actualServer.ServerProperties.FullyQualifiedDomainName
	actualServerUser := *actualServer.ServerProperties.AdministratorLogin

	// Expectation
	assert.NotNil(t, actualServer)
	assert.Equal(t, expectedUser, actualServerUser)
	assert.Equal(t, expectedServername, actualServername)
	assert.Equal(t, expectedSkuName, *actualServer.Sku.Name)
```
In go libraries, you can use SQL to make end2end test like this,
```
func ConnectDB(t *testing.T, userName string, expectedPassword string, databaseAddress string, actualServername string) {
	var connectionString string = fmt.Sprintf("host=%s user=%s password=%s dbname=%s sslmode=require", databaseAddress, userName+"@"+actualServername, expectedPassword, "postgres")
	print(connectionString)
	db, err := sql.Open("postgres", connectionString)
	assert.Nil(t, err, "open db failed")
	err = db.Ping()
	assert.Nil(t, err, "connect db failed")
	fmt.Println("Successfully created connection to database")
	var currentTime string
	err = db.QueryRow("select now()").Scan(&currentTime)
	assert.Nil(t, err, "query failed ")
	assert.NotEmpty(t, currentTime, "Get Query Time "+currentTime)
```
github code is here https://github.com/wuqunfei/tfmodule-azure-resource-postgresql/blob/main/test/mod_test.go#L56

4. Policy Test in Terraform

IT Compliance and Security become into code in these years when you provision your infrastructure. Azure has an example blog https://learn.microsoft.com/en-us/azure/developer/terraform/best-practices-compliance-testing
```
terraform show -json main.tfplan > main.tfplan.json

docker run --rm -v $PWD:/target -it eerkunt/terraform-compliance -f features -p main.tfplan.json

# https://github.com/terraform-compliance/cli  a lightweight, security focused, BDD test framework against terraform.
```
5. Terraform Code Quality Check

Since last year, SonarQube supports Terraform Grammar and Security Check. It helps us reduce a lot manually setup and check.

For example, this code has Omitting “public_network_access_enabled” allows network access from the Internet. Make sure it is safe here.
```
# public_network_access_enabled = true
# default is true, need to set false
public_network_access_enabled = false
```
Which developer can’t find it easily because public network access as default.

6. Making Terraform code callable

1. Shell spaghetti easily: Using some shell and scripts or github ci/cd or jenkins, the provision can work automation quickly and easily.

2. Integrating with automation tools : we can also use Ansible TF-module and define a play book to run automatically. Ansible TF model link Ansible Towner can explore their API easily to automation tools.

3. Crossplane Open API definition

It is a new design architecture from Crossplane which had considered each component can be callable by Open API definition.
```
  versions:
  - name: v1alpha1
    served: true
    referenceable: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              parameters:
                type: object
                properties:
                  storageGB:
                    type: integer
                required:
                  - storageGB
            required:
              - parameters
```
Original code is here https://github.com/crossplane/crossplane/blob/master/docs/getting-started/create-configuration.md

Summary

To write infrastructure code goes straightforward, but the foundation of high-quality software mindset which is still missing in many cloud engineering daily job. In this blog, I summarized the existing marketing options, it may help someone like to improve their automation and quality in their infrastructure daily work.
Centralizing CI/CD pipeline in a big organization

2022-03-03
Don’t repeat yourself
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

Using Gitlab CI/CD and Github actions pipeline makes workflow for a single project quickly. There is always a pipeline file like .gitlab or .github/workflows in an individual project. But each developer or team need to maintain and update it regularly when there are bugs or updates.

In a big organization, there would be better to have a centralized team (e.g., Cloud Platform, Performance & Reliability Engineering, Engineering Tools) to develop standard tooling and infrastructure to solve every development team’s problems.

https://netflixtechblog.com/full-cycle-developers-at-netflix-a08c31f83249

Pipeline for different applications is a common demand for software engineering, data engineering and data scientist. It can also help business developers focus on their business logic, and a centralized team can support the most updated stacks in the big organization.

A pipeline is a code. So how can we share and combine pipeline code with the business code quickly and smoothly? This article will share 3 patterns and anti-patterns with Jenkins features to achieve this goal. However, it does not limit you to using modern CI/CD tools like cloud pipelines like Azure Pipelines, AWS CodePipeline, Google Cloud build.

Patterns:
1. To set boundaries and separate responsibilities. Let specialists do the professional job.
2. Centralized pipelines in a centralized git repository, a centralized team to maintain, update and bug fix. CI can combine the various source code, pipeline and deployment in one building process.
3. To give the possibility and flexibility to the end developer to customize features and add new ideas.
Anti-Patterns:
1. Anyone can do anything. I am not sure whether it is good or bad, but asking data scientists to write a service deploy pipeline is too expensive, opposite asking DevOps to write an NLP pipeline is also challenging.
2. Each project has its pipeline. How to fix >100 repositories with the same legacy pipeline simultaneously? Don’t repeat yourself.
3. Control over Innovations. A centralized team does not have the capacity or passion for taking the responsibilities. Then the mess is coming, in the organization better have a mature-approve model to control innovation instead of killing the invention.
Economy effects

Depending on your organization size, if there are 1000 developers, each team has over 10 projects running on the production like today’s modern microservice pattern. And the pipeline bug fix needs 2 hours for a specialist, and a general developer needs to invest time to understand the context and solve it for 4 hours at the first time. Others same pipeline can be solved in 1 hour with previous experience.

Hour(x) = 1000(developers) x ( 1 (first project) x 4(hour) + 9(other projects) x 1(hour))

= 1000 x( 1 x 4 + 9 x 1)

= 13,000 h

Using Germany Munich Average Developer’s Salaries € 7000/Month

Cost(x) = 13000(h) x 7000(€)/22(D)/8(H)

= 13000 x 7000 /22/8

= 517, 045 €

In scaling economy, one specialist fixes a bug in the centralized repository pipeline, can save half a million for 2 hours consumed bug in 1000 developer’s organization.

Let’s do it in action.

Use case: build a pipeline that can support python flask web service, which can deploy to Azure Kubernetes service.
1. Create two repositories, one for source code one for pipeline code.
  1. https://github.com/wuqunfei/jenkins_ai_pipelines
  2. https://github.com/wuqunfei/ocr_service
2. To use Jenkins DSL to create a similar pipeline for the same type of workflow with parameters, like spring boot application, python web application, etc…
  1. ocr_service is classic python flask web service
3. Combine pipeline code and source code in the same CI job
```
//github server setting
String github_token_credential = "git-token-credentials"
String github_host = "github.com"

//central pipeline repository
String pipeline_repository = "wuqunfei/jenkins_ai_pipelines"
String pipeline_jenkins_file = "Jenkinsfile.py.aks.groovy"

//application source code
String source_code_repository_url = "https://github.com/wuqunfei/ocr_service"
String source_code_branch = "main"

//Azure ACR and AKS
String acr_name = "ocr"
String acr_credential = "acr_credential"
String aks_kubeconfig_file_credential = "k8s"


//Application
String application_name = "pysimple"

pipelineJob("ocr-service-builder") {
    parameters {

        stringParam('github_token_credential', github_token_credential, 'Github token credential id')

        stringParam("application_name", application_name, "application_name for docker image")
        stringParam("source_code_repository_url", source_code_repository_url, "Application Source Code HTTP URL")
        stringParam("source_code_branch", source_code_branch, "Application Source Code Branch, default main")


        stringParam("pipeline_repository", pipeline_repository, "pipeline github project name")
        stringParam("pipeline_jenkins_file", pipeline_jenkins_file, 'pipeline file')


        stringParam("acr_name", acr_name, "Azure Container Registry name for docker image")
        stringParam("acr_credential", acr_credential, "Azure Container credential(user/pwd) id in jenkins ")
        stringParam("aks_kubeconfig_file_credential",aks_kubeconfig_file_credential, "Azure AKS kubeconfig file credential id in Jenkins" )

    }
    definition {
        cpsScm {
            scm {
                git {
                    remote {
                        github(pipeline_repository, "https", github_host)
                        credentials(github_token_credential)
                    }
                }
            }
            scriptPath(pipeline_jenkins_file)
        }
    }
}
```
Jenkins DSL API https://jenkinsci.github.io/job-dsl-plugin/#path/pipelineJob
```
pipeline {
    agent any
    stages {
        stage('Checkout Source Code and Deployment Code') {
            steps {

                echo "Checkout source code done ${source_code_repository_url}"

                git branch: "${params.source_code_branch}", credentialsId: "${params.github_token}", url: "${params.source_code_repository_url}"
                echo "Checkout source code done ${source_code_repository_url}"

            }
        }
        stage("Test Code"){
            steps{
                echo "Test code"
            }
        }
        stage("Build Code"){
            steps{
                echo "application build done"
            }
        }
        stage("Docker Build"){
            steps{
                script {
                    dockerImage = docker.build("${params.application_name}:${env.BUILD_ID}")
                }
                echo "docker build done"
            }
        }
        stage("Docker Publish ACR"){
            steps{
                script{
                    docker_register_url =  "https://${params.acr_name}.azurecr.io"
                    docker.withRegistry( docker_register_url, "${params.acr_credential}" ) {
                        dockerImage.push("latest")
                    }
                }
                echo "docker push done"
            }
        }
        stage("Kubernetes Deploy"){
            steps{
                withCredentials([kubeconfigContent(credentialsId: 'k8s', variable: 'kubeconfig_file')]) {
                    dir ('~/.kube') {
                        writeFile file:'config', text: "$kubeconfig_file"
                    }
                    sh 'cat ~/.kube/config'
                    echo "K8s deploy is done"
                }
            }
        }
        stage("Service Health Check"){
            steps{
                echo "Service is up"
            }
        }
    }
}
```
The implementation is straightforward, but letting the team members and manager comprehend takes much more time. One of my working companies took at least two years to mature this idea, with some fantastic architects pushing the idea “pipeline driven organization” https://www.infoq.com/articles/pipeline-driven-organization/.

I hope my experience can inspire anyone to apply to your organization with Jenkins or other cloud CI/CD tools.

Reference is here:

https://github.com/wuqunfei/jenkins_ai_pipelines

https://www.digitalocean.com/community/tutorials/how-to-automate-jenkins-job-configuration-using-job-dsl
A big Package of “Architecture” Principles/Manifesto

2022-02-15
Define a set of guiding principles is an important first step of any strategy
—- Cloud Strategy from Gregor Hohpe.

1. Agile manifesto

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:
1. Individuals and interactions over processes and tools
2. Working software over comprehensive documentation
3. Customer collaboration over contract negotiation
4. Responding to change over following a plan
That is, while there is value in the items on
the right, we value the items on the left more.

https://agilemanifesto.org/iso/en/manifesto.html

3. 21 principles of enterprise architecture

Four categories of principles
- General principles
- Information principles
- Application principles
- Technology principles
General principles
1. IT and business alignment
2. Maximum benefits at the lowest cost and risk
3. Business continuity
4. Compliance with standards and policies
5. Adoption of the best practices for the market
Information principles
1. Information treated as an asset
2. Shared information
3. Accessible information
4. Common terminology and data definitions
5. Information security
Application principles
1. Technological independence
2. Easy-to-use applications
3. Component reusability and simplicity
4. Adaptability and flexibility
5. Convergence with the enterprise architecture
6. Enterprise architecture also applies to external applications
7. Low-coupling interfaces
8. Adherence to functional domains
Technology principles
1. Changes based on requirements
2. Control of technical diversity and suppliers
3. Interoperability
https://developer.ibm.com/articles/enterprise-architecture-financial-sector/

3. 6 pillars of AWS Architectured framework
1. Operational Excellence
2. Security
3. Reliability
4. Performance Efficiency
5. Cost Optimization
6. Sustainability
https://aws.amazon.com/cn/blogs/apn/the-6-pillars-of-the-aws-well-architected-framework/

4. The Twelve Factors
1. Codebase
  One codebase tracked in revision control, many deploys
2. Dependencies
  Explicitly declare and isolate dependencies
3. Config
  Store config in the environment
4. Backing services
  Treat backing services as attached resources
5. Build, release, run
  Strictly separate build and run stages
6. Processes
  Execute the app as one or more stateless processes
7. Port binding
  Export services via port binding
8. Concurrency
  Scale out via the process model
9. Disposability
  Maximize robustness with fast startup and graceful shutdown
10. Dev/prod parity
  Keep development, staging, and production as similar as possible
11. Logs
  Treat logs as event streams
12. Admin processes
  Run admin/management tasks as one-off processes
https://12factor.net/

5. Manifesto for software craftsmanship

As aspiring Software Craftsmen we are raising the bar of professional software development by practicing it and helping others learn the craft. Through this work we have come to value:
1. Not only working software, but also well-crafted software.
2. Not only responding to change, but also steadily adding value
3. Not only individuals and interactions, but also a community of professionals
4. Not only customer collaboration, but also productive partnerships
That is, in pursuit of the items on the left we have found the items on the right to be indispensable.

https://manifesto.softwarecraftsmanship.org/#/en

Provider	Schema migrate	Function migrate	Store Procedure migrate
AWS DMS	Y	X	X
GOOGLE DMS	Y	X	X
Azure DMS	Y	X	X
Ora2pg	Y	P	P
Jooq	Y	P	P

Clean Python’s Code

2022-02-10

Business is constantly changing. How to design a scalable python application in the data scientist and data engineering world?

It is no one mature enterprise-level framework in python data world compared Java’s Enterprise frameworks like springboot, microprofile, etc…. but I try to use dependency Injection and configuration pattern to clean python code.

Use Case

There is a nlp processing in financial project, but with more different business case in need, configuration and dependencies becomes overwriting and missing conf or cyclic dependency in python.

We build an NLP example service following the dependency injection principle. It consists of several services with a NLP domain logic. The services have dependencies on database & storage by different providers. In the meanwhile, the configuration can also be Inheritance by python @dataclass and supported hydra.cc framework.

Refactoring

high cohesion and loose coupling recap.

1. Coupling is the degree of interdependence between software modules, tightly coupling modules can be solved by dependency Injection pattern.

2. Complicated Configuration can be extend, validate, Inheritance by Hydra.cc and pydantic framework.

Configuration

Assembling Processing

there are 3 main steps to handle configurations

Load configuration from YAML into python dataclass by Hydra.cc framework which developer by facebook team.
Using Pydantic library’s annotation(@validate) to check the value of your configuration
Set the configuration into container, container injects this configuration into different services easily. DI library called python dependency injection

from pydantic import validator
from pydantic.dataclasses import dataclass

@dataclass
class MySQLConfig:
    driver: str
    user: str
    port: int
    password: str

    @validator('port', pre=True)
    def check_port(cls, port):
        if port < 1024:
            raise Exception(f"Port:{port} < 1024 is forbidden ")
        return port

Personally experience, I prefer to dataclass instead of pydantic BaseModel. In fact pydantic has pydantic.dataclasses which looks like dataclass and support @valiator annotation.

Extra read, I only use simple config in demo case. If you like to extend the complicate and separate YAML, please check in detail https://hydra.cc/docs/tutorials/structured_config/hierarchical_static_config/ like this

from dataclasses import dataclass
import hydra
from hydra.core.config_store import ConfigStore
@dataclass
class MySQLConfig:
    host: str = "localhost"
    port: int = 3306

@dataclass
class UserInterface:
    title: str = "My app"
    width: int = 1024
    height: int = 768

@dataclass
class MyConfig:
    db: MySQLConfig = MySQLConfig()
    ui: UserInterface = UserInterface()

cs = ConfigStore.instance()
cs.store(name="config", node=MyConfig)

@hydra.main(config_path=None, config_name="config")
def my_app(cfg: MyConfig) -> None:
    print(f"Title={cfg.ui.title}, size={cfg.ui.width}x{cfg.ui.height} pixels")

if __name__ == "__main__":
    my_app()

Hydra is quite impressive by its feature, you can check this youtube video “Configuration Management For Data Science Made Easy With Hydra“

Application Structure

./
├── src/
│   ├── __init__.py
│   ├── containers.py
│   ├── gateway.py
│   └── services.py
├── config.yaml
├── __main__.py
└── requirements.txt

https://github.com/wuqunfei/python-di-config

Gateways

from abc import ABC, abstractmethod
from loguru import logger

class DatabaseGateway(ABC):
    def __init__(self):
        ...
    @abstractmethod
    def save(self):
        ...
class MysqlGateway(DatabaseGateway):
    def __init__(self):
        ...
    def save(self):
        logger.info("Saved in Mysql")

class PostgresqlGateway(DatabaseGateway):
    def __init__(self):
        ...
    def save(self):
        logger.info("Saved in Postgresql")

class ObjectStorageGateway(ABC):
    def __init__(self):
        ...
    @abstractmethod
    def download(self):
        ...

class S3GateWay(ObjectStorageGateway):
    def download(self):
        logger.info("download from AWS S3 blob Storage")

class AzureStoreGateWay(ObjectStorageGateway):
    def download(self):
        logger.info("download from Azure Object Storage")

Services

from abc import ABC, abstractmethod
from loguru import logger

from src.gateway import DatabaseGateway, ObjectStorageGateway

class AbstractNLPService(ABC):
    def __init__(self, config: dict):
        self.config = config

    @abstractmethod
    def ocr_preprocess(self):
        ...
    @abstractmethod
    def tokenizer(self):
        ...
    @abstractmethod
    def chunker(self):
        ...
    @abstractmethod
    def post_process(self):
        ...
    def run_nlp(self):
        self.ocr_preprocess()
        self.tokenizer()
        self.chunker()
        self.post_process()

class BankNLPService(AbstractNLPService):
    def __init__(self,
                 config: dict,
                 db_gateway: DatabaseGateway,
                 storage_gateway: ObjectStorageGateway):
        super().__init__(config)
        self.db_gateway = db_gateway
        self.storage_gateway = storage_gateway

    def ocr_preprocess(self):
        self.storage_gateway.download()
        logger.info(f"{self.__class__.__name__} OCR preprocess done")

    def tokenizer(self):
        logger.info(f"{self.__class__.__name__} Tokenizer done")

    def chunker(self):
        logger.info(f"{self.__class__.__name__} Chunker done")

    def post_process(self):
        logger.info(f"{self.__class__.__name__} post process done")
        logger.info(self.config)
        self.db_gateway.save()


class InsuranceNLPService(AbstractNLPService):
    def __init__(self, config: dict):
        super().__init__(config)

    def ocr_preprocess(self):
        logger.info(f"{self.__class__.__name__} OCR preprocess done")

    def tokenizer(self):
        logger.info(f"{self.__class__.__name__} Tokenizer done")

    def chunker(self):
        logger.info(f"{self.__class__.__name__} Chunker done")

    def post_process(self):
        logger.info(f"{self.__class__.__name__} post process done")

    @abstractmethod
    def get_risk(self):
        ...


class LifeNLPService(InsuranceNLPService):
    def get_risk(self):
        logger.info(f"{self.__class__.__name__} risk score 1.0 done")

class CarNLPService(InsuranceNLPService):
    def get_risk(self):
        logger.info(f"{self.__class__.__name__} risk score 2.0 done")

Container

class MyContainer(containers.DeclarativeContainer):
    config = providers.Configuration()
    '''Gateways as singleton'''
    mysql_gateway: DatabaseGateway = providers.Singleton(
        MysqlGateway
    )
    s3_gateway: ObjectStorageGateway = providers.Singleton(
        S3GateWay
    )
    '''Services factory '''
    nlp_service_factory: AbstractNLPService = providers.Factory(
        BankNLPService,
        config=config,
        db_gateway=mysql_gateway,
        storage_gateway=s3_gateway

    )
    life_nlp_factory: AbstractNLPService = providers.Factory(
        LifeNLPService,
        config=config
    )
    car_nlp_factory: AbstractNLPService = providers.Factory(
        CarNLPService,
        config=config
    )

Main Function

Let’s put all together and run it

@hydra.main(config_path="", config_name="config")
def my_app(cfg: MySQLConfig) -> None:
    """1. to get config yaml by hydra"""
    cfg_dict = dict(cfg)
    """2. to validate the configuration by pydantic"""
    MySQLConfig(**cfg_dict)
    container = MyContainer()
    """3. to load configuration into container"""
    container.config.from_dict(cfg_dict)
    nlp = container.nlp_service_factory()
    nlp.run_nlp()

if __name__ == "__main__":
    my_app()

Finally Results

2022-02-10 20:11:59.873 | INFO     | src.gateway:download:43 - download from AWS S3 blob Storage
2022-02-10 20:11:59.873 | INFO     | src.services:ocr_preprocess:48 - BankNLPService OCR preprocess done
2022-02-10 20:11:59.873 | INFO     | src.services:tokenizer:51 - BankNLPService Tokenizer done
2022-02-10 20:11:59.873 | INFO     | src.services:chunker:54 - BankNLPService Chunker done
2022-02-10 20:11:59.873 | INFO     | src.services:post_process:57 - BankNLPService post process done
2022-02-10 20:11:59.873 | INFO     | src.services:post_process:58 - {'driver': 'mydriver', 'user': 'root', 'port': 3306, 'password': 'foobar'}
2022-02-10 20:11:59.873 | INFO     | src.gateway:save:20 - Saved in Mysql

It is a simple practise to clean python code with 3 libraries

Source Code address, hope it inspires us to write clean python code.

git clone git@github.com:wuqunfei/python-di-config.git

Cluster centers (z-score means on factors), k = 4
Cluster	F1	F2	F3	F4	F5
1	-0.78	0.15	0.44	-0.03	-1.30
2	-0.89	0.05	-0.51	-0.64	0.37
3	0.68	1.17	0.21	0.37	0.25
4	0.63	-1.13	0.07	0.28	0.10

recent posts

about

#1 Start Simple, Combine Strategically

#2 The Real Skill Is Interpretation

#3: All Models Are Wrong, But Some Are Useful

#4: Integration Beats Perfection

#3 A Practical Business Decision Framework

#5 A Real World Automotive Case

The Challenge

The Problem We Faced

How We Solved It

All Steps Together of Use Case

🛫 The Perfect Trip That Wasn’t

📝 Two Identical Claims, Two Different Outcomes

Jack’s Experience (5 days later)

Rose’s Experience (5 days later)

📞 The Call That Changed Everything

📊 The Shocking Discovery

The Numbers Don’t Lie (2000 claims, Mock Datasets)

But Why? The Hidden Truth

🛠️ The Fix: Teaching AI to Be Fair

Step 1: Measuring the Bias

Step 2: Three Mitigation Solutions Benchmark

Key Metrics Interpretation

Trade-offs:

Method 1: Demographic Parity

Method 2: Equalized Odds ⭐ WINNER

Method 3: Threshold Optimizer

Step 4: 📈 The Results: A Fairer Future

Before vs. After

The Business Impact

💝 The Happy Ending

🎓 Key Takeaways (What You Need to Know)

For Everyone

For Business Leaders

🌟 The Choice, Red or Blue ?

Key Takeaways

📚 Learn More

🌟 Step 1: Ask Yourself Before You Speak

🧮 Step 2: Advocacy vs. Inquiry — Finding the Balance

🗣️ Step 3: Practice the Dance — Advocacy and Inquiry in Action

1. The Problem: 90% Communication, 10% Coding

2. The Solution: Shift From “How” to “What”

3. The impact: Structured spec == Less talk, More build

4. Role Exchange: From Execution to Steering

4. Context Engineering – How specs get selected as Context

5. Spec-Driven Tool (Kiro from AWS), easy understand.

6. Create a New Business idea in practice

9. Key Takeaways

10. Time Spend in this new idea totally

11. Get Started (Technical in detail)

Closing Words

Summary

📖 Our Story

⚠️ Key Lessons

🚀 Actions of 2026 WRO

The Problem

Objectives

Solutions

Features

Roadmap

System Architecture(in plan)

Examples

Limitation:

Source Code

1. Overview of Test Cost

2. Unit Test in Terraform

3. Integration Test Terraform

4. Policy Test in Terraform

5. Terraform Code Quality Check

6. Making Terraform code callable

Summary

Patterns:

Anti-Patterns:

Economy effects

Let’s do it in action.

1. Agile manifesto

3. 21 principles of enterprise architecture

General principles

Information principles