Why this role is different
Most engineering jobs ask you to maintain legacy systems or add features to crowded codebases. This isn't that.
You'll be one of the founding engineers shaping two products that sit at the intersection of AI, enterprise software, and infrastructure — at a moment when those three things are being rewritten from scratch. The code you write in your first six months will still be running in production five years from now. The architectural decisions you make will shape how thousands of Korean small businesses experience AI, and how GPU compute is delivered across Southeast Asia.
We don't believe in 200-person engineering orgs where you ship one feature per quarter. We believe in small teams of strong engineers who own entire systems end-to-end, ship fast, and use AI tools to operate like teams 5x their size.
The two products you'll build
CoSAP — AI-Powered Business Intelligence for Korean SMEs
Korean small and medium businesses run on Douzone and ECount — ERP systems that hold decades of accounting and operational data, but expose it through forms and reports built for accountants, not founders. CoSAP changes that. We let business owners ask questions in plain Korean — "왜 이번 달 마진이 떨어졌어?" (why did our margin drop this month?) — and get answers backed by their actual financial data.
Under the hood: a multi-tenant data layer connecting Douzone and ECount, retrieval over financial and operational data using PostgreSQL with pgvector and Qdrant, event streaming with Kafka, workflow orchestration with Kestra, and self-hosted LLM serving with vLLM. The Korean SME market is 7+ million businesses. Almost none of them have access to real business intelligence today.
Fusionflow — GPU & AI Infrastructure Orchestration
GPUs are the most expensive, most contested compute resource on earth right now. We operate Kubernetes-based GPU clusters across multiple data centers and turn them into reliable, multi-tenant compute that customers can actually use. Fusionflow is the orchestration layer: scheduling, isolation, observability, and operations.
You'll work on real distributed systems problems — GPU scheduling under contention, node health and failure recovery, network performance tuning across InfiniBand and RoCE, tenant isolation, and the operational tooling needed to run hundreds of GPUs reliably. We currently operate a 19-node K3s cluster with 160+ GPUs and are scaling significantly through 2026.
What you'll actually do
Concretely — not buzzwords, but the kind of tickets you'll close in your first year:
• Build the next-generation ERP connector layer for CoSAP — designing schemas, sync logic, and conflict resolution for ECount and Douzone data flowing into our analytics layer.
• Improve retrieval quality for natural-language financial queries — chunking strategies, embedding models, hybrid search, and evaluation pipelines.
• Design and operate Fusionflow's GPU scheduling logic — how do we fairly allocate scarce GPU resources across tenants while keeping latency low?
• Harden cluster operations — node health monitoring, automated remediation, network tuning, DDoS mitigation, observability.
• Write production-grade APIs and internal tooling in Python and TypeScript that other engineers, ops staff, and customers depend on.
• Review AI-generated code with the same rigor as human code — catching subtle bugs, hallucinated APIs, security issues, and performance regressions.
• Be on-call for systems you build. Debug real outages. Write the postmortem. Ship the fix.
Who we're looking for
In the AI era, raw coding skill is no longer the bottleneck. Almost anyone can produce working code with Claude Code or Cursor. What separates strong engineers from average ones now is judgment, taste, and verification discipline. Here's what we actually screen for, in priority order:
1. Verification instinct — the ability to spot when AI output is wrong
This is the #1 filter. AI tools produce confident, plausible-looking code that is sometimes subtly broken — hallucinated APIs, wrong async behavior, race conditions, security holes, schema mismatches. The best engineers today treat AI output as a draft from a fast, overconfident junior, not as truth. We will test this in the interview.
2. Strong fundamentals in distributed systems
You understand consistency, idempotency, retries, queues, timeouts, and what happens when networks partition. You've worked with Kubernetes, Kafka (or equivalent), and PostgreSQL in production — not just in tutorials. You know why SELECT FOR UPDATE exists and when to use it.
3. Problem decomposition over prompt engineering
Given an ambiguous request, you ask the right clarifying questions before writing code. You can break "sync ECount data into our analytics layer" into a concrete technical plan covering schema, ingestion, idempotency, error handling, and observability — before opening Claude Code.
4. Production ownership and operational maturity
You've been on-call. You've debugged a real outage at 2 AM. You read logs, traces, and metrics fluently. You write code defensively because you've been burned. You don't say "works on my machine" — you say "let me check what differs in production."
5. Code reading > code writing
Most of your time is spent reading and reviewing, not greenfield writing. You can drop into an unfamiliar codebase, orient yourself in a few hours, and identify what to change. You're comfortable navigating large repos, reading other people's code, and understanding intent.
6. AI tool fluency with healthy skepticism
Daily use of Claude Code, Cursor, or Copilot. Familiarity with MCP, agent workflows, tool calling, and prompt design. Most importantly: you know where these tools fail and you compensate. A candidate who blindly trusts AI output is more dangerous than one who doesn't use it at all.
7. Clear written communication in English
PR descriptions, technical specs, runbooks, prompt design, async messaging with a globally distributed team. Clear writing equals clear thinking. We work asynchronously across Singapore, HCMC, and Korea — written communication is the medium







