We are rebuilding every customer and employee touch- point as an AI- native experience. That ranges from booking a flight/bus/train seat to automating sale processes or leave requests. Engineers here get clear goals, full context, and the freedom to ship quickly while keeping reliability and safety paramount.
Major projects you will tackle:
Company- wide automation hub built on n8n, including reusable nodes and guard- rails that let non- technical teams create flows safely.
Omni- channel customer chatbot that handles FAQs, after- sales, and ticket booking for bus, flight, train, and vehicle- rental verticals- across text and voice.
Multimodal expansion that blends text, speech, and images so customers can, for example, upload a ticket photo or speak a booking change request.
Department- specific assistants for HR onboarding, finance invoice queries, operations incident triage, IT help- desk, and more- each powered by shared LLM components and n8n triggers.
You will fine- tune foundation models, fuse retrieval with LLM reasoning, and iterate in Vietnamese, English, and other languages.
What you will do:
Design, implement, and continually improve our multi- agent framework. Build and refine agent components such as short- and long- term memory stores, planning/reflection loops, agent- to- agent messaging, and specialised prompt templates that let multiple agents collaborate on complex tasks.
Craft prompts, refusal rules, and jailbreak tests; automate factuality, safety, and multimodal hallucination checks in CI.
Select and adapt OpenAI, Gemini models, Llama, Mixtral or better- using LoRA or full fine- tuning- so the models speak our brand voice in multiple languages.
Build and optimise retrieval- augmented pipelines, keeping latency below two seconds and hallucinations under five percent.
Pair daily with Conversation Designers, MLOps engineers, Automation engineers, and business stakeholders; publish model cards, data sheets, and rollback plans.
Translate company OKRs into squad- level road- maps and capacity plans.Mentor seniors & engineers in the team
Package models with Docker and serve them through FastAPI or gRPC behind vLLM, Triton Inference Server, or Text- Generation- Inference; add monitoring with Grafana / Prometheus for latency, drift, and GPU cost (for later milestones).