Zorky CRMZorky CRM
EN|RU
@ekaterinovikova

AI / LLM Engineer in IT — CIS and Europe market

AI / LLM Engineer — the youngest but loudest role inside the AI/ML/DS direction, which split off in 2023 on the wave of ChatGPT / GPT-4 / Claude / Llama 2. Focus: integrating large language models (LLMs) into a product — NOT training foundation models from scratch (that's Research), NOT classical ML pipelines (that's ML Engineer / Data Scientist), but applied LLM engineering: prompt engineering + RAG (Retrieval-Augmented Generation) + fine-tuning via LoRA / QLoRA + agent architectures + tool use + evaluation. Role family: AI Engineer (general — LLM integration into one product feature: chatbot / summarization / classification / Q&A), LLM Engineer (LLM-deep — server-side optimisation: vLLM / TGI / SGLang serving, KV cache tuning, quantization, speculative decoding), Generative AI Engineer (broader — includes image / video / audio: Stable Diffusion / Flux / Suno / ElevenLabs), AI Agent Engineer (rising 2025-2026 — autonomous agents with tool calling, multi-step reasoning, LangGraph / AutoGen / CrewAI / Microsoft Semantic Kernel), Senior AI Engineer (multi-product LLM architecture + cost optimisation). Stack 2026: Python (monopoly). LLM providers: OpenAI API (GPT-4 / GPT-4 Turbo / o1 / o3 — reasoning models 2024-2026), Anthropic API (Claude Sonnet 4 / Opus 4.5 — coding + long-context leader), Google Gemini (1.5 Pro / Flash — best multimodal), Mistral (open-weights Large), Cohere (Command R+ — RAG-optimised), Groq + Together AI + Fireworks + Replicate (fast inference of open models). Open LLM 2026: Llama 3 / 3.1 / 3.2 / 3.3 (Meta — open-weights leader), Mistral / Mixtral, Qwen 2.5 (Alibaba — sleeper hit 2024-2026), DeepSeek V3 / R1 (reasoning open-source), Phi 3 (Microsoft small-language-model), Gemma 2 (Google). Frameworks: LangChain (dominates — 100K+ stars on GitHub, despite criticism over complexity), LlamaIndex (RAG-focused — data ingestion + indexing mastery), Haystack (deepset), Semantic Kernel (Microsoft), AutoGen (Microsoft multi-agent), CrewAI (multi-agent — rising), LangGraph (LangChain agent extension — DAG-style), DSPy (Stanford — declarative LLM programming, growing). Vector databases: Pinecone (managed leader — premium), Weaviate (open-source — modular), Qdrant (Rust — fast self-hosted), Milvus (enterprise scale), Chroma (lightweight dev), pgvector (Postgres extension — production-friendly), Elasticsearch dense_vector + Redis Stack (hybrid). Embeddings: OpenAI text-embedding-3 + Cohere Embed v3 + sentence-transformers (open-source — BGE / E5 / Stella). Fine-tuning: LoRA + QLoRA (PEFT library by Hugging Face), full fine-tuning rare (cost prohibitive). RLHF / DPO: TRL library (Hugging Face). Serving: vLLM (PagedAttention — record throughput), TGI (Text Generation Inference — Hugging Face), SGLang (programming model for serving + caching), Ollama (local), llama.cpp (CPU + edge). Evaluation: RAGAS, DeepEval, lm-evaluation-harness, Inspect AI (UK AISI), Promptfoo, LangSmith (LangChain trace + eval — best UX). Observability: Helicone, LangSmith, Langfuse (open-source), Arize Phoenix, Datadog LLM Observability. Cost: per-token tracking + prompt caching (Anthropic / OpenAI native + custom Redis layer). According to Zorky CRM, 8 active openings (young niche — the pool will grow), median $14614/mo. Top stack: salesforce, figma, go, databricks, docker. 50.0% remote. AI Engineer — premium $6,500-11,000/mo Senior, at frontier-AI labs (Anthropic / OpenAI / Cohere / Mistral) — $12,000-22,000+, top outliers $30,000-50,000+ for Foundation Model engineering.

Updated: 5/29/2026, 5:40:38 PM
Open over 3 months
8
live positions
Median / month
$14,614
Remote
50%
Top stack
salesforce
2 jobs

Comparison with other specializations

The AI / ML / Data Science direction contains 6 specializations. The current one (AI / LLM Engineer) is highlighted in blue — compare it with its neighbors by the number of open jobs and median salary.

Chart loading…

Demand trend

AI Engineer — the fastest-growing AI/ML sub-niche (split off in 2023 after the ChatGPT boom). The real market is broader than the visible pool — many AI roles classified as general ML Engineer / Backend in our matcher (titles like "Senior Backend Engineer with AI focus"). Drivers 2026: LLM adoption across all products (chatbots / summarisation / Q&A / code-assist), RAG applications for enterprise knowledge bases, agentic workflows (rising 2025-2026), Foundation Model startups race (Anthropic / OpenAI / Cohere / Mistral / Llama-derivatives). Russian AI: Sber.AI (GigaChat) + Yandex (Yandex.GPT) + Tinkoff + OZON — a steady flow. International remote via frontier-AI labs + Y Combinator AI startups — premium segment.

How many new jobs appear each week.

Seniority distribution — trend

How the share of Junior/Middle/Senior/Lead in open jobs shifts week over week. A trend toward Senior usually signals a mature specialization where companies look for ready-made talent; the opposite — a rise in Junior — signals expansion and ground-up team building.

Share of each level in % of all jobs with a stated grade per week.

Salary by level

Junior — typical entry Backend Senior / ML Engineer Middle + LLM portfolio (demonstrable RAG + LangChain experience). Career flow: Backend Senior / ML Middle (2-3 years) + LLM interest → AI Engineer Junior (1-2 years) → Middle (2-3 years) → Senior → either Staff / Principal AI Engineer, AI Tech Lead / Engineering Manager, a move into a frontier-AI lab (Anthropic / OpenAI / Cohere — premium track), or AI startup founder. If grade data is sparse — check the ml-engineer + research pages for broader benchmarks.

Median salary (USD/month) at each grade plus the jump vs the previous one.

LevelMedian $/moJump vs prev.Jobs with salary
Junior0
Middle0
Senior0
Lead0

Biggest salary jump — between Senior and Lead (+45.6%).

Salary distribution — trend

The median AI Engineer salary — $14614/mo — premium segment thanks to rare-skill combination. The distribution is based on a small sample (young niche). $7K+ — Senior with production RAG / agent architecture experience. $10K+ — Senior at international tech companies or frontier-AI labs (Anthropic / OpenAI / Cohere / Mistral / Hugging Face). $15K+ — Foundation Model engineering (DeepMind / Meta AI training teams) — top outliers $25K-50K for exceptional candidates with published research.

What share of jobs each price band holds week over week.

62% of jobs are in the $5–8K range (the core market). High-end $8K+ segment: 24% — usually US-remote or senior-international roles.

Hiring geography

The leader by AI Engineer job count is EN (5 positions) — but the young-niche pool is small, not fully representative. Russia — Sber.AI + Yandex + Tinkoff + OZON + VK + EPAM AI Practice dominate. Poland — AI-friendly EU hub (Warsaw / Krakow). Germany — Berlin AI cluster (Aleph Alpha + Black Forest Labs). France — Paris (Mistral + Hugging Face HQ). UK — London (DeepMind / Cohere). USA — Bay Area (the largest AI hub). Huge international remote via Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Y Combinator AI startups.

Job distribution by country.

These numbers reflect the distribution across the sources we parse. Poland often looks dominant because of dense NoFluffJobs / JustJoin.it / Pracuj coverage — the Polish IT market is genuinely large, but in our sample its share is overweighted relative to the real volume of all IT jobs in the region. Same caveat for other top countries: this is «where our parsers look», not «the true size of the market».

Remote / Hybrid / Office — trend

50.0% of AI Engineer jobs are remote or hybrid. AI work is fully cloud-based (LLM APIs + vector DBs + cloud GPUs). Outsourcers — almost always remote. International frontier-AI labs — full-remote standard for exceptional candidates. Big Tech AI — hybrid standard. Y Combinator AI startups — premium full-remote.

How the share of each work format shifts week over week.

89% — remote. Specialisation is well-adapted to remote format.

Top in-demand technologies

Top AI Engineer stack 2026: Python deep + Backend basics (async / FastAPI / pytest), LLM providers (OpenAI / Anthropic / Google Gemini / Mistral / Cohere / Groq / Together AI / Fireworks / Replicate), Open LLM (Llama 3.x / Mistral / Qwen 2.5 / DeepSeek V3+R1 / Phi 3 / Gemma 2), LLM frameworks (LangChain mastery + LlamaIndex + Haystack + Semantic Kernel + LangGraph + AutoGen + CrewAI + DSPy), Vector DBs (Pinecone / Weaviate / Qdrant / Milvus / Chroma / pgvector / Elasticsearch dense_vector / Redis Stack), Embeddings (OpenAI text-embedding-3 + Cohere Embed v3 + sentence-transformers BGE/E5/Stella), Fine-tuning (LoRA + QLoRA via PEFT + Unsloth + Axolotl), Serving (vLLM + TGI + SGLang + Ollama + llama.cpp + MLX), Evaluation (RAGAS + DeepEval + lm-evaluation-harness + Inspect AI + Promptfoo + LangSmith), Observability (Helicone + LangSmith + Langfuse + Arize Phoenix + Datadog LLM Observability), Safety (NeMo Guardrails + Llama Guard + OpenAI Moderation).

salesforce
2
2
figma
2
2
go
2
2
databricks
1
1
docker
1
1
gcp
1
1
kubernetes
1
1
python
1
1
pytorch
1
1
snowflake
1
1

Technology combinations

Common pairs: Python + LangChain + OpenAI API + Pinecone (classic RAG stack), Python + LangChain + Anthropic Claude + Qdrant (Anthropic-focused), LlamaIndex + OpenAI text-embedding-3 + pgvector (lightweight RAG), LangGraph + OpenAI function calling + custom tools (agent stack), vLLM + Llama 3.1 70B + Qdrant (self-hosted production), PEFT + Unsloth + Axolotl + Llama 3.x 8B (fine-tuning stack), DSPy + GPT-4 + RAGAS (declarative LLM programming). Learning roadmap: Python + Backend basics → LLM fundamentals (Karpathy YouTube) → OpenAI API mastery → Anthropic Claude mastery → LangChain → RAG architecture deep → Vector DB mastery → fine-tuning hands-on (LoRA / QLoRA) → vLLM serving → evaluation (RAGAS) → observability (LangSmith) → agent frameworks (LangGraph / AutoGen / CrewAI / DSPy) → pet-project portfolio.

Which pairs of technologies appear together most often in a single job.

python + sql
52
52
databricks + spark
43
43
databricks + go
39
39
go + visio
31
31
mlops + python
30
30
go + vite
25
25
spark + sql
23
23
go + spark
23
23
python + visio
22
22
python + spark
20
20
express + go
20
20
python + pytorch
20
20

Where we see these jobs

AI Engineer jobs: hh.ru (especially Sber.AI / Yandex / Tinkoff active), Habr Career, getmatch, Djinni, LinkedIn (huge international AI Engineer segment — Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Y Combinator AI startups), NoFluffJobs / JustJoin.it (Poland AI-friendly), Telegram (@ai_engineer_ru, @ml_jobs, @aijobs, @jobsforaiml, @prompt_engineering_ru), career pages of EPAM AI Practice / Luxoft AI / Andersen AI, specialised boards aijobs.net + ai-jobs.net + builtin.com/jobs/ai + jobs.lever.co for AI startups, Y Combinator Work at a Startup (largest AI-startup channel), AI-lab direct careers (Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Stability / Replicate / Together AI / Perplexity / Inflection / Adept / xAI).

Telegram channels
4%
62
Job boards and websites
96%
1,548

AI / LLM Engineer vs other directions

AI Engineer overlaps with ML Engineer (production ML overlap — but AI focuses on LLM-based features), Data Scientist (LLM-aware DS — prompt engineering for analysis), Research Engineer (deeper LLM science — papers + Foundation Model training), Backend (production-grade integration depth), MLOps (LLM serving infrastructure overlap). Comparison with ml-engineer/data-scientist/research/mlops — in the SiblingSubnichesChart above.

Volume of open jobs across IT directions.

Backend
4,770
Full-stack
3,304
Data Engineer
2,325
Sales
1,932
DevOps / SRE
1,794
AI / ML / DS
1,610
QA / Testing
1,571
Architecture
1,437
Frontend
1,055

Latest jobs

Latest open AI Engineer jobs — most recent positions in the sample (young niche, sample is small). The full list is in our CRM or via the "see all" link below. For broader view, see the ml-engineer + research pages.

Software Engineer - Generative AI Hub
London · ~$14614/мес · today
ML Engineer (LLM / Generative AI)
9 days ago
ci/cddockerkubernetespythonpytorch
Sr Technical Led - Generative AI
Bangalore · 13 days ago
awsazure
Sales AI Engineer
San Francisco, CA • New York, NY • United States · 13 days ago
figmasalesforce
Research Engineer, Multimodal Generative AI (Image/Video)
Kirkland, Washington, US; Seattle, Washington, US · 13 days ago
go
Lead AI Engineering - CDI (F/H)
14ème Arrondissement · 13 days ago
databricksgcpgitsnowflake
Sales AI Engineer
San Francisco, CA • New York, NY • United States · 15 days ago
figmasalesforce
Research Engineer, Multimodal Generative AI (Image/Video)
Kirkland, Washington, US; Seattle, Washington, US · 15 days ago
go
See all 8 jobs →

What we can offer

If you work with AI / LLM Engineer jobs or you're in this role yourself — we can close a specific task. Pick a format, leave a contact — we reply within 24 hours.

CRM for recruiters
We onboard you onto our CRM. Upload a AI / LLM Engineer job — get a list of matching candidates with full contact data within your plan limits. Auto-matching plus explainability. Per-month contact limits are configurable.
Candidate access
Are you a candidate looking for AI / LLM Engineer work? Buy direct access to employer contact data — N views per month. No middlemen: message the hiring manager directly.
Talent Supply Audit
We'll show how many AI / LLM Engineer specialists are realistically available for your job: by level, geo, format, budget. An honest answer instead of "we have 100 million resumes".
Custom analytics
A personalized quarterly market report on your ICP — salary benchmarks, talent supply, competitor hiring activity. PDF plus raw data.
Are you a candidate looking for work?Upload resume →

Frequently asked questions

The most common questions about AI / LLM Engineer: pay (premium segment for rare-skill), AI Engineer vs ML Engineer vs Data Scientist (3-way comparison), RAG vs Fine-tuning vs Prompt Engineering (decision tree + cost reality), AI Agent Engineer (rising 2025-2026 sub-specialisation), remote, how to become one (2-4 months from Backend Senior — fastest entry into the ML world), Senior skills (LangChain + advanced RAG + agent architectures + cost optimisation + open-source contributions). Answers recompute automatically.

How much does an AI Engineer earn in 2026?

The median AI Engineer salary is $14614/mo per Zorky CRM data (8 active jobs — young niche, the pool will grow as LLM adoption spreads across products). AI Engineer is a premium segment thanks to a rare-skill combination (Python + LLM expertise + product thinking). Senior with production RAG / agent architecture experience — $7,000-11,000. Senior at US/EU outsource (EPAM AI / Luxoft AI on enterprise GenAI projects) — $7,500-12,000. Staff / Principal AI Engineer — $10,000-15,000. International remote (Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Stability / Replicate / Together AI / Perplexity / Y Combinator AI startups) — $9,000-18,000+ Senior. Foundation Model engineering (Anthropic / OpenAI / DeepMind / Meta AI training teams) — top outliers $20,000-50,000+ for exceptional candidates with published research.

What does an AI Engineer Junior, Middle, Senior, or Lead earn?

Junior — typical entry: Backend Middle / ML Engineer Middle / Data Scientist Middle + LLM hobby projects (demonstrable RAG + LangChain experience). The Junior → Middle jump — after the first production LLM-feature deployment (chatbot / summarisation / Q&A in a real product with real users). Middle → Senior — multi-feature LLM ownership + cost optimisation (typical mandate: 50-70% reduction in LLM API costs through caching + smaller models + RAG vs fine-tuning trade-offs) + evaluation methodology. Senior → Staff / Principal — org-wide AI strategy + multi-product LLM platform architecture. Career flow: Backend Senior / ML Engineer Middle (2-3 years) + LLM interest → AI Engineer Junior / Middle (1-2 years) → Senior → either Staff / Principal AI Engineer, AI Tech Lead / Engineering Manager, a move into a frontier-AI lab (Anthropic / OpenAI / DeepMind — premium track), or AI startup founder.

How much do AI Engineers earn in Moscow, St Petersburg, remote?

Moscow Senior AI Engineer — $6,500-10,500/mo (Sber.AI — the largest AI employer in Russia for GigaChat + Kandinsky + banking AI; Yandex — Yandex.GPT + Alisa AI + Search AI + Market recommender AI; Tinkoff — chatbot + voice AI; OZON — search AI + recsys AI; VK — Mail.ru AI + product AI; X5 Group — retail AI; MTS AI). St Petersburg $6,000-9,500 (JetBrains AI Assistant team, EPAM AI SPb). Minsk/Kyiv $5,500-9,000 Senior. Poland €7,000-11,000 gross Senior. Germany €80-120K/yr Senior (Berlin AI cluster — Aleph Alpha / Black Forest Labs). UK £75-130K Senior (London — DeepMind / Cohere). 50.0% remote. Outsourcers with GenAI Practice (EPAM AI / Luxoft AI / Andersen AI / DataArt AI) — almost always remote, $7,500-12,000 Senior on US AI projects. International frontier-AI labs (Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Stability / Replicate / Together AI / Perplexity / Inflection / Adept / xAI) — full-remote $9,000-18,000+ Senior for Russian-speakers with English. Big Tech AI (Google DeepMind / Meta AI / Microsoft AI / Apple ML / Amazon AGI) — $13,000-25,000+ Senior + RSU.

What stack does an AI Engineer most often need?

Top 5: salesforce, figma, go, databricks, docker. Python deep (monopoly — async / typing / FastAPI). LLM provider integration: OpenAI API mastery (chat completions + function calling + structured outputs + assistants + o1/o3 reasoning models), Anthropic API (Claude — best for coding + long-context up to 200K), Google Gemini (multimodal), Mistral / Cohere / Groq / Together AI / Fireworks / Replicate (alternative providers + inference). Each provider has its quirks (rate limits + retry strategies + cost per token + feature support). Open LLM knowledge 2026: Llama 3.x (Meta — open-weights workhorse), Mistral / Mixtral, Qwen 2.5 (Alibaba — sleeper hit), DeepSeek V3 / R1 (reasoning open-source — released Dec 2024, disrupted the market), Phi 3 (Microsoft SLM), Gemma 2 (Google). LLM Frameworks: LangChain mastery — the largest job market (LCEL — LangChain Expression Language, callbacks, agents, retrievers), LlamaIndex (RAG-focused — data ingestion + indexing strategies), Haystack (deepset), Semantic Kernel (Microsoft for .NET shops), LangGraph (DAG-style agent orchestration — LangChain), AutoGen + CrewAI (multi-agent — rising), DSPy (Stanford — declarative LLM programming, growing). Vector DB mastery: Pinecone (managed leader), Weaviate (open-source modular), Qdrant (Rust-based — fast self-hosted), Milvus (enterprise scale), Chroma (lightweight dev / prototyping), pgvector (Postgres extension — production-friendly for existing Postgres shops), Elasticsearch dense_vector + Redis Stack (hybrid full-text + vector). Embeddings: OpenAI text-embedding-3-large/small (industry standard 2024-2026), Cohere Embed v3, sentence-transformers open-source (BGE / E5 / Stella — top on MTEB benchmark), multimodal embeddings (CLIP variants). Fine-tuning: LoRA + QLoRA via PEFT library (Hugging Face — standard 2026), Unsloth (faster LoRA training), Axolotl (config-driven fine-tuning). Full fine-tuning rare (cost prohibitive). RLHF / DPO / KTO: TRL library (Hugging Face — Direct Preference Optimization growth 2024+). Serving: vLLM (PagedAttention + continuous batching — record throughput for open models), TGI (Text Generation Inference — Hugging Face), SGLang (programming model for serving + KV caching), Ollama (local development), llama.cpp (CPU + edge — Apple Silicon optimised), MLX (Apple's framework). Evaluation: RAGAS (RAG-specific — faithfulness + answer relevancy + context precision), DeepEval, lm-evaluation-harness (academic benchmarks), Inspect AI (UK AISI safety-focused), Promptfoo (regression testing for prompts), LangSmith (LangChain trace + eval — best UX). Observability + tracing: Helicone (LLM-specific APM), LangSmith, Langfuse (open-source), Arize Phoenix, Datadog LLM Observability (2024+ feature). Cost optimisation: prompt caching (Anthropic native + OpenAI native + custom Redis layer), prompt compression (LLMLingua), model routing (small model for easy queries, large for complex — through providers like Martian / OpenRouter). Safety / guardrails: NeMo Guardrails (NVIDIA), Llama Guard (Meta), OpenAI Moderation API, custom prompt injection defences.

AI Engineer vs ML Engineer vs Data Scientist — what's the difference?

Data Scientist — focus on statistical modelling + A/B testing + business problem framing. Stack: pandas + sklearn + XGBoost + Jupyter + SQL + Tableau. Pay $4,500-8,000. See Data Scientist. ML Engineer — focus on production ML pipelines + deployment + scaling. Stack: PyTorch + sklearn + Kubernetes + MLflow + cloud-managed ML. Generalist — works with recsys / fraud / ranking / classical ML. Pay $5,500-9,500. See ML Engineer. AI Engineer (this page) — focus on LLM integration in product. Does NOT train foundation models (that's Research), does NOT build classical ML pipelines (that's ML Engineer). Stack: Python + LLM providers (OpenAI / Anthropic / Mistral) + LangChain / LlamaIndex + vector DBs + vLLM serving + RAG patterns + LoRA fine-tuning + LLM evaluation tools. Pay $6,500-11,000. Key distinction 2026: AI Engineer reuses pre-trained LLMs, ML Engineer often trains models from scratch. AI Engineer works with API/prompts/RAG; ML Engineer — with training loops + feature engineering. Career pivots: Backend Senior → AI Engineer Junior — 2-4 months (fast ramp thanks to API-based work + learnable LangChain). ML Engineer Senior → AI Engineer Senior — 1-3 months (already knows stack basics, needs to deepen LLM-specific). DS Senior → AI Engineer Middle — 3-6 months. AI Engineer Senior → Research (Anthropic / OpenAI) — requires deeper math + paper-writing experience + 6-12 months of preparation.

RAG vs Fine-tuning vs Prompt Engineering — what to pick?

Decision tree for LLM customisation 2026: 1) Prompt Engineering — first try. Use case: zero-shot or few-shot ENOUGH for the task. Pros: zero cost (no training), instant iteration, model-agnostic. Cons: limited control, prompt fragility, context-window pressure. 2) RAG (Retrieval-Augmented Generation) — best for knowledge-grounded tasks. Use case: chatbot, Q&A, document search, customer support, code documentation, internal knowledge bases. Architecture: query → embed → vector search → top-k documents → augment prompt with retrieved context → LLM generates answer. Pros: factual grounding (anti-hallucination), up-to-date (just update the index, no retraining), source citation possible, cheaper than fine-tuning. Cons: retrieval quality bottleneck, prompt size grows. RAG stack 2026: ingestion (LlamaIndex / LangChain document loaders) → chunking strategy (semantic vs fixed-size vs recursive) → embeddings (OpenAI text-embedding-3 / Cohere Embed v3 / open-source BGE) → vector DB (Pinecone / Qdrant / Weaviate / pgvector) → retrieval (similarity + reranking via Cohere Rerank / BGE Reranker) → prompt augmentation → LLM generation. Advanced RAG: hybrid search (vector + BM25), HyDE (Hypothetical Document Embeddings), query rewriting, multi-hop retrieval, self-reflection (CRAG / Self-RAG). 3) Fine-tuning — use when prompt engineering + RAG aren't enough. Use case: a) domain-specific style / format (medical / legal — specific phrasing patterns), b) reduce token cost (a smaller fine-tuned model replaces a larger general model), c) tasks requiring deep specialised knowledge (where RAG context window doesn't help). Modern fine-tuning: LoRA / QLoRA (PEFT) — train small adapter (~1% of weights), full fine-tuning rare. Stack: Hugging Face Transformers + PEFT + TRL + Unsloth (faster) + Axolotl (config-driven). Hardware: 24GB GPU (RTX 4090 / A6000) for 7-13B models, 80GB+ (A100/H100) for 70B+. Cost reality 2026: prompt engineering — minutes, $0. RAG setup — 1-2 weeks engineering, $100-1,000/mo API costs. LoRA fine-tuning — 1-3 days engineering + $50-500 compute. Full fine-tuning — weeks + $1,000-100,000+. 4) Combination: real production AI features typically use ALL three. Prompt engineering first (system prompts + few-shot examples) → RAG for grounding → optional LoRA fine-tuning for style/format. 5) Agentic patterns (rising 2025-2026) — LLM + tool use + multi-step reasoning. Use LangGraph / AutoGen / CrewAI / DSPy. 6) Function calling / structured output — OpenAI / Anthropic native support, for tool integration without custom parsing. Default 2026: RAG for 80% of production use cases + prompt engineering for prototyping + LoRA fine-tuning only if RAG is explicitly insufficient + agents for complex multi-step workflows.

Can AI Engineers work remotely?

Yes, 50.0% of AI Engineer jobs are full-remote or hybrid. AI Engineer work is fully cloud-based (LLM APIs + vector DBs + cloud GPUs — all remote-accessible). Outsourcers (EPAM AI / Luxoft AI / Andersen AI / DataArt AI) — almost always remote on US AI projects. Russian product companies (Sber.AI / Yandex AI / Tinkoff AI / OZON AI) — hybrid or remote after probation. Russian banks (Sber / VTB) — hybrid/office due to AI-data compliance. International frontier-AI labs (Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Stability / Replicate / Together AI / Perplexity / Inflection / Adept / xAI) — full-remote standard for exceptional candidates. Big Tech AI (Google DeepMind / Meta AI / Microsoft AI / Apple ML / Amazon AGI) — hybrid standard. Y Combinator AI startups — premium full-remote. Relocant hubs for AI Engineer: USA (Bay Area — the largest AI cluster), UK (London — DeepMind / Cohere), Canada (Toronto — Vector Institute), Germany (Berlin — Aleph Alpha + Black Forest Labs), France (Paris — Mistral + Hugging Face HQ), Israel (Tel Aviv — AI21 Labs), Singapore, UAE (Dubai — AI tax-friendly). English for international AI remote — must (premium +40-60%, and the AI community / papers / docs are entirely English).

How is an AI Agent Engineer (rising 2025-2026) different?

AI Agent Engineer — sub-specialisation within AI Engineering, focused on autonomous agents with tool use + multi-step reasoning + planning. Hot and rising 2025-2026 after: 1) OpenAI Assistants API + GPT-4 / o1 / o3 function calling improvements, 2) Anthropic computer use (Claude can control a screen — released Oct 2024), 3) Devin (Cognition Labs autonomous SWE), 4) AutoGen / CrewAI multi-agent framework maturity, 5) LangGraph (DAG-style agent orchestration). Day-to-day: 1) Design agent workflows (planning + tool selection + execution + verification loops), 2) Tool integration (custom tools + MCP — Model Context Protocol, Anthropic standard 2024+), 3) Memory architecture (short-term + long-term + semantic), 4) Multi-agent orchestration (router agent + specialist agents pattern), 5) Evaluation for agentic workflows (success rate + cost per task + steps to completion), 6) Safety / guardrails (preventing unintended actions, sandboxing). Stack: LangGraph (DAG agents — LangChain), AutoGen (Microsoft multi-agent), CrewAI (multi-agent roleplaying), DSPy (Stanford declarative), Microsoft Semantic Kernel, OpenAI Assistants API, Anthropic Computer Use / MCP, OpenInterpreter (open-source code execution), Pydantic AI (typed agent framework, rising 2024+). Pay: Senior AI Agent Engineer — premium over general AI Engineer +10-20% thanks to rare-skill. Anthropic / OpenAI / Devin (Cognition Labs) / Adept hiring AI Agent specialists — premium $12,000-20,000+ Senior. Y Combinator AI startups in the agent space — the most active hiring sub-niche 2025-2026. Career flow: AI Engineer Senior + a demonstrable agent project → AI Agent Engineer — 3-6 months.

Which companies actively hire AI Engineers?

At the top: Sber.AI, Yandex, Tinkoff. Russian AI (the largest AI projects in Russia): Sber.AI (GigaChat — the largest LLM project in Russia + Kandinsky text-to-image + Salute Voice), Yandex (Yandex.GPT for Alisa + Yandex.Browser AI + Maps AI), Tinkoff (chatbot + transaction categorisation + voice AI), OZON (recsys AI + semantic search), VK (Mail.ru AI assistants + product AI features), X5 Group (retail AI), MTS (AI assistants + telecom AI), JetBrains (AI Assistant for IDE — major AI Engineer hire). Outsourcers with GenAI Practice: EPAM AI Practice (the largest AI outsource in CIS for US GenAI projects), Luxoft AI, Andersen AI, DataArt AI, Itransition. Frontier-AI labs (international remote premium): Anthropic (Claude — safety-focused frontier — premium tier), OpenAI (GPT — capabilities-focused — premium tier), Cohere (enterprise LLM — Canada / UK), Mistral AI (open-weights frontier — France), Hugging Face (open-source ecosystem), Stability AI (image gen), Black Forest Labs (FLUX models — Germany), Replicate (inference platform), Together AI + Fireworks + Groq (fast inference), Perplexity (AI search), Inflection AI, Adept, xAI (Musk), Tencent AI, Alibaba DAMO. AI Agent specialists (rising 2025-2026 hiring tier): Cognition Labs (Devin — autonomous SWE), Adept, OpenInterpreter, Anthropic (computer use team). Big Tech AI: Google DeepMind (Gemini) / Meta AI (Llama) / Microsoft AI (Copilot family) / Apple ML (Apple Intelligence) / Amazon AGI (recently formed 2024). Y Combinator AI startups — the most active hiring category 2025-2026 (>50% of recent YC batches are AI). Russian AI startups rising: Just AI (chatbots), Vector AI, Sber AI Lab startups.

Where to start in AI Engineering in 2026?

Roadmap: 1) Python deep + Backend basics — async / typing / FastAPI / pytest. Without Backend skill — not an AI Engineer (just a prompt jockey). Book: "Fluent Python" Ramalho. 2) LLM fundamentals understanding — transformer architecture basics (attention + tokenisation + decoding strategies + context windows), but you don't need to implement from scratch. Course: "Building LLMs from scratch" Karpathy YouTube (free — best for intuition). Book: "Hands-On Large Language Models" Jay Alammar / Maarten Grootendorst (2024 — must-read). 3) OpenAI API mastery — chat completions + function calling + structured outputs + streaming + assistants. Build a minimum of 3 real apps. Free credits + cheap experimentation. 4) Anthropic Claude API mastery — best for coding + long-context use cases. Tool use + prompt caching + computer use (if AI Agent track interests you). 5) LangChain mastery — the largest job-market requirement. LCEL (LangChain Expression Language), chains, agents, retrievers, callbacks. Course: "LangChain & Vector Databases in Production" DeepLearning.AI (free). 6) RAG architecture deep — document loaders + chunking strategies (semantic / fixed / recursive) + embeddings choice + vector DB selection + retrieval algorithms + reranking. Build a production RAG app with advanced patterns (hybrid search + reranking + HyDE). 7) Vector DB mastery — at least one deeply (Qdrant — best for self-hosted learning) + one managed (Pinecone). Understanding HNSW / IVF / PQ algorithm basics. 8) Fine-tuning hands-on — LoRA / QLoRA via PEFT library + Unsloth (faster). Fine-tune Llama 3.x 8B on your own dataset. Hardware: Google Colab Pro+ ($50/mo) or RunPod / Vast.ai (cheap GPU rental). 9) vLLM serving — deploy an open model with vLLM + benchmark throughput + compare with OpenAI API. 10) Evaluation mastery — RAGAS + DeepEval + custom evaluation suite. Build a regression test for prompts. 11) LangSmith / Langfuse observability — instrument a production app + analyse traces + identify bottlenecks. 12) Agent frameworks — LangGraph + AutoGen + CrewAI + DSPy basics. Build a multi-agent demo (research agent + writing agent + critic agent). 13) Pet-project portfolio: production-grade AI feature (e.g. document Q&A for an open dataset) with RAG + evaluation + observability + cost tracking. Document as portfolio (GitHub + blog post + demo video). RU courses: Karpov.Courses "AI Engineer" track, Otus "LLM Engineering", HSE School of AI, School21 (Sber) AI Engineer track. International (EN): DeepLearning.AI Generative AI Specialization (Andrew Ng — must-start), Hugging Face NLP Course (free), "AI Engineering" Chip Huyen book (2024 must-read), fast.ai Practical Deep Learning Part 2 (advanced), OpenAI / Anthropic / Cohere official documentation (must read in full), Latent Space podcast (best AI Engineering podcast). Must-read books: "AI Engineering" Chip Huyen (2024), "Hands-On Large Language Models" Alammar / Grootendorst, "Building LLMs for Production" Bouchard / Peng. Communities: Hugging Face Discord (largest open AI community), LangChain Discord, LlamaIndex Discord, Twitter AI community (follow @karpathy, @swyx, @hwchase17, @jerryjliu0), r/LocalLLaMA (open-source LLM community), Telegram @ai_engineer_ru. Backend Senior + LLM interest → AI Engineer Junior — 2-4 months (fastest entry into the ML world). ML Engineer Senior → AI Engineer Senior — 1-3 months. Data Scientist → AI Engineer Middle — 3-6 months.

How many AI Engineer jobs are open across CIS and Europe?

8 active open AI Engineer positions in our sample — a young niche (split off in 2023 after the ChatGPT boom). The real market is much broader — most AI Engineer roles are classified by our matcher as general ML Engineer or Backend (because titles are often "Senior Backend Engineer (AI focus)" or "ML Engineer — LLM"). The true pool of AI-focused dev jobs across CIS + Europe is estimated at 500-2,000 active positions at any moment in 2026 (accounting for fuzzy-classified ones). Geography: EN, 🇷🇺 Russia. Sources: hh.ru (especially Sber.AI / Yandex / Tinkoff active), Habr Career, getmatch, Djinni, LinkedIn (huge international AI Engineer segment — Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Y Combinator AI startups), NoFluffJobs / JustJoin.it (Poland AI-friendly), Telegram (@ai_engineer_ru, @ml_jobs, @aijobs, @jobsforaiml, @prompt_engineering_ru), career pages of EPAM AI Practice / Luxoft AI / Andersen AI, specialised boards (aijobs.net, ai-jobs.net, builtin.com/jobs/ai, jobs.lever.co for AI startups), Y Combinator Work at a Startup (largest AI-startup channel — 50%+ of recent batches are AI), AI-lab direct career pages (Anthropic / OpenAI / Cohere / Mistral / Hugging Face / Stability / Replicate). Time to close a Senior AI Engineer role — 4-10 weeks in Russia, 6-12 weeks international (rare-skill premium). Trend 2026: the AI Engineer pool grows fastest among all IT categories — we see 3-5× year-over-year growth in LLM-mentioning job postings.

What skills does a Senior AI Engineer need?

A Senior AI Engineer owns the full LLM-product engineering cycle + cost optimisation + technical leadership. Python deep + Backend Senior level: async / typing / FastAPI / pytest mastery. LLM API providers mastery: OpenAI + Anthropic + Google Gemini + Mistral + Cohere — know quirks + rate limits + retry strategies + cost per token for each. Open LLM knowledge: Llama 3.x / Mistral / Qwen 2.5 / DeepSeek V3 + R1 / Phi 3 / Gemma 2 — performance / cost trade-offs, when to use which. LangChain mastery: LCEL deep, callbacks, agents, retrievers, custom chain authoring, debugging through LangSmith. Or equivalent in LlamaIndex / Haystack / Semantic Kernel. RAG architecture mastery: advanced patterns (hybrid search BM25 + vector, reranking, HyDE, query rewriting, multi-hop retrieval, self-reflection patterns CRAG / Self-RAG), chunking strategies (semantic + recursive + hierarchical), embedding model selection (OpenAI vs Cohere vs open-source — performance + cost trade-offs). Vector DB mastery deep: one of Pinecone / Qdrant / Weaviate / Milvus / pgvector deeply — index tuning, sharding, replication, hybrid search implementation. Fine-tuning mastery: LoRA / QLoRA via PEFT + Unsloth + Axolotl, dataset preparation (instruction tuning + DPO data), hyperparameter selection, evaluation methodology. RLHF / DPO / KTO basics: TRL library for preference optimisation. Serving mastery: vLLM advanced (PagedAttention + continuous batching + speculative decoding + KV cache optimisation), TGI, SGLang. Agent architectures: LangGraph + AutoGen + CrewAI + DSPy mastery, MCP (Model Context Protocol) implementation, tool use design patterns, multi-agent orchestration. Evaluation mastery: RAGAS + DeepEval + custom evaluation suite design, LLM-as-judge methodology, regression testing for prompts, A/B testing for LLM features. Observability + tracing: LangSmith + Langfuse + Helicone + Arize Phoenix + Datadog LLM Observability mastery. Cost optimisation mastery: prompt caching strategies (native + custom Redis), prompt compression (LLMLingua), model routing (small for easy / large for complex via OpenRouter / Martian), batch API for non-real-time. Typical Senior mandate: 50-70% reduction in LLM API costs. Safety / guardrails: NeMo Guardrails, Llama Guard, OpenAI Moderation API integration, prompt injection defence, output filtering. System design for LLM products: design a RAG-based product on whiteboard at 100M+ queries/day scale, multi-region considerations, latency budgets (target P95 < 2s for chat). Soft: ADRs writing for LLM architecture decisions, technical writing (LLM feature design docs), cross-team collaboration (Product / Backend / DS / Research teams), mentoring Middle AI Engineers, paper-reading discipline (NeurIPS / ICLR / EMNLP — must follow). English for Senior+ MUST — AI community / docs / papers / conferences are entirely English. Optional bonus: open-source contributions to LangChain / LlamaIndex / vLLM / Hugging Face — sharply increase market value for frontier-AI labs (Anthropic / OpenAI / Cohere / Mistral) hiring. Twitter AI presence (regular posting about LLM experiments) — important for hiring through community connections.

Similar specializations

Data EngineerBackendAnalyst / BI

Methodology

  • Data period: in the hero and copy — the last 3 months. In the charts — the full available observation period (since parsers were launched, usually 2-3 months).
  • Data is collected automatically from 1000+ sources — Telegram channels and job boards across CIS and Europe.
  • Only live open jobs with a clear description are counted. Spam and duplicates are filtered out.
  • Salaries are converted to USD/month at the current rate. Outlier values (
    lt;500 or
    gt;50K) are filtered out.
  • Levels are normalized: Mid → Middle, Intern/Trainee → Junior, Principal/Staff/Expert → Lead.
  • The first 2 weeks of data (parser ramp-up period) are not shown in the charts.
  • Data is recomputed every day.

Authorship and citation

Analytics prepared by Zorky Research Team. Last updated: May 29, 2026 at 5:40 PM.

Data sources and methodology

Data is collected automatically from 1000+ sources — Telegram job channels and job boards across CIS and Eastern Europe (HH, Habr Career, Djinni, DOU, NoFluffJobs, JustJoin.it, Pracuj.pl and others). Parsing runs 24/7, duplicates are filtered by description and URL, salary outliers are stripped. Detailed methodology — on the "How it works" page.

Cite this page:
Zorky CRM (2026). AI / LLM Engineer in IT: CIS and Europe market. Accessed: 5/29/2026. URL: https://zorky.tech/en/research/ml
Data collected automatically from 1000+ sources • Source: Zorky CRM