Computer Vision в IT — рынок СНГ и Европы
Computer Vision Engineer (CV) — sister-discipline NLP в семействе ML, focused на обработке изображений и видео (не текстов). Зрелая область (с 1960-х), пересобранная deep learning'ом 2012 (AlexNet) → трансформерами 2020-2021 (ViT) → foundation моделями 2023-2024 (DINOv2 / SAM / CLIP) → generative bumom 2022-2026 (Stable Diffusion / FLUX). Focus задач: image classification, object detection (YOLO + Detectron2 + DETR), semantic / instance / panoptic segmentation (SAM + Mask R-CNN), OCR (PaddleOCR + EasyOCR + Tesseract legacy), face recognition + biometrics, video understanding (action recognition + tracking — ByteTrack / DeepSORT), multimodal (CLIP + GPT-4V + Claude vision + Gemini Vision), 3D vision (NeRF + Gaussian Splatting + structure-from-motion), generative image / video (Stable Diffusion / FLUX / Sora family), edge deployment (TensorRT + ONNX + CoreML + OpenVINO). Семейство ролей: Computer Vision Engineer (general — production CV pipelines), Senior CV Engineer (multi-task CV ownership + custom architectures), ML Research CV (academic-track — papers at CVPR / ICCV / ECCV — overlap с Research Engineer / Scientist), 3D Vision Engineer (NeRF / Gaussian Splatting / point cloud — rising 2024+), Generative AI Engineer (Image / Video) (Stable Diffusion / FLUX / Runway / Pika / Kling specialization — overlap с ai-engineer), Edge CV Engineer (TensorRT + mobile deployment specialization), Robotics Vision Engineer (SLAM + perception для роботов / автономных систем). Стек 2026: Python (монопольно — некоторые edge компоненты на C++/Rust). PyTorch + torchvision (foundation — 90%+ CV-research на PyTorch 2026), OpenCV (classical CV — still huge для preprocessing + traditional algorithms), Pillow (image manipulation). Object detection / segmentation frameworks: Ultralytics YOLO (YOLOv8 / YOLOv9 / YOLOv10 / YOLOv11 — industry standard для detection 2026), Detectron2 (Meta — research-grade detection + segmentation), MMDetection + MMSegmentation + MMPose (OpenMMLab — Chinese research, huge ecosystem), DETR family (transformer-based detection — RT-DETR + DINO-DETR rising), SAM + SAM 2 (Segment Anything Model — Meta 2023/2024 — universal segmentation, SAM 2 = video). Self-supervised + foundation models: DINOv2 (Meta — best vision backbone 2024), CLIP + OpenCLIP + SigLIP (vision-language). Classification backbones: ViT (Vision Transformer) family, Swin Transformer, ConvNeXt (modern CNN), EfficientNet (legacy but still big), timm (PyTorch Image Models — Ross Wightman — must library, 1000+ pretrained models). Data augmentation: albumentations (industry standard — fast + comprehensive), torchvision.transforms.v2 (modern), RandAugment + AugMix (research-recipe augment). Multimodal LLM: GPT-4V (OpenAI), Claude 3.5 Sonnet vision (Anthropic), Gemini 1.5 Pro / 2.0 (Google), LLaVA + Qwen 2 VL + InternVL (open-source vision-language). OCR: PaddleOCR (Baidu — multilingual leader 2026), EasyOCR, Tesseract (legacy still big), Docling (IBM — modern doc understanding 2024+), Surya (multilingual OCR). Face recognition: InsightFace (open-source SOTA 2026 — ArcFace + RetinaFace), DeepFace (lightweight library), face_recognition (dlib wrapper). Russian face-rec: NtechLab FaceNGN / VisionLabs Luna / Tevian. Video understanding + tracking: ByteTrack (fastest tracker 2026), DeepSORT (classic), BoT-SORT, StrongSORT. Video models: VideoMAE, X-CLIP, InternVideo. 3D vision: NeRF (Neural Radiance Fields — original 2020), Instant-NGP (NVIDIA — 100× faster), Gaussian Splatting (3DGS) (rising 2023-2026 — фотореалистичный rendering), Nerfstudio (unified framework), Open3D (point cloud processing), PyTorch3D (Meta — 3D research). Generative image / video: Stable Diffusion family (SDXL + SD3 + SD3.5 — Stability AI), FLUX.1 [dev/schnell/pro] (Black Forest Labs — открытый рекорд качества 2024+), DALL-E 3 (OpenAI), Imagen 3 (Google), Midjourney v6 (commercial), Ideogram (text-in-image leader). Generative video: Runway Gen-3, Pika 1.5, Kling AI (Kuaishou), Sora (OpenAI), AnimateDiff + SVD (Stable Video Diffusion) (open-source). Russian generative: Сбер Kandinsky 3. UIs: ComfyUI (node-based — research-favorite), Automatic1111 (web UI legacy). Fine-tuning generative: LoRA + DreamBooth для Stable Diffusion / FLUX. Edge deployment: ONNX (cross-framework), TensorRT (NVIDIA — fastest inference на NVIDIA hardware), OpenVINO (Intel CPUs / iGPUs), CoreML (Apple Silicon), TFLite (mobile), MediaPipe (Google — real-time on-device pipelines), NVIDIA DeepStream (production video analytics). Inference serving: Triton Inference Server (NVIDIA — multi-framework + dynamic batching), BentoML, TorchServe. Annotation: CVAT (OpenCV — industry standard для detection / segmentation), Label Studio, V7 Darwin (commercial), Roboflow (popular для quick prototyping). Datasets: COCO + ImageNet + OpenImages + LAION-5B + Objects365 + Visual Genome. По данным Zorky CRM, открыто 33 активных вакансий с явной CV-спецификой (реальный пул много шире — multiple CV roles classified как general ML Engineer / Backend / Robotics). Медиана $6930/мес. Топ-стек: visio, python, data engineer, data processing, go. 69.2% — удалёнка. CV Engineer Senior — $6000-10000/мес, в autonomous vehicle companies (Tesla / Waymo / Cruise / Wayve / Pony.ai / Яндекс беспилотники) — премиум $9000-15000+, generative AI (Stability / Black Forest Labs / Runway / Pika) — $9000-16000+.
Сравнение с другими специализациями
Внутри направления AI / ML / DS есть 6 специализаций. Текущая (Computer Vision) отмечена синим — сравните её по числу открытых вакансий и медиане зарплат с соседними.
Динамика спроса
CV — зрелая ML-область (с 1960-х), пересобранная deep learning'ом 2012 → трансформерами 2020 → foundation models 2023-2024 (DINOv2 / SAM 2) → generative bumom 2022-2026 (Stable Diffusion / FLUX / Sora). Pool small в нашей выборке — реальный рынок шире за счёт overlap с general ML / Robotics. Драйверы 2026: autonomous vehicles (Tesla / Waymo / Wayve / Яндекс беспилотники — крупнейший single CV-employer category), generative image/video adoption (Stability / Black Forest Labs / Runway / Pika), spatial computing (Apple Vision Pro mass-market launch 2024), retail visual search (Wildberries / Ozon — крупные deployments РФ), medical imaging (СберМедИИ + западные AI-медицинские стартапы), defense + drones (Anduril / Skydio), robotics (Boston Dynamics). Российские CV крупняки: Яндекс беспилотники + Сбер.AI Kandinsky + NtechLab / VisionLabs (face-rec global leaders).
Сколько новых вакансий появляется каждую неделю.
Распределение по уровням — динамика
Как меняется доля Junior/Middle/Senior/Lead в открытых вакансиях по неделям. Тренды к Senior — обычно признак «зрелого» рынка специализации, где компании ищут готовых специалистов; обратное — рост Junior — сигнал расширения и набора в команды с нуля.
Доля каждого уровня в % от всех вакансий с указанным грейдом за неделю.
Зарплата по уровням
Junior — typical entry: MS / PhD CV / Robotics + portfolio (Kaggle CV competitions / GitHub CV-projects). Career-flow: Backend Senior / ML Middle (2-3 года) + CV interest + portfolio → CV Engineer Junior (1-2 года) → Middle (2-3 года) → Senior → либо 3D Vision Engineer (NeRF / Gaussian Splatting), либо Edge CV Engineer (TensorRT / CoreML mastery), либо Generative AI Engineer (image/video), либо ML Research CV (academic-track papers CVPR / ICCV / ECCV), либо Autonomous Vehicle Perception Engineer. Цифры основаны на small sample — для broader benchmarks смотрите ml-engineer / research страницы.
Медианная зарплата (USD/мес) на каждом грейде + прирост к предыдущему.
Самый большой денежный прыжок — между Senior и Lead (+45.6%).
Распределение зарплат — динамика
Медианная CV Engineer-зарплата — $6930/мес — премиум-сегмент за rare-skill combination. Distribution based на small sample. $8K+ — Senior с production detection / segmentation + edge deployment expertise. $10K+ — Senior в autonomous vehicle companies или Generative AI (Stability / Black Forest Labs / Runway). $15K+ — Senior+ в Big Tech CV (Apple Vision Pro / NVIDIA / Google DeepMind / Meta Reality Labs) или Foundation Model CV teams. $25K+ — топ-выбросы (DINOv2 / SAM 2 / Sora teams).
Какую долю вакансий занимает каждый ценовой диапазон по неделям.
62% вакансий — в диапазоне $5–8K (это основной рынок). Высокий сегмент $8K+: 24% — обычно это US-remote или senior-international роли.
География найма
Лидер по числу CV Engineer-вакансий — 🇵🇱 Польша (18 позиций). Россия — Яндекс беспилотники + Маркет visual search + Сбер.AI Kandinsky + СберМедИИ + VK + Wildberries + Ozon + NtechLab + VisionLabs + Cognitive Pilot + EPAM AI CV Practice доминируют. Польша — CV-friendly EU-хаб. Германия — Berlin AI + Munich automotive (BMW / Mercedes AI / Bosch / Continental autonomous teams). UK — London (Wayve autonomous + Tractable). США — Bay Area + Pittsburgh (autonomous vehicle clusters) + Boston (Robotics MIT region). Огромный международный remote через autonomous vehicle companies (Tesla / Waymo / Cruise / Wayve / Pony.ai / Zoox / Aurora) + generative AI (Stability / Black Forest Labs / Runway / Pika / Kling) + Big Tech CV + Y Combinator CV-startups.
Распределение вакансий по странам.
Эти числа отражают распределение по источникам, которые мы парсим. Польша часто выглядит доминирующей из-за плотного coverage NoFluffJobs / JustJoin.it / Pracuj — польского IT-рынка действительно много, но в нашей выборке его доля переоценена относительно реального объёма всех IT-вакансий в регионе. То же — про другие топ-страны: это «куда смотрят наши парсеры», не «истинный размер рынка».
Удалёнка / Гибрид / Офис — динамика
69.2% CV Engineer-вакансий — удалёнка или гибрид. CV work fully cloud-based standard. Аутсорсеры — почти всегда remote. Российские банки — гибрид/офис. Autonomous vehicle companies — research roles often remote, deployment requires on-site (Pittsburgh / Mountain View / Palo Alto). Generative AI companies — full-remote standard. Big Tech CV — гибрид-standard.
Как меняется доля каждого формата работы по неделям.
89% — удалёнка. Specializация хорошо адаптирована к remote-формату.
Топ востребованных технологий
Топ-стек CV Engineer 2026: Python deep + C++ basics для edge, PyTorch + torchvision mastery (90%+ research), OpenCV (classical CV), Pillow (image manipulation), object detection / segmentation: Ultralytics YOLO (industry standard production) + Detectron2 (Meta research-grade) + MMDetection / MMSegmentation / MMPose (OpenMMLab) + DETR family (RT-DETR rising) + SAM + SAM 2 (universal segmentation), self-supervised + foundation models: DINOv2 (Meta — best vision backbone 2024) + CLIP + OpenCLIP + SigLIP (vision-language), classification backbones: ViT + Swin Transformer + ConvNeXt + EfficientNet + timm (1000+ pretrained), augmentation: albumentations (industry standard) + torchvision.transforms.v2 + RandAugment / Mixup / CutMix, multimodal LLM: GPT-4V + Claude 3.5 vision + Gemini 1.5/2.0 Pro + LLaVA + Qwen 2 VL + Molmo (best open multimodal 2024), OCR: PaddleOCR (Baidu multilingual leader) + EasyOCR + Tesseract + Docling (IBM 2024) + Surya + TrOCR, face recognition: InsightFace (open-source SOTA) + DeepFace + face_recognition (dlib). Russian: NtechLab FaceNGN + VisionLabs Luna + Tevian + RecFaces, video tracking: ByteTrack (fastest) + DeepSORT + BoT-SORT + StrongSORT + MOTRv2. Video models: VideoMAE + X-CLIP + InternVideo, 3D vision: NeRF + Instant-NGP (NVIDIA) + Gaussian Splatting (rising 2023-2026) + Nerfstudio + Open3D + PyTorch3D (Meta) + Kaolin (NVIDIA) + COLMAP (SfM) + Mitsuba 3, generative: Stable Diffusion SDXL/SD3/SD3.5 + FLUX.1 (Black Forest Labs) + DALL-E 3 + Imagen 3 + Midjourney v6 + Ideogram. Video: Runway Gen-3 + Pika 1.5 + Kling + Sora + AnimateDiff + SVD. Russian: Сбер Kandinsky 3, generative UIs: ComfyUI (research-favorite) + Automatic1111, edge deployment: ONNX + TensorRT (NVIDIA fastest) + OpenVINO (Intel) + CoreML (Apple Silicon) + TFLite (mobile) + MediaPipe (Google on-device) + NVIDIA DeepStream (multi-camera production), inference serving: Triton Inference Server + BentoML + TorchServe, annotation: CVAT (OpenCV — industry standard) + Label Studio + V7 Darwin + Roboflow, datasets: COCO + ImageNet + OpenImages + LAION-5B + Objects365 + Visual Genome + ADE20K + Cityscapes.
Технологические комбинации
Частые пары: Python + PyTorch + torchvision + OpenCV (classical CV foundation), Ultralytics YOLO + albumentations + CVAT (production detection pipeline), Detectron2 + MMDetection + timm (research-flexible stack), SAM 2 + ByteTrack + DeepSORT (video understanding + tracking), DINOv2 + CLIP + multimodal LLM (foundation model + vision-language stack), Stable Diffusion + FLUX + ComfyUI + LoRA / DreamBooth (generative image stack), NeRF + Gaussian Splatting + Nerfstudio + COLMAP + Open3D (3D vision stack), PyTorch + ONNX + TensorRT + CoreML (edge deployment stack), NVIDIA DeepStream + Triton + Kubernetes (production video analytics). Learning-roadmap: math foundations → Python + ML basics → OpenCV classical CV → Stanford CS231n → PyTorch + torchvision + timm → Ultralytics YOLO + Detectron2 → SAM + DINOv2 → multimodal LLM → generative track (SD/FLUX/ComfyUI) → 3D vision (NeRF/3DGS) → edge deployment hands-on (TensorRT/CoreML) → annotation tooling (CVAT) → pet-project portfolio (4 demos).
Какие пары технологий чаще всего встречаются вместе в одной вакансии.
Откуда мы видим эти вакансии
Computer Vision Engineer-вакансии: hh.ru (особенно Яндекс беспилотники / Сбер.AI / NtechLab / VisionLabs active), Habr Career, getmatch, Djinni, LinkedIn (огромный международный CV сегмент через autonomous vehicles + generative AI + Big Tech), NoFluffJobs / JustJoin.it (Польша CV-friendly), Telegram (@cv_ru, @ml_jobs, @aijobs, @jobsforaiml, @robotics_ru), карьерные сайты EPAM AI CV / Luxoft CV / Andersen / DataArt, специализированные борды aijobs.net + ai-jobs.net + builtin.com/jobs/ai + roboticscareer.org, autonomous vehicle direct careers (Tesla / Waymo / Cruise / Wayve / Pony.ai / Zoox / Aurora / Mobileye), generative AI direct careers (Stability AI / Black Forest Labs / Runway / Pika / Ideogram / Midjourney), CVPR / ICCV / ECCV conference job boards, Y Combinator Work at a Startup.
Computer Vision vs другие направления
Computer Vision Engineer пересекается с ML Engineer (production ML overlap ~60%), AI Engineer (multimodal LLM overlap), NLP Engineer (vision-language models), Research Engineer (CVPR / ICCV / ECCV papers track), Robotics Engineer (SLAM + perception), Edge ML / MLOps Engineer (deployment overlap). Сравнение с ml-engineer/ai-engineer/nlp/data-scientist/research/mlops — в SiblingSubnichesChart выше.
Объём открытых вакансий по направлениям IT.
Свежие вакансии
Свежие открытые Computer Vision Engineer-вакансии — последние позиции в выборке (узкий pool явных CV roles — реальный рынок шире за счёт overlap с ml-engineer / robotics). Полный список — в нашем CRM или по ссылке «смотреть все» ниже. Для broader view посмотрите ml-engineer + ai-engineer страницы.
Что мы можем предложить
Если работаете с Computer Vision-вакансиями или сами в этой роли — мы можем закрыть конкретную задачу. Выберите формат, оставьте контакт — отвечаем в течение суток.
Частые вопросы
Самые частые вопросы про Computer Vision Engineer: зарплаты (премиум-сегмент за rare-skill), CV vs ML vs AI Engineer (3-way comparison + 5 distinctions), object detection stack 2026 (YOLO vs Detectron2 vs MMDetection vs DETR vs SAM decision tree), 3D Vision Engineer (rising 2024+ sub-specialization), удалёнка, как стать (6-12 месяцев из Backend / ML Middle + portfolio), Senior skills (PyTorch deep + OpenCV mastery + detection / segmentation frameworks + edge deployment + один domain). Ответы пересчитываются автоматически.
Сколько зарабатывает Computer Vision Engineer в 2026?
Медиана CV Engineer — $6930/мес по данным Zorky CRM (33 активных вакансий с явной CV-спецификой — реальный pool шире за счёт overlap с general ML / Robotics). CV Engineer — премиум-сегмент за счёт rare-skill combination (PyTorch deep + image processing intuition + production deployment expertise). Senior с production detection / segmentation pipeline + edge deployment (TensorRT / CoreML) — $7000-10500. Senior в US/EU-аутсорсе (EPAM AI CV / Luxoft CV на enterprise CV projects) — $7500-11500. Staff / Principal CV Engineer — $10000-15000. Autonomous vehicle companies (Tesla Autopilot / Waymo / Cruise / Wayve / Pony.ai / Zoox / Яндекс беспилотники) — премиум $9000-15000+ Senior, $15000-25000+ Staff/Principal. Generative AI image/video (Stability AI / Black Forest Labs / Runway / Pika / Kling / Stable Diffusion ecosystem) — $9000-16000+ Senior. Big Tech CV (Google DeepMind / Meta AI / Microsoft / Apple Vision Pro team / NVIDIA) — $14000-25000+ Senior + RSU. Топ-выбросы: Foundation Model CV teams (DINOv2 / SAM 2 / Sora authors) — $25000-50000+. Премиум-доплаты: 3D vision (NeRF / Gaussian Splatting) +15-25%, edge deployment mastery (TensorRT + CoreML + ONNX) +10-20%, generative image/video specialization +20-30%.
Какая зарплата у CV Engineer Junior, Middle, Senior, Lead?
Цифры основаны на small sample — для broader benchmarks смотрите ML Engineer и Research Engineer / Scientist. Junior — typical entry: MS / PhD CV / Robotics + portfolio (Kaggle CV competitions, GitHub CV-projects). Скачок Junior → Middle — после первого production CV-deployment (detection / segmentation / classification feature shipped). Middle → Senior — multi-task CV pipeline ownership + edge deployment expertise или generative AI specialization. Senior → Staff / Principal — org-wide CV strategy + custom architecture design + research-paper publication track. Career-flow: Backend Senior / ML Engineer Middle (2-3 года) + CV interest + portfolio → CV Engineer Junior (1-2 года) → Middle (2-3 года) → Senior → либо 3D Vision Engineer, либо Edge CV Engineer, либо Generative AI Engineer (image/video), либо ML Research CV (academic-track papers CVPR / ICCV / ECCV).
Сколько платят CV Engineer в Москве, СПб, удалённо?
Москва Senior CV Engineer — $6500-10000/мес (Яндекс — крупнейший CV-работодатель РФ для беспилотных автомобилей + Маркет visual search + Алиса AR; Сбер.AI — Kandinsky 3 generative + face recognition banking + СберМедИИ medicine CV; VK — face recognition VK Видеозвонки + контент-модерация; Wildberries — visual search + одежда / товары; Ozon — visual search; NtechLab — global face recognition leader; VisionLabs — Luna platform; Tevian — face / liveness; Cognitive Pilot — беспилотный комбайн / транспорт; СберМедИИ — medical imaging; Лаборатория Касперского — image-based malware; X5 Group — smart shelf / retail analytics). СПб $6000-9500 (JetBrains AI). Минск/Киев $5500-9000 Senior. Польша €7000-11000 gross Senior. Германия €80-120K/год Senior (Berlin AI cluster + automotive — Bosch / Continental / BMW autonomous teams). UK £75-130K Senior (London — Wayve autonomous). 69.2% — удалёнка. Аутсорсеры (EPAM AI CV Practice / Luxoft CV / Andersen AI / DataArt CV) — почти всегда remote, $7500-11500 Senior на US-CV projects. Autonomous vehicle companies: Tesla Autopilot ($12K-22K Senior) / Waymo / Cruise / Wayve / Pony.ai / Zoox / Aurora Innovation / Mobileye / Argo AI legacy / Yandex SDG (Self-Driving Group). Generative AI: Stability AI / Black Forest Labs (FLUX — German team) / Runway / Pika / Kling AI / Ideogram / Midjourney — $9000-16000+ Senior. Robotics: Boston Dynamics / Skydio / Anduril Industries (defense + autonomy) / Niantic AR. Big Tech CV (Apple Vision Pro team / NVIDIA / Google DeepMind / Meta AI Reality Labs / Microsoft HoloLens) — $14000-25000+ Senior.
Какой стек чаще всего требуют от CV Engineer?
Топ-5: visio, python, data engineer, data processing, go. Python deep (монопольно — edge components иногда C++/Rust). PyTorch + torchvision mastery — 90%+ CV-research на PyTorch 2026 (TensorFlow legacy). OpenCV mastery: classical CV ops (cv2.findContours / cv2.HoughLines / Canny edge / morphology / homography / camera calibration / stereo matching — must для preprocessing + traditional algorithms). Pillow (PIL — image manipulation). Object detection / segmentation frameworks: Ultralytics YOLO (YOLOv8 / v9 / v10 / v11 — industry standard 2026 — easy training + good docs + Python-friendly), Detectron2 (Meta — research-grade detection + segmentation, более flexible), MMDetection / MMSegmentation / MMPose (OpenMMLab — huge ecosystem, Chinese research-heavy), DETR family (transformer-based — RT-DETR rising 2026), SAM + SAM 2 (Segment Anything Model + video — Meta — universal segmentation). Self-supervised + foundation models: DINOv2 (Meta — best vision backbone 2024 для feature extraction), CLIP + OpenCLIP + SigLIP (vision-language alignment), MAE (Masked Autoencoder). Classification backbones: ViT (Vision Transformer) family — ViT-B/L/H, Swin Transformer, ConvNeXt (modern CNN), EfficientNet + MobileNet (efficient), timm (PyTorch Image Models — Ross Wightman — 1000+ pretrained models, must-library). Data augmentation: albumentations (industry standard — fast + comprehensive — 30+ augmentation types), torchvision.transforms.v2 (modern PyTorch native), RandAugment + AugMix + Mixup + CutMix (research recipes). Multimodal LLM: GPT-4V + Claude 3.5 Sonnet vision + Gemini 1.5/2.0 Pro vision + open-source LLaVA + Qwen 2 VL + InternVL + Idefics2 + Molmo (best open multimodal 2024). OCR: PaddleOCR (Baidu — multilingual leader 2026 — 80+ languages), EasyOCR, Tesseract (legacy still huge), Docling (IBM 2024 — doc understanding), Surya (multilingual modern), TrOCR (Microsoft transformer-based). Face recognition: InsightFace (open-source SOTA 2026 — ArcFace + RetinaFace + Buffalo models), DeepFace, face_recognition (dlib wrapper). Russian: NtechLab FaceNGN / VisionLabs Luna / Tevian / RecFaces. Video understanding + tracking: ByteTrack (fastest tracker 2026), DeepSORT, BoT-SORT, StrongSORT, MOTRv2. Video models: VideoMAE, X-CLIP, InternVideo, MViT. 3D vision: NeRF (original 2020) + Instant-NGP (NVIDIA — 100× faster) + Gaussian Splatting (3DGS) (rising 2023-2026 — фотореалистичный rendering, заменяет NeRF), Nerfstudio (unified framework), Open3D (point cloud processing), PyTorch3D (Meta), Kaolin (NVIDIA), COLMAP (structure-from-motion), Mitsuba 3 (differentiable rendering). Generative image / video: Stable Diffusion family (SDXL + SD3 + SD3.5 — Stability AI), FLUX.1 (Black Forest Labs — открытый рекорд качества 2024+), DALL-E 3 / Imagen 3 / Midjourney v6 / Ideogram (text-in-image). Video: Runway Gen-3 / Pika 1.5 / Kling AI / Sora / AnimateDiff / SVD. UIs для generative: ComfyUI (node-based — research-favorite), Automatic1111. Fine-tuning generative: LoRA + DreamBooth для Stable Diffusion / FLUX. Edge deployment: ONNX (cross-framework), TensorRT (NVIDIA — fastest), OpenVINO (Intel), CoreML (Apple Silicon), TFLite (mobile), MediaPipe (Google — on-device real-time pipelines), NVIDIA DeepStream (production video analytics — multi-camera). Inference serving: Triton Inference Server + BentoML + TorchServe. Annotation: CVAT (OpenCV — industry standard для detection / segmentation), Label Studio + V7 Darwin (commercial) + Roboflow (rapid prototyping). Datasets: COCO (detection / segmentation benchmark) + ImageNet (classification) + OpenImages + LAION-5B (web-scale generative training) + Objects365 + Visual Genome + ADE20K (segmentation) + Cityscapes (autonomous driving). Hardware-aware optimization: quantization (INT8 / FP16), pruning, distillation, structured sparsity (for NVIDIA Ampere+ tensor cores).
Computer Vision Engineer vs ML Engineer vs AI Engineer — в чём разница?
Все три роли overlap в 2026, но focus areas различные. ML Engineer — generalist, владеет production ML stack (recsys / fraud / ranking / classical ML + LLM). Может работать с CV-data, но не deep specialization. См. ML Engineer. AI Engineer / LLM Engineer — focus на LLM integration (text-focused). См. AI / LLM Engineer. NLP Engineer — focus на text + speech. См. NLP Engineer. Computer Vision Engineer (эта страница) — focus на image / video / 3D data. Stack overlap с ML Engineer ~60% (PyTorch + cloud ML + deployment) + ~30% unique (OpenCV + Ultralytics + Detectron2 + SAM + albumentations + edge deployment specifics — TensorRT / CoreML). Distinctions: 1) Image processing intuition — CV Engineer understands camera models / lens distortion / color spaces / histogram analysis / morphological ops — classical CV foundation. AI Engineer обычно ничего такого не знает. 2) Computational constraints — CV models heavy (gigabytes), inference latency critical (autonomous driving — 30+ FPS mandate), edge deployment (TensorRT / CoreML / OpenVINO mastery) — exclusive CV territory. 3) Generative image / video specialization — Stable Diffusion / FLUX / ComfyUI workflow expertise — CV-niche specifically (overlaps с AI Engineer для text-driven generation). 4) 3D vision skills — NeRF / Gaussian Splatting / point cloud / SLAM / camera calibration — exclusive CV. 5) Domain expertise — CV roles часто tied к specific industry (autonomous vehicles / medical imaging / robotics / AR/VR / satellite imagery / manufacturing QC). Each domain имеет свои датасеты + правила. Career-pivots: ML Engineer Senior → CV Engineer — 4-8 месяцев (need OpenCV + Ultralytics + Detectron2 + albumentations + один domain). CV Engineer Senior → ML Engineer — 2-4 месяца (easy lateral, добавить MLOps + LLM basics). CV Engineer Senior → AI Engineer (LLM track) — 3-6 месяцев. Hot 2025-2026 sub-specializations: 3D Vision (Gaussian Splatting) / Generative Image-Video (FLUX + Sora-style) / Multimodal (Vision-Language Models — LLaVA / Molmo).
Object detection stack 2026 — YOLO vs Detectron2 vs MMDetection vs DETR vs SAM?
Decision tree для object detection / segmentation 2026: 1) Ultralytics YOLO (YOLOv8 / v9 / v10 / v11) — default choice 2026 для object detection. Pros: easy training (3-5 lines of code), good docs + community, fast inference (real-time на CPU + edge), wide deployment support (ONNX + TensorRT + CoreML + TFLite native). Cons: Ultralytics license requires AGPL or commercial license ($$) для proprietary use, less academic-flexible чем Detectron2 / MMDetection. Use case: 90% production object detection use cases, прототипы, edge deployment. 2) Detectron2 (Meta) — research-grade. Pros: clean API, flexible (easy custom architectures), Apache 2.0 license (commercial-friendly), backed by Meta. Cons: slower training than Ultralytics, larger learning curve. Use case: complex custom architectures, research projects, when you need fine control over training loop. 3) MMDetection / MMSegmentation / MMPose (OpenMMLab — Chinese consortium) — huge ecosystem. Pros: 100+ pre-implemented architectures (старые и новые), papers reference implementations always там first, Apache 2.0. Cons: config-heavy (steep learning curve), Chinese-language community sometimes hard для English-only, dependencies могут конфликтовать. Use case: research, comparing many architectures, paper reproduction. 4) DETR family (transformer-based detection) — modern paradigm shift 2020+. RT-DETR (Real-Time DETR — Baidu, rising 2024+ — competitor к YOLO в real-time space), DINO-DETR, Co-DETR. Pros: end-to-end (no NMS post-processing), often better accuracy on complex scenes. Cons: slower inference чем YOLO, more compute-hungry для training. Use case: highest-accuracy needs, complex scenes (crowded objects). 5) SAM (Segment Anything Model — Meta 2023) + SAM 2 (video version 2024) — universal segmentation. Use case: a) zero-shot segmentation (segment anything без training), b) interactive annotation tool (point / box prompt → mask), c) data labeling acceleration (SAM-assisted annotation 10× faster). Combine с YOLO / Detectron2: detector → SAM для precise masks. 6) Classical detection (HOG + cascade classifiers — OpenCV) — only для very simple cases, edge devices без GPU, low-power microcontrollers (still relevant для embedded scenarios). Default 2026 рекомендации: Production deployment + commercial → Ultralytics YOLO (если ok с AGPL or commercial license) OR Detectron2 (Apache 2.0 alternative). Research / paper reproduction → MMDetection. Highest accuracy / complex scenes → RT-DETR или Co-DETR. Universal / zero-shot segmentation → SAM 2. Annotation acceleration → SAM + interactive workflows (CVAT integration). Edge / mobile → YOLO с TensorRT (NVIDIA) / CoreML (Apple) / TFLite (Android). Senior CV Engineer должен знать когда какой инструмент.
Можно ли работать CV Engineer удалённо?
Да, 69.2% CV Engineer-вакансий — full-remote или гибрид. CV work fully cloud-based (training в cloud GPUs — A100 / H100, datasets streaming с S3 / GCS, deployment в Kubernetes). Аутсорсеры (EPAM AI CV / Luxoft CV / Andersen AI / DataArt CV) — почти всегда remote на US-CV projects. Российские (Яндекс беспилотники / Сбер.AI / VK / Wildberries / Ozon / NtechLab / VisionLabs CV teams) — гибрид или remote после probation. Российские банки (Сбер AI Banking CV — face recognition / document verification) — гибрид/офис security compliance. Autonomous vehicle companies — особый случай: research roles often remote-friendly, но product deployment / on-vehicle testing требует on-site (Pittsburgh Cruise / Mountain View Waymo / Palo Alto Tesla / etc). Generative AI companies (Stability / Black Forest Labs / Runway / Pika / Kling) — full-remote standard. International voice-AI / NLP companies overlap (multimodal teams) — full-remote. Big Tech CV (Apple Vision Pro / NVIDIA / Google DeepMind / Meta Reality Labs / Microsoft HoloLens) — гибрид-standard. Релокант-хабы для CV: США (Bay Area + Pittsburgh — autonomous vehicle clusters + Boston — Robotics MIT region), UK (London — Wayve), Канада (Toronto — Vector Institute), Германия (Berlin AI + Munich automotive — BMW / Mercedes AI), Франция (Paris — Hugging Face + Mistral для vision LLM), Япония (Tokyo — Sony AI + automotive). Английский для international CV-remote — must (CVPR / ICCV / ECCV papers + community англоязычные).
Чем 3D Vision Engineer (NeRF / Gaussian Splatting — rising 2024+) отличается?
3D Vision Engineer — sub-specialization внутри CV focused на 3D understanding + neural rendering. Hot rising 2023-2026 после: 1) NeRF mainstream 2022+, 2) Gaussian Splatting paper Aug 2023 (фотореалистичный рендер + быстрый), 3) Apple Vision Pro launch 2024 (mass-market spatial computing), 4) autonomous vehicle perception (3D scene understanding critical), 5) AR/VR content creation (Niantic / Meta Reality Labs). Day-to-day: 1) NeRF / Gaussian Splatting reconstruction — input video → 3D scene representation. Tools: Instant-NGP (NVIDIA — fast NeRF) / Nerfstudio (unified framework) / gsplat (Gaussian Splatting library) / Polycam app для capture. 2) Point cloud processing — LiDAR data (autonomous vehicles) или depth cameras (Kinect / RealSense). Tools: Open3D / PCL (Point Cloud Library) / PyTorch3D. 3) Structure-from-motion (SfM) — multiple 2D images → 3D scene. Tools: COLMAP (industry standard SfM). 4) SLAM (Simultaneous Localization and Mapping) — robotics / AR — track camera pose + build map. Tools: ORB-SLAM3 / OpenVSLAM / Kimera. 5) Differentiable rendering — learn 3D from 2D supervision. Tools: Mitsuba 3 / nvdiffrast / Kaolin. 6) 3D generation — text → 3D mesh (DreamFusion + Magic3D + Zero-1-to-3) или image → 3D (TripoSR / InstantMesh — open-source 2024+). 7) Mesh processing — texturing / retopology / UV unwrapping для 3D content pipelines. Stack-specific: PyTorch3D (Meta — 3D research), Kaolin (NVIDIA — 3D DL), Mitsuba 3 (differentiable rendering), Open3D / PCL (point cloud), COLMAP (SfM), Nerfstudio + gsplat (NeRF / 3DGS), Blender Python API (mesh manipulation). Зарплаты: Senior 3D Vision Engineer — премиум над general CV +15-25% за счёт rare-skill (3D math + computer graphics + ML hybrid редки). $7500-12000 Senior в РФ tech (Яндекс беспилотники / Sber). $9000-14000 в spatial computing companies (Apple Vision Pro / Meta Reality Labs). $12000-18000+ в AR-startups (Niantic / Snap AR). Career-flow: CV Engineer Senior + computer graphics interest + NeRF / 3DGS hands-on portfolio → 3D Vision Engineer — 6-12 месяцев.
Какие компании активно нанимают CV Engineer?
В топе: Яндекс, Сбер.AI, VK. Российские CV крупняки: Яндекс — крупнейший CV-работодатель РФ (беспилотные автомобили — Яндекс SDG / Self-Driving Group — десятки CV engineers; Маркет visual search; Драйв camera-based; Алиса AR features; Yandex Cloud Vision). Сбер.AI (Kandinsky 3 generative — крупнейший Russian text-to-image; Сбер face recognition — banking + transit; СберМедИИ — medical imaging; SberDevices smart displays). VK (face recognition VK Видеозвонки + контент-модерация + Mail.ru AI). Wildberries (visual search для одежды / товаров — крупнейшее retail CV-deployment РФ). Ozon (visual search + product matching). NtechLab (FaceNGN — global face recognition leader, Moscow). VisionLabs (Luna face recognition platform — банки, retail, госы). Tevian (face / liveness detection — banking). Cognitive Pilot (беспилотный комбайн / транспорт / агро). СберМедИИ (medical imaging — крупнейший medical CV в РФ). X5 Group (smart shelf / customer counting в магазинах). Лаборатория Касперского (image-based malware analysis). Iva Cognitive, RecFaces, Тинькофф (KYC face + document verification). Аутсорсеры с CV Practice: EPAM AI CV (крупнейший CV-аутсорс в СНГ для US-CV projects), Luxoft CV, Andersen AI, DataArt CV. Autonomous vehicle companies (international remote премиум): Tesla Autopilot (Bay Area, hybrid only), Waymo (Mountain View), Cruise (Pittsburgh — defunct status uncertain 2025), Wayve (London — Europe leader), Pony.ai, Zoox (Amazon), Aurora Innovation, Mobileye (Intel — Israel), Argo AI (legacy 2023). Generative AI companies (full-remote премиум): Stability AI (SDXL / SD3 — UK), Black Forest Labs (FLUX — Germany — rising star 2024+), Runway (Gen-3 video), Pika Labs, Kling AI (Kuaishou — China), Ideogram (text-in-image leader), Midjourney (commercial closed), OpenAI (DALL-E + Sora teams). Robotics + AR/VR: Boston Dynamics (perception engineers), Skydio (drones), Anduril Industries (defense + autonomy), Niantic (AR — Pokémon Go scale CV), Snap AR (Snapchat lens CV), Magic Leap, Apple Vision Pro team (premium $$). Big Tech CV (топ-tier salary): Google DeepMind (Gemini Vision team) / Meta AI Reality Labs (DINO / SAM authors) / Microsoft HoloLens / Apple ML Vision / Amazon AGI vision / NVIDIA (Omniverse + DGX + research). Y Combinator CV-startups premium remote.
С чего начинать в Computer Vision в 2026?
Roadmap: 1) Math foundations — линейная алгебра + матанализ + основы projective geometry (transformations / homographies / camera models). «Multiple View Geometry in Computer Vision» Hartley / Zisserman — bible для CV math (можно использовать как reference, не читать целиком). 2) Python deep + ML basics — PyTorch + NumPy + Matplotlib. Build simple ML classifier (MNIST). 3) OpenCV mastery — classical CV ops. Курс: «Computer Vision Course» от PyImageSearch (Adrian Rosebrock — best entry для OpenCV), OpenCV official documentation + tutorials. Build pet-projects: edge detection / homography panorama stitching / face detection с classical Haar cascades. 4) Deep Learning для Vision — Stanford CS231n «Convolutional Neural Networks for Visual Recognition» (Karpathy / Li — free YouTube + slides — must-do, canonical CV deep learning course). 5) PyTorch + torchvision hands-on — train ResNet / EfficientNet on CIFAR-10, fine-tune pretrained model на own dataset. Учиться использовать timm library (1000+ pretrained models). 6) Object detection — Ultralytics YOLO hands-on (easiest entry — train YOLO on custom dataset за день). Затем Detectron2 (more flexible). 7) Segmentation — Mask R-CNN через Detectron2, затем SAM (universal segmentation) — try interactive segmentation. 8) Annotation tooling — CVAT mastery (industry standard для CV annotation). Annotate own dataset (10-50 images), train detector on it. 9) Augmentation mastery — albumentations library (must для production CV). Understand training stability — strong augmentation improves robustness. 10) Modern transformers для CV — Vision Transformer (ViT) + Swin Transformer + DINOv2 (foundation model). Hugging Face Transformers vision support. 11) Multimodal LLM — try GPT-4V + Claude vision + open-source LLaVA / Qwen 2 VL для understanding. 12) Generative CV track (popular 2024-2026): Stable Diffusion + FLUX hands-on, ComfyUI workflows mastery, LoRA / DreamBooth fine-tuning. Курс: «Generative AI with Diffusion Models» DeepLearning.AI. 13) 3D Vision track (rising 2024+): NeRF + Gaussian Splatting hands-on с Nerfstudio framework, capture own scene на phone (Polycam app), reconstruct as 3DGS. 14) Edge deployment hands-on — convert PyTorch model → ONNX → TensorRT (on NVIDIA GPU), benchmark inference latency. Try CoreML conversion для Apple Silicon. 15) Pet-проект portfolio: a) production-grade detection pipeline (e.g. fish-counting / vehicle-counting / cell-counting demo); b) custom Stable Diffusion / FLUX LoRA на own style; c) Gaussian Splatting reconstruction (cool 3D scene from phone video); d) mobile CV app (deploy via CoreML/TFLite). Document на GitHub + blog post + video demo. Курсы РФ: МФТИ DLSchool (CV модуль — free YouTube), Karpov.Courses «Computer Vision» track, Otus «Computer Vision», SkillFactory CV, School21 (Сбер) AI Computer Vision track. International (eng): Stanford CS231n (canonical — free YouTube), fast.ai Practical Deep Learning (CV included), Hugging Face Computer Vision Course (free), «Deep Learning for Computer Vision» book Mohamed Elgendy, PyImageSearch University (Adrian Rosebrock — applied focus). Books-must: «Deep Learning for Vision Systems» Mohamed Elgendy (Manning), «Multiple View Geometry» Hartley / Zisserman (math reference), «Computer Vision: Algorithms and Applications» Richard Szeliski (free 2nd edition online — encyclopedic). Communities: r/computervision, PyImageSearch Discord, Hugging Face Discord (vision channels), Telegram @cv_ru, @ml_ru. Conferences: CVPR (top — June), ICCV (October, alternating с ECCV), ECCV (October alternating), NeurIPS (December — broader ML с CV-track). Backend Senior / ML Engineer Middle + CV interest + portfolio → CV Engineer Junior — 6-12 месяцев. PhD CV / Robotics → Senior CV Engineer — direct entry.
Сколько вакансий CV Engineer в СНГ и Европе?
33 активных открытых CV Engineer-вакансий с явной CV-спецификой в нашей выборке. Реальный пул многократно шире — many CV roles classified как general ML Engineer / Robotics / AI Engineer (titles типа «ML Engineer для autonomous driving» или «Senior Backend Engineer with CV focus»). True CV-focused dev jobs в СНГ + Европе оценочно 300-1500 позиций активных любой момент 2026 (с учётом нечётко классифицированных). География: 🇵🇱 Польша, EN, INT. Источники: hh.ru (особенно Яндекс беспилотники / Сбер.AI / NtechLab / VisionLabs active), Habr Career, getmatch, Djinni, LinkedIn (огромный международный CV сегмент — autonomous vehicles + generative AI + Big Tech CV), NoFluffJobs / JustJoin.it (Польша CV-friendly), Telegram (@cv_ru, @ml_jobs, @aijobs, @jobsforaiml, @robotics_ru), карьерные сайты EPAM AI CV / Luxoft CV / Andersen / DataArt, специализированные борды (aijobs.net, ai-jobs.net, builtin.com/jobs/ai, roboticscareer.org), autonomous vehicle direct careers (Tesla / Waymo / Cruise / Wayve / Pony.ai / Zoox / Aurora / Mobileye), generative AI direct careers (Stability AI / Black Forest Labs / Runway / Pika / Ideogram), CVPR / ICCV / ECCV conference job boards, Y Combinator Work at a Startup. Реальный рынок шире за счёт международного remote-сегмента (generative AI companies + robotics startups full-remote-friendly). Время закрытия Senior CV Engineer — 6-12 недель (longer чем general AI Engineer за счёт rare-skill — PyTorch deep + classical CV + один domain + edge deployment combination).
Какие навыки нужны Senior CV Engineer?
Senior CV Engineer владеет полным циклом vision engineering + technical leadership. Math foundations: linear algebra + projective geometry (camera models / homographies / epipolar geometry) + calculus + optimization theory — на уровне «can read CVPR papers without math blocks». Python deep + Backend Senior level: async / typing / FastAPI / pytest mastery. C++ basics (для edge deployment + OpenCV custom kernels — nice to have). PyTorch + torchvision mastery deep: custom Datasets / Samplers / Losses / training loops, distributed training (DDP for multi-GPU CV training), mixed-precision (FP16 / BF16 mandatory для production), gradient accumulation для large models. OpenCV mastery: classical CV ops mastery (calibration / morphology / Hough transforms / homography / stereo matching / optical flow), C++ API basics для performance-critical paths. Modern transformers для CV: ViT / Swin / DINOv2 mastery, fine-tuning strategies, hybrid CNN-transformer architectures. Object detection / segmentation mastery: Ultralytics YOLO production deployment + Detectron2 custom architecture authoring + MMDetection / MMSegmentation when needed + RT-DETR + SAM 2 integration в pipelines. Data augmentation mastery: albumentations advanced (custom transforms + composition strategies), test-time augmentation (TTA), AutoAugment / RandAugment / CutMix / Mixup understanding. Foundation models: DINOv2 + CLIP + SigLIP + OpenCLIP — use cases для feature extraction + zero-shot classification + retrieval. Multimodal LLM (GPT-4V / Claude vision / LLaVA / Qwen 2 VL / Molmo) — when to use vs train custom model. Generative CV mastery (if track включает): Stable Diffusion + FLUX deep — ControlNet / LoRA / DreamBooth / IP-Adapter / textual inversion fine-tuning. ComfyUI workflow authoring. Video generation basics (AnimateDiff / SVD). 3D Vision mastery (if track включает): NeRF + Gaussian Splatting + Nerfstudio + COLMAP SfM + Open3D point clouds. Camera calibration + projective geometry deep. Edge deployment mastery: ONNX (cross-framework export + ONNX Runtime optimization), TensorRT (NVIDIA — INT8 quantization + plugin development если нужно), CoreML (Apple Silicon — Conv2D ops support + Neural Engine specifics), TFLite (Android NNAPI delegate), MediaPipe pipelines, OpenVINO (Intel iGPU). Hardware-aware optimization: quantization (PTQ + QAT), pruning, distillation (teacher-student), structured sparsity (NVIDIA Ampere+ для 2× speedup). Production CV pipeline architecture: design end-to-end pipeline на whiteboard — data ingestion (cameras / drives / cloud streaming) → preprocessing → batched inference → post-processing → tracking / aggregation → downstream actions. Multi-camera systems: synchronization + calibration + fusion для autonomous / surveillance. Inference serving: Triton Inference Server (NVIDIA — best для multi-model + dynamic batching), DeepStream (NVIDIA video analytics), BentoML / TorchServe. Latency budgets: design for real-time constraints (autonomous: 30+ FPS, AR: 60+ FPS, video moderation: batch ok), profile-driven optimization (NVIDIA Nsight + PyTorch Profiler). Domain expertise: deep understanding одного-двух domains (autonomous driving / medical imaging / retail visual search / robotics perception / AR/VR / satellite imagery / manufacturing QC) — главный premium-driver Senior+. Annotation strategy: design large-scale annotation workflows (CVAT / Label Studio + SAM-assisted), active learning loops, label quality control. Evaluation methodology: COCO metrics (AP / AP50 / AP75 / mAP@[.5:.95]) deep understanding, segmentation metrics (mIoU / Dice / boundary F1), domain-specific metrics (NDS / mAP для autonomous detection), human-in-the-loop evaluation для generative. Soft: ADRs writing для CV architecture decisions, technical writing (CV feature design docs + paper drafts если research-track), cross-team collaboration (Product / Backend / Robotics / Hardware teams), mentoring Middle CV Engineers, paper-reading discipline (CVPR / ICCV / ECCV / NeurIPS must-follow). Английский для Senior+ MUST — CV community / docs / papers / conferences англоязычные. Optional bonus: open-source contributions в Ultralytics / Detectron2 / MMDetection / timm / albumentations / Nerfstudio — резко повышают market value для Big Tech CV / autonomous vehicles / generative AI hiring. Papers at CVPR / ICCV / ECCV workshops — премиум для Research Engineer CV track.
Похожие специализации
Как мы считаем
- Период данных: в hero и текстах — последние 3 месяца. В графиках — весь доступный период наблюдений (с момента запуска парсеров, обычно 2-3 месяца).
- Данные собираются автоматически из 1000+ источников — Telegram-каналов и job-площадок СНГ и Европы.
- В расчёт идут только живые открытые вакансии с понятным описанием. Спам и дубликаты отсекаются.
- Зарплаты приводятся к USD/мес по актуальному курсу. Аномальные значения (lt;500 или gt;50K) отфильтрованы.
- Уровни нормализованы: Mid → Middle, Intern/Trainee → Junior, Principal/Staff/Expert → Lead.
- Первые 2 недели данных (период парсер-rampup) в графиках не показываем.
- Данные пересчитываются каждый день.
Авторство и цитирование
Аналитика подготовлена Zorky Research Team. Последнее обновление: 29 мая 2026 г. в 17:41.
Источники данных и методология
Данные собраны автоматически из 1000+ источников — Telegram-каналов вакансий и сайтов работы СНГ и Восточной Европы (HH, Habr Career, Djinni, DOU, NoFluffJobs, JustJoin.it, Pracuj.pl и других). Парсинг работает круглосуточно, дубликаты фильтруются по описанию и URL, аномальные значения зарплат отсекаются. Подробная методология — на странице «Как работает».
Zorky CRM (2026). Computer Vision в IT: рынок СНГ и Европы. Дата обращения: 29.05.2026. URL: https://zorky.tech/ru/research/ml