AI Engineer

Micropolis Robotics

Employer Active

Posted on 30 Oct

Experience

1 - 7 Years

Education

Any Graduation

Nationality

Any Nationality

Gender

Not Mentioned

Vacancy

1 Vacancy

Job Description

Roles & Responsibilities

Design, build, and operate a production server side AI assistant that:
Answers questions grounded in user scoped data (docs, tables) with citations where applicable
Performs actions by calling internal/external APIs securely on the user s behalf
Supports low latency, real time voice chat (streaming STT/TTS + incremental LLM responses)

Implement the tool/agent layer:
Structured tool calling (JSON schema based) to integrate business services
Model Context Protocol (MCP) servers/clients where appropriate for tool discovery and execution

Architect retrieval augmented generation (RAG):
Ingestion for documents and tables, parsing, chunking, embeddings, metadata, and indexing
Hybrid retrieval (sparse+dense), query rewriting, and answer attribution

Deliver performant, cost e cient inference on open source models:
Model selection/routing; context management; caching/batching; streaming token delivery
GPU utilization and serving via vLLM/TGI/llama.cpp/Ollama or similar

Build resilient APIs and real time integrations:
WebSockets/WebRTC/gRPC for streaming voice; REST/GraphQL for control and orchestration

Productionize and operate on server/on prem:
Containerize with Docker; automate CI/CD; implement logs/metrics/traces (OpenTelemetry)
Evals, A/B tests, safety/guardrails, and human in the loop feedback

Desired Candidate Profile

Required Skills and Experience

  • Proficiency in Python
  • Open source LLMs and serving:
    • Experience with models like Llama, Mistral/Mixtral, Qwen, etc.
    • Serving stacks: vLLM, Text Generation Inference (TGI), llama.cpp, Ollama
    • Prompt engineering, routing, context/window management
  • RAG and data systems:
    • Vector DBs (like FAISS, Qdrant, Weaviate, Milvus, pgvector) and hybrid search
    • Document/table ingestion and normalization; schema/metadata design
  • Real time voice:
    • STT/TTS (e.g., Whisper/faster whisper, Vosk, Coqui TTS, Piper) with streaming pipelines
    • Low latency streaming via WebSockets/WebRTC/gRPC
  • Tooling for actions:
    • Structured tool/function calling, API design/integration, and service orchestration
    • Familiarity with Model Context Protocol (MCP) concepts and usage
  • Deployment and operations (on prem rst):
    • Docker, Linux, networking, and secure service deployment
    • GPU stacks (CUDA/drivers/containers) and performance tuning
  • Excellent communication, documentation, and cross functional collaboration

Preferred Skills

  • Agents/frameworks: LangChain, LlamaIndex, Semantic Kernel, or custom tool routers
  • Advanced retrieval: multi vector stores, RRF/hybrid search, query planning, re ranking
  • SQL generation and safe execution over tabular data; row level security; schema mapping
  • Document processing: OCR, table extraction, CSV/Parquet pipelines
  • Serving/perf: Triton, quantization (GGUF/GGML), LoRA/QLoRA with PEFT, KV cache optimizations
  • Evals and observability: Ragas/DeepEval, Langfuse/PromptLayer, OpenTelemetry

Company Industry

Department / Functional Area

Keywords

Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com

Similar Jobs

AI Developer

Superbytes

  • 1 - 4 Years
  • Dubai - United Arab Emirates (UAE)
View All