Back to Jobs

AI/ML Engineer – Data Simulation & Synthetic Data Generation

Full-Time
Hyderabad
On-site
Apply Now

Overview

Key Responsibilities

  • Research, evaluate, and benchmark generative and diffusion-based models (Stable Diffusion, Sora-like models, GANs, NeRFs) for simulation and synthetic data generation.
  • Build pipelines to replicate images and videos across new environments, lighting conditions, scenes, poses, and object variations.
  • Develop multimodal prompt-based simulation workflows including:
    • Text → Image
    • Image → Image
    • Video → Video transformations
  • Fine-tune models for domain-specific simulation tasks such as:
    • Texture transfer
    • Background replacement
    • Camera simulation
    • Noise injection
    • Motion variation
  • Create automated pipelines to scale image, video, audio, and text simulation across large datasets.
  • Evaluate realism, fidelity, annotation consistency, and domain-adaptation effectiveness of generated synthetic data.
  • Work closely with ML researchers to integrate synthetic data into training loops and improve downstream model performance.
  • Collaborate with backend and data teams to design scalable storage, sampling, and dataset versioning strategies for simulation workflows.
  • Develop metrics and QA processes for simulation quality, drift detection, and dataset reliability.
  • Support training pipelines, experiment tracking, and dataset versioning as simulation infrastructure scales.

Preferred Experience

  • Experience with multimodal generative models for image, video, and text-prompted generation.
  • Familiarity with dataset versioning and experiment tracking tools such as DVC, Weights & Biases (W&B), or MLflow.
  • Understanding of domain adaptation and synthetic-to-real generalization techniques.

Qualifications

  • 3–6 years of experience in applied machine learning or generative AI.
  • Strong Python programming skills with hands-on experience in PyTorch or TensorFlow.
  • Practical experience working with generative models including diffusion models, GANs, video synthesis models, and NeRFs.
  • Familiarity with data augmentation, image/video transformations, and synthetic data generation workflows.
  • Experience building scalable pipelines using FastAPI, Airflow, or custom orchestration frameworks.
  • Understanding of GPU-based training, inference optimization, and model performance tuning.
  • Practical knowledge of Git, Docker, Linux, and cloud platforms such as AWS, GCP, or Azure.

Recruitment Procedure

The details about the recruitment procedure are as follows:

Round 01
High-level Technical Discussion
Discussion with multiple interviewers to evaluate technical expertise and problem-solving approach.
Round 02
Assessment / Coding
Practical coding assessment to evaluate technical and analytical skills.
Round 03
Final Interview
Final technical and managerial evaluation with the leadership team.
Round 04
HR Interview
Discussion regarding company policies, culture, compensation, and onboarding process.