AI/ML Engineer – Data Simulation & Synthetic Data Generation

Full-Time

Hyderabad

On-site

Key Responsibilities

Research, evaluate, and benchmark generative and diffusion-based models (Stable Diffusion, Sora-like models, GANs, NeRFs) for simulation and synthetic data generation.
Build pipelines to replicate images and videos across new environments, lighting conditions, scenes, poses, and object variations.
Develop multimodal prompt-based simulation workflows including:
- Text → Image
- Image → Image
- Video → Video transformations
Fine-tune models for domain-specific simulation tasks such as:
- Texture transfer
- Background replacement
- Camera simulation
- Noise injection
- Motion variation
Create automated pipelines to scale image, video, audio, and text simulation across large datasets.
Evaluate realism, fidelity, annotation consistency, and domain-adaptation effectiveness of generated synthetic data.
Work closely with ML researchers to integrate synthetic data into training loops and improve downstream model performance.
Collaborate with backend and data teams to design scalable storage, sampling, and dataset versioning strategies for simulation workflows.
Develop metrics and QA processes for simulation quality, drift detection, and dataset reliability.
Support training pipelines, experiment tracking, and dataset versioning as simulation infrastructure scales.

Experience with multimodal generative models for image, video, and text-prompted generation.
Familiarity with dataset versioning and experiment tracking tools such as DVC, Weights & Biases (W&B), or MLflow.
Understanding of domain adaptation and synthetic-to-real generalization techniques.

3–6 years of experience in applied machine learning or generative AI.
Strong Python programming skills with hands-on experience in PyTorch or TensorFlow.
Practical experience working with generative models including diffusion models, GANs, video synthesis models, and NeRFs.
Familiarity with data augmentation, image/video transformations, and synthetic data generation workflows.
Experience building scalable pipelines using FastAPI, Airflow, or custom orchestration frameworks.
Understanding of GPU-based training, inference optimization, and model performance tuning.
Practical knowledge of Git, Docker, Linux, and cloud platforms such as AWS, GCP, or Azure.

The details about the recruitment procedure are as follows:

Round 01

High-level Technical Discussion

Discussion with multiple interviewers to evaluate technical expertise and problem-solving approach.

Round 02

Assessment / Coding

Practical coding assessment to evaluate technical and analytical skills.

Round 03

Final Interview

Final technical and managerial evaluation with the leadership team.

Round 04

HR Interview

Discussion regarding company policies, culture, compensation, and onboarding process.