SDET (Software Development Engineer in Test)

About the Role

We're looking for a passionate SDET to join our Quality Engineering team and ensure the reliability and performance of PloyD's AI Operations platform. You'll design and implement comprehensive test automation frameworks, work closely with development teams, and champion quality throughout the software development lifecycle.

Note: This is a template based on industry best practices. Please update with your specific requirements or reference: Similar role at Modular

What You'll Do

Design and build test automation frameworks for AI model serving infrastructure
Develop comprehensive test suites for ML inference APIs and model deployment pipelines
Validate model performance, accuracy, and latency across different frameworks (TensorRT-LLM, vLLM, TGI)
Build automated testing for GPU workloads and distributed inference systems
Implement and maintain CI/CD pipelines on GitHub Actions, GitLab CI, and BuildKite
Manage and optimize CI/CD runners (GitHub Actions runners, GitLab runners, BuildKite agents)
Create observability and monitoring tests for AI Operations platform
Collaborate with ML engineers to define quality metrics for AI systems

What We're Looking For

Required Qualifications

4+ years of experience in test automation and software development
Strong programming skills in Python (required for ML/AI testing)
Experience testing ML/AI systems, model serving, or inference platforms
Hands-on experience with CI/CD platforms (GitHub Actions, GitLab CI, or BuildKite)
Experience managing and optimizing CI/CD runners and build agents
Knowledge of testing frameworks (Pytest, unittest) and API testing
Understanding of containerization (Docker) and Kubernetes
Experience with cloud platforms (AWS, GCP, Azure) and GPU workloads
Strong analytical and problem-solving skills

Preferred Qualifications

Experience with ML frameworks (PyTorch, TensorFlow) and model serving platforms
Knowledge of performance testing for AI inference (latency, throughput, GPU utilization)
Experience with self-hosted runner infrastructure and autoscaling
Familiarity with GitOps workflows and ArgoCD
Experience testing distributed systems and microservices at scale
Understanding of LLM inference optimization and quantization techniques
Open source contributions to AI/ML testing tools or CI/CD platforms

Benefits & Perks

Competitive salary and equity package
Comprehensive health, dental, and vision insurance
401(k) with company match
Flexible work arrangements (remote/hybrid)
Professional development budget
Unlimited PTO policy
Latest tech equipment

About PloyD

PloyD is building the future of AI Operations. Our platform makes it easy for enterprises to deploy, monitor, and scale AI models in production. We're a fast-growing startup backed by top-tier investors, and we're looking for talented people to join our mission.