RemoteAtlas
Find Jobs
CompaniesBlogPost a Job
RemoteAtlas

Discover curated remote jobs and work from anywhere. Updated daily with roles from top companies worldwide.

Remote Jobs by Role

  • Remote Engineering Jobs
  • Remote Design Jobs
  • Remote Product Manager Jobs
  • Remote Marketing Jobs
  • Remote Sales Jobs
  • Remote Data Jobs
  • Remote DevOps Jobs
  • Remote Support Jobs
  • Remote Customer Success Jobs
  • Remote Security Jobs
  • Remote Mobile Developer Jobs

More Roles

  • Remote QA Jobs
  • Remote HR & People Jobs
  • Remote Finance Jobs
  • Remote Operations Jobs
  • Remote Management Jobs
  • Remote AI & Machine Learning Jobs
  • Remote Writing & Content Jobs
  • Remote Video & Animation Jobs
  • Remote Translation & Localization Jobs
  • Remote IT Support Jobs
  • Remote Community Management Jobs

Remote Jobs by Location

  • Remote Jobs in the US
  • Remote Jobs in Europe
  • Remote Jobs — Work from Anywhere
  • Remote Jobs in the UK
  • Remote Jobs in the Americas
  • Remote Jobs in EMEA
  • Remote Jobs in APAC
  • Remote Jobs in Canada

Company

  • Browse All Jobs
  • Blog
  • Companies
  • About Us
  • Post a Job
  • Contact Us
© 2026 RemoteAtlas. All rights reserved.
Terms & ConditionsPrivacy Policy
Home/Remote Engineering Jobs/Lambda/Data Center Facility Telemetry & Controls Engineer
L
Lambda

Data Center Facility Telemetry & Controls Engineer

Lambda

Remote, USAFull-time$185K - $290KPosted about 19 hours ago
Software Development

Summary

Lambda is hiring a Data Center Facility Telemetry & Controls Engineer to join their Software Development team. Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. Key skills: Python, Terraform, Machine Learning, AI.

About the role

Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. Our customers range from AI researchers to enterprises and hyperscalers. Lambda's mission is to make compute as ubiquitous as electricity and give everyone the power of superintelligence. One person, one GPU.

If you'd like to build the world's best AI cloud, join us.


Location: Multiple Sites — US (Remote/Hybrid eligible)

 

Role Overview

The Data Center Facility Telemetry & Controls Management Engineer is a critical technical role responsible for the design, deployment, integration, and ongoing operation of Building Management Systems (BMS), Data Center Infrastructure Management (DCIM) platforms, and facility telemetry pipelines across Lambda's growing data center portfolio. This engineer ensures that all facility systems — power, cooling, thermal, and environmental — are continuously monitored, alarmed, and controllable in real time, supporting the safe and efficient operation of high-density GPU deployments at rack densities of 136–380 kW per rack.

 

What You’ll Do

BMS / Controls Architecture & Integration

  • Architect and manage BMS integration across colocation and Lambda-owned facilities, covering chillers, CRAHs, CDUs (Coolant Distribution Units), cooling towers, UPS systems, PDUs, and automatic transfer switches.

  • Define standards for BMS point lists, naming conventions, control sequences, and integration protocols (BACnet, Modbus, SNMP, OPC-UA, RESTful APIs).

  • Oversee commissioning and acceptance testing of new BMS deployments and CDU/TCS loop integrations for next-generation liquid-cooled GPU rack systems.

  • Collaborate with colocation partners (Equinix, Digital Realty, and others) to ensure telemetry data flows from provider BMS/EPMS into Lambda's monitoring stack.

DCIM & Telemetry Platform Management

  • Own the DCIM platform strategy and roadmap — evaluating, selecting, and implementing tooling for asset management, capacity planning, environmental monitoring, and power chain visibility.

  • Develop and maintain real-time dashboards for PUE, thermal performance, stranded capacity, and cooling system efficiency across all Lambda sites.

  • Build and maintain telemetry pipelines ingesting data from BMS, PDUs, in-rack sensors, CDUs, and network devices into centralized monitoring and alerting platforms (e.g., Prometheus, Grafana, InfluxDB, or equivalent).

  • Define alarm thresholds and escalation workflows for critical facility events including high coolant temperatures, CDU inlet/outlet anomalies, leak detection, and power exceedances.

Liquid Cooling Controls & High-Density Operations

  • Develop control strategies and setpoint frameworks for TCS (Thermal Control System) loops supporting direct liquid cooling at densities of 220–380 kW per rack.

  • Evaluate and qualify CDU vendors on controls integration capabilities, telemetry exposure, and remote management interfaces.

  • Define and enforce operational procedures for CDU commissioning, setpoint changes, loop pressure management, and fluid quality monitoring.

  • Support design and construction coordination for liquid cooling infrastructure in new data center buildouts, ensuring BMS and controls readiness at Day 1.

Operational Reliability & Incident Response

  • Establish and maintain facility event management processes, including on-call response protocols for facility telemetry anomalies.

  • Lead root cause analysis for facility system failures and implement corrective actions to prevent recurrence.

  • Partner with the data center operations team to maintain and refine emergency response runbooks tied to BMS alerts and automated controls.

  • Drive continuous improvement in MTTR for facility-related events through better telemetry coverage and automated remediation.

Vendor & Stakeholder Management

  • Manage BMS integrators, DCIM vendors, and control subcontractors - from RFP through design, installation, commissioning, and ongoing support.

  • Serve as the primary technical interface with colocation providers on all BMS/EPMS integration topics.

  • Collaborate with Lambda's infrastructure engineering, construction, and procurement teams to align controls requirements with facility buildout timelines.

  • Support due diligence and technical evaluation for new colocation sites and modular data center deployments from a telemetry and controls readiness perspective

You

Required Experience

  • 7+ years of experience in data center infrastructure engineering, with at least 4 years focused on BMS, DCIM, or controls systems in a hyperscale, colocation, or AI/HPC environment.

  • Hands-on experience designing and integrating BMS for mission-critical facilities including UPS, PDU, CRAH/CRAC, chiller plant, cooling tower, and liquid cooling (CDU/in-row) systems.

  • Strong working knowledge of industrial control protocols: BACnet IP/MS-TP, Modbus TCP/RTU, SNMP, DNP3, and modern API-based integrations.

  • Demonstrated experience with DCIM platforms (Nlyte, Sunbird, Vertiv TRELLIS, or equivalent) including deployment, configuration, and ongoing administration.

  • Experience with real-time telemetry stacks (Prometheus, InfluxDB, Grafana, or similar) applied to infrastructure monitoring use cases.

  • Strong understanding of data center power and cooling systems, including PUE optimization, thermal management, and redundancy architectures (2N, N+1).

Preferred Qualifications

  • Direct experience with direct liquid cooling (DLC) systems, CDU controls integration, and TCS loop management for high-density AI GPU deployments (100+ kW per rack).

  • Familiarity with OCP (Open Compute Project) hardware and telemetry standards.

  • Experience working with major colocation providers (Equinix, Digital Realty, CyrusOne, etc.) on BMS/EPMS integration and data sharing agreements.

  • Exposure to modular or edge data center deployments and associated controls considerations.

  • Background in scripting and automation (Python, Ansible, Terraform) applied to infrastructure management workflows.

  • Experience operating data centers at international scale, including Asia-Pacific or Southeast Asian markets.

  • Relevant certifications: CDCP, CDCE, ETA Data Center Specialist, or vendor-specific BMS/controls certifications.

What We Offer:

  • Opportunity to shape the telemetry and controls architecture for one of the fastest-growing AI infrastructure platforms in the industry.

  • Work with cutting-edge GPU infrastructure at rack densities at the frontier of what the industry has deployed.

  • Collaborative environment with experienced infrastructure, construction, and vendor teams across a rapidly scaling global portfolio.

  • Competitive compensation including salary, equity, and comprehensive benefits.

  • Flexibility in work location with hybrid/remote options depending on facility portfolio needs.

Salary Range Information

The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

About Lambda

  • Founded in 2012, with 500+ employees, and growing fast

  • Our investors notably include TWG Global, US Innovative Technology Fund (USIT), Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, Gradient Ventures, Mercato Partners, SVB, 1517, and Crescent Cove

  • We have research papers accepted at top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG

  • Our values are publicly available: https://lambda.ai/careers

  • We offer generous cash & equity compensation

  • Health, dental, and vision coverage for you and your dependents

  • Wellness and commuter stipends for select roles

  • 401k Plan with 2% company match (USA employees)

  • Flexible paid time off plan that we all actually use

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Related jobs

D
Ditto
Engineering Manager, Data SyncNew

Ditto·Remote (Atlanta, Austin, San Francisco, Seattle)

Full-time$222K - $283KSoftware DevelopmentManagement
21h
R
Rescale
Senior HPC Engineer, Services

Rescale·Remote (United States)

Full-time$100K - $150KSoftware Development
More remote engineering jobsMore remote jobs in the US
53d
K
Kong
Staff Site Reliability Engineer - VolcanoNew

Kong·Remote — United States

Full-time$150K - $210KSoftware DevelopmentDevOps & Infrastructure
1d
HA
Harvey AI
AI Automation Engineer, Customer Education New

Harvey AI·Remote

Full-time$123.6K - $185.4KSoftware DevelopmentQA & Testing
18h