← All Jobs
Posted Jun 1, 2026

Software Development Engineer (Agentic AI & LLM Platforms) – Work Remotely (EST Hours) – Must Be Able to Obtain Public Trust – No 3rd Parties

Must be able to obtain a Public Trust Must be able to work remote EST hours Overview • We are building the next generation of agentic AI to transform how the agency accelerates research, makes decisions, and ships products at scale. • We are a small, startup-minded team that ships fast and owns what we build end-to-end. • We are looking for an SDE II who is hungry to contribute to a real production system, not a sandbox. • You will work across the application and infrastructure layers, implement features that users interact with every day, and be expected to own what you build from design through deployment. • You will not be handed perfectly scoped tickets. • You will be expected to ask good questions, figure things out, and move. • The best person for this role communicates clearly, collaborates without ego, and brings genuine empathy for the users whose work they are making better. • You are a self-starter with a high bar and a high sense of urgency. • You play well with others and make the people around you better. What You Will Do Build Agentic AI Systems • Implement and iterate on our agentic workflows: tool-calling, multi-step reasoning, planning, memory, and agent-to-agent (A2A) communication patterns at the application layer • Build and maintain MCP (Model Context Protocol) client-side integrations: how agents discover, invoke, and compose tools • Implement tool definitions, input/output schemas, error handling, retry logic, and result formatting for GRACE's growing tool library • Contribute to multi-agent orchestration patterns that are reliable and debuggable in production, not just in demos Build LLM-Powered Features • Implement LLM orchestration logic: prompt construction, context management, model selection, and response parsing across OpenAI GPT, Anthropic Claude, and Google Gemini • Build and maintain RAG pipeline components: query formulation, result ranking, citation grounding, and hallucination mitigation • Implement and iterate on prompt engineering patterns and system prompts that drive quality and consistency across model families • Contribute to context window budget management: truncation, summarization, and pagination logic that makes the right call at runtime • Build LLM evaluation components: grounding assessment, regression tests, safety checks, and quality metrics • Write prompts and pipelines with token economics in mind; cost-per-query is a real constraint, not an afterthought Own the Backend • Build secure, well-tested backend features end-to-end: from application logic through to the API contract the frontend consumes • Implement integrations with internal and external data sources and APIs, including Dimensions, Google Search, Slack, SharePoint, and LLM provider APIs • Contribute to monitoring, logging, and distributed tracing so that failures are diagnosable and regressions are caught before users report them • Implement fallback, retry, and graceful degradation patterns for AI service dependencies • Write production-quality code: readable, tested, reviewed, and documented Contribute to Infrastructure • Work within Microsoft Azure infrastructure: Azure Functions, Azure API Management, Azure Container Apps, and Azure OpenAI Service • Contribute to CI/CD pipelines, deployment automation, and release processes • Work with containerization tools and infrastructure as code; understand the environment your code runs in • Contribute to application-level SLOs: tool call success rates, response quality, and latency from the user's perspective Collaborate and Grow • Participate actively in design reviews, sprint planning, and retrospectives; ask good questions and push back when something does not add up • Communicate technical decisions clearly to both engineers and non-engineers; no one should have to guess what you built or why • Work closely with the PM, researcher, designer, and senior engineers to translate ambiguous requirements into clear, actionable implementations • Bring genuine curiosity and empathy to every feature; understand who is using what you build and why it matters to them • Ensure strong privacy, security, and compliance in all systems, integrations, and data handling Basic Qualifications • Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field, or equivalent practical experience • 3+ years of professional software engineering experience building and operating production systems • Proven experience in high-velocity environments where you contributed to shipping real products end-to-end • Strong proficiency in Python and at least one other backend language; familiarity with modern backend frameworks and async patterns • Solid understanding of algorithms, data structures, distributed systems, and software design patterns • Experience building and operating systems on major cloud platforms (AWS, GCP, or Azure) • Experience with containerization (Docker) and working within CI/CD pipelines • Clear, direct communicator who gives and receives feedback well, works with empathy, and makes the people around them better Preferred Qualifications • Hands-on experience building features on top of LLMs in production: tool-calling, RAG, multi-step reasoning, and context management • Familiarity with A2A (Agent-to-Agent) communication patterns and multi-agent orchestration frameworks • Familiarity with MCP at the client/consumer layer: how agents discover and invoke tools via MCP • Working knowledge of prompt engineering and LLM behavior across model families; you understand why Claude and GPT respond differently to the same prompt • Experience with LLM evaluation, grounding assessment, or regression testing for AI-powered systems • Awareness of token economics at the application layer: cost-per-query, context budget management, and prompt efficiency • Experience on Microsoft Azure: Azure Functions, API Management, Container Apps, or Azure OpenAI Service • Familiarity with secrets management, least-privilege access, and security-conscious engineering practices • Experience in startup or early-stage environments: comfort with ambiguity, rapid iteration, and wearing multiple hats • Experience in healthcare, life sciences, or other regulated domains is a plus but not required Why This Role • You will work on a production system that real users depend on every day to do meaningful work. • You will not be one of hundreds of engineers on a feature nobody uses. • You will see the impact of what you build quickly, get direct feedback, and have real ownership over your work.