Build civilization-scale infrastructure.
We hire patiently and at depth across all eight research pillars and three internal projects. Apik is a small, hand-selected team that is growing slowly on purpose. If you want to work on the foundations of post-scarcity coordination, with a research bar calibrated against Anthropic-tier and DeepMind-tier work, this is the page that documents how that hiring runs.
The candidate archetypes.
Apik does not run a single recruiting funnel against a single role-shape. The archetypes below are deliberately distinct: the work each of them does is different, the signal we look for in each is different, and the reasons we decline a strong candidate from one archetype are not the reasons we would decline from another. The shapes are the ones that have stabilized across the frontier-research cohort, and the descriptions here are written so a candidate can self-assign before applying.
Holds open problems on a research pillar; publishes in the venues appropriate to that pillar; reads the literature in adjacent pillars deeply enough to argue across them. We look for sustained engagement with a small number of questions over years rather than incidental engagement with many. Strong candidates can name the two or three problems in their pillar that they think the field has gotten wrong, and can argue for the position.
Builds the rigs that research scientists work in and ships the systems research scientists need to validate claims. The role is engineering-grade, not research-grade-with-good-engineering: research engineers own evaluation harnesses, run-tracking, training infrastructure, and the deployment plumbing for product-side experiments. We look for engineers who read papers carefully and ship code that survives review by both researchers and other engineers.
The rare role. Authors and revises policy documents at the standard of the Responsible Development Policy and the Apik manifesto: dense, cited, falsifiable, written to survive contact with regulators and adversarial press. Candidates here usually come from a hybrid background — law-and-AI, economics-and-AI, governance-and-engineering. We look for the ability to hold a contested position in writing without flinching from the counterarguments.
Aegis-shaped. Specifies and mechanically checks safety-critical interfaces — agent action grammars, side-effect policy cores, runtime monitors. Candidates have prior production work in TLA+, Coq, Isabelle, Lean, or a comparable proof assistant; bonus for verification work on privileged-interface systems (seL4-style, Project Everest, AWS automated-reasoning group).
How candidates are evaluated.
Hiring at Apik is slow on purpose. A typical loop runs four to eight weeks from first contact to offer; senior hires can run longer. The slowness is the signal — if the loop is moving fast, the loop is calibrated against the wrong standard. We benchmark the bar against the public hiring documentation and the published practice of the frontier-research cohort: the calibration target is Anthropic-tier and DeepMind-tier hiring, not generic-AI-startup hiring.
We do not run leetcode loops for research-track roles. The signal we have found most predictive — across both research-scientist and research-engineer archetypes — is the ability to write a two-page argument that survives substantive review. Strong candidates can produce an argument-bearing document about a contested research question, and their argument holds up against the obvious counterarguments. The signal is unusually decision-relevant because the company itself runs on documents of this shape: research agendas, policy frameworks, system cards, manifesto-style framings. A candidate who cannot produce documents like this would not have the leverage at Apik that the role requires.
The standard interview shape: an initial conversation about research interests and prior work; a written exercise on a question relevant to the role; a technical round with a researcher in an adjacent pillar; a culture and posture round with one of the founders; references. We share interview material in advance where possible, and we publish feedback to declined candidates where they ask for it. The bar is high; the process is meant to be respectful regardless of outcome.
Pay, equity, and the transparency posture.
Apik publishes compensation bands for every role at offer time. The bands are calibrated against Anthropic-tier and DeepMind-tier base-cash for equivalent roles; equity is meaningful and is structured to reward sustained tenure rather than departure. Where a role's band is wider than usual — typical of research-scientist seniority bands — the position within the band is decided by calibration against current Apik teammates at the same seniority, and the calibration is explained at offer time.
We publish bands because the compensation literature is unambiguous about what unpublished bands do: they widen the asymmetry of negotiation in favor of the employer, in ways that compound across cohorts and disproportionately disadvantage candidates from groups under-represented in the field. The transparency posture is the same one that runs through the rest of the lab — the Responsible Development Policy, the system-card schema, the policy review on news posts — and it is meant to be of a piece.
We do not run counter-offer dynamics. We do not refresh equity to retain against an external offer. We do not pay above-band to win a signed candidate from a competing offer. The principle is that the band is the band, and the band has to be defensible at the start of the loop or it is not the right band.
The fellowship program.
The visiting-researcher program runs three-to-twelve-month engagements for senior researchers and PhDs who have a defined collaboration in mind. The program is funded; the deliverable is a publishable artifact — a paper, a policy framework, a verified subsystem, a demonstrated component — that would not have existed without the collaboration. The program structure is modeled on Anthropic's residency framing, MIRI's research workshops, and the visiting-researcher norms at academic-affiliated centers like Cambridge's CSER, Oxford's GovAI, and the OpenAI Forecasting program.
The scientific case for visiting researchers is direct: rotating outside expertise prevents the inbreeding that a small, focused lab is otherwise structurally prone to. Specific pillars — interpretability, formal methods, mechanism design, the embodiment substrate — are research areas where a three-to-six-month deep collaboration with an external specialist materially changes the outcomes the lab produces. The Responsible Development Policy and the system-card schema both benefit from external review by visiting policy researchers and visiting safety scientists; that review has been built into the document-review processes.
Visiting researchers receive Apik-internal access proportionate to the collaboration scope, attribute their work to themselves as primary author with Apik as the institutional collaborator, and retain copyright on the artifact unless the collaboration is structured otherwise. Proposals go to careers@apiksystems.com with a one-page brief, the proposed deliverable, and at least one reference who can speak to the candidate's prior work.
How the lab runs day to day.
Apik is remote-first with two on-site weeks per year (one research retreat, one engineering retreat). Most communication is written and asynchronous; the standing meetings are kept few. Slack is for coordination, not for work; the work happens in documents, code, and review threads. We do not run a standup; we do run a weekly research review where anyone — researcher, engineer, visiting fellow — can present current work and have it critiqued by the room. The review is the canonical site of internal feedback and is intentionally substantive.
Reading-group cadence runs at one paper per week per pillar, plus a cross-pillar reading group on the manifesto-relevant literature. Attendance is voluntary; participation in the discussion is expected for full-time research roles. The reading list is curated by pillar leads and is published internally; the reading-group discussions occasionally convert into news posts on this site.
Deep work is the default expected mode. Calendars are unscheduled by default for research-track roles; meetings are scheduled into the week with preference for batching. The norm is that two consecutive uninterrupted days per week are the floor, not the ceiling. The tradeoff is fewer status meetings and slower-feeling coordination; the gain is the kind of sustained engagement that is the operating premise of the lab.
Currently hiring
- AI SafetyResearch scientist — Mechanistic interpretabilityRemote / on-site·Full-time
Drive the interpretability program — sparse-autoencoder methodology at frontier scale, feature audits of agentic plans, intervention-based causal claims. The role owns the interpretability artifact for system-card use and the research direction we publish under that line.
- Agentic SystemsResearch engineer — Long-horizon agent infrastructureRemote / on-site·Full-time
Build the harnesses and the rigs the agentic-systems research depends on: long-horizon task suites, oversight protocols, multi-agent coordination experiments. Interfaces with the Aegis policy core on the verification side and with Brello AI on the deployment side.
- Project AegisFormal-methods engineer — Verified envelopesRemote / on-site·Full-time
Specify and mechanically check the policy core that mediates agent side effects. Background in TLA+, Coq/Isabelle, or seL4-adjacent verification work. The role lives at the boundary between alignment claims and the engineering tolerance that makes them load-bearing.
- Physical IntelligenceResearch engineer — Sensorimotor foundation modelsOn-site·Full-time
Train and evaluate sensorimotor models on the embodied-research platform. The role is on-site because the data-collection and rig-engineering work does not work remotely. Background in robotics, RL, or imitation learning.
- Cross-cuttingVisiting researcher / FellowRemote / on-site·3–12 months
Senior researchers and PhDs spending three to twelve months with Apik on a defined collaboration. The arrangement is funded; the deliverable is a published artifact that would not have existed without the collaboration. Pillars currently open for visiting work: interpretability, formal methods, mechanism design, embodiment.
- Cross-cuttingEngineering — Distributed systems & evaluationsRemote / on-site·Full-time
Owns the evaluation infrastructure across all eight pillars. Builds the harnesses, the run-tracking, the statistical analysis stack, the run-reproducibility tooling. The role is plumbing-shaped and high-leverage — every research result on this site eventually passes through your code.
The application packet.
For research-track roles: a cover letter that makes a specific argument about what you would push on at Apik, two to three pieces of relevant prior work (papers, repositories, posts), and references who can speak to recent work. We do not gate on CV format; a clean academic CV or resume is fine but is not the signal we read against. Cover letters under one page that argue something contestable beat cover letters under three pages that argue nothing.
For research-engineer roles: a cover letter, a representative code artifact (production code is fine; open-source contributions are fine), and a short description of one engineering decision you made on a recent project that you would defend or revisit. The engineering loop is calibrated to assess judgement under realistic constraints, not algorithmic agility under artificial ones.
For visiting-researcher proposals: a one-page brief on the collaboration, the proposed deliverable, the specific Apik person or pillar you would work with, and a senior reference. Proposals that are pillar-aligned and have a named Apik counterpart in mind are read first.
All applications go to careers@apiksystems.com. We respond to every applicant within two weeks of receipt; declined candidates can request feedback and we will provide it where the loop generated useful signal.