AI-Native Hiring Assessment

Hire engineers who can leverage AI.

AI changed how engineers build, but hiring hasn't caught up. We're closing that gap.

Traditional hiring methods are outdated.

Hiring teams can't tell who leverages AI and who leans on it.

Category Traditional OA TenX Assessment
AI policy Not allowed Required
What's assessed Algorithmic puzzles and syntax Prompts, code comprehension, solution quality
What you get A score and a pass/fail threshold AI usage profile, team fit, multi-dimensional scoring

Send an invite and receive a report.

  1. 01
    Candidate invited

    One-click link. No install, no portal.

  2. 02
    Timed project

    60-minute build in a realistic dev environment with AI.

  3. 03
    Activity captured

    Every prompt, iteration, and edit logged in real time.

  4. 04
    Technical deep dive

    Targeted questions probe comprehension and decision-making.

  5. 05
    Report delivered

    AI usage profile, team fit, and scores sent to your team.

THE ASSESSMENT

Real work. Real tools.
Real signal.

Candidates work on projects in a full dev environment with AI.

[Preview] README.md
{} devcontainer.json
TenX AI v0.3.1
Type /help for commands.
Assessment timer: 60:00. Type /time to check remaining time.
(59:20) What are some options we can explore for the leaderboard's ranking algorithms?
Read(ranking.py)
Working… (Ctrl+C to interrupt)
(59:59) Add a new recommendation engine feature that analyzes user behavior patterns
4 Prompts
2 Iterations
1/3 AI Accepts
checkout.ts
const subtotal = cart.items
.map(i => priceFor(i) * i.qty)
.reduce((a, b) => a + b, 0);
// TODO: bug — discount applies AFTER tax
const taxed = subtotal * 1.0825;
const total = taxed - discountFor(userId);
TECHNICAL DEEP-DIVE

Candidates own every decision.

After the build, TenX probes the decisions they made, the AI suggestions they rejected, and the edge cases they considered. Reasoning is graded alongside the code.

Deep-Dive · Live session 0142

See what a TenX report looks like.

AI Usage Profile. Code Comprehension. Solution Quality.

Candidate Report
Implementation
STRONG

The implementation uses full-text search with weighted ranking across all three entity types. Account hits are weighted 3×, ticket hits 2×, note hits 1×. Scores are aggregated per account via CTE structure, deduplication is handled by GROUP BY with MAX aggregation for multi-match entities, and results are ordered by total_score DESC. The query structure matches the rubric's strong-tier example: separate CTEs per entity type, ts_rank weighting, UNION ALL aggregation, and final join to accounts.

CODE-REVIEW OBSERVATION

Cross-entity coverage: The query searches accounts (full_name, email, company_name), tickets (title, description), and notes (content). All three are weighted via module-level constants (3, 2, 1).

Code Quality
MODERATE

Core search functionality is complete: full-text search across three entity types, deduplication at account level, weighted ranking. CTE structure cleanly separates concerns (account hits, ticket hits, note hits, aggregation). Weights are extracted to named constants. No test files present in the diff. The candidate describes manual validation in FRQ 3 (a dozen representative queries, timing…

CODE EVIDENCE

if not q or not q.strip(): return [] — empty input handled

Candidate Report
Code Comprehension
PARTIAL

The candidate describes the high-level shape of the implementation (full-text search with weighted ranking) but cannot recall specific weight values, mis-states the aggregation choice, and attributes several decisions to "what the AI generated" rather than own reasoning.

FRQ EVIDENCE

Turn 1: candidate stated 'I used ts_rank with higher weight on account fields, I think' — could not specify ACCOUNT_MATCH_WEIGHT=3, TICKET_MATCH_WEIGHT=2, NOTE_MATCH_WEIGHT=1 without re-opening the file. Later described aggregation as SUM when the code uses MAX.

Tradeoff Awareness
LIMITED

The candidate does not surface tradeoffs unprompted. When directly asked about aggregation or indexing choices, gives surface-level answers without connecting them to precision/recall, scalability, or maintainability implications.

FRQ EVIDENCE

Turn 2: when probed on MAX vs SUM aggregation, candidate stated 'I think it was just simpler that way' — no articulation of precision vs recall. Did not raise on-the-fly tsvector vs indexed columns or LIMIT placement as design decisions worth discussing.

Candidate Report
Prompting
ITERATIVE
SPEC-DRIVEN

The candidate skipped upfront planning and moved directly into implementation prompts, course-correcting reactively as errors surfaced rather than scoping the work in advance.

Collaboration
DIRECTING
DELEGATING

The candidate handed off broadly scoped tasks to the AI, accepted its structural and naming choices without pushback, and intervened only when the implementation failed outright.

Validation
ACCEPTING
SKEPTICAL

The candidate merged AI-generated SQL without independent review, relied on the automated test suite as sole verification, and did not probe edge cases or security implications before moving on.

Integration
SHALLOW
DEEP

The candidate engaged AI across planning, implementation, debugging, and documentation, leaning on it for most decisions rather than treating it as one input among several.

BEST FIT

Fits high-velocity product teams where shipping speed outweighs cleanup cost and downstream review catches the misses. Poor fit for regulated environments (fintech, healthcare, security-critical infrastructure) where AI-generated code requires strict validation before merge.

EARLY ACCESS

We're onboarding pilot partners now.

Pilot partners shape the product. Pricing is a conversation, not a checkout.

Accepting a limited number of pilot partners now.