AI-Native Hiring Assessment

Hire engineers who can leverage AI.

AI changed how engineers build, but hiring hasn't caught up. We're closing that gap.

THE PROBLEM

Traditional hiring methods are outdated.

Hiring teams can't tell who leverages AI and who leans on it.

Category	Traditional OA	TenX Assessment
AI policy	Not allowed	Required
What's assessed	Algorithmic puzzles and syntax	Prompts, code comprehension, solution quality
What you get	A score and a pass/fail threshold	AI usage profile, team fit, multi-dimensional scoring

HOW IT WORKS

Send an invite and receive a report.

01

Candidate invited

One-click link. No install, no portal.
02

Timed project

60-minute build in a realistic dev environment with AI.
03

Activity captured

Every prompt, iteration, and edit logged in real time.
04

Technical deep dive

Targeted questions probe comprehension and decision-making.
05

Report delivered

AI usage profile, team fit, and scores sent to your team.

THE ASSESSMENT

Real work. Real tools.
Real signal.

Candidates work on projects in a full dev environment with AI.

[Preview] README.md

{} devcontainer.json

TenX AI Assistant

You must use the TenX AI coding agent for this assessment. It is pre-configured for this environment and required for evaluation.

Launch it (new terminal tab):

bashtenx-ai

Rules

The provided AI coding agent is your primary tool. Use it freely.
Do not use ChatGPT, Claude Code, Gemini, Copilot, or any other external AI tool.
Browser searches and documentation are allowed.

Evaluation Criteria

Solution quality (correctness, robustness, code health)
Your understanding of what you built (free-response questions at submission)
How you worked with AI (the assistant logs the session automatically)

Submission — when your time is up or you are done:

bashtenx-submit

Hot Takes Tournament

A pairwise-voting web app where users submit short opinions ("hot takes"), vote on head-to-head matchups between two takes, and a leaderboard ranks every take by its matchup history.

Stack

Layer	Technology
Backend	Python 3.11, FastAPI, SQLite (stdlib `sqlite3`)
Frontend	React 18, Vite
Validation	Pydantic v2

Make Commands

make dev starts backend (8000) and frontend (5173) together. Other targets: make backend, make frontend, make seed, make reset-db, make test, make verify.

API Endpoints

Method	Path	Description
`GET`	`/posts`	List all posts
`POST`	`/posts`	Submit a new hot take (5–200 characters)
`GET`	`/matchup`	Get two random posts for a head-to-head vote
`POST`	`/matchup/vote`	Record a vote with a winner and loser
`GET`	`/leaderboard?sort=…`	Ranked leaderboard; accepts `recent`, `best`, `hot`, `controversial`
`GET`	`/health`	Health check

Assessment Task

Build the app end-to-end: the matchups data layer, all backend endpoints, four ranking algorithms, and the frontend views that tie them together.

The backend has a working GET /posts anchor and stub handlers for everything else. The frontend has scaffolded views with TODO markers. Your job is to complete both sides.

The leaderboard supports four sort modes. The frontend tabs use these exact labels:

Tab label	Subtitle	Sort param
New	Sorted by submission time	`recent`
Top-rated	Highest win rate with confidence	`best`
Trending	Recent engagement × win rate	`hot`
Divisive	Close win/loss splits with high vote volume	`controversial`

Each sort mode requires its own ranking formula in ranking.py. How you implement them — what signals you weight, how you define recency, how you handle posts with no votes yet — is your decision.

Hard constraints:

Backend: FastAPI, stdlib sqlite3 only — no ORMs
Ranking: all logic in ranking.py, no outbound HTTP from that module, no external ranking libraries
Frontend: useState, useEffect, and fetch only — no state-management or data-fetching libraries

Time limit: 60 minutes coding / 10 minutes free-response.

Before coding

Read every file under backend/app/ — understand what is given, what is stubbed, and what is missing.
Read every file under frontend/src/ — the views have scaffolded shells with TODO markers.
Read all files in tests/.
Run pytest --collect-only to see the full test map.
Run make test to see the baseline pass/fail state.

tenx · ai

problems

output

TenX AI v0.3.1

Type /help for commands.

Assessment timer: 60:00. Type /time to check remaining time.

(59:20)› What are some options we can explore for the leaderboard's ranking algorithms?

Read(ranking.py)

Working… (Ctrl+C to interrupt)

[Preview] README.md

Hot Takes Tournament

A pairwise-voting app where users submit opinions, vote on head-to-head matchups, and a leaderboard ranks every take by its history.

Python · FastAPI React 18 · Vite SQLite

Time limit 60 min coding · 10 min FRQ

tenx · ai (59:20)

What are some options for the leaderboard ranking algorithms?

Read(ranking.py)

Four worth considering:

recent — sort by created_at
best — Wilson score, confidence-adjusted win rate
hot — win rate × recency decay on latest vote
controversial — close 50/50 splits with high volume

Wilson is the right call for best — raw win rate breaks on low-vote posts.

Implement Wilson score in ranking.py

Edit(ranking.py)

Done. Added wilson_score(wins, total) — 95% confidence lower bound. New posts start conservatively and earn rank as votes accumulate.

› ask the AI something…

TECHNICAL DEEP-DIVE

Candidates own every decision.

After the build, TenX probes the decisions they made, the AI suggestions they rejected, and the edge cases they considered. Reasoning is graded alongside the code.

Deep-Dive · Live session 0142

SAMPLE CANDIDATE REPORT

See what a TenX report looks like.

AI Usage Profile. Code Comprehension. Solution Quality.

Candidate Report

03 SOLUTION

Implementation

STRONG

The implementation uses full-text search with weighted ranking across all three entity types. Account hits are weighted 3×, ticket hits 2×, note hits 1×. Scores are aggregated per account via CTE structure, deduplication is handled by GROUP BY with MAX aggregation for multi-match entities, and results are ordered by total_score DESC. The query structure matches the rubric's strong-tier example: separate CTEs per entity type, ts_rank weighting, UNION ALL aggregation, and final join to accounts.

CODE-REVIEW OBSERVATION

Cross-entity coverage: The query searches accounts (full_name, email, company_name), tickets (title, description), and notes (content). All three are weighted via module-level constants (3, 2, 1).

Code Quality

MODERATE

Core search functionality is complete: full-text search across three entity types, deduplication at account level, weighted ranking. CTE structure cleanly separates concerns (account hits, ticket hits, note hits, aggregation). Weights are extracted to named constants. No test files present in the diff. The candidate describes manual validation in FRQ 3 (a dozen representative queries, timing…

CODE EVIDENCE

if not q or not q.strip(): return [] — empty input handled

Candidate Report

02 UNDERSTANDING

Code Comprehension

PARTIAL

The candidate describes the high-level shape of the implementation (full-text search with weighted ranking) but cannot recall specific weight values, mis-states the aggregation choice, and attributes several decisions to "what the AI generated" rather than own reasoning.

FRQ EVIDENCE

Turn 1: candidate stated 'I used ts_rank with higher weight on account fields, I think' — could not specify ACCOUNT_MATCH_WEIGHT=3, TICKET_MATCH_WEIGHT=2, NOTE_MATCH_WEIGHT=1 without re-opening the file. Later described aggregation as SUM when the code uses MAX.

Tradeoff Awareness

LIMITED

The candidate does not surface tradeoffs unprompted. When directly asked about aggregation or indexing choices, gives surface-level answers without connecting them to precision/recall, scalability, or maintainability implications.

FRQ EVIDENCE

Turn 2: when probed on MAX vs SUM aggregation, candidate stated 'I think it was just simpler that way' — no articulation of precision vs recall. Did not raise on-the-fly tsvector vs indexed columns or LIMIT placement as design decisions worth discussing.

Candidate Report

01 AI USAGE PROFILE

Prompting

ITERATIVE

SPEC-DRIVEN

The candidate skipped upfront planning and moved directly into implementation prompts, course-correcting reactively as errors surfaced rather than scoping the work in advance.

Collaboration

DIRECTING

DELEGATING

The candidate handed off broadly scoped tasks to the AI, accepted its structural and naming choices without pushback, and intervened only when the implementation failed outright.

Validation

ACCEPTING

SKEPTICAL

The candidate merged AI-generated SQL without independent review, relied on the automated test suite as sole verification, and did not probe edge cases or security implications before moving on.

Integration

SHALLOW

DEEP

The candidate engaged AI across planning, implementation, debugging, and documentation, leaning on it for most decisions rather than treating it as one input among several.

BEST FIT

Fits high-velocity product teams where shipping speed outweighs cleanup cost and downstream review catches the misses. Poor fit for regulated environments (fintech, healthcare, security-critical infrastructure) where AI-generated code requires strict validation before merge.

EARLY ACCESS

We're onboarding pilot partners now.

Pilot partners shape the product. Pricing is a conversation, not a checkout.

Exclusive Access Pilot Program

Comprehensive report on every candidate
Rubric calibrated to your stack and roles
Weekly feedback loop with the founders

Start the conversation

Accepting a limited number of pilot partners now.

Hire engineers who can leverage AI.

Traditional hiring methods are outdated.

Send an invite and receive a report.

Real work. Real tools. Real signal.

Candidates own every decision.

See what a TenX report looks like.

We're onboarding pilot partners now.

Hire engineers who can leverage AI.

Real work. Real tools.
Real signal.