Can we reuse the same prompt across roles?

Yes, for very similar roles. But: every role has specifics ('Python vs Go', 'Berlin vs remote', 'senior vs mid'). A universal prompt is only 70% suitable per role. Recommendation: master prompt per job family (backend, frontend, marketing), tweak mandatory skills per role.

What if the KI score and my judgement systematically diverge?

Three diagnostics. One - read the reasoning. Wrong substance? Prompt too vague. Two - look at score distribution against sensitive attributes (gender, age). Does the prompt have a bias? Re-instruct. Three - honestly ask if your own judgement is systematically biased. Sometimes the KI is right.

Do we have to tell candidates which prompt was used?

On request, yes - GDPR Art. 15 right to information + AI Act transparency cover 'logic of processing'. You don't have to volunteer the exact prompt wording, but on request you must describe which factors enter the score. Per-role documentation helps (mandatory step 6 in the AI Act checklist).

KI BMS

Pricing

Guides

Setting up KI screening correctly - write the prompt, validate, calibrate

KI pre-sorting stands or falls with the prompt. Here's how to write a good one, test it in 20 applications, and tune over time - without overriding your HR judgement.

Guide

Setup

Finn GlasCo-Founder + Engineering

·January 16, 2026·

4 min read

Key takeaways

A good prompt is three sentences. Longer prompts get weighted unevenly by the LLM and produce less consistent scores.

Validating with 20 applications from the talent pool is the most important step - skip at your peril.

Re-calibrate every 30-50 applications or on hiring-team change.

KI score never overrides HR judgement. If score says 92 and HR says 'wrong person', HR is right.

Step by step

1. Reduce profile to three mandatory skills

Look at the job ad. Which three skills can you not drop without the role becoming a different one? Those three go into the prompt. Everything else is plus or irrelevant.

2. Write the prompt in three-sentence structure

Example: 'Look for 3+ years backend experience with Python or Go, sufficient DB experience (Postgres or MySQL), and EU work permit. Pluses: Kubernetes experience, open-source contributions with own code. Frontend experience is not relevant for this role.'

3. Validate against 20 talent-pool applications

Pick 20 applications from your pool, ideally mixed 'top back then' and 'didn't fit back then'. Have the prompt score. Compare top-5: does it match your memory? If yes, continue. If not, back to step 2.

4. Go live and observe

Flip the switch in role editing. Next 30 applications come with score + reasoning. Spot-check every 5-10 applications: does the reasoning match the CV? If reasoning hallucinates ('person has 8 years of experience' for a 3-year CV), that signals a prompt problem.

5. Re-calibrate after 50 applications

After 50 applications with the prompt: look at score distribution. Are 80% of your invitees in the top 20%? If yes, continue. If 'top invitees' land in mid-score, adjust - usually a plus component is missing or a mandatory component is too hard.

What a good screening prompt contains

Three components in this exact order. One - mandatory skills (max 3). 'Look for Python + backend experience + EU work permit'. Two - strong pluses (max 2). 'Pluses: Kubernetes experience and open-source contributions'. Three - what's not relevant. 'Frontend experience is not relevant for this role'.

Why exactly three. A longer prompt isn't evenly weighted - the seventh criterion has statistically half the impact of the third. Listing 12 requirements gives you a score primarily reflecting the first three. Focus on the three that genuinely decide.

Anti-patterns we see often

One - 'look for a top application impression'. Meaningless to the KI; it doesn't know what 'top impression' means. Stay concrete: 'look for structured CV language and concrete project descriptions'.

Two - 'assess cultural fit'. Forbidden under AGG, risky under AI Act, technically unreliable. Ask 'how does this person work with others?' in the structured interview - not in KI screening.

Three - 'reject applications under score 40'. Auto-rejection without human review is forbidden (Art. 22 GDPR). The system may at most 'auto-sort into a lane', never 'send auto-rejection'.

Validate before trusting the KI

Before going live, have the KI score 20 historical applications from your talent pool. Compare scores with your own past assessment: does the top-5 selection match? If yes, prompt works. If not, adjust - usually a rewording of mandatory skills suffices.

The 80% match rule of thumb: in our pilot data, top-20% KI selection matches HR judgement ~80% of the time. If your validation is clearly below that (~50%), the prompt has a problem - usually too unspecific or too creative.

FAQ

Frequently asked

Share this article

Try KI BMS

Free plan, no credit card. We host in Germany. You can export and delete everything self-serve.

Written by

Finn Glas

Co-Founder + Engineering

Finn is one of the Co-Founders. He owns the engineering side, the infrastructure, and most of the late-night fixes that ship before anyone notices.

finn.glas at aicuflow dot comLinkedIn Website