Your team does not sell the same way as every other team. So why would you score calls with the same generic template?

Most call scoring tools hand you a fixed scorecard -- "Engagement: 7.2, Sentiment: Positive, Talk Ratio: 62%" -- and call it a day. That data might look useful in a dashboard, but it does not tell you whether the rep actually asked about budget, disclosed the required compliance statement, or booked the next meeting. It scores what the tool thinks matters, not what you know matters.

Automatic custom call scoring flips this. You define the rules. AI applies them to every call. No manual review, no generic templates, no hoping the tool happens to measure the right things.

Why Generic Call Scoring Does Not Work

Every sales team has its own process, its own qualification criteria, its own definition of a "good call." A recruitment agency cares whether the consultant confirmed salary expectations and availability. An insurance team needs to verify that the advisor stated their FCA disclosure. A real estate office wants to know if the agent booked a viewing.

Generic scoring tools ignore all of this. They measure surface-level signals -- sentiment, talk time, filler words -- because those are the only metrics that apply universally. The result is scores that look scientific but tell you nothing actionable.

Here is what typically happens with template-based scoring:

A rep gets an "Engagement Score" of 8.1 on a call where they forgot to qualify budget. The score looks great, but the deal is dead.
A compliance-heavy call gets flagged as "negative sentiment" because the rep spent time on required disclosures. The tool penalises good behaviour.
Two reps with identical generic scores have wildly different close rates because the score is not measuring what actually drives results.

The problem is not that generic metrics are useless. Talk-to-listen ratio and call duration have their place. The problem is that they are incomplete -- and when they are the only thing being measured, your team optimises for the wrong things.

What Automatic Call Scoring Actually Means

Manual call scoring means a manager listens to a recording, fills out a scorecard, and writes up notes. It works, but the maths breaks down fast. A manager with 8 reps averaging 15 calls per day is looking at 120 daily calls. Even spending just 10 minutes per review, that is 20 hours of work to cover every call. In practice, managers review 1-3% of calls and hope the sample is representative.

Automatic scoring eliminates this bottleneck. Here is how it works:

The call happens. Your VoIP system (Aircall, Ringover, or similar) records it.
AI transcribes the call. Speech-to-text converts the recording into a searchable transcript.
Your custom rules run against the transcript. The AI checks each rule and assigns a score.
Results are available immediately. No waiting for a manager to find time. Every call is scored within minutes.

The critical difference from manual scoring is not just speed -- it is coverage. When every call gets scored, you stop relying on random samples and start seeing real patterns. You can monitor sales calls at scale without adding headcount.

Custom Rules vs Fixed Templates

This is where the gap between tools becomes obvious. Most enterprise platforms like Gong use proprietary scoring models trained on aggregate data. They tell you things like "Deal Risk: Medium" based on patterns across thousands of companies. Useful if you are a 500-person sales org that looks like the average. Less useful if you are a 6-person team with a specific process.

	Fixed Templates (Enterprise)	Custom Rules (Coldread)
What gets scored	Predetermined metrics (engagement, sentiment, deal risk)	Whatever you define -- discovery questions, compliance checks, process steps
Who defines criteria	The vendor's data science team	You, in plain English
Industry adaptation	Generic across all industries	Rules tailored to recruitment, insurance, real estate, automotive, or any vertical
Setup complexity	Minimal (but you cannot change what it measures)	Write your rules in natural language -- no technical configuration
Example output	"Engagement Score: 7.2"	"Budget discussed: Yes. Next meeting booked: No. Compliance disclosure: Yes."
Price range	$100-300+ per user/month	Starting at $29/month for the whole team

The Coldread vs Gong comparison covers this in detail, but the short version: fixed templates work when your sales motion matches the vendor's model. Custom rules work when you need the scoring to match your actual process -- which is most of the time for small, specialised teams.

How to Define Your Scoring Criteria

Building a custom scorecard starts with one question: what separates your best calls from your worst?

Skip the theoretical frameworks for a moment. Pull up the last five deals your team closed and the last five they lost. Listen to the calls (or read the transcripts) and note the differences. You will almost certainly find patterns:

Winners asked about the prospect's current solution and why they are looking to change. Losers jumped straight into a demo.
Winners confirmed budget range early. Losers presented pricing cold at the end.
Winners agreed on a specific next step with a date. Losers ended with "I'll send you some info."

Those patterns are your scoring criteria. For a deeper framework on building scorecards from scratch, see our guide on call scoring best practices.

Turning Patterns Into Rules

Once you have your criteria, express them as specific, observable behaviours -- things the AI can detect in a transcript:

"Did the rep ask about the prospect's current provider?" (Yes/No)
"Did the rep confirm budget or price range?" (Yes/No)
"Did the rep book a follow-up meeting with a specific date?" (Yes/No)
"Did the rep handle at least one objection by asking a clarifying question?" (Yes/No)
"Did the rep deliver the compliance disclosure?" (Yes/No)

Notice these are all binary or nearly binary questions. That is intentional. Complex scales ("Rate the quality of objection handling from 1-10") introduce subjectivity -- even for AI. Simple, specific rules produce consistent scores.

Start Small, Then Expand

Five rules is enough to start. You can always add more once you have baseline data. Teams that begin with 15+ rules tend to get overwhelmed by the output before they learn anything useful from it.

Examples by Industry

The power of custom rules is that the same tool adapts to completely different sales processes. Here is what scoring criteria look like across Coldread's core industries:

Recruitment

Did the consultant confirm the candidate's salary expectations?
Did the consultant ask about notice period and availability?
Did the consultant describe the role accurately, including location and working arrangements?
Did the consultant confirm next steps and timeline?
Did the consultant ask about other active applications?

These rules catch the most common failure mode in recruitment calls: consultants who get excited about a match and rush to submit the candidate without qualifying properly.

Insurance

Did the advisor state their FCA regulatory disclosure?
Did the advisor confirm consent to record the call?
Did the advisor assess the customer's existing cover before recommending?
Did the advisor explain key policy exclusions?
Did the advisor ask about vulnerable circumstances?

For insurance teams, scoring is as much about compliance as it is about sales quality. Missing a disclosure on one call is a coaching moment. Missing it on 30% of calls is a regulatory problem. For more on this, see our insurance sales tips guide.

Real Estate

Did the agent ask about the buyer's timeline and mortgage status?
Did the agent describe the property's key selling points tailored to the buyer's stated needs?
Did the agent suggest or book a viewing?
Did the agent ask about competing properties the buyer is considering?
Did the agent confirm follow-up arrangements?

Real estate teams live and die by viewings booked. A custom score that tracks "viewing booked: yes/no" tells you more about call quality than any sentiment analysis ever could.

Automotive

Did the rep ask about the customer's current vehicle and what they are looking for?
Did the rep mention available financing or trade-in options?
Did the rep ask for an appointment or test drive?
Did the rep capture the customer's contact details for follow-up?

Financial Services and Debt Collection

Did the advisor deliver the required regulatory disclosure?
Did the agent verify the debtor's identity before discussing the account?
Did the agent explain payment options clearly?
Did the agent detect and respond appropriately to signs of vulnerability?

Compliance-heavy industries benefit the most from automatic scoring because the alternative -- manually auditing every call -- is prohibitively expensive. If you are evaluating call monitoring software for small teams, compliance automation should be a top criterion. See our call compliance monitoring guide for detailed requirements.

Setting It Up: Plain-English Rules

Most conversation intelligence platforms require either a fixed model you cannot change or a complex configuration process involving spreadsheets, dropdown menus, and vendor support calls.

Coldread takes a different approach. You write scoring rules in plain English:

"Check if the rep asked about the prospect's current provider"

"Check if the advisor stated the FCA compliance disclosure"

"Check if the rep confirmed a specific next meeting date"

That is the entire setup. No coding, no configuration wizard, no waiting for a customer success manager to build it for you. Write the rule, save it, and every future call gets scored against it.

This matters because scoring criteria are not static. Your process evolves. You launch a new product, enter a new market, hire reps who need different coaching focus areas. With fixed templates, you are stuck. With plain-English rules, you update a sentence and the scoring adapts immediately.

For teams using Aircall, these rules work on top of Coldread's native integration -- see the full breakdown of Aircall AI features and how Coldread extends them.

What Automatic Scoring Reveals

Scoring one call gives you a data point. Scoring every call gives you a dataset. That dataset reveals patterns no amount of manual review can surface:

Rep-Level Patterns

Discovery gaps. One rep scores 90% on rapport and value proposition but 30% on discovery questions. They are charming but not qualifying. Without scoring data, you might not catch this because their calls "sound good."
Compliance drift. A rep who consistently delivered disclosures in month one starts skipping them in month three. Automatic scoring catches the trend before it becomes a regulatory issue.
Closing weakness. A rep covers every step of the process but never books a concrete next meeting. Their pipeline looks healthy but nothing moves forward.

Team-Level Patterns

Script effectiveness. Your new call script improves discovery scores by 15% across the team but reduces close rates. The script is thorough but too long -- reps are running out of time before they get to the ask.
Training impact. After a coaching session on objection handling, scores for that criterion improve by 20% for three weeks, then regress. One session is not enough -- you need reinforcement.
Seasonal shifts. Discovery scores drop every quarter-end as reps rush to hit targets and skip qualification. Now you can quantify the problem and address it.

These patterns are the foundation of data-driven sales coaching. Instead of guessing what your team needs to work on, you know -- because the scores tell you. If you are building a coaching programme from scratch, our coaching software guide covers how to pair scoring data with structured 1:1s.

Connecting Scores to Revenue

The ultimate test of any scoring system is whether higher scores correlate with better outcomes. After 30 days of scoring, compare:

Average scores for deals that closed vs deals that were lost
Rep scores vs quota attainment
Score trends over time vs pipeline velocity

If your scoring criteria are right, higher scores should predict higher close rates. If they do not, your criteria need adjusting -- you are measuring the wrong things. This feedback loop is what makes custom scoring powerful. For a deeper look at connecting call data to revenue outcomes, see our guide on improving close rates with call analytics.

Key Takeaways

Generic scoring templates measure what the vendor thinks matters. Custom rules measure what you know matters for your specific team and industry.
Automatic scoring means every call gets evaluated -- not the 1-3% a manager can manually review. Coverage goes from sample-based to comprehensive.
Plain-English rules eliminate configuration complexity. Write what you want to check in natural language. No spreadsheets, no vendor setup calls.
Start with 5 rules based on what separates your best calls from your worst. Expand once you have baseline data.
The real value is in patterns, not individual scores. Rep-level trends, team-wide gaps, and score-to-revenue correlations drive coaching and process decisions.

Custom scoring is not a nice-to-have feature. For small, specialised teams where every call matters, it is the difference between improving your sales calls based on data and improving them based on guesswork.

Try Coldread free -- define your custom scoring rules in plain English and score every call automatically. Plans start at $29/month, no credit card required. See what your team's calls actually look like when measured against your standards, not someone else's.

Use the ROI calculator to estimate how much time automatic scoring saves your team each week.

Automatic Call Scoring With Custom Rules: How AI Call Scoring Works for Sales Teams