Can Alternative Data Be Used for Creditworthiness Assessment?

May 17, 2021

About 25% of the US consumers are considered thin-file since they have less than five items in their traditional credit histories. About 7% are a no-hit (records other than the five items of the traditional framework), and 9% are completely invisible — they have no credit record. Looking for a resistance-free entry into the financial services industry to cash in on the opportunity, technology and internet companies are particularly interested in those edge cases. With the abundance of digitized behavioral data available to those companies today — social media, search, mobile money, bills, shopping history, rent, etc., — anyone can lend. Facebook alone targets ads based on 98 data points on every user. The number of indicators FinTech startups in the alternative credit scoring space varies from several hundred to hundreds of thousands.

‍

However, not all alternative data are born equal.

‍

Alternative data sources vary significantly in their ability to accurately assess one’s creditworthiness/predict the likelihood of someone defaulting. Moreover, the sources of alternative data vary in relevance depending on the goal — one would look at very different pieces of data with a varying level of trust for assessing someone previously unknown to the formal financial system and for marginally improving the accuracy of a credit score for someone with limited data.

‍

An honest discussion on the use of alternative data in creditworthiness assessment led us to three important considerations.

‍

What’s the definition?

‍

Alternative data sources constitute a fairly long list that can be grouped roughly into two types — soft data and hard data.

‍

Soft data sources include the hallmarks of social behavior for individuals. For businesses, the closest resemblance would be corporate culture, how a business owner cares for the property, inventory management, etc.
Hard data sources always come down to finances and how one (whether a business or an individual) behaves with money.

‍

For financial institutions to trust someone with money, hard data has a far larger meaning than soft data — social media data, for example. Experian, for one, talks about alternative data in the following terms:

‍

Rental payments
Mobile phone payments
Cable TV payments
Bank account information, such as deposits, withdrawals, or transfers
Small-dollar loans

‍

Every one of those alternative sources has to do with money management. Sharing the results of its first-ever report on lender and borrower perceptions about using alternative data for credit decisions, Experian revealed that 80% of lenders rely on a credit report plus additional information when making a credit decision. But here is what’s more interesting — more than 50% of consumers believe that including items like their utility or mobile phone payment history would have a positive effect on their credit score.

‍

Basically, more than 50% of consumers would want to expand the hard data points that lenders are considering. This is surprisingly aligned with how an institution thinks about alternative data — in terms of ongoing, consistent financial behavior and responsibility. The agency found that if given a choice, many consumers would prefer that alternative credit data sources, such as utility bill payment history (48%), savings/checking account transactions (39%), and mobile phone payment history (38%), be evaluated in their credit history. Every one of the most preferred sources is financial data.

‍

The main characteristics of a good source of alternative data that some professionals distinguish include:

‍

Coverage: A new data source will ideally have broad and consistent coverage (e.g., over 90% of US adults use a cell phone, and the market is concentrated so data collection would be easy to achieve; ~40% of US adults pay rent, but this is a low concentration market and so the data are expensive to collect).
Specificity: A data source should ideally contain detailed data elements about an individual — data elements that provide part of a full picture of the borrower (e.g., on-time and late payments over a significant time series, or specific asset or income data); some data sources are based on ‘segment data’ or ‘modeled data’ and are typically less predictive than consumer-specific sources.
Accuracy and timeliness: Data should be accurate and frequently updated; a data source should have a system for ongoing data verification and management.
Predictive power (‘signal’): Most important, data should contain information relevant to the behavior that you’re trying to predict.
Orthogonality: Ideally, the data source should be additive to traditional bureau data; using it will improve the predictive accuracy of any new score by improving the signal-to-noise ratio.
Regulatory compliance: Data sources must comply with existing regulations for consumer credit (i.e., Fair Credit Reporting Act, Equal Credit Opportunity Act, Gramm-Leach-Bliley Act).

‍

Those characteristics are more likely to be found in hard data.

‍

But, what of soft data sources? This leads us to the second important consideration.

‍

What’s the target market and goal?

‍

There is ONLY ONE RIGHT ANSWER to the question of whether alternative data can be used for creditworthiness assessment or not — IT DEPENDS.

‍

Let’s take social media.

‍

Experian asks: Can banks, credit unions, and online lenders look at social media profiles when making a loan decision and garner intel to help them make a credit decision?

‍

Experian answers: In the case of business credit, YES. On the consumer side, NO.

‍

To address the consumer “no” first, there is a very trivial answer to close down any discussions about the use of famed social media data for alternative credit scoring frameworks in consumer lending:

‍

The Equal Credit Opportunity Act, which states that credit must be extended to all creditworthy applicants regardless of race, religion, gender, marital status, age, and other personal characteristics. Social media profiles can check every one of those boxes, making this data unusable.
Social media data can be manipulated.
FCRA requires credit data to be displayable and disputable. Social media can’t address those needs.

‍

The situation is much different for business lending because the state of social media channels for a business indicates customer engagement-related performance, which, ultimately, can point to (potential) financial performance. Consumers are increasingly engaging with businesses through the means of social media messaging. More so since chatbots became ubiquitous. The regulatory standards are also different for businesses — the FCRA does not apply to business lending.

‍

Alternative data could be useful in providing a clearer picture of small business relationships with the community of its customers/potential customers. Any online rankings on major platforms like TripAdvisor, Yelp, etc., lead to sales, making social media performance an important and relevant indicator of where the business stands with its customers.

‍

There is another side to a conversation about the goal. The use of alternative data in creditworthiness assessment has varying relevance in edge cases — with invisible individuals and businesses and those with limited records.

‍

The greatest opportunity is in creating identities for those who are deemed invisible, while for those with a full record, alternative data cannot play a role any more meaningful than to simply sharpen the image in a non-deterministic way. That category can be accurate enough for institutions described using current frameworks.

‍

Since the catch-22 of credit is that to borrow, you need a score; but to generate a score, you need to have borrowed before, alternative data (financial) could be seen as a solution to breaking this conundrum for edge cases. For those with records or limited records, traditional scoring frameworks are seen as powerful enough by financial institutions.

‍

‍

Source: Point of View: Alternative Data and the Unbanked

What is next?

‍

Any breaks in the use of alternative data to improve the accuracy of existing frameworks or to redefine the very framework are heavily skewed towards enhancing only one function — the loan origination process. Financial institutions, technology, internet companies, and startups are focused on ensuring the fastest, cheapest, most efficient, and accurate decision-making process and delivery of funds to the borrower.

‍

However, loan origination is only one piece of the puzzle; the other two more important ones being repayment and repeat (up)sell.

‍

Traditionally, industry-defining players like Experian, TransUnion, and now much-conflicted Equifax calculate the credit scores based on a person’s historical financial and repayment data. And there is a reason for it — with the abundance of available options, repayment (or default rate) is the key measurement of a successful framework. More importantly, it is non-harmful in the long term for the borrower's repayment. Past financial behavior has always been and will remain very powerful in its accuracy of creditworthiness assessment.

‍

Any model that is focused on leveraging alternative data to evaluate the likelihood of default becomes useless when the evaluation turns out to be incorrect. Once the hit defaults, all that will matter in the future will be that fact of default and not the shining score build of 40,000 pieces of data.

‍

In conversations about the use of alternative data sources, the focus has to shift towards responsible lending and sustainable recovery of funds. Consumers in the US, for example, are heavy users of credit. Consumer debt, including personal loans, real-estate-secured loans, auto loans, credit cards, and student loans, totals over $12 trillion. That is not the number lending startups should be using as a highlight to justify their existence. That number represents an opportunity for companies operating in the lending space to focus on practices ensuring sustainable and responsible recovery of funds.

‍

For ~4.5 billion people globally — a majority of them from low and middle-income emerging countries — with no credit or repayment data available, alternative frameworks may become a way into the formal financial system. But those frameworks will need to continuously evolve to take into account the most important stage — repayment history.

‍

To learn about Prove’s identity solutions and how to accelerate revenue while mitigating fraud, schedule a demo today.

Tags:

Banking

Tags:

Prove Identity Manager

Tags:

North America

Keep reading

See all blogs

Blog

Why Prove Matters When Identity Data Leaks Become Critical Infrastructure Failures

As large-scale data breaches expose billions of identity records, traditional identity verification and KYC models fail under automated fraud, making cryptographically anchored, persistent digital identity critical infrastructure.

Ashley Kiolbasa

February 24, 2026

Blog

How Prove’s Global Fraud Policy Stops Phone-Based Fraud Others Miss

Learn how Prove’s Global Fraud Policy (GFP) uses an adaptive, always-on engine to detect modern phone-based threats like recycled number fraud and eSIM abuse. Discover how organizations can secure account openings and recoveries without increasing user friction.

Ryan Alexander

February 11, 2026

Blog

Prove Supports Safer Internet Day: Championing a Safer, More Trustworthy Digital World

Prove proudly supports the goals and initiatives behind Safer Internet Day, a worldwide effort that brings together individuals, organizations, educators, governments, and businesses to promote the safe and positive use of digital technology for all, especially young people and vulnerable users.

Mitch Bompey

February 9, 2026

Blog

Let us Prove it
Talk to an expert today

Let's talk

Trusted by 1,500+ leading companies to reduce fraud and improve consumer experiences, Prove is the world’s most accurate identity verification and authentication platform.