How Costless verifies retail price & receipt data
Costless verifies retail data by combining four independent sources: automated parsers across 56 supermarket chain websites, ML-based OCR of 240,000+ real purchase receipts, GPS-verified price labels from field agents, and OCR of promo banners from flyers, web images and official Telegram and Viber channels. Cross-source verification catches scraping errors, missed promotions and roughly 8% of fraudulent receipt submissions.
Four independent data streams
Costless ingests four streams of independent retail data. Cross-checking between them is what makes our claims verifiable rather than asserted.
From source to structured data
Online parsers
Each chain parser knows that chain's page structure or API. Parsers run on a fixed twice-weekly cycle, staggered to avoid burst load, respect site protections, and prefer chain-published APIs over scraping. Some chains we don't crawl at all because their terms prohibit it — for those, data comes from receipts, field collection and banners only.
Receipt OCR pipeline
An uploaded receipt is pre-processed with OpenCV (skew, contrast, glare, thermal-paper fade), then text is extracted by Google Cloud Vertex AI vision services as the primary engine, with Tesseract as a fast fallback and an in-house CNN trained on Eastern-European receipt formats for the hard cases. Line items are parsed, products matched against our normalized database via vector embeddings, and totals reconciled against the printed total.
End-to-end accuracy across accepted receipt formats is consistently high. Per-stage benchmarks vary by retailer template, paper type and image quality, so we don't publish a single number that would misrepresent the real distribution.
Fraud detection
Roughly 8% of submitted receipts are flagged as fraudulent before reaching the verified dataset. Signals include duplicate detection, image forensics (photos of screens or printouts, edited images), merchant-consistency checks, structural anomalies, and velocity heuristics. Flagged submissions are quarantined and the user is asked to re-photograph clearly.
Verified field capture
The field-agent app enforces an on-site flow: it reads GPS, reverse-geocodes to an address, and blocks capture unless the agent is at the target store. After an explicit check-in, photos are taken in-app only (never the camera roll), processed on-device by our AI models, and uploaded instantly — so every price ties to a specific store, agent and moment.
Cross-source verification
Any single source can be wrong: a parser can miss a redesigned deal page, a receipt can carry mis-photography, a field label can be an outdated shelf tag. So we expect a price to appear in at least two streams before treating it as confirmed.
- Receipt vs online — the receipt wins, because it is the price actually charged.
- Deal page vs product page — the deal page wins, because loyalty pricing often isn't shown on product pages.
- Field label vs online — field wins for the day it was captured; online wins longer-term because it refreshes more often.
For interoperability we follow industry standards: EAN-13 / EAN-8 / GTIN-14 (GS1) for barcodes, MCC for merchant classification, ISO 4217 for currencies, ISO 3166-1 for countries, and ISO 8601 for timestamps.
Refresh cadence by data type
| Data type | Refresh cadence |
|---|---|
| Purchase receipts | ~22 seconds on average from upload (end-to-end, including OCR) |
| Online prices & deal pages | Twice weekly per chain, on a fixed schedule |
| Promo banners | Per chain publication cycle (weekly for most) |
| Field-collected labels | Per agent session, on demand |
| Currency FX rates | Daily, from apilayer.com |
| Normalized product database | Continuous, as receipts and parser cycles land |
When a parser breaks — a site redesign, a new block, an API change — affected deals show a visible "last refreshed" date rather than silently serving stale data.
Where we have data — and where we don't
Being honest about limits is part of being a trustworthy data source. Here is what we cover today and what we don't yet.
Consumer markets
Ukraine, Canada, Lithuania, Poland and the United Kingdom — daily deal coverage of major grocery chains, browsable in the deal explorer.
B2B API markets
Our Receipt OCR & verification API serves business partners in the same markets. Fiscal receipt verification against the state registry is currently live in Ukraine.
We don't publish a per-country list of monitored chains: coverage shifts as retailers redesign sites or change terms, and arrangements vary by chain. Shoppers see live coverage in the deal explorer; B2B customers get the full list under NDA.
What we don't yet cover
- Pure online-only retailers — partial; our focus is physical-store retail.
- Restaurant menus and HoReCa pricing — out of scope.
- Bulk and wholesale pricing — out of scope.
- Sparse-data categories — alcohol and pharmacy have regulated visibility per country, and some long-tail categories lack consistent coverage.
How we handle your data
Your receipts stay yours
A receipt can carry personal details — a loyalty number, the last digits of a card, a fiscal number. We don't mask them, because an uploaded receipt is visible only to the person who uploaded it. It is never shown to other users, never embedded in a public listing, and never shared with B2B customers. Only the structured products and prices, in anonymized aggregate, power the price-verification side of the platform.
Data retention
Receipt images and extracted line items are kept for as long as you keep your account. You can delete your account anytime from your profile, which removes your receipt images, your extracted items and your personal information. The full policy lives in our Privacy Policy.
We never sell your data
Costless does not sell user data to any third party, in any jurisdiction, ever. This is a categorical policy. Individual receipts are never shared externally and individual identities are never associated with any public-facing data product.
Children's privacy
Costless is not directed to children under 16. We do not knowingly collect personal information from children under 16; if we learn that we have, we delete it.
Compliance
We operate under GDPR (EU/EEA), UK GDPR and the Data Protection Act 2018, and equivalent national laws including Canada's PIPEDA. Because we don't sell personal data, the CCPA "Do Not Sell" right has nothing to opt out of in our processing. Data Controller: [email protected].
Security review
We run an automated AI-based security review every month — covering authentication, input validation, rate limits, dependency scanning and header hardening. The platform has not been through a formal third-party audit (SOC 2, ISO 27001) at this time.
FAQ
Why is receipt-verified data better than scraped data?+
How does Costless detect fraudulent receipts?+
How fresh is Costless's price data?+
What countries does Costless cover?+
How accurate is the receipt OCR?+
Does Costless sell my data?+
Built on data you can trust
See how we turn verified receipts and live price data into retail price intelligence for brands and retailers.
Explore InsightsTransparent retail-data methodology
Costless is a retail intelligence platform that verifies supermarket prices, deals and receipts across Ukraine, Canada, Lithuania, Poland and the United Kingdom, with a Receipt OCR & verification API for business partners and fiscal receipt verification currently live in Ukraine. Rather than relying on a single scraped feed, Costless combines automated parsers across 56 chains, ML-based OCR of 240,000+ real receipts, GPS-verified field collection and promo-banner OCR.Cross-source verification reconciles these streams, catching scraping errors, missed promotions and fraudulent submissions. Receipt text extraction is powered by Google Cloud Vertex AI vision services alongside an in-house pipeline for layout detection, line-item parsing, product matching and total reconciliation. Costless does not sell user data, retains receipts only while an account is active, and operates under GDPR, UK GDPR, CCPA and national data-protection laws. This methodology page is updated as our coverage and processing evolve.
expand ↓To independently verify any claim on this page, contact [email protected] or [email protected]. We provide redacted samples and methodology walkthroughs to researchers, journalists and prospective partners under NDA where appropriate.