news api tools

SecurityTrails Blog · Jan 15 · SecurityTrails team

SecurityTrails Engineering: Major Updates to our Domain Discovery Pipeline

Reading time: 3 minutes

We’ve made major engineering advancements in the process of locating domains that are normally hard to obtain. We’ve developed an autonomous system that allows us to process 10s of TBs of data daily, to find domains and subdomains from numerous sources, verify them and then add all those domains to our database.

Not all domains are discovered equally

Even if generic top-level domains (gTLDs) such as .com and .net are the most widespread, they are not the only option out there. The use of country code top-level domains (ccTLDs) is on the rise both for legitimate and illegitimate purposes, with the number of registered domains rising each day (around 40% of all registered domains are ccTLDs).

Registries have an obligation to release gTLD zone files with daily updates, but there is no similar obligation for ccTLD registrars to release zone files. That’s why we needed to find a way around it, so we could provide our customers with complete, reliable lists of all domains, including ccTLDs.

Due to the complex nature of obtaining full coverage of ccTLDs, we built a proprietary pipeline to analyze around 40 sources, with new data coming in as soon as these sources update them.

How we increased our coverage and discovered new domains

For SecurityTrails to obtain complete data, we have to locate domain names as quickly as possible. This has been our core competency since we started the company—and that’s why we’re happy to announce these enhancements to our pipeline and algorithms for finding domain names faster than ever.

Using our pipeline and a vast number of sources to gather and validate data, we’re able to provide up-to-date data, verify it and track it. The number of domains, subdomains and hosts that we acquire on a daily basis can be accessed through our Feeds page.

We’re always on the lookout for finding new ccTLDs and other data sources, to enhance the quality and quantity of acquired data.

Verification of domains

Once domains are discovered, they go through a validation process to make sure they have nameservers and that they respond to queries. Once we determine they are real, we can begin looking up those domains regularly as part of our normal process.

Our pipeline gives us the capability to perform millions of asynchronous resolver requests in a rather short time span, which results in new domains appearing in almost real-time after we see a first occurrence.

Summary

Here at SecurityTrails, we’re constantly updating our data and improving the ways we collect and process it. Our data can be queried from our Domain API, the SecurityTrails Feeds and on our simple-to-use web app SurfaceBrowser.


To follow along and stay up to date with news from the infosec world and all of our newest features, sign up at SecurityTrails and check out our blog!

For more information, or if you’re interested in custom data enrichment, you can always contact us.