tips tools interview

SecurityTrails Blog · May 02 · SecurityTrails team

It’s never been easier to make a great product: A chat with Johannes Gilger from Urlscan.io

Reading time: 20 minutes
Photos: Christina Sobiraj 

Since the dawn of the Internet, people dreamed of starting their own online business, though few truly succeeded in that task. With all of the technology and services available today, becoming an entrepreneur seems easier than ever before.

Even if it is easier, technology-wise, to set up a business, it still comes with its own set of obstacles. Starting your own business raises questions about affording the instability of entrepreneurship, keeping healthy work-life balance, and knowing if it’s the right moment to start something on your own.

We sat down with Johannes Gilger, creator of urlscan.io, to discuss the answers to those questions, as well as his engineering philosophy, on how many servers urlscan.io runs, and the secret behind his obsessive list-making.

We’ve written before about urlscan.io, a website scanner that analyzes websites and shows all the resources requested when a website is loaded. As big fans for quite some time, we were excited to get the chance to talk to the guy behind it all.

URLScan.io logo

SecurityTrails: You’re from Aachen, Germany. What would you say are the main differences between the cybersecurity industry in Germany and the US?

Johannes Gilger: I would say that in Germany, cybersecurity hasn’t really landed yet. Not enough folks understand the importance of cybersecurity, especially old and established companies. They haven’t really figured out what the necessary changes look like not just in cybersecurity, but in IT in general, and how to adjust to be able to address them properly. The biggest difference is the awareness of the importance of cybersecurity. The US is years ahead compared to Germany. Germany is now catching up, but it’s still somewhat slow.


Tell us about where you studied and how you got into IT Security.

Johannes: Well, I studied here in Aachen at RWTH Aachen University, which is one of the best institutions to study Computer Science in Germany. While I initially struggled with the courses and the math-heavy curriculum, I eventually found my stride, thanks to some good friends and a better approach to what constitutes effective learning. My time at the university taught me a lot about my own motivation, my strengths and weaknesses, and most importantly that it’s always possible to understand complex topics if you set your mind to it.

After I graduated in 2011, I started working as a PhD candidate at the chair for IT Security, which was my first real contact with the security industry. I met a lot of very talented and hands-on people by participating in so-called Security CTF (Capture-the-Flag) events, some of whom would later be my colleagues. After two years, I decided doing a PhD wasn’t for me, and I switched to working in the industry.


Johannes Gilger

You work for CrowdStrike, a US based company. How has working remotely been for you?

Johannes: First and foremost, working remotely gives you a good work-life balance. Especially if you have kids like me, your quality of life is so much better. You can take half an hour to play with your kids; you can pick them up from kindergarten and continue working after that. Remote work has been really great for me, and working remotely for a US company has been even better because you get to learn about different cultures and people from different backgrounds. People in the US have a completely different approach to life than people in Europe, especially compared to Germany. Americans are not as traditional and set in their path– quicker to adapt to changing circumstances.

In Germany for example, when you study for something or pick a certain trade, it’s still unusual to deviate from that in life. If you study computer science, then everybody expects that after you finish school, maybe get a PhD, you’ll work for one of the big companies in Germany — be it a car manufacturer or a software based company — and you should do that for a very long time or even your whole life.

I work at CrowdStrike Intelligence with a team of very dedicated and talented people. CrowdStrike has been a great experience because it has a blend of all kinds of different folks working there remotely. They all bring their own experiences and hail from different parts of Europe and North America. The company is tackling some interesting problems in the security space, and is expanding at a healthy pace. The diversity of people who work there has been an amazing experience that I don’t think I could ever get from working at a regular German company.


As someone who has a regular 9–5 job and a side project — Urlscan.io, how do you manage to balance both with your family and private life?

Johannes: The reason I’m able to balance it, is because I’m working on my side projects at night, and every now and again on the weekends or whenever other people might sit on the couch and watch Netflix. I used to watch a lot of movies and TV shows when I was studying, and at some point I realized that I was wasting so many hours doing something that wasn’t giving me any happiness anymore, so what else was I supposed to do with my free time? In the summer, yes, you can go outside, work in your garden, or go for a walk. But in the winter, there are a lot of evenings where you basically just sit on the couch. Whenever I feel like that, I work on my side projects for one or two hours at night. It’s important that you do it consistently.

The biggest takeaway for me is, if you put in one hour a day to do something, it compounds into this big thing, where you look back, and maybe wonder how you’ve built this with just one hour daily. But all of that really adds up over time.


How did you know it was the right moment for you to start working on your own project and start urlscan.io?

Johannes Gilger

Johannes: For me, the biggest motivation to start working on this was that it had always been my dream to build something that a lot of people use on a daily basis. I wanted to build something that I could send to someone within an hour and say, “Hey, take a look at this new feature and tell me what you think about it.” To have the direct feedback-loop of trying something and getting someone’s opinion on it.

With URLScan, it was me scratching my own itch, thinking, I want to build something that a lot of people on the Internet might benefit from, and also something I would be interested in having myself. With how complex websites are, looking at them and determining which services and technologies they are using can be really tedious. Yes, you can do all of it manually, but it takes a lot of time. If you can automate finding out all you can know about a website and how it operates, that already saves you at least a couple of minutes each time. So that was my first itch that I wanted to scratch. I would definitely say that it surprised me how quickly it has picked up, in terms of popularity and volume.

I started URLScan two-and-a-half years ago, and I built the majority of the website that you see today in one month. Obviously, I’ve put in a lot of cumulative time over the past two and a half years, but it’s been incremental, rather than starting everything from scratch. There are things you have to build over time, but they’re not as spectacular. Building monitoring and all of these management features is absolutely necessary, but it’s not something that people immediately can tell you’ve been working on.


How did you decide that you wanted to have urlscan.io be an easy to use product that could help and bring value to both amateurs and non-professionals?

Johannes: One of the challenges I like tackling, and that’s something I do at my day job as well, is the constant theme of “digest this large amount of information so that it makes sense for both amateurs and professionals”. The data should be condensed, and the most important information shown very prominently. There are a lot of web services out there that do a great job, and you don’t really appreciate it until you try to do it yourself. Particularly with URLScan, I realised it’s really tough picking the most important pieces of information to display.


Do you think that it can be hard for inexperienced people to get into the industry, as most tools out there are very advanced?

Sites deemed dangerous by safe browsing

Johannes: I think that’s one of the reasons why big companies will pay huge sums for solutions, while not getting enough benefit from them. Because things look really complicated and complex to use. It’s hard to get information out of them, and understand what’s really happening. The data is already there, and in some cases it’s mostly a matter of how to search or display it a little bit better. That’s why a lot of security solutions are falling short. They focus so much on the data and forget about the human that should use it. Even professionals have a hard time using systems. They have to use 10 different systems, and often don’t have time to train in each one, so in a lot of cases they will use a system and only use 10% of its capabilities. It can make it harder for people entering the industry to figure out what they should be focusing on.


Urlscan.io has been a favorite website scanning service for many. What are the secrets to your growth from a product perspective?

Johannes Gilger

Johannes: I’m the first to admit that URLScan is not a new idea. There have been services like this around for a long time. There is actually a service, urlquery, that was basically my inspiration. From a functional perspective, it was an okay service, but I thought I could definitely do a better job with the way the data was presented.

In a lot of cases, people focus on creating something new, but in most cases you should create something that already exists, but do it in a better way. Once you take that approach, you will need to listen to your users, iterate quickly, be very self-aware of things you may have done in the past that weren’t the best ideas, and be willing to make necessary changes. Sometimes, you are sure you are building a perfect product or feature, thinking everyone will use it. Then you observe that almost nobody is interested in that product or feature, so you have to swallow your pride and start fresh. In general you have to iterate a lot, you have to fail, and sometimes be ready to throw away the time and love you put into features you need to kill off.


It’s never been easier for someone to start their own project or a company. What does it take to actually make that step, in terms of the right business attitude?

Johannes: The most important thing to remember is, if you want to start something that is not just a fun hobby project, you need to be willing to stick with it for a long time, because success, visibility, and adoption don’t happen overnight. As much as we, as engineers, think that we will just build something superior, tweet about it, and people will magically start using it, that’s not really the case. People are slow to adopt to new things. Some people learn about services like URLScan only after two years of its release, so it’s not something that happens instantly.

Be prepared to do the boring stuff — make sure the service is running, that it’s monitored, that the code is stable, and to address user management, even if it’s the most boring feature. Don’t be distracted by other people doing something similar, and don’t be discouraged by existing solutions.

All of that plays into the fact that, you just have to do your own thing. Don’t be distracted, but also don’t drown in your own plans. Always make sure you don’t bore yourself with it, and as long as you keep doing it and put the hours in, it will get you somewhere.


And what would be some things people need to deliver if they want to start something in terms of technology?

URLs submitted per day

Johannes: For a software engineer, believing that when you want to build something, it has to be complex and advanced, or something that nobody else is doing, is the most dangerous way of thinking. But you should build something that will bring value to people and solve the problems they have! If the only solution is to build something crazy advanced with sophisticated machine learning and an amazing database, then fine, do it. But if 95% of the problem you’re trying to solve is just displaying the important information, then you should swallow your pride and do that.

In terms of technology, you focus on the problem of the user, and that informs your technology choice. If it’s just a search, you can put up a website that will search the information and show it in a clear way. If it’s something more complex, like where a user has a lot of things to keep track of, then it will probably be a service that alerts the user via email or dashboard, and automates the steps.


What are your engineering philosophies?

Johannes: The most important thing is, you have to be self-aware, self-conscious, constantly looking at yourself to re-evaluate what you are doing. You try something, observe the output, and then re-evaluate if that was the best approach. It’s the same approach for very technical issues and for personal matters. You might build something and it’s great technology, but then you realize it’s really hard to manage, so the management doesn’t justify using that technology. The same applies to your personal approach to things. You might realize that, in the past year, you did so much planning and it didn’t really get you anywhere, so maybe try to do less planning before building anything. Once you adopt this constant habit of re-evaluation in your life, everything falls into its place.

The other big theme you should be looking for is “compounding”. I try to examine everything I do by asking myself whether it will have a compounding effect and offer a good return on investment. This especially applies to time management, meaning you should only do something if you think it will make you faster and more efficient in the future. Examples of this are “should I learn this new thing right now or stick with what I know?” and “should I engineer this to very stable right now or are there more pressing issues?”. If you talk about releasing a product or feature, release it earlier, so it has time to compound users and adoption even if it’s not perfect. I could have kept working on urlscan for another year before releasing it, but then it would not have the same user-base it has today.


Tracking 233 brands

What technologies did you use to build urlscan.io? How many servers did you use?

Johannes: That is something that is also important in terms of engineering philosophy. If you want to build something, and you are passionate about the product more than the process of building it, then start with what you know. Nobody cares if it’s Python, Ruby, NodeJS, or even PHP, just start with what you know and what works for you.

I started writing URLScan in NodeJS, which some people would have an issue with, but I was very familiar with it and was able to iterate quickly because I was comfortable with it. Since I’m interfacing with Google Chrome to do the website scanning, and all of the Google Chrome APIs are in Javascript, I’m using Javascript in back-end as well as front-end.. That has helped me to quickly switch between working on different components of the system.

I set it up on one server, and it was running like that for a year and a half. It had a lot of traffic at that point. Most people really expected to see such a big site with so much traffic running on one server, but it was actually fine. I got a second server because my ElasticSearch search index that I use for search and aggregation took up a lot of resources. The main reason was less the people using it, but mostly because of the search engine crawlers actually crawling the page. I added the second server last year, and now it’s at three servers. I’m pulling off some things, in order to optimize some stuff to work even with the small set of servers, which is a very fun aspect of this project.

Today, urlscan is handling 200 requests per second during business hours, serving tens of thousands of users a day and 200-300GB of outgoing traffic daily. Recently we often saw 100k websites scanned per day. To this date we have scanned more than 15 million websites.


How do people find you and your company? What kind of companies use URLScan?

Johannes: By now, it’s obviously a big tool for many people in the cybersecurity industry. Clients are security companies, like the one I work at, but also the kind of people tasked with security at all kinds of organizations. We have 5500 registered users and around 50-60 thousand daily visitors, so it definitely has an audience.

Most often, people find out about my service through word of mouth — they might be part of a mailing-list where someone shares that they found a malicious website and are using my service for it, see my website, and start using it from there. People also find us through our integrations in commercial and open source threat intelligence, or automation tools.

Another way is through Google search engine traffic. If you are working in a security role, you often have a domain or an IP that you’ll search for in your threat intelligence database. But at some point, you might use the search engine, and a lot of them will point to URLScan, since it might be one of the few services that has any information on that IP.


What were some of the biggest challenges you ran into while working on urlscan.io?

Johannes: The biggest challenges for me were prioritizing what to do and what not to do, and not making too many plans. I’m an obsessive list maker, which means everything in my life that I want to do has a list. Especially when I get a great idea and write it down. After two years of writing stuff down, you have a pretty huge list of features you want to build. And that’s okay, because you can always go back and throw some information away if you reach the point where it becomes paralyzing and demotivating. You don’t have time to do all 300 items on the list, so what you have to do is to break it into smaller pieces and prioritize. You have to pick small chunks off that big list, because it will get overwhelming. The biggest challenge is keeping yourself motivated by not setting your expectations too high.


What can we expect from urlscan.io in the future, do you have any new product goals?

Johannes: My plan for urlscan.io, on the community side, is to keep growing it in terms of features, but also in terms of volume — how many scans we support and how much data we can store and make available for searching.

I’m very happy to find a great sponsor in SecurityTrails. Because SecurityTrails has gladly sponsored URLScan, I’m able to grow it and provide more value to the community. Further down the line, there might be some features that are for more specific use-cases or too resource intensive to be on the community side. In the future, some of these features, or an API, might only be available to paying customers. And what we manage to do with our paying customers will also support the community side of urlscan.io.


What is the advice you would give to someone looking into starting their own company, but just need that last bit of faith?

Johannes: Let me start out by saying that I’m not the most qualified to give advice since urlscan isn’t really a business yet. It’s a good product with a large user-base, but it’s not a profit-generating business, so my own experience is limited to the initial steps of building a product and making it widely known.

Johannes Gilger
— Start small and provide value to people, and you will be surprised at how quickly it will be picked up.

First, try to gain confidence by working on a job and realizing that most other folks are “winging it” as well. There are a lot of people who are scraping by and hoping what they built actually works, so just do something and don’t expect it to be perfect. When it’s just you working on it, there will obviously be issues and mistakes, but don’t let that keep you from doing what you envisioned.

Everything you are able to learn yourself, you should. That’s my personal approach — try and do everything yourself, and only then can you determine if it is something you would rather outsource.

Try to adopt a way of thinking you can learn anything you put your mind to, and not be afraid of starting new things. If you’ve never, for example, done user management, you need to have the mindset that it’s not rocket science, and just research it. Same with the business side — you want to start a business, but there are all these things keeping you from it, like taxes, contracts, payments. None of this is fun, but it’s also something you can learn to do yourself.

Everything you are able to learn yourself, you should. That’s my personal approach — try and do everything yourself, and only then can you determine if it is something you would rather outsource. I also encourage people to try some front-end engineering. Once you’ve done it, you can really tell if that’s something you are good at and want to keep doing, or if it’s something you want to outsource.

Start small and provide value to people, and you will be surprised at how quickly it will be picked up. It’s never been easier to set up a project, product, or business, and go from zero to providing value to people with no risk, aside from the time you’ve put in. You don’t have to buy a server any more, hire people, or buy any service with a long contract. In most cases, everything you want to learn you can get for free on the internet. Be self reliant for as long as you can, and grow at your own pace.

Lastly, while scratching your own itch is a great place to start, make sure that what you’re building actually solves someone’s problem. That means that you should understand the problem space, for example by having worked in a particular industry and role. I was able to build features for urlscan because I’m also a user of systems like it, so I know what information a security professional would be looking for. I think there are so many untapped industries which are yearning for someone to come in and solve their problems instead of building yet another To-Do-list or social network for lack of better ideas. I would like to tackle some of these problems in other industries at some point, industries with a more direct real-world impact such as healthcare, food, transportation, industrial automation, supply chain management, just to name a few.


Don’t forget to check out urlscan.io if you haven’t already and follow them on Twitter.


Check out our blog to read more interesting interviews and improve your infosec research, and sign up for SecurityTrails to see everything you can do with our products!