Have I Been Pwned 2.0 is Now Dwell! – Cyber Tech

This has been a very very long time coming, however lastly, after a marathon effort, the model new Have I Been Pwned web site is now dwell!

Feb final yr is once I made the primary decide to the general public repo for the rebranded service, and we soft-launched the brand new model in March of this yr. Over the course of this time, we have utterly rebuilt the web site, modified the performance of just about each net web page, added a heap of recent options, and right now, we’re even launching a merch retailer 😎

Let me discuss you thru simply among the highlights, strap your self in!

The signature function of HIBP is that massive search field on the entrance web page, and now, it is even higher – it has confetti!

Effectively, not for everybody, solely about half the individuals who use it’ll see a celebratory response. There is a motive why this response is deliberately jovial, let me clarify:

As Charlotte and I’ve travelled and frolicked with so many alternative customers of the service around the globe, a theme has emerged over and over: HIBP is a bit playful. It isn’t a scary place emblazoned with hoodies, padlock icons, and fearmongering about “the darkish net”. As a substitute, we purpose to be extra consumable to the plenty and supply factual, actionable info with out the hyperbole. Confetti weapons (sure, there are a number of, and so they’re animated) lighten the temper a bit. The choice is that you simply get the pink response:

There was a really transient second the place we thought-about a extra light-hearted therapy on this web page as nicely, however in some way a little bit of unhappy trombone actually did not appear applicable, so we deferred to a extra demure response. However now it is on a timeline you possibly can scroll by in reverse chronological order, with every breach summarising what occurred. And if you need extra information, we’ve got an all-new web page I will speak about in a second.

Only one little factor first – we have dropped username and cellphone quantity search assist from the web site. Username searches have been launched in 2014 for the Snapchat incident, and cellphone quantity searches in 2021 for the Fb incident. And that was it. That is the one time we ever loaded these courses of knowledge, and there are a number of good explanation why. Firstly, they’re each painful to parse out of a breach in comparison with e mail addresses, which we merely use a regex to extract (we have open sourced the code that does this). Usernames are a string. Telephone numbers are, nicely, it relies upon. They don’t seem to be simply numbers as a result of if you happen to correctly internationalise them (like they have been within the Fb incident), they’ve additionally acquired a plus on the entrance, however they’re steadily far and wide by way of format. And we will not ship notifications as a result of no person “owns” a username, and cellphone numbers are very costly to ship SMSs to in comparison with sending emails. Plus, each different incident in HIBP aside from these two has had e mail addresses, so if we’re asking “have I been pwned?” we will all the time reply that query with out loading these two hard-to-parse fields, which often aren’t current in most breaches anyway. When the outdated web site supplied to simply accept them within the search field, it created confusion and assist overhead: “why wasn’t my quantity within the [whatever] breach?!”. That is why it is gone from the web site, however we have stored it supported on the API to make sure we do not break something… simply do not count on to see extra information there.

The Breach Web page

There are various causes we created this new web page, not least of which is that the search outcomes on the entrance web page have been getting too busy, and we needed to palm off the small print elsewhere. So, now we’ve got a devoted web page for every breach, for instance:

That is largely info we had already (albeit displayed in a way more user-friendly trend), however what’s distinctive concerning the new web page is rather more focused recommendation about what to do after the breach:

I not too long ago wrote about this part and the way we plan to establish different companions who’re capable of present applicable companies to individuals who discover themselves in a breach. Identification safety suppliers, for instance, make lots of sense for a lot of information breaches.

Now that we’re dwell, we’ll additionally work on fleshing this web page out with extra breach and user-specific information. For instance, if the service helps 2FA, then we’ll name that out particularly relatively than depend on the generic recommendation above. Identical with passkeys, and we’ll add a bit for that. A current dialogue with the NCSC whereas we have been within the UK was round including localised information breach steering, for instance, displaying of us from the UK the NCSC brand and a hyperlink to their useful resource on the subject (which recommends checking HIBP 🙂).

I am certain there’s rather more we will do right here, so if you happen to’ve acquired any nice concepts, drop me a remark beneath.

The Dashboard

Over the course of a few years, we launched increasingly more options that required us to know who you have been (or at the least that you simply had entry to the e-mail tackle you have been utilizing). It started with introducing the idea of a delicate breach throughout the Ashley Madison saga of 2015, which meant the one option to see your involvement in that incident was to obtain an e mail to the tackle earlier than looking. (Sidenote: There are various good explanation why we do not try this on each breach.) In 2019, once I put an auth layer across the API to deal with abuse (which it did superbly!) I required e mail verification first earlier than buying a key. And extra issues adopted: a devoted area search dashboard, managing your paid subscription and earlier this yr, viewing stealer logs in your e mail tackle.

We have now unified all these completely different locations into one central dashboard:

From a look on the nav on the left, you possibly can see lots of acquainted options which are fairly self-explanatory. These mix related issues for the plenty and people which are extra business-oriented. They’re now all behind the one “Signal In” that verifies entry to the e-mail tackle earlier than being proven. Sooner or later, we’ll additionally add passkey assist to keep away from needing to ship an e mail first.

The dashboard strategy is not nearly transferring present options below one banner; it’ll additionally give us a platform on which to construct new options sooner or later that require e mail tackle verification first. For instance, we have typically been requested to supply folks with the power to subscribe their household’s e mail addresses to notifications, but have them go to a unique tackle. Many people play tech assist for others, and this could be a genuinely helpful function that is smart to position at a degree the place you’ve got already verified your e mail tackle. So, keep tuned for that one, amongst many others.

The Area Search Function

Extra time went into this one function than many of the different ones mixed. There’s lots we have tried to do right here, beginning with a a lot cleaner checklist of verified domains:

The search outcomes now give a a lot cleaner abstract and add filtering by each e mail tackle and a hotly requested new function – simply the most recent breach (it is within the drop-down):

All these searches now simply return JSON from APIs and the entire dashboard acts as a single-page app, so all the pieces is actually snappy. The filtering above is finished purely client-side towards the complete JSON of the area search, an strategy we have examined with domains of over 1 / 4 million breached e mail addresses and nonetheless been workable (though arguably, you really need that information through the API relatively than scrolling by it in a browser window).

Verification of area possession has additionally been utterly rewritten and has a a lot cleaner, less complicated interface:

We nonetheless have work to do to make the non-email verification strategies smoother, however that was the case earlier than, too, so at the least we have not regressed. That’ll occur shortly, promise!

The API

First issues first: there have been no modifications to the API itself. This replace would not break something!

There is a dialogue over on the UX rebuild GitHub repo about the appropriate option to do API documentation. The final consensus is OpenAPI and we began happening that route utilizing Scalar. Actually, you possibly can even see the work Stefan did on this right here at haveibeenpwned.com/scalar:

It is very cool, particularly the way in which it paperwork samples in all types of various languages and even has a take a look at runner, which is successfully Postman within the browser. Cool, however we simply could not end it in time. As such, we have stored the outdated documentation for now and simply styled it so it seems like the remainder of the positioning (which I reckon remains to be fairly slick), however we do intend to roll to the Scalar implementation once we’re not below the duress of such an enormous launch.

The Merch Retailer

You understand what else is superior? Merch! No, critically, we have had so many requests over time for HIBP branded merch and now, right here we’re:

We truly now have a real-life merch retailer at merch.haveibeenpwned.com! This was in all probability the worst doable use of our time, contemplating how a lot mechanical stuff we needed to do to make all the brand new stuff work, but it surely was a little bit of a ardour undertaking for Charlotte, so yeah, now you possibly can truly purchase HIBP merch. It is all achieved by Teespring (the place have I heard that identify earlier than?!) and all the pieces listed there’s at price value – we make completely zero {dollars}, it is only a enjoyable initiative for the neighborhood 🙂

We did check out their possibility for stickers too, however they fell nicely wanting what we already had up with our little one-item retailer on Sticker Mule so for now, that continues to be the go-to for laptop computer decorations. Or simply go and seize the open supply paintings and get your individual printed from wherever you please.

The Nerdy Bits

We nonetheless run the origin companies on Microsoft Azure utilizing a mixture of the App Service for the web site, “serverless” Features for many APIs (there are nonetheless a number of async ones there which are known as as part of browser-based options), SQL Azure “Hyperscale” and storage account options like queues, blobs and tables. Just about all of the coding there’s C# with .NET 9.0 and ASP.NET MVC on .NET Core for the online app. Cloudflare nonetheless performs a large position with lots of code in staff, information in R2 storage and all their good bits round WAF and caching. We’re additionally now completely utilizing their Turnstile service for anti-automation and have ditched Google’s reCAPTCHA utterly – massive yay!

The entrance finish is now newest gen Bootstrap and we’re utilizing SASS for all our CSS and TypeScript for all our JavaScript. Our (different) man in Iceland Ingiber has simply achieved a completely excellent job with the interfaces and exceeded all our expectations by an enormous margin. What we’ve got now goes far past what we anticipated once we began this course of, and an enormous a part of that has been Ingiber’s capability to take a easy requirement and switch it right into a factor of magnificence 😍 I am very glad that Charlotte, Stefan and I acquired to spend time with him in Reykjavik final month and share some beers.

We additionally made some measurable enhancements to web site efficiency. For instance, I ran a Pingdom web site pace take a look at simply earlier than taking the outdated one offline:

After which ran it over the brand new one:

So we reduce out 28% of the web page dimension and 31% of the requests. The load time is far of a muchness (and it is extremely variable at that), however having strong measures for all of the values within the column on the appropriate is a really pleasing end result. Contemplate additionally the commentary anybody in net dev would have seen over time about how a lot greater net pages have turn into, and right here we’re shaving off strong double-digit percentages 11 years later!

Lastly, something that might remotely be construed as monitoring or advert bloat simply is not there, as a result of we merely do not do any of that 🙂 Actually, the one actual site visitors stats we’ve got are primarily based on what Cloudflare sees when the site visitors flows by their edge nodes. And that 1Password product placement is, because it’s all the time been, simply textual content and a picture. We do not even observe outbound clicks, that is as much as them in the event that they need to seize that on the touchdown web page we hyperlink to. This truly makes discussions comparable to we’re having with identification theft corporations that need product placement a lot more durable as they’re used to getting the kinds of numbers that invasive monitoring produces, however we would not have it some other manner.

The AI

I needed to make a fast be aware of this right here, as AI appears to be both continually overblown or denigrated. Both it may remedy the world’s issues, or it simply produces “slop”. I used Chat GPT particularly actually extensively throughout this rebuild, particularly within the ultimate days when time acquired tight and my mind acquired fried. Listed below are some examples the place it made an enormous distinction:

I am utilizing Bootstrap icons from right here: 

What's a superb icon for example a heading known as "Index"?

This was proper on the eleventh hour once we realised we did not have time to implement Scalar correctly, and I wanted to shortly migrate all the prevailing API docs to the brand new template. There are over 2,000 icons on that web page, and this strategy meant it took about 30 seconds to seek out the appropriate one, each time.

We killed off some pages on the outdated web site, however earlier than rolling it over, I needed to know precisely what was there:

Write me a PowerShell script to crawl haveibeenpwned.com and write out every distinctive URL it finds

After which:

Now write a script to take all of the paths it discovered and see in the event that they exist on stage.haveibeenpwned.com

It discovered great things too, just like the safety.txt file I would forgotten emigrate. It additionally discovered stuff that by no means existed, so it is the standard “belief, however confirm” state of affairs.

And only a gazillion little issues the place each time I wanted something from some CSS recommendation to configuring Cloudflare guidelines to idiosyncrasies within the .NET Core net app, the right reply was seconds away. I would say it was proper 90% of the time, too, and if you happen to’re not utilizing AI aggressively in your software program improvement work now (and I am certain there are significantly better methods, too) I am fairly assured in saying “you are doing it incorrect”.

The Journey Right here

It is arduous to clarify how a lot has gone into this, and that goes nicely past simply what you see in entrance of you on the web site right now. It is seemingly little issues, like minor revisions to the phrases of use and privateness coverage, which required many hours of time and 1000’s of {dollars} with attorneys (simply minor updates to how we course of information and a mirrored image of recent companies such because the stealer logs).

We pushed out the brand new web site within the wee hours of Sunday morning my time, and nearly all the pieces went nicely:

One or two little glitches that we have mounted and pushed shortly, that is it. I’ve truly waited till now, 2 days after going dwell, to publish this submit simply so we may iron out as a lot stuff as doable first. We have pushed greater than a dozen new releases already since that point, simply to maintain iterating and refining shortly. TBH, it has been a bit intense and has been an enormously time-consuming effort that is dominated our focus, particularly over the previous few weeks main as much as launch. And simply to drive that time residence, I actually acquired a well being alert very first thing Monday morning:

Nothing like empirical information to make a degree! That final weekend once we went dwell was particularly brutal; I do not assume I’ve devoted that a lot high-intensity time to a software program launch for many years.

Have I Been Pwned has been a ardour for 1 / 4 of my life now. What I in-built 2013 was by no means supposed to take me this far or final this lengthy, and I am kinda shocked it did if I am trustworthy. I really feel that what we have constructed with this new web site and new model has elevated this little pet undertaking right into a severe service that has a brand new stage of professionalism. However I hope that in studying this, you see that it has maintained all the pieces that has all the time been nice concerning the service, and I am so glad to nonetheless be right here writing about it right now within the 205th weblog submit with that tag. Thanks for studying, now go and revel in the brand new web site 😊

Edit (a number of hours after initially posting): Let me broaden on Cloudflare’s Turnstile because it’ll clarify some idiosyncrasies some folks have seen:

That is an anti-automation strategy that does not contain palming site visitors to Google (like reCAPTCHA did), and it can be applied utterly invisibly. There are extra invasive implementations of it, however we’re making an attempt to be seamless right here. It entails some Cloudflare script working within the browser and offering a problem, which is then submitted with the HTTP request and verified server facet. We have had it on HIBP in a single type or one other since 2023, and it can be superior… till it is not. If the problem fails, what occurs subsequent? It relies upon.

On types the place we actually want to dam the robots (for instance, any that ship e mail), a failed Turnstile problem was initially simply displaying a pink error. It now says this:

Our anti-automation course of thinks you are a bot, which you are clearly not! Strive behaving like a human and clicking the button once more and if it nonetheless misbehaves, give the web page a reload.

We have typically discovered a second click on or a web page reload solves the issue, so hopefully this sends folks in the appropriate route. If it would not, we’ll want to take a look at extra in-your-face implementations of Turnstile that present a widget it’s good to work together with. To have a go your self and see it in motion, strive the dashboard sign up web page.

The opposite place Turnstile options closely is on the primary search web page on the root of the positioning. We do not need that API being hit by bots, so it is a should have there. Right here, like on the opposite pages of the brand new web site, we’re asynchronously posting to API endpoints and sending the problem token together with the request. What we’re doing in another way on the entrance web page, nevertheless, is that if the problem fails and returns HTTP 401 when posted to the HIBP endpoint (you may additionally see a response physique of “Invalid Turnstile token”), we have been meant to be falling again to a full web page submit. That wasn’t occurring within the new web site once we first launched it. However it’s now 🙂

When the complete web page submit again happens, Cloudflare will current a managed problem. That is rather more invasive, but it surely’s additionally rather more dependable and can then serve the identical end result as you’d have seen anyway, albeit through a full web page load. We implement the identical managed problem logic on the deep-linked account pages, which you’ll see right here: https://haveibeenpwned.com/account/take a look at@instance.com

Based on the Cloudflare stats, about 82% of all our issued challenges are efficiently solved:

Of the 18% that are not, many will probably be as a consequence of bots stopped by Turnstile doing precisely what it is meant to do. It is doubtless a single-digit proportion of requests which are actual people being impeded, and we have to have a look at methods to get that quantity down, however at the least the fallback positions are improved now. Should you have been having issues, give the positioning a superb refresh, see the way you go and depart your suggestions within the feedback beneath.

Have I Been Pwned

Add a Comment

Your email address will not be published. Required fields are marked *