A Short Introduction to Web Security Standards
XSS, CSRF, CSP, WTF does it all means? Let's let's dive in and find out what each of these acronyms mean and how we can use them to make our websites secure.
As a web developer, every day you come across so many acronyms that it can be challenging to remember what they all mean. Have you ever seen acronyms like
CSRF? They are all the bad things that attacker can do to your site.
HSTS – those are security standards that were developed to prevent nasty hackers from messing up with your website.
Of course, you don’t have to be able to cite the relevant RFC, but it’s a good idea to at least have a general sense of what they are. So, let’s dive in and find out what each of these acronyms means.
XSS (Cross-Site Scripting)
XSS stands for Cross-Site Scripting, which is a malicious code injection technique. Let’s imagine that you’re an attacker here. If you want to do a content injection on some site, what are you going to inject? Probably it will be a script. Right? One of the ways that you can do that is using inline script tag.
If you can inject some inline scripts on a page, the browser doesn’t know who put them there. It’s just doing its job and when someone loads the page, the script will be executed. This gives the attacker the ability to basically do anything on the page that a legitimate user can do. That’s the basic idea of XSS.
CSRF (Cross-Site Request Forgery)
A CSRF attack occurs when a user is tricked to interact with a page on malicious site. That generates a request to target site from an authenticated user. An attacker controls the data of the request to cause unwanted actions.
In other words, it’s an attack in which the attacker tries to force a victim’s browser to create a request to the target server, without letting the victim know about it.
The danger of CSRF is that this behavior of browsers and the entire HTTP protocol is normal. For example, it’s normal that a site can contain images from another site on its pages. The browser does not know in advance what exactly is it trying to load, a picture, or a disguised payload.
Are we doomed yet? … Maybe not
However, keep your apocalyptic thoughts in leash. There are some mechanisms that we can use to protect against all that. Let’s look at what technologies do we have to prevent these kinds of things from happening.
HTTPS (HyperText Transfer Protocol Secure)
More and more websites are moving to secure connections with Progressive Web Apps, .apps, SSL certificates etc. This happens for obvious reasons. If your website uses HTTPS, all communication between your browser and the website will be encrypted.
In the past there was an outdated idea that moving your site to HTTPS is going to be a big performance hit, but now it really isn’t a performance concern at all. It’s been shown that it can actually improve performance, which is cool. You even get a little boost on Google for going HTTPS, so it’s always a good idea these days. It’s the way forward for the web.
There are also some general benefits like preventing shenanigans with your traffic from your ISP, problems with people using browser extensions that add all kinds of weird junk to the page. Going HTTPS provides a better user experience in general.
CSP (Content Security Policy) what it is and why do we need it?
CSP has been around well over a decade. It was created at Mozilla and essentially it’s just a white list for the things that web pages are allowed to do.
Traditionally, you render a web page, send it to the browser and it does what the page says, like loading images and scripts. In that way, you can also execute some malicious script that can be injected into the page. If you look at it from the human perspective, you obviously can say that it doesn’t look good, but the browser doesn’t have that context and CSP is how you can give the browser this ability.
We’re basically saying “I can load my scripts, images, and styles only from these domains”, or “Please send form data from this page only to this endpoint”. In case if someone modified my login page and is trying to post user data to another website it will restrict that.
You can really control all the things that the browser can do. Prior to this mechanism, we were only able to build our page and send it to the browser. With CSP the browser knows what you expect to happen and it can make sure that it happens.
So, at very high-level CSP is a whitelisting mechanism for the content that your site can load and the things that it can do.
Where do we have to define this policy?
The ideal way to do this is in an HTTP response header, but you can do it in the HTML as a meta tag. Most web developers are familiar with the setting and configuring headers. CSP is just another HTTP response header.
You define a header and then you write your policy and send it over with the page that you want that policy to be applied to. The main policy is set on the document but you can also set a CSP on a sub-resources. But most people just build one universal policy for the site and then deliver it on all of the assets.
Wait, how do we do something like an inline script for Google analytics with CSP enabled?
All inline script tags are blocked in CSP. You have to put them in the file and then load them from your domain. With that you can actually check the domain against the whitelist, so you have the full control over the scripts that are executing on your origin.
So, the first option would be to externalize it into a
ga.js file. The second option would be to take the content of the script tag and then hash it with
sha256 hash and put that into the policy. That says to the browser that if there’s an inline script block on the page, hash the script and if it’s matching, then it’s something that we expect and can be executed. If it doesn’t match, then you can’t execute that. It’s something called a
Another mechanism is a
CSP nonce-source. A nonce is an arbitrary number that can be used just once inside a script tag. It is a random nonce attribute which is added to the script tags that you put the random value in the header and then you put the random value in the script tag and if those two things match then the browser is allowed to execute that. But that’s more of a super technical edge case of CSP. We have to keep in mind that the complexity will also increase in that case.
How does the reporting work in CSP?
The problem is that you as a host don’t know about that. So that’s where CSP reporting feature comes in handy. After you enable that feature, when CSP does a blocking action, it’ll tell you about it by dispatching HTTP POST request that has a JSON payload. That JSON payload has a set of defined fields with specified values, so all that you need to have is an endpoint that can take this JSON and then do stuff with it.
There are tools like Sentry that can take JSON and has some support for CSP.
What are some problems in CSP?
It’s really hard to retrofit CSP onto legacy content. Google Tag Manager has been a big blocker for CSP. With its help marketing people have the ability to inject arbitrary scripts into the pages of a website. This is just insane. Think about what if someone from your marketing team will inject a malware or a keylogger into your website and someone will steal all your credit card numbers??
With things like Ticketmaster credit card breach and compromised Browsealoud, developers should be more careful with third-party technologies that they allow to use on their websites. A lot of the times programmers are in such a hurry just to get something to work, the security aspect of it ends up taking a backseat.
Another thing that’s problematic is that implementing CSP on your website should be as easy as clicking one button and it’s done. So much of the information that comes out of a security community is riddled with acronyms or deep knowledge. It’s necessary to have a certain amount of knowledge to be able to implement good security practices on the website. It can scare some people away.
HSTS (HTTP Strict Transport Security)
HSTS stands for HTTP Strict Transport Security. With HSTS enabled, when you make an initial request to a website from your browser it never even attempts to load a webpage insecurely, it automatically redirects before it even attempts to load the page.
When you open your tab in a browser you type
Facebook.com and hit Enter, the browser needs a fully formed URL. It will add
http://facebook.com, because HTTP is the default schema to make that a URL. So, if there is somebody on the network who can use that and serve you a fake Facebook login page, you’re done.
One of the main features of HSTS is to change that default, if you have a website and you turn on the HSTS there, it basically says to the browser that you only do a secure talking now. So from now on, you must do HTTPS. Even if the user manually types HTTP, the website will force it, even if it’s a link or a bookmark.
HSTS is a per-site mechanism, you deliver it through the response header. The problem with HSTS is something called a
TOFU mechanism which is trust on first use. If I’ve never been to Facebook’s website, how do I know that it has HSTS turned on? So, the first time you go to Facebook, it would be HTTP and then you will find out that they have HSTS, and from now on it will be HSTS.
To get around that there is something called
preloading. This is when Facebook can have it’s domain name basically built into the browser source code. You can look at the Chromium source and there’s literally a huge list of domains that says all of these have HSTS turned on, so the browser knows that before it’s even connected.
But the problem here is that this is not a scalable solution. There is a workaround where there is a possibility to preload a top-level domain (
TLD). So, instead of baking whole site names, they’re only taking TLDs. So if you go to the same list in the Chromium source you’ll see things like
.dev. And what that means is that every single domain on these TLDs has the HSTS policy enabled.
CAA (Certificate Authority Authorization)
Right now if I want to get a certificate for a website, I go to the certificate authority and say “Hey, I’m Owlypixel please give me a certificate”. And it’s a certificate authority’s job to prove or disprove that you are who you’re saying you are. The problem here is that even a complete stranger can apply for the certificate and impersonate you. You have no say in this process.
With CAA you basically say “These are the CAs that I permit to issue certificates for me”. Only Letsencrypt or Cloudflare are allowed to issue a certificate. You’ve narrowed down the scope of CAs and authorized specific CAs to issue certificates for you.
CAA can be configured by adding one DNS record. A lot of people don’t think that it’s a strong mechanism and they just disregard it. But you’re getting huge levels of protection for setting a DNS record. And you can have a pretty powerful security boost for your security policy.
Ok, that’s awesome but where do I start?
Number one probably should be an HTTPS because if the connection between the browser and the website is not secure then we have nothing to build other security on top of. It’s like a step one in making a website secure.
There’s no actual requirement to do it, it’s just that there are more and more incentives to do it. You have a positive SEO from Google, better performance with things like Brotli compression. It’s not a new thing that the web is moving towards encryption, it’s just that we’re now moving towards encryption faster than we ever have. Not to mention that Chrome will start marking all HTTP sites with the Not Secure mark.
Once you get HTTPS working, you can look at HSTS, CAA and after that CSP. CSP has like 50 features, but you can only use the ones that you need.
Security is something that every developer needs to know a bit of. It really doesn’t take so much time to learn and implement it. Of course, you can get yourself in trouble if you don’t know what you’re doing. I think that this article is a really good start for people who really hadn’t heard of it, so they will be able to tighten up their websites.