How to set up SEO in React deployed on AWS without server-side pre-rendering

William Lindsay
6 min readJul 21, 2021

--

Need to add SEO header tags to your React Single-Page App (SPA) and don’t want to pre-render it on your server and rehydrate it?

I had that exact problem. I had a pre-existing React + TypeScript + NodeJS app that was deployed to AWS using Serverless and making the React bundle available in a publicly accessible S3 bucket. Adding SEO header tags can’t be that complicated right? Just add those tags and move on right? Nope…

The first thing I did on my React SEO journey was to fill in the appropriate tags. I found React Helmet which made this pretty trivial. Just need to include their component in each page component with the appropriate tags (like below) and off I go.

<Helmet>
<title>{t}</title>
<meta name="description" content={description} />
<meta property="og:title" content={t} />
<meta property="og:type" content="website" />
<meta property="fb:app_id" content="not-telling" />
<meta property="og:description" content={description} />
<meta name="image" property="og:image" content={imageUrl} />
<meta property="og:image:url" content={imageUrl} />
<meta property="og:image:secure_url" content={imageUrl} />
<meta property="og:url" content=`https://allswealth.com/${urlPath}`} />
<meta property="og:site_name" content="Allswealth" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:image" content={imageUrl} />
<meta name="twitter:image:alt" content={imageAtlText} />
</Helmet>

That was easy, right? I published my changes to my server and… nothing changed… Only the home page was serving the correct SEO content. Note, I debugged this by using the link validators for Facebook, Twitter, and LinkedIn to see if the correct information for each link was shown after being crawled.

So why didn’t this work? This all stems from a core problem with SPAs; they are not static and therefore won’t be crawled correctly by search engines (social media bots too for that matter).

What I thought was going to be a small dev task ended up ballooning pretty fast. I kept Googling and reading about this problem but saw that the common recommendation was to use server-side pre-rendering with client-side rehydration. I was not in the mood to rearchitect my deployment and content serving strategies for what seemed like such a small ask…

After collecting myself, I thought about it a bit more and came up with a set of requirements to solve my problem:

  1. Need to be able to generate a static version for each of my React pages
  2. Need to serve it to bots crawling my website while still serving the normal version to everyone else
  3. Need to be able to automatically add and update the static pages as I have a blog with dynamic content that is core to my SEO strategy

With that in mind, I went exploring and found Prerender.io. They offer a service to cache raw HTML versions of your site and offer an API to boot. It is not free but offers 250 cached URLs for free which more than exceeds my current amount of pages with plenty of room to grow.

I proceeded to add all of my URLs to Prerender.io and went back to my link validators to see if it worked. It did, but with a catch. I had to use Prerender.io’s cached version of the URLs; instead of using https://allswealth.com/blog/Retirement/RRSPsvsTFSAs, I had to point it to https://dashboard.prerender.io/raw-html?url=https://allswealth.com/blog/Retirement/RRSPsvsTFSAs&adaptiveType=desktop.

The first step is done. Next is to be able to point those link validators to my actual URL and not the cached one.

Since I deploy to AWS using Serverless, I already understood that I could use Lambdas with CloudFront to intercept and reroute requests (that’s how I serve my React app from an S3 bucket). I also found a GitHub repo called prerender-cloudfront which has a YAML file that describes two Lambda functions that do exactly what I want; intercept requests which originate from web crawlers and redirect to the Prerendio.io cached URLs.

Before we move on to creating the Lambdas, note that I had to create them in the US East (N. Virginia) (us-east-1) region to get them to work instead of being in the same regions as my other Lambdas.

The first Lambda, called SetPrerenderHeader, is defined below and is in charge of the interception of the requests. It does so by checking the user agent in the header of the request against a RegEx of known web crawlers. It also omits certain file types that you may not care about. If there is a match, it adds a few additional headers to the request to identify that it should be sent to Prerender.io. (Swap the X-Prerender-Token with your token from Prerender.io if you attempt this yourself.)

exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const headers = request.headers;
const user_agent = headers['user-agent'];
const host = headers['host'];
if (user_agent && host) {
var prerender = /googlebot|adsbot\-google|Feedfetcher\-Google|bingbot|yandex|baiduspider|Facebot|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator/i.test(user_agent[0].value);
prerender = prerender || /_escaped_fragment_/.test(request.querystring);
prerender = prerender && ! /\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)$/i.test(request.uri);
if (prerender) {
headers['x-prerender-token'] = [{ key: 'X-Prerender-Token', value: 'my-precious'}];
headers['x-prerender-host'] = [{ key: 'X-Prerender-Host', value: host[0].value}];
headers['x-prerender-cachebuster'] = [{ key: 'X-Prerender-Cachebuster', value: Date.now().toString()}];
}
}
callback(null, request);
};

Now for the second Lambda, RedirectToPrerender, which does the actual redirection. It looks for a request with the proper headers and changes the origin to Prerender.io.

exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
if (request.headers['x-prerender-token'] && request.headers['x-prerender-host']) {
request.origin = {
custom: {
domainName: 'service.prerender.io',
port: 443,
protocol: 'https',
readTimeout: 20,
keepaliveTimeout: 5,
customHeaders: {},
sslProtocols: ['TLSv1', 'TLSv1.1'],
path: '/https%3A%2F%2F' + request.headers['x-prerender-host'][0].value
}
};
}
callback(null, request);
};

Lambdas don’t do much on their own so let’s connect them to CloudFront. To do so, I had to create a new Distribution. The only settings I changed from the defaults were as follows:

1. Set Origin Domain to my main website URL (allswealth.com). It should show up in the dropdown.

2. Under Cache key and origin requests change to the bellow by adding our expected headers and changing the TTL.

3. Under Function associations — optional add a Function type of Lambda@Edge to both Viewer request and Origin request and copy & paste the ARN for the SetPrerenderHeader and RedirectToPrerender Lambdas as shown below (ARN should look something like arn:aws:lambda:us-east-1:123456789101:function:Prerender-SetPrerenderHeader-0ZZZp7DPyoC3:1).

Now let’s go back to testing our link validators to see if it works.

It works! SEO is now fully up and operational!

Only one last requirement to meet; the ability to dynamically add more cached URLs as I create more blog content. Thankfully, as we saw earlier, Prerender.io has an API that we can use and one of its functions called recache does just that. You give it your personal Prerender token with a URL and it caches or recaches it for you.

I added the function below to my backend and call it anytime my code to create or update a blog is invoked.

export const recacheUrl = async (urlPath: string): Promise<void> => {
const url = `https://allswealth.com/${urlPath}`;
await axios({
method: 'post',
url: 'https://api.prerender.io/recache',
headers: {
'Content-type': 'application/json'
},
data: {
prerenderToken: ENV.prerenderIoToken,
url
}
});
};

That’s it! All of the requirements have been met without requiring a large redesign. This saved me a lot of time and I hope it can help others too.

--

--

William Lindsay
William Lindsay

Written by William Lindsay

Co-founder of Allswealth. Computer engineer.

Responses (2)