Our template is broken. Comments are displayed out of the intended layout.
Our blog is generating a lot of page views and has become quite expensive (more below).
Our blog has become quite popular and has received around 300k monthly
"visitors" and above 1.5M "page loads".
This is bumping our expenses significantly and we are evaluating various options regarding caching.
Using a CDN like Fastly, Netlify, or Cloudflare
Using Varnish
Caching via Varnish could create a bottleneck for our blog and a single point of failure.
In the medium term we could also evaluate what we want to do with our current blog.
There is a Drupal static caching project called Tome that we could use together with drupal comments from pantheon.
We could migrate the blog to a static content generator and use a separate system for comments.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Trac: Summary: Caching for the blog to Blog status and where to go Description: Our blog has become quite popular and has received around 300k monthly
"visitors" and above 1.5M "page loads".
This is bumping our expenses significantly and we are evaluating various options regarding caching.
Using a CDN like Fastly, Netlify, or Cloudflare
Using Varnish
Caching via Varnish could create a bottleneck for our blog and a single point of failure.
to
We have a few issues with our blog.
Our template is broken. Comments are displayed out of the intended layout.
Our blog is generating a lot of page views and has become quite expensive (more below).
Our blog has become quite popular and has received around 300k monthly
"visitors" and above 1.5M "page loads".
This is bumping our expenses significantly and we are evaluating various options regarding caching.
Using a CDN like Fastly, Netlify, or Cloudflare
Using Varnish
Caching via Varnish could create a bottleneck for our blog and a single point of failure.
In the medium term we could also evaluate what we want to do with our current blog.
There is a Drupal static caching project called Tome that we could use together with drupal comments from pantheon.
We could migrate the blog to a static content generator and use a separate system for comments.
ah. i saw this ticket after replying in private by email, but i'll share that analysis here. ;)
TL;DR: I'd go with varnish still, and ask the next steps on that.
The single bottleneck issue for Varnish could be a problem, but we do
have multiple locations for our servers and would be able to providing
multiple redundant servers without too much problems if that becomes an
issue. I would certainly advocate towards creating at least two
frontends to start with.
As we discussed last week, we already have a (~free) contract with
Fastly, so if we want to go the "CDN" way, it would be a good
option. They say they don't log/track their users, but I'm not sure it
would be a great move in terms of "publicity". I'm also not quite sure I
trust Fastly with doing the right thing here, ultimately, nor do I feel
that the idea of putting all our eggs in the same basket to be safe. We
also run the chance of blowing our quota there eventually if we throw
everything in Fastly.
I would assume CF is out of the question, and I don't know enough about
Netlify to speak about it...
It would be useful to know a little more what "page loads" mean. The
300k "visitors" and 1.5M "pages" figures are similar to what we see in
the dashboard, but in terms of server resources, actual raw numbers
(megabits per second or total gigabytes, and "hits" per second, as
oposed to pages) would be more useful to evaluate our capacity. What's a
"page" for example? Is that one page load, with all extra resources like
CSS and images? While that's useful for them because it's their primary
driver (because it's drupal fighting with PHP and the database to create
the page on the fly), for us at the caching layer, we don't care about
the type of content as much. :)
Finally, I looked at Tome briefly. There were various modules like this
in Drupal's history, the one I knew about before today is called "boost"
but hasn't been ported to D8 it seems. Tome is interesting, as it does
allow the creation of a static site in front of drupal, and we could
then share it on the mirror system, but then it still means we need to
deal and pay with pantheon for the hosting, which still seems like an
expensive proposition for basically a glorified text editor. I'm not
sure how "just sending the comment links" would work in practice, but
maybe it can be done too.
Anyways, Tome would take time and effort to setup, and since we are
still considering our long-term options here, I wouldn't advise for that
solution just yet and just start working more concretely on how to setup
the varnish frontends, provided we have confirmation on the
numbers. With a rough guesstimate, 1.5M "pages" is about 23Mbit/s on
average during the month, something we could probably absorb in the
existing infrastructure without too much troubel. But that's assuming
just the 5MB frontpage, having better numbers would help here
tremendously.
we're taking about 88% of the traffic out of the blog, which should drastically reduce the costs. a 88% reduction should bump us from the peak 435000 visits (the 300k visits per month package, 1000
We're still steady at around 87-89% hit ratio. We had a small outage on one of the servers friday (#32603 (moved)) but thanks to their redundant nature, that probably went unnoticed. We are down to 450$/mth in the billing, and the caches haven't been online for a full month yet, so that's likely to go down a little further.
We're currently at 1M pages served for november, according to Pantheon, (october = 3M) and 147k visits (oct = 435k).
Here's a summary of our status with the blog, a month after the cache went online. Two main problems were identified with the blog:
Broken templates and long-term web development goals
Cost overrun issues
TL;DR:
fix templates in-house or Giant Rabbit, switch to static site generator (Lektor?) and external commenting system (Discourse?) in the mid-long term
cost overruns back under control (~500$?), but incomprehensible billing makes this possibly uncertain, need to double-check
= Broken templates and web development
Ever since some change happened on the blog (an upgrade?), HTML templates were broken, which is particularly visible in the comments section. Those are not formatted properly and we want those fixed. We considered various providers to outsource this consulting to and, coincidentally, consider moving our hosting elsewhere. We had a quote from Koumbit.org which was privately discussed.
For now, we will try to fix the blog where it is in the meantime, maybe with the help of an existing Drupal provider (Giant Rabbit) instead of starting a new business relationship. Something that we should consider is that fixing the template might be expensive. Hiro is willing to make another try adapting our styleguide to an updated bootstrap template.
In the long term, we want to move away from Drupal, towards a static site generator for the content and something like Discourse for the comments in the backend. The latter could be reused for other projects inside Tor, particularly the support and community teams, among others. It was also considered as an option for easier user onboarding for bug reporting when compared to GitLab. The static site generator could be one we area already using, like Lektor. This still has to be discussed further. We might achieve the same level of WYSIWYG with a static site generator, without the time and economical investment of running a giant framework like drupal.
= Cost issues
The other problem that was identified in October was the cost overrun issues. Around August or September, we passed the 300k visits per month mark, which bumped us in another price range with Pantheon (~1k$/mth). Their pricing plan seem to go as follows, in terms of visits/month vs cost/month:
small, 25k: 175$
medium, 50k: 300$
large, 150k: 600$
extra-large, 300k: 1000$
(I'm ignoring the "basic" 50$/mth package because I'm going under the assertion that's not accessible for us, because it's a high traffic site.)
Before the traffic bumps happened, we were billed 500
/mth rate. We were bumped from the "large" to the "extra-large" package first on september 27th, then again on october 29th. Their billing system is ... a bit opaque to me, but it seems we are now billed 500$/mth again. I honestly can't figure out what is going on with the billing at this point, honestly. I would love if Jon or someone else could go over those invoices and figure it out.
But my theory right now is the caching system did its job and brought us back to a "pre-crisis" level of billing, that is, the "large" billing package. Indeed, that is what the "billing" section of the Pantheon dashboard says. There's also this message in the "Workflow section:
Changed site plan to "plan-performance_large-preferred-monthly-1"
[matt's email address at panthon]
Finished 40 minutes ago
So maybe we got someone at Pantheon to intervene for us?
We can clearly see a drop in the traffic on the backend in the Pantheon stats:
October: 435k visits, 3.1M pages served
November: 165k visits, 1M pages served
That's a 63% drop in visits and 68% drop in page served. It could still get slightly better in December, as out hit ratio is actually better than this, at 88%:
The reason those ratios don't correspond exactly to each other is we have different ways to count those statistics. Pantheon uses "visits" and "pages", we use "hits". The distinction is that a "visitor" can hit multiple "pages" in one "visit", and a page is made of multiple "hits". So while we may keep many hits from going to the backend, we may not keep as many "pages" as we want going there. I suspect it would be very hard to remove the other 115k visits per month to get down to the medium package, and I have not made more efforts to do so.
Also, as far as I can tell, this traffic hitting our own TPA infrastructure is not affecting us in any significant way, neither in terms of cost (traffic is not large enough to change billing significantly) or performance (load is not big enough to affect the server's overall performance).
So I consider the "cost" crisis to be over, but there might be more tricks we could do to bring the hit ratio down. At this point, I consider the cost tradeoff to not be worth it, however, as long as Pantheon doesn't bump us back to the "extra large" cost grid.