Top Five Posts from 2016

Most of this blog’s 40k visitors a year are looking at the epic Elasticsearch posts that I wrote years ago. For the most part they seem to still be relevant to people even if they are somewhat outdated. Here are my top posts with some commentary about each of them.

2014-emailteaser

1: Elasticsearch: Five Things I was Doing Wrong

79% of my traffic comes from search engines, and almost 50% of all traffic goes to this one post. It’s actually kinda crazy that such a simple post gets so much of my traffic. I blame the clickbait headline. I have a bunch of long winded epic posts and what I should probably be writing is these small tidbits as they come up.

2: Three Principles for Multilingal Indexing in Elasticsearch

This is my all time favorite post. After 2.x and the removal of being able to specify an analyzer in a query it has become a bit outdated, but the overall concepts are still good. I love all the comments this post has generated. I’ve learned so much from this post and the discussions that it generated. We’ve accomplished a lot the past year to adjust our multi-lingual indexing (deployed edgengrams into an A/B test yesterday) and I’m hoping to write up what my latest thinking is soon.

3 and 4: Scaling Elasticsearch Series

The first two parts of this three part series are my third and fourth most popular posts. The indexing post is almost twice as popular as the intro and querying posts. Although these posts are almost three years old now they still describe pretty well how we scale most of our queries. Most of the reason why these posts haven’t been updated is because the methods they describe have worked really well for us.

The original post talks about having 600 million posts in the index and 23m queries a day. We now have 4.3 billion posts and do about 45m queries a day. That’s some good scaling.

Only over the past year have we started to see some problems slowly develop with our global cluster scaling. Currently the cluster runs fine for about a month or so and then heap usage creeps upwards until it starts to cause problems. The solution is just to do rolling restart of the cluster. Not pretty, but it works. Here’s what our average heap usage looks like broken down by data center for the past 30 days.

Screen Shot 2017-01-06 at 9.28.48 AM.png

We think a lot of these are just memory management bugs in the old Elasticsearch version we have been running for years and are hopeful that as we transition to 2.x many of them will be resolved. The other option is just to add more servers which we haven’t done in a few years. Our typical load is not very high though until we reach the point of running out of heap so I haven’t felt very justified in ordering more servers for this cluster yet.

One high point of this cluster is it taught us how to run a multi data center cluster. Every cluster we deploy now is multi-data center and we have successfully survived cases where an entire data center goes down. Currently we are in three data centers spread across the US. It’s likely that in 2017 we will start trying to run intercontinental Elasticsearch clusters (Europe and the US). Should be exciting.

5: Managing Elasticsearch Cluster Restart Time

This post describes how we manage long restart times. 2.x is a bit faster in this regard, but still takes a while to synchronize, so this is still relevant to managing a production ES cluster.

Top Five Posts from 2016

1: Elasticsearch: Five Things I was Doing Wrong

2: Three Principles for Multilingal Indexing in Elasticsearch

3 and 4: Scaling Elasticsearch Series

5: Managing Elasticsearch Cluster Restart Time

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112