WordPress in the Czech Republic - complex research

✍️ Vláďa Smitka
📅 11. 05. 2015

Let me introduce our company at first. We are a small Czech agency specializing in the small business market. We can provide a full package of services for our customers – network design and configuration, server installation and administration, development of information systems and websites, online marketing – SEO, PPC, data analytics. We are also a certified Google Partner. You can meet us as speakers on many conferences in the Czech Republic, WordPress community conferences, WordCamp, Barcamps, Marketing workshops etc. I’m very interested in network security, but I often move around WordPress development and its security and performance.

I’m sorry this is currently the only article available in English. If you have any questions about online business in the Czech Republic, feel free  to contact us.

Automated tools analyzed almost 4 GB of source code. The analysis was performed the first week of April 2015 therefore gained data is valid to this date.

The first task of the crawler was to determine WP version. It collected information about templates and plugins, which can be detected from HTML source of homepage, usage of Google Analytics, Facebook components (Like buttons, Fan walls, etc.) and some other details. All sites were checked on backlinks via Majestic SEO and on mentions on social networks (only for homepage). I also tried to determine webhosting provider of explored site from IP addresses.

I used 3 main methods to identify WP version:

  1. From “generator” meta in the source code
  2. From “/readme.html” file
  3. From RSS feed “/feed”

In the case of failure I tried to identify the WP version from MD5 hashes of some core static files and I take advantage of the fact some plugins add WP versions as parameters to their static resources (?ver=xy). If this parameter appeared in at least 60% cases of resources with these parameters I considered this number as a WordPress version. Despite these methods I wasn’t able to determine exact WP version in about 4,000 cases. The reason is probably usage of Security Plugins, which hide WP versions.

Plugins and themes were detected according to links to their CSS and JS files in wp-content folder (e.g.  /wp-content/themes/twentytwelve/style.css = Twenty Twelve theme).

Now we have come to the research results. Numbers are slightly rounded for better readability.

WP versions

The main point of the research was to determine which WP versions are actually used.

WordPress versions in the Czech

The good news is, thanks to easy updates most sites use the most recent version available at the time of research – version 4.1. Some sites (8) experiment with the beta version of 4.2 (which is stable at the time of writing this article and the patch 4.2.1 was already released).

On the other hand there are 7 sites which use version 1.5 from 2005, so they haven’t been updated for more than 10 years.

Archaic WP version 1.5

WordPress version 2.x is also very old – its last release was in 2009. There are almost 2,500 sites which use this version and I think they are very risky.

If we define versions 4.x as “Updated”, versions 3.9 as “Slightly outdated”, remains of 3.x as “Outdated” and lower versions as “Archaic”, we can see this distribution:

WP versions

 The number of outdated version 3 presents a serious security risk. It is more serious than the archaic versions, because there were many more leaky plugins available.

A closer looking at the still actively developed minor/patched versions may be also interesting:

Minor WP versions

It is obvious that autoupdates work well – most of the minor versions are the most recent.

Themes

After a short excursion to the WordPress core, we will focus on the most used Themes.

Many web developers prepare their own themes either child themes or themes made from scratch.  Therefore enormous diversity is not surprising – I found more than 23,000 different themes. Despite this there are many popular themes used on hundreds of sites.

Number of sites use the theme Number of different themes (rounded)
1

16000

2

3600

3

1300

4

600

5-10

1100

11-99

580

100+

28

As you can see the most of themes are unique (used on only one site).

On the other side, there are 28 very popular themes used on more than 100 sites – they are deployed on 20 % of explored sites.

Rank Theme Number of sites Category
1 Twenty Ten

1636

default

2 Twenty Twelve

1520

default

3 Twenty Eleven

1477

default

4 Default

844

default

5 Twenty Fourteen

671

default

6 Twenty Thirteen

627

default

7 Avada

395

premium

8 Graphene

382

free

9 Responsive

364

free

10 Enfold

264

premium

11 webadresy

243

private

12 Vantage

243

free

13 Twenty Fifteen

236

default

14 Divi

200

premium

15 mioweb

192

private

16 Hueman

184

free

17 Mystique

174

free

18 myDyTheme2

170

private

19 11-modra-facebook

167

private

20 Customizr

157

free

21 Suffusion

156

free

22 Mantra

145

free

23 adbees

141

private

24 The7

137

premium

25 OptimizePress

132

premium

26 00-ocean

121

private

27 Pinboard

110

free

28 Tempera

104

free

11 % WordPress sites use default themes

Plugins

We have already checked core and visual appearance, so the next step is to add some extra functionality and security holes through plugins.

Lots of themes (mainly the premium ones) also include some plugins. Problems come when somebody makes changes in the original (premium) theme and “breaks” the possibility of updates. If there is some vulnerable plugin, it will never be fixed. This is why you always may use child themes or your own plugin for customization.

I found 160,000 plugins in total (6,500 different kinds) and made a chart of the 50 most used WordPress plugins in the Czech Republic. These 50 different plugins represent almost 50 % of all detected plugins.

Pořadí Plugin Počet Kategorie
1 All in One SEO Pack

18211

SEO
2 Contact Form 7

16283

Forms
3 Nextgen Gallery

10552

Gallery
4 Yet Another Related Posts Plugin

2645

Related
5 Slider Revolution

2512

Slider
6 WPML

2448

Localization
7 Google Analytics by Yoast

2378

Analytics
8 WP-PageNavi

2234

Extend functions
9 Jetpack

1796

Multipurpose
10 Google Analyticator

1756

Analytics
11 WordPress SEO by Yoast

1665

SEO
12 WP-Polls

1581

Polls
13 WooCommerce

1520

Ecommerce
14 qTranslate

1469

Localization
15 Lightbox Plus Colorbox

1274

Lightbox
16 Easy FancyBox

1258

Lightbox
17 WP Super Cache

1213

Cache
18 W3 Total Cache

1180

Cache
19 Captcha

1060

Extend functions
20 LayerSlider

1056

Slider
21 Simple Lightbox

910

Lightbox
22 Visual Composer

900

Page Builder
23 MailPoet Newsletters

862

Mailing
24 Responsive Lightbox by dFactory

845

Lightbox
25 Lightbox 2 *

837

Lightbox
26 Contact Form

836

Forms
27 Fancybox for WordPress

825

Lightbox
28 WP jQuery Lightbox

819

Lightbox
29 WP-Table Reloaded *

772

Tables
30 Čestina pro WordPress *

763

Czech translation
31 Meta Slider

714

Slider
32 Contact form * (nová verze: cformsII)

701

Forms
33 TablePress

685

Tables
34 Sociable *

681

Social networks
35 WP Lightbox 2

625

Lightbox
36 jQuery Colorbox *

621

Lightbox
37 Photo Gallery

573

Gallery
38 WP-PostRatings

529

Extend functions
39 Gallery

516

Gallery
40 bbPress

484

Forum
41 WP Google Maps

443

Maps
42 Events Manager

442

Events
43 Page Builder by SiteOrigin

390

Page Builder
44 Facebook Like Button by BestWebSoft

387

Social networks
45 Sidebar Login

383

Extend functions
46 YouTube

367

Videos
47 MapPress Easy Google Maps

367

Maps
48 NextCellent Gallery - NextGEN Legacy

365

Gallery
49 Polylang

351

Localization
50 MailChimp for WordPress

327

Mailing

* obsolete plugins

You may notice the dominance of the first 3 plugins:

All in One SEO Pack

This plugin enhanced some basic on-page SEO factors. The main purpose is to canonize links, set up titles and metas (social networks included), globally and per post, disable archive indexing and enable sitemap generation. This plugin also provides connection to Google Analytics and proves web ownership for some other tools (e.g. Google Webmasters Tool). It allows editing of robots.txt. But be careful here – this plugin (still?) disallows access to wp-includes folder which is something that Google doesn’t like.

I prefer WordPress SEO by Yoast instead.

Contact Form 7

CF7 allows contact form creation. You can use simple codes to design your own contact form with an unlimited number of fields. Outputs from this form are sent to chosen email addresses. I often use two other extensions: Contact Form 7 Modules  (hidden fields) and Contact Form 7 Honeypot (simple, but efficient antispam). The main advantage of this plugin for me is its “extension readiness”. It is pretty simple to link it to other systems, e.g. CRM.

NextGen Gallery

This plugin was virtually the only reasonable gallery solution in older WP versions. In modern WP versions the integrated gallery is very usable. But if you need a more complex gallery solution, this plugin can be a good choice. There are lots of extensions for it, too.

The popularity of this plugin is probably related to the number of older WP versions.

WordPress is often used for simple linkbuilding SEO (I don’t want to use the word “blackhat”) and lead generation pages, so the reason of popularity of the first two plugins is obvious. I found a network of 1 600 sites made for this purpose which were owned by a single company.

I divided detected plugins into categories to determine the main reasons why users want to install plugins. The distribution also shows which features are missing by default. 

Plugins by type

Users look for advanced contact forms frequently.

There is also no lightbox to show pictures in the default configuration. We can see the Fancybox for WordPress plugin on the 27th place. A serious vulnerability was found in this plugin recently.  I detected an unpatched version on almost 400 sites!

Almost 50 % of websites using Fancybox for WordPress plugins are vulnerable.

Users love image sliders, but I hate them. The most popular slider plugins are Slider Revolution and Layer Slider (both premium). A very serious vulnerability was found in the first one last year. Thousands of sites were infected. I think the main reason is its integration to various premium themes, which lost update capability due to editing the original theme, and frequent illegal usage of this plugin…

The Slider Revolution plugin was detected on more than 2,500 sites. Almost 600 of these sites use a vulnerable version and allow the attacker to get full control over web.

More than 20 % of sites are using the Slider Revolution version containing serious vulnerability.

Many users want to translate their site to other languages. WMPL is the second most popular premium plugin (there were also security problems recently). There are other popular plugins for localization: qTranslate (I don’t like the way it translates content using comment blocks) and Polylang, which is strong competition to WPML. But there is also a new guy among localization plugins which looks promising – Babble by Automattic.

Another common task is to add Google Analytics tracking code. Lots of modern themes have an option to add these codes, but special plugins are still very popular.

Plugins which recommend related posts are also installed often. These plugins usually consume a lot of performance. It is not a surprise caching plugins for performance boosting are also very popular.

The number of installations of 2 main caching plugins (WP Super Cache a W3 Total Cache) is comparable. WP Super Cache is my favorite one due to its simplicity - it does only page caching, but it does it perfectly. For further optimization I always use appropriate Object Cache drop-in to reduce DB queries caused by transients and Autoptimize to minify and combine CSS and JS.

Object Cache Backend is a simple way to enhance performance thanks Object caching technology available on the server (e.g. APCXcacheAPCu,MemcachedRedis). Administrator needs select an appropriate drop-in by hand, so I found only 670 sites use this.

Only 1 % of site use Object Cache Backend.

Some users want to extend basic features like paginating by numbers or post rating. Users also want to embed Google Maps, social sharing buttons and videos. Somebody lacks the possibility to create tables in default WP editor (although TinyMCE allows it). We can see quite a big number of outdated installations here – more sites use WP-Table Reloaded than its successor TablePress.

Some users also appreciate the ability to create own layouts without HTML coding. It is possible thanks to various Page Builders. These plugins are often included in premium templates to gain commercial advantage.

Many sites are built for commercial purposes, so we can find various eCommerce plugins and plugins for sending newsletters.

Security Plugins

Security plugins are a separate chapter. It is not possible to detect these plugins in source HTML code. Fortunately there is only a few common plugins of this kind, so I wrote a test tailored to them. I detected a security plugin on 6 % of sites.

Security plugins for WP

The most popular security plugin is iThemes Security. This plugin allows blocking access to files which consist of sensitive information (like readme.html). It also blocks readme.txt files in plugin folders. I used these files to get more information about sites, so I also checked if this feature is enabled. It was enabled on 20 % sites using iThemes Security.

Rank Security plugin Count
1 iThemes Security 1900
2 WordFence 1340
3 All in One WP Security & Firewall 530
4 BulletProof Security 70

Almost all security plugins allow hiding the WP version, so I suppose they caused lots of failures in determining WP version.

Where are they hosted?

I tried to identify webhoster on the basis of IP address. I used whois to get the owner of the IP subnet (address range). This method is not 100 % accurate – some bigger webhosters use more subnets with different names.

I wasn’t able to find all existing Czech WordPress sites, so real numbers can be different.

Chart of the Czech webhosting companies by number of hosted WordPress sites:

Rank Webhoster Number of sites
1 Wedos

13970

2 Savana

3590

3 Active24

2940

4 Český Hosting

2270

5 Stable.cz

2110

6 Forpsi

2090

7 Gransy

2000

8 Gigaserver

1480

9 Web4U

1120

10 Hosting90

980

11 cz-hosting

900

12 Ignum

760

13 Tele3

730

14 Pípni

700

15 Angelhosting

680

16 Zoner

580

Wedos is apparently the clear leader. It is caused by the low costs of their services and strong marketing. This company also sponsored several WordPress conferences, so their name is connected to this CMS.

In the Czech Republic there is quite a lot of datacenter houses. It is hard to determine the relationship between IP address and exact location, but I tried. You can see approximate numbers in the following table.

Rank Datacenter Number of sites
1 Wedos

14000

2 Master Internet (4D)

9200

3 Casablanca

8700

4 VSHosting (TTC/ServerPark)

6700

5 SuperNetwork (TTC)

4700

6 Active24 (Tower)

2800

7 CoolHousing

2200

8 Forpsi (CZ1)

2100

9 DialTelecom (Nagano)

1600

10 Coprosys (Nagano)

770

HTTP servers

I also tried to detect used HTTP servers. Most webhosters use Apache exclusively due to users’ expectations. Apache serves 51,000 sites. My personal best HTTP server is Nginx because of its performance and straight configuration. Nginx serves 11,500 WP sites in the Czech Republic. There aren’t many other players on this ground.

Apache

51000

Nginx

11500

IIS

1100

OpenResty

200

Lighttp

50

LiteSpeed

40

I had never heard about OpenResty before – it is extended Nginx.

I also tried to detect the exact version of HTTP server from headers, but this information is often hidden.

More than 30,000 Apaches didn’t disclose their version, 18,000 use 2.2 and 1,400 use 2.4.

In the case of Nginx the situation is similar:

4,500 didn’t disclose their version, 3,500 use 1.2.1, 1,600 use 1.7.1 and 1,400 use 1.6.2+3.

PHP versions

Many HTTP servers hide their exact version. It is not a surprise they hide PHP versions, too – 34,000 sites didn’t tell their PHP version.

PHP version affects performance, newer versions are noticeably faster.

Disclosed versions:

PHP/4.3

20

PHP/4.4

60

PHP/5.0

2

PHP/5.1

50

PHP/5.2

7000

PHP/5.3

14300

PHP/5.4

7500

PHP/5.5

2800

PHP/5.6

300

Several sites also experiment with HHVM.

Web charts

Everybody loves “top charts”, so I prepared a few. Keep in mind  values except “Trust flow” may be artificially influenced by massive purchases of backlinks and fans.

Top 10 sites by Trust flow (Majestic SEO):

www.radegast.cz
www.pamatnik-terezin.cz
www.mediatel.cz
www.cscope.cz
www.corro.cz
www.mirc.cz
www.ancr.cz
www.neternity.cz
www.bonipueri.cz
www.zdravaprsa.cz

Top 10 sites by Citation flow (Majestic SEO):

www.autoskola-praha-ridicak.cz
web.etronic.cz
www.internetprofi.cz
www.hostivarskaprehrada.cz
new.rampusak-stity.cz
www.czech-production.cz
www.profilamas.cz
sd.kralovstilvi.cz
sstepanhon.cz
www.mediatel.cz

Top 10 sites by number of external backlinks (Majestic SEO):

www.geosense.cz
www.profilamas.cz
www.radegast.cz
www.neternity.cz
www.sperky-sw.cz
www.ftonline.cz
www.drosera.cz
www.a2b.cz
www.nsko.cz
www.internetprofi.cz

Top 10 sites by Facebook likes and shares:

www.artex-pokladny.cz
www.hubnutihrou.cz
www.milionaremdoroka.cz
www.revolucnimarketing.cz
www.darujvajicko.cz
www.elitevideoacademy.cz
www.akademieretoriky.cz
www.komunikacikuspechu.cz
www.moje-sebeduvera.cz
www.pragulic.cz

In this list there are many suspicious “infoproduct” sites. I’m not convinced high ranks originate in an organic way.

Top 10 sites by tweets:

www.luciesvarcova.cz
www.test2014.cz
www.fotoseminar.cz
www.neurra.cz
www.stvanci.cz
www.hubnutihrou.cz
www.companyconsults.cz
www.ceskycmelak.cz
www.cafedu.cz
www.mocslov.cz
 

Top 10 sites by LinkedIn mentions:

www.tqtest.cz
www.hubnutihrou.cz
www.taichiresort.cz
www.superprijem.cz
blog.emailkampane.cz
www.laserfoto.cz
www.kompetenz.cz
www.navykybohatych.cz
www.mediatel.cz
www.inside.cz

Top 10 sites by +1 on GooglePlus:

www.xindlx.cz
www.doperin.cz
www.antelli.cz
www.rodinne-konstelace.cz
www.studiocamo.cz
www.artex-pokladny.cz
www.test2014.cz
www.oezentrum.cz
www.neurra.cz
www.probuzenyslon.cz

Interesting data

I found some interesting data during the analysis, so let’s take a look.

Size of HTML

I monitored the size of HTML response (raw HTML – without JS, CSS, images).

HTML code size

50 % sites contain less than 28 kB of HTML
80 % sites contain less than 45 kB of HTML

I found almost 80 sites with more than 0,5 MB of raw HTML code = virtually useless sites.

Google Analytics

I also checked if sites use Google Analytics. Currently there are 3 types of tracking code:

  1. Old tracking code (ga.js) – obsolete
  2. Universal Analytics – the new one
  3. Google Tag Manager – whole system for codes management, it uses Universal Analytics

Google Analytics usage

More than half of websites don’t use Google Analytics.
27 % sites use the old GA code.
GTM is used on only 1 % of sites (about 650 sites in total).

Remarketing isn’t related to Google Analytics, but I also include this report here. Usage of remarketing codes show on commercial focused sites.

2 % of sites use remarketing (1,200 sites).

The same situation is in the case of AdSense:

Almost 9 % sites use Google AdSense.

3,800 sites use synchronous variant of advert code and 1,800 site asynchronous.

Facebook

The influence of Social networks is increasing, so I also tested if sites use Facebook components, e.g. Like and Share buttons or Fan walls.

Facebook plugins usage

Facebook Social plugins use 38 % sites.

HTTPS

Security is a very important aspect of every website, so I was interested in usage of HTTPS with valid certificates.

HTTP usage

Only 0,24 % sites use HTTPS with a valid certificate.

What does Google PageSpeed say?

All sites were examined by Google PageSpeed Insights. I obtained data on counts of static resources (CSS, JS, Images) and their sizes.

In total PageSpeed score there are several factors counted: optimization of static content (code minification, possibility of lossless image compression), proper caching, transfer compression, existence of useless redirection, server response time. The most common problem was the size of images and unset caching in headers.

Google PageSpeed scores

Half of sites achieved 75+ points in Google PageSpeed.

Data shows most sites reach pretty good scores. Some of them on both sides of the range were further examined to resolve why they got such a high/low score. The main reasons for low scores (360 sites got 0 points) were huge unoptimized images – Google PageSpeed found that it was possible to save a few megabytes without loss of quality. There were some extreme cases – 2 sites had images larger than 70 MB on their homepages! The worst rated sites had naturally many more problems. High scored sites were mostly older and simple ones – there is virtually a chance to screw something up. Modern professional sites usually got scores around 90.

Almost half of all tested sites contains more than 1 MB of images on their homepage.

Size of images

Before you upload an image to your web, edit it in a graphics editor – lower the resolution and JPG quality settings.  An image larger than the physical size of a screen won’t be at all appreciated by your visitors.

Tip: Personally I recommend IrfanView for quick images editing – in its plugin pack, there is a good tool, RIOT, for web image optimization (it is also possible to download newer versions which is also available in standalone and contains more tools for PNG).

Compared to the previous test of HTML code size, we can also calculate total page size including images, scripts and cascade styles. The total size of all homepages was almost 120 GB. Let’s take a look at homepage size distribution:

Size of homepage

Almost half of tested sites have a homepage size under 400 kB.

Per resource type distribution is also interesting:

Size of static resources by type

The fact that the biggest part belongs to images is not surprise. It is a lot bigger surprise that 30 % belongs to Javascripts.

The total count of static resources loaded by site is visible on following graph:

Number of static resources on HP

Many sites load a huge number of other files, sometimes more than 100. Some of them use various photo galleries so many requests are understandable. On the other hand many requests are caused by plugin over-usage.

Almost half of all tested sites contain more than 30 static resources on their homepage. There are more than 2,000 sites which have more than 100 static resources.

The number of static resources on homepages is visible in the following graphs:

Number of JS files

Number of CSS files

Low request sites are usually older simple sites, but many modern sites use plugins to minify and combine static resources (e.g. thanks to Autoptimize plugin).

We discussed the number of resource and now it is time to look at their size.

JS size distribution

CSS size distribution

50 % of sites include less than 60 kB of CSS and 350 kB of Javascripts. 10 % of sites use more than 300 kB of CSS and 4 % of sites use more than 2 MB of Javascripts.

According to the data I acquired, I daresay 25 % of transferred data are unnecessary.

Conslusion

My research proves the majority of Czech WordPress sites are outdated and lots of them are vulnerable. Performance is also an issue. WordPress is often used for infomarketing sites due to its simplicity and user friendliness – everybody can make his own site without any knowledge of code. Unfortunately these users do not accept the fact websites need constant care.

My research also shows the popularity of default themes and reveals the most used plugins. The Webhosting company Wedos proves clear dominance in this market.

Lots of sites can be described as “Infomarketing sites with default templates, which use plugins to turn on SEO and to administer contact forms, it runs with minimum costs and is outdated.” On the other hand there are lots of professional sites focused on security, speed and content quality. Simplicity is the main advantage of this CMS and with a little effort it is possible to make a professional website.

What about the next steps?

A lot of vulnerable sites were found. The access to many of them were even blocked by my antivirus. I would like to contact the owners and creators and help them repair/improve their sites.

You can follow us on social networks (almost in Czech):

  • Facebook
  • Twitter
  • Google+
  • Linked In
  •  

Another articles about WordPress (Czech).

You may be also interested in slides from my speech at WordCamp Prague about WordPress Security or my slides about WordPress Performance.

Update

The current state 08/2017:

Štítky: , ,

3 comments on “WordPress in the Czech Republic - complex research”

  1. Thanks for going to all the effort to put this fascinating article together. I agree with you that the situation will probably be fairly similar in countries outside of the Czech environment, but it would be interesting to do this in our own country as well.

  2. Interesting research! What tools did you actually use for finding out social mentions? Especially LinkedIn is very hard to come by.

Napsat komentář

Vaše e-mailová adresa nebude zveřejněna. Vyžadované informace jsou označeny *


Lynt services s.r.o

Již 12 let vytváříme efektivnější kampaně, zrychlujeme weby a řešíme jejich bezpečnost. Kombinujeme marketing, vývoj a automatizaci.
poptávka služeb