Occasionally while browsing this site I could see that whitespace was acting up. Especially when browsing on the iPhone, specific words would merge together, the latest example being in a blog post regarding eZ Platform and Gatsby.js. In the first paragraph the words "time to focus" were merged together as shown below:
My initial thought was that this is an issue with the presentation. After looking at some more examples, I noticed that the issue manifested around links. So I thought I'd look into the styles applied. It inherited word-wrap: break-word from its parent, so I initially thought that would surely be the culprit. For a while I tried following this lead to no avail.
Even with bare bones styling applied the iPhone would squish these three words together. By accident I viewed the source code using the traditional view-source functionality (view-source:https://example.com/....) and noticed there were HTML entities around between the words: "time to focus" instead of (what I though was) "time to focus".
The entity 8239 refers to the Unicode Character 'NARROW NO-BREAK SPACE' that is apparently used in Mongolian and French. Below you can see a fraction of the content with spaces highlighted... not so easy to spot the one with the different type of space 🤦
For us it should be regular space, so the most reliable fix was to replace all occurrences of this character in the database. The simple query below (tested only with MySQL) will replace all narrow no-break spaces with regular spaces for an eZ Platform database:
UPDATE `ezcontentobject_attribute` SET `data_text` = REPLACE(`data_text`, ' ', ' ') WHERE data_type_string='ezrichtext';
Remember to backup your database before running this command, and make sure your use case has no need for use of this special character. It exists for a reason. After running the query and clearing caches the text looks as it should when using Safari on an iPhone:
With the browser developer tools showing the character it would have been very difficult to spot that some spaces are not what they seem. The origin of these characters was likely some copy-pasting from Microsoft Word online, looks like rich text is still hard in the 2020s.
In the end the remedy for this issue was simple, but it was by accident that I tracked down the different types of whitespace in use. Maybe this posts helps someone notice the same.