stephen.news

hypertext, words and more

An Inside Look at The New York Times Publishing Tech Stack

As a WordPress Guy, and advocate for online publishing — I’ve always been curious about the New York Times digital publishing “stack.” A lot has changed since the printed word.

Text editors, paragraph blocks, header arrangements, interactive graphs, revision history, and data structures — journalists these days have a lot more responsibilities to handle than just a story. They’re more akin to data scientists or librarians handling meta-data with caution and organizational finesse than the classical depiction of journalists.

I was surprised to read that the New York Times employs a CMS called Oak, which was built on the backbone of a simple text editor called ProseMirror. An un-opinionated open-source text editor. I was taken back by this line: 

ProseMirror structures its main elements — paragraphs, headings, lists, images, etc. — as nodes. Many nodes can have child nodes — e.g., a heading_basic node can have child nodes including a heading1 node, a byline node, a timestamp node and image nodes. This leads to the tree-like structure I mentioned above.

It’s neat to see their CMS take on a node-tree structure (which is ultimately a appropriation other have taken as well such as HTML/XML/JSON). It’s smart. I would have loved to see what their process and CMS looked like before.

Did you notice the parent-nodes have a similarity to another data structure paradigm, say blocks? In WordPress-land, blocks are coming soon, and once that happens, the data-structure pandora’s box will have been opened. In the simplest, most modest comparison, Gutenberg will only have parent blocks. No children. But that could all change later down the road. 

Food for thought: if the New York Times can transition to a successful implementation of complex data structures, modern CMS composer capabilities — surely WordPress/Gutenberg could employ similar techniques and follow in that same spirit.