Syntax Highlighting

Wi

William Jing

Syntax Highlighting
MDX
Next.js
15
5 min read
Last modified: December 29, 2024
Syntax Highlighting

Overview

Syntax highlighting is an essential feature for any technical blog that includes code snippets. Studies show that color-coded syntax can improve code comprehension by up to 20%, making it easier for readers to understand and engage with technical content. It enhances engagement by making code visually appealing and easier to read, clearly distinguishing elements like keywords, strings, and comments. While rebuilding my blog, I dedicated considerable time to researching how to implement syntax highlighting effectively.

There are several mature solutions available, such as Prism for its regex-based efficiency and React Syntax Highlighter for its seamless integration with React. However, the process of implementation can be challenging due to limited documentation and conflicts when mixing different solutions, such as inconsistent styling, duplicated dependencies, or incompatibilities between client-side and server-side rendering. Here, I document my successful approach for future reference.

Background

  1. Framework: Next.js

  2. Content: MDX

  3. Hosting: Cloudflare Pages

Available Options

  1. React Syntax Highlighter: Widely used in the React ecosystem.

  2. Prism: A regex-based implementation that supports server-side and client-side usage.

  3. Shiki: Commonly used with the Unified ecosystem, working server-side.

  4. Rehype Highlight: A plugin for Rehype, part of the Unified processor for HTML.

Server or Client?

Syntax highlighting solutions can be categorized as:

For server-side solutions, the most common approach is to use the Unified ecosystem. Unified is a modular framework for processing content like Markdown or HTML, allowing seamless integration of parsers, transformers, and compilers. For more details, visit the Unified documentation.

Unified Architecture

| ........................ process ........................... |
| .......... parse ... | ... run ... | ... stringify ..........|

          +--------+                     +----------+
Input ->- | Parser | ->- Syntax Tree ->- | Compiler | ->- Output
          +--------+          |          +----------+
                              X
                              |
                       +--------------+
                       | Transformers |
                       +--------------+

Serverless Considerations

In serverless environments like Cloudflare Workers or Pages, certain restrictions apply. For example, dangerouslySetInnerHTML is prohibited because it can introduce security vulnerabilities like cross-site scripting (XSS). In serverless environments, this restriction ensures greater security but also limits the ability to inject precompiled HTML directly, requiring alternative approaches for rendering syntax-highlighted content.

When using Shiki, note that Cloudflare Workers do not support initializing WebAssembly from binary data. Instead, you must upload the WebAssembly (WASM) file as an asset and import it directly, as explained in the Shiki documentation.

Chosen Path

I chose React Syntax Highlighter with Prism as the underlying highlighter. This decision aligns with my use of Next.js and the need for a client-side solution due to serverless limitations. Prism was selected for its speed and compatibility.

Using React Syntax Highlighter

Prism.js or Highlight.js?

Prism.js and Highlight.js are both popular syntax highlighting libraries, but they differ in key aspects:

  • Prism.js is more modern, lightweight, and customizable. It uses regex-based parsing, supports plugins, and requires manual language specification, making it suitable for performance-critical applications.

  • Highlight.js is known for its automatic language detection, making it more beginner-friendly and easier to integrate when language detection is needed.

In this project, I chose Highlight.js due to its support for automatic language detection, simplifying implementation in cases where the language isn't specified. However, Highlight.js has some drawbacks compared to Prism.js, such as being less customizable and having a larger bundle size, which could impact performance in resource-constrained environments.

Styling Options

Using the style Prop

  • Pros: Avoids CSS conflicts.

  • Cons: Cannot reuse styles from a theme.

Using the className Prop

  • Pros: Allows reuse of styles from a theme.

  • Cons: Potential CSS conflicts when switching themes.

Managing Theme-Specific CSS

When switching theme between light and dark, for example, by importing different CSS file, the CSS file will conflict because previous one is not removed. There is a genius idea to solve this problem:

const { resolvedTheme } = useTheme();

React.useEffect(() => {
  const styleId = 'prism-theme';
  const existingStyle = document.getElementById(styleId);
  if (existingStyle) {
    existingStyle.remove();
  }

  const link = document.createElement('link');
  link.id = styleId;
  link.rel = 'stylesheet';
  link.href = resolvedTheme === 'dark' ? '/prism-themes/prism-vsc-dark-plus.css' : '/prism-themes/prism-one-light.css';
  document.head.appendChild(link);

  return () => {
    const style = document.getElementById(styleId);
    if (style) {
      style.remove();
    }
  };
}, [resolvedTheme]);

One downside of this approach is that we need manually manage the CSS file in our project:

cp node_modules/prism-themes/themes/prism-vsc-dark-plus.css public/prism-themes/
cp node_modules/prism-themes/themes/prism-one-light.css public/prism-themes/

Syntax Highlighting Examples

CPP

#include <iostream>

int main() {
  std::cout << "Hello, World!" << std::endl;
  return 0;
}

Python

def hello():
    print("Hello, World!")

JavaScript

const hello = () => {
  console.log('Hello, World!');
};

Bash

$ git clone https://github.com/username/repo.git

SQL

SELECT * FROM users;

JSON

{
  "name": "John Doe",
  "age": 30,
  "email": "[email protected]"
}

YAML

name: John Doe
age: 30
email: [email protected]

Markdown

# Hello, World!

This is a test of Markdown syntax highlighting.

HTML

<h1>Hello, World!</h1>

CSS

body {
  background-color: #f0f0f0;
}

Conclusion

Syntax highlighting is a key feature for creating engaging and accessible technical content. Prism.js offers modern features, lightweight performance, and customizability, making it ideal for precise control over syntax rendering. In contrast, Highlight.js provides automatic language detection and ease of use but may be less efficient for performance-critical applications. Understanding these trade-offs helps ensure an informed choice tailored to specific needs.

While the implementation process can be complex due to serverless constraints and compatibility issues, solutions like React Syntax Highlighter and Prism make it achievable. By carefully considering your framework, hosting environment, and customization needs, you can select a strategy that provides both functionality and visual appeal for your blog. Through this exploration, I hope to have clarified the process and empowered others to create better technical blogs.

Comments

You must be logged in to comment.

OI
[email protected]January 2, 2025

I was using the email mask. It's quite useful! Haaa!

AU
[email protected]January 2, 2025

It's also me. It's fun! Haaa!