ZMarkupParser HTML String to NSAttributedString Tool
Convert HTML String to NSAttributedString with corresponding Key style settings
ZhgChgLi / ZMarkupParser
Features
- Developed purely in Swift, parses HTML Tags using Regex and Tokenization, corrects tag errors (fixes unclosed tags & misaligned tags), converts to an abstract syntax tree, and finally uses the Visitor Pattern to map HTML Tags to abstract styles, resulting in the final NSAttributedString output; does not rely on any Parser Lib.
- Supports HTML Render (to NSAttributedString) / Stripper (removes HTML Tags) / Selector functionality
- Automatically corrects tag errors (fixes unclosed tags & misaligned tags)
<br>
-><br/>
<b>Bold<i>Bold+Italic</b>Italic</i>
-><b>Bold<i>Bold+Italic</i></b><i>Italic</i>
<Congratulation!>
-><Congratulation!>
(treat as String) - Supports custom style specifications e.g.
<b></b>
->weight: .semibold & underline: 1
- Supports custom HTML Tag parsing e.g. parse
<zhgchgli></zhgchgli>
into desired styles - Includes architecture design for easy HTML Tag extension Currently supports basic styles, as well as ul/ol/li lists and hr separators. Future support for other HTML Tags can be quickly added.
- Supports style parsing from
style
HTML Attribute HTML can specify text styles from the style attribute, and this tool also supports style specifications fromstyle
e.g.<b style=”font-size: 20px”></b>
->bold + font size 20 px
- Supports iOS/macOS
- Supports HTML Color Name to UIColor/NSColor
- Test Coverage: 80%+
- Supports parsing of
<img>
images,<ul>
lists,<table>
tables, etc. - Higher performance than
NSAttributedString.DocumentType.html
Performance Benchmark
- Test Environment: 2022/M2/24GB Memory/macOS 13.2/XCode 14.1
- X-axis: Number of HTML characters
- Y-axis: Time taken to render (seconds)
*Additionally, NSAttributedString.DocumentType.html
crashes with strings longer than 54,600+ characters (EXC_BAD_ACCESS).
Demo
You can directly download the project, open ZMarkupParser.xcworkspace
, select the ZMarkupParser-Demo
target, and Build & Run to test the effects.
Installation
Supports SPM/Cocoapods, please refer to the Readme.
Usage
Style Declaration
MarkupStyle/MarkupStyleColor/MarkupStyleParagraphStyle, corresponding to the encapsulation of NSAttributedString.Key.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
var font: MarkupStyleFont
var paragraphStyle: MarkupStyleParagraphStyle
var foregroundColor: MarkupStyleColor? = nil
var backgroundColor: MarkupStyleColor? = nil
var ligature: NSNumber? = nil
var kern: NSNumber? = nil
var tracking: NSNumber? = nil
var strikethroughStyle: NSUnderlineStyle? = nil
var underlineStyle: NSUnderlineStyle? = nil
var strokeColor: MarkupStyleColor? = nil
var strokeWidth: NSNumber? = nil
var shadow: NSShadow? = nil
var textEffect: String? = nil
var attachment: NSTextAttachment? = nil
var link: URL? = nil
var baselineOffset: NSNumber? = nil
var underlineColor: MarkupStyleColor? = nil
var strikethroughColor: MarkupStyleColor? = nil
var obliqueness: NSNumber? = nil
var expansion: NSNumber? = nil
var writingDirection: NSNumber? = nil
var verticalGlyphForm: NSNumber? = nil
...
You can declare the styles you want to apply to the corresponding HTML tags:
1
let myStyle = MarkupStyle(font: MarkupStyleFont(size: 13), backgroundColor: MarkupStyleColor(name: .aquamarine))
HTML Tag
Declare the HTML tags to be rendered and the corresponding Markup Style. The currently predefined HTML tag names are as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A_HTMLTagName(), // <a></a>
B_HTMLTagName(), // <b></b>
BR_HTMLTagName(), // <br></br>
DIV_HTMLTagName(), // <div></div>
HR_HTMLTagName(), // <hr></hr>
I_HTMLTagName(), // <i></i>
LI_HTMLTagName(), // <li></li>
OL_HTMLTagName(), // <ol></ol>
P_HTMLTagName(), // <p></p>
SPAN_HTMLTagName(), // <span></span>
STRONG_HTMLTagName(), // <strong></strong>
U_HTMLTagName(), // <u></u>
UL_HTMLTagName(), // <ul></ul>
DEL_HTMLTagName(), // <del></del>
IMG_HTMLTagName(handler: ZNSTextAttachmentHandler), // <img> and image downloader
TR_HTMLTagName(), // <tr>
TD_HTMLTagName(), // <td>
TH_HTMLTagName(), // <th>
...and more
...
This way, when parsing the <a>
Tag, it will apply the specified MarkupStyle.
Extend HTMLTagName:
1
let zhgchgli = ExtendTagName("zhgchgli")
HTML Style Attribute
As mentioned earlier, HTML supports specifying styles from the Style Attribute. Here, it is abstracted to specify supported styles and extensions. The currently predefined HTML Style Attributes are as follows:
1
2
3
4
5
6
7
ColorHTMLTagStyleAttribute(), // color
BackgroundColorHTMLTagStyleAttribute(), // background-color
FontSizeHTMLTagStyleAttribute(), // font-size
FontWeightHTMLTagStyleAttribute(), // font-weight
LineHeightHTMLTagStyleAttribute(), // line-height
WordSpacingHTMLTagStyleAttribute(), // word-spacing
...
Extend Style Attribute:
1
2
3
4
5
6
7
8
9
ExtendHTMLTagStyleAttribute(styleName: "text-decoration", render: { value in
var newStyle = MarkupStyle()
if value == "underline" {
newStyle.underline = NSUnderlineStyle.single
} else {
// ...
}
return newStyle
})
Usage
1
2
3
import ZMarkupParser
let parser = ZHTMLParserBuilder.initWithDefault().set(rootStyle: MarkupStyle(font: MarkupStyleFont(size: 13)).build()
initWithDefault
will automatically add predefined HTML Tag Names & default corresponding MarkupStyles as well as predefined Style Attributes.
set(rootStyle:)
can specify the default style for the entire string, or it can be left unspecified.
Customization
1
2
let parser = ZHTMLParserBuilder.initWithDefault().add(ExtendTagName("zhgchgli"), withCustomStyle: MarkupStyle(backgroundColor: MarkupStyleColor(name: .aquamarine))).build() // will use markupstyle you specify to render extend html tag <zhgchgli></zhgchgli>
let parser = ZHTMLParserBuilder.initWithDefault().add(B_HTMLTagName(), withCustomStyle: MarkupStyle(font: MarkupStyleFont(size: 18, weight: .style(.semibold)))).build() // will use markupstyle you specify to render <b></b> instead of default bold markup style
HTML Render
1
2
3
4
5
6
let attributedString = parser.render(htmlString) // NSAttributedString
// work with UITextView
textView.setHtmlString(htmlString)
// work with UILabel
label.setHtmlString(htmlString)
HTML Stripper
1
parser.stripper(htmlString)
Selector HTML String
1
2
3
4
5
6
7
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
selector.first("a")?.first("b").attributedString // will return Test
selector.filter("a").attributedString // will return Test Link
// render from selector result
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
parser.render(selector.first("a")?.first("b"))
Async
Additionally, if you need to render long strings, you can use the async method to prevent UI blocking.
1
2
3
parser.render(String) { _ in }...
parser.stripper(String) { _ in }...
parser.selector(String) { _ in }...
Know-how
- The hyperlink style in UITextView depends on linkTextAttributes, so there might be cases where NSAttributedString.key is set but has no effect.
- UILabel does not support specifying URL styles, so there might be cases where NSAttributedString.key is set but has no effect.
- If you need to render complex HTML, you still need to use WKWebView (including JS/tables rendering).
Technical principles and development story: “The Story of Handcrafting an HTML Parser”
Contributions and Issues are welcome and will be promptly addressed
For any questions or feedback, feel free to contact me.
===
===
This article was first published in Traditional Chinese on Medium ➡️ View Here