ZhgChg.Li

Automate Medium Article Backup|Mirror to GitHub Pages with Jekyll Seamlessly

Discover how to efficiently backup and mirror your Medium articles to GitHub Pages using Jekyll, solving content loss risks with automated setup, maintenance, and customization for reliable personal archives.

Automate Medium Article Backup|Mirror to GitHub Pages with Jekyll Seamlessly

Automatic Backup of Medium Articles to Github Pages (Jekyll)

Independent writing, free to read — please support these ads

 

Advertise here →

Some Notes on Building, Maintaining, Upgrading, and Customizing a Personal Medium Article Backup Mirror Site

Preface

I have been managing my Medium account for 6 years, and the total number of articles exceeded 100 last year. As time goes by and the number of articles grows, I increasingly worry that Medium might suddenly shut down or my account could have issues, causing all my work to be lost. Some articles are not very valuable, but many record technical architectures and problem-solving thoughts at the time. I often revisit my old posts to review knowledge. In recent years, I also started documenting my travel stories abroad, which are memories and perform well in traffic. Once these contents are lost, they cannot be rewritten.

Developing a Backup Tool Independently

I usually write articles directly on the Medium platform without my own backup. Therefore, during the 2022 Lunar New Year, I spent time developing a tool to download Medium articles and convert them into Markdown files (including article images, embedded code, and other content) — ZMediumToMarkdown

Extend the use of this tool to deploy the downloaded Markdown as a static backup mirror site on Github Pages using Jekyll (Chirpy Theme)https://zhgchg.li/

<https://zhgchg.li/>

https://zhgchg.li/

At that time, I integrated the whole set into a Github Template Repo for friends with similar needs to quickly deploy and use — ZMediumToJekyll. Since then (2022), I have not updated the version or settings of Jekyll (Chirpy Theme). ZMediumToMarkdown is still maintained, and any format parsing errors found are fixed immediately. It is now quite stable.

The version of Jekyll (Chirpy Theme) used at that time was v5.x, which worked well with all necessary features (e.g., sticky posts, categories, tags, cover images, comments…). The only issue was frequent scrolling problems where the page would sometimes not scroll, but after a few swipes it worked again. This was a flaw in user experience. Attempts to upgrade to v6.x still had the issue, and reporting it to the developers received no response. Additionally, conflicts increased with each version upgrade, so the idea of upgrading was eventually abandoned.

I recently decided to solve issues with Jekyll (Chirpy Theme), upgrade its version, and conveniently optimize the quick deployment tool ZMediumToJekyll.

New! medium-to-jekyll-starter 🎉🎉

Independent writing, free to read — please support these ads

 

Advertise here →

medium-to-jekyll-starter.github.io

I integrated the latest version v7.x of Jekyll (Chirpy Theme) with my ZMediumToMarkdown Medium article download and conversion tool into a new Github Template Repo — medium-to-jekyll-starter.github.io.

You can directly use this starter Repo to quickly set up your own Medium mirror content backup site, with one-time setup for permanent continuous automatic backup, deployed completely free on Github Pages.

For a step-by-step setup guide, please refer to this article: https://zhgchg.li/posts/medium-to-jekyll/

Results

<https://zhgchg.li/>

https://zhgchg.li/

All the above articles are **automatically downloaded from my Medium, converted to Markdown format, and re-uploaded.*

Here is a sample conversion result of a random article for comparison:

Original content on Medium / Converted result on personal website

After the upgrade, the scrolling freeze issue no longer occurs. This upgrade also added customized dynamic content (displaying Medium follower count).

Some Technical Notes

Independent writing, free to read — please support these ads

 

Advertise here →

Jekyll (Chirpy Theme) deployment setup on Github Pages mainly follows the official Start Repo:

Last month, I also referred to this project’s approach and created a new open-source project — Linkyee, an open-source Link Tree personal link page.

<https://link.zhgchg.li/>

https://link.zhgchg.li/

Jekyll Customization Method (1) — Override HTML

Jekyll is a powerful Ruby static site generator. Jekyll (Chirpy Theme) is a theme based on Jekyll. After comparing with other themes, Chirpy Theme offers the best quality, user experience, and comprehensive features.

Jekyll pages support inheritance. We can add files with the same page filenames as Jekyll in ./_layouts. When generating the site, the engine will use your custom page content to replace the original.

For example, if I want to add a line of text at the end of each post page, I first copy the original post page file (post.html) and place it in the ./_layouts directory:

Open post.html with an editor, add text or customizations in the appropriate place, and redeploy the site to see the customized results.

You can also create a ./_include directory to store some shared page content files:

Then in post.html, we can directly use {% include buymeacoffee.html %} to include the HTML content from that file for reuse.

The advantage of overriding HTML Layout files is 100% customization—you can freely adjust the page content and layout as you wish; the downside is that during upgrades, conflicts or unexpected results may occur, requiring you to review your customizations again.

Jekyll Customization Method (2) — Plugin

The second method is to use the Hook feature within Plugins to inject custom content during Jekyll’s static site generation phase.

Built-in Hook Owners and Events

[Built-in Hook Owners and Events

Hook events](https://jekyllrb.com/docs/plugins/hooks/#built-in-hook-owners-and-events){:target=”_blank”} are many; here I only include the site:pre_render and post:pre_render hooks I used.

Adding a new method is also simple; just add a Ruby file in ./_plugins.

posts-lastmod-hook.rb is an existing Plugin

posts-lastmod-hook.rb is a built-in Plugin

I want a few “pseudo” dynamic content features. The first is to display the Medium follower count under the profile and show the last updated time of the page content in the footer.

Created a zhgchgli-customize.rb under ./_plugins:

#!/usr/bin/env ruby
#
require 'net/http'
require 'nokogiri'
require 'uri'
require 'date'


def load_medium_followers(url, limit = 10)
  return 0 if limit.zero?

  uri = URI(url)
  response = Net::HTTP.get_response(uri)
  case response
  when Net::HTTPSuccess then
      document = Nokogiri::HTML(response.body)

      follower_count_element = document.at('span.pw-follower-count > a')
      follower_count = follower_count_element&.text&.split(' ')&.first

      return follower_count \\|\\| 0
  when Net::HTTPRedirection then
    location = response['location']
    return load_medium_followers(location, limit - 1)
  else
      return 0
  end
end

$medium_url = "https://medium.com/@zhgchgli"
# could also define in _config.yml and retrieve in Jekyll::Hooks.register :site, :pre_render do \\|site\\| site.config

$medium_followers = load_medium_followers($medium_url)

$medium_followers = 1000 if $medium_followers == 0
$medium_followers = $medium_followers.to_s.reverse.scan(/\d{1,3}/).join(',').reverse


Jekyll::Hooks.register :site, :pre_render do \\|site\\|

  tagline = site.config['tagline']
  
  followMe = <<-HTML
  <a href="#{$medium_url}" target="_blank" style="display: block;text-align: center;font-style: normal;/* text-decoration: underline; */font-size: 1.2em;color: var(--heading-color);">#{$medium_followers}+ Followers on Medium</a>
  HTML

  site.config['tagline'] = "#{followMe}";
  site.config['tagline'] += tagline;

  meta_data = site.data.dig('locales', 'en', 'meta');
  # only implementation in en, could implement to all langs.

  if meta_data
    gmt_plus_8 = Time.now.getlocal("+08:00")
    formatted_time = gmt_plus_8.strftime("%Y-%m-%d %H:%M:%S")
    site.data['locales']['en']['meta'] += "<br/>Last updated: #{formatted_time} +08:00"
  end
end
  • The principle is to register a Hook before the site renders, injecting an HTML block showing Medium follower count into the tagline profile introduction section in the config.

  • The number of Medium followers is fetched every time the script runs to get the latest count.

  • The logic for the footer last updated time is similar, which is to add the last updated time string to locales->en->meta when generating the site.

  • Additionally, if it’s a Hook before article generation, you can access the Markdown; if it’s a Hook after article generation, you can access the generated HTML.

After saving, you can first test the result locally with bundle exec jekyll s:

Open 127.0.0.1:4000 in your browser to see the result.

Finally, add a scheduled workflow in the Github Pages Repo Actions to automatically regenerate the site, and it’s done:

In the Jekyll (Chirpy Theme) repo project Actions, find pages-deploy.yml and add the following under on::

  schedule:
    - cron: "10 1 * * *" # Automatically runs once daily at 01:10 UTC, https://crontab.guru

The advantage of Plugins is that they enable dynamic content (scheduled updates) without affecting the site structure or causing conflicts during upgrades; the downside is limited control over content and display positions.

Jekyll (Chirpy Theme) Deployment Issues on Github Pages after v7.x

Besides the site structure adjustments, the deployment script in v.7.x has also changed; the original deploy.sh script was removed, and Github Actions deployment steps are used directly:

# build:
# ...
      - name: Upload site artifact
        uses: actions/upload-pages-artifact@v3
        with:
          path: "_site${{ steps.pages.outputs.base_path }}"

  deploy:
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    runs-on: ubuntu-latest
    needs: build
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

But I encountered problems during the deployment process:

Uploaded artifact size of 1737778940 bytes exceeds the allowed size of 1 GB caused the Upload Artifact to fail because my website content is too large; however, the previous deployment script worked, so I had to revert to the original deploy.sh plus comment out this section above.

Test Site Steps Keep Failing During Github Pages Deployment

Jekyll (Chirpy Theme) deployment includes a step called Test Site to check if the webpage content is correct, such as verifying links and ensuring no missing HTML tags.

# build:
# ...
      - name: Test site
        run: \\|
          bundle exec htmlproofer _site \
            \-\-disable-external \
            \-\-no-enforce-https \
            \-\-ignore-empty-alt \
            \-\-ignore-urls "/^http:\/\/127.0.0.1/,/^http:\/\/0.0.0.0/,/^http:\/\/localhost/"

I added --no-enforce-https and --ignore-empty-alt to skip checks for HTTPS and HTML tags without alt attributes. Ignoring these two allows the check to pass (since I can’t change the content for now).

The CLI command for htmlproofer is not mentioned in the official documentation. After searching for a long time, I finally found the rules in a comment on an issue:

<https://github.com/gjtorikian/html-proofer/issues/727#issuecomment-1334430268>

https://github.com/gjtorikian/html-proofer/issues/727#issuecomment-1334430268

Other Article Supplements

Improve this page
Edit on GitHub
Also published on Medium
Read the original
Share this essay
Copy link · share to socials
ZhgChgLi
Author

ZhgChgLi

An iOS, web, and automation developer from Taiwan 🇹🇼 who also loves sharing, traveling, and writing.

Comments