article-archiver

1.5.0 • Public • Published

Article Archiver production

The purpose of this library is to convert online articles and blog posts into local markdown by only preserving:

  • article content
  • media assets
  • meta data

The heavy lifting around scraping is done with Cypress and the content is enhanced with Mozilla Readability.


Getting Started

⚠️ This library is under development and not expected to work until the TODO's are completed ⚠️

Installation

npm install -g article-archiver

Usage

npx article-archiver <urls>

Architecture

Architecture

TODO

  • [x] setup cypress
  • [x] configure cypress to scrape URL's
  • [x] implement code cleaner and enhancer
  • [ ] implement readability
  • [ ] wire up scraper to enhancer
  • [ ] setup http server for tmp files
  • [ ] setup website-scraper
  • [ ] wire up archiver to save local assets to tmp folder
  • [ ] setup utf8 and turndown transformers
  • [ ] wire up transformer to merge meta data and write to output

Package Sidebar

Install

npm i article-archiver

Weekly Downloads

2

Version

1.5.0

License

MIT

Unpacked Size

56.4 kB

Total Files

76

Last publish

Collaborators

  • chrisodicho