page-grabber
TypeScript icon, indicating that this package has built-in type declarations

0.4.5 • Public • Published

page-grabber

Util for grab data from web-page

NPM version Build Status Dependency Status Coverage percentage experimental

Install

npm install page-grabber --save

or 

yarn add page-grabber

Usage

    import { JSDOM } from "jsdom";
    import createGrabber, { attr$, html, sel$, text } from "page-grabber";
 
    const h = `
        <div id="div1">
            <ul>
                <li><a href="link1">Title1</a><span><b>Content1</b></span></li>
                <li><a href="link2">Titl2</a><span><b>Content2</b></span></li>
            </ul>
        </div>
    `;
    const window = new JSDOM(h).window;
    const grabber = createGrabber(window);
    const res = grabber.grab(sel$("#div1", {
        items: sel$("ul>li", [{
            title: sel$("a", text()),
            link: sel$("a", attr$("href")),
            content: sel$("span", html()),
        }]),
    }));
    for (const item of res.items) {
        console.log("title: ", item.title, "content: " + item.content);
        // title:  Title1 content: <b>Content1</b>
        // title:  Titl2 content: <b>Content2</b>
    }

API

attr(name: string) => string | null - get attribute by name
attr$(name: string) => string - get attribute by name with check for non-empty

Test

npm install
npm test

Readme

Keywords

none

Package Sidebar

Install

npm i page-grabber

Weekly Downloads

32

Version

0.4.5

License

ISC

Unpacked Size

229 kB

Total Files

25

Last publish

Collaborators

  • arvitaly