duplicate-document-image-finder
TypeScript icon, indicating that this package has built-in type declarations

0.1.1 • Public • Published

duplicate-documet-image-finder

A JavaScript library to find duplicate document images.

How does it work?

It extracts the text of images using OCR and uses levenshtein distance to calculate the similarity between two texts.

Online demo

Methods and Interfaces

  • find. Find the duplicated images. You can pass your own OCR results.

    async find(images:HTMLImageElement[],textLinesOfImages?:TextLine[][],progressCallback?:any):Promise<HTMLImageElement[]>
  • TextLine

    export interface TextLine{
      x:number;
      y:number;
      width:number;
      height:number;
      text:string;
    }

Install

Via NPM:

npm install duplicate-document-image-finder

Via CDN:

<script type="module">
  import { DuplicateDocumentImageFinder } from 'https://cdn.jsdelivr.net/npm/duplicate-document-image-finder/dist/duplicate-document-image-finder.js';
</script>

License

MIT

Package Sidebar

Install

npm i duplicate-document-image-finder

Weekly Downloads

1

Version

0.1.1

License

MIT

Unpacked Size

54.8 kB

Total Files

7

Last publish

Collaborators

  • xulihang