chunkchunk

0.2.0 • Public • Published

Variable File Chunker

Uses the BuzHash library to chop up files to feature determined chunks. Good for deduplication.

const ChunkChunk = require('chunkchunk'),
      fs = require('fs'),
      file = fs.openSync('my/file');

const chunkee = new ChunkChunk(file, { max: 40000 })
// create variable chunks of `my/file` to a maximum size
// of 40k, with the default `min` chunk size of 50% of that.
const chunk = chunkee.nextChunk();
// { hash: 'sha256...', buffer: <Buffer ...> }
const chunks = chunkee.toEnd();
// [ {hash:.., buffer:<..>}, {...}]

fs.close(file);

Example Runs

The following is an example run of npm test chunking a ~120kb .jpg of batman. First column is number of bytes in the chunk, with a max chunk setting of 40k and a min of 20k. Second column is the sha256 of that chunk.

[master] ~ npm test

> chunkchunk@0.2.0 test C:\code\experiments\chunkchunk
> node test/test

4 chunks in 0s 30.234033ms
39005   QQitkmhKkmWfrX+p59Nk49kctS1TrMHhpnFka08Bya4=
34113   aOsmGJhJUeWNPPVbsyfqMyRx1F28rwQnvWiwwN/qVDo=
21484   FX98NrA8OlKWzSGJpXIAslixSRU4QJBPBEVEkcc9EXA=
23960   BY970o+31e5szl0TIGuDtfnPbH41tzWq2WYcK0Pn+1c=
118562

TODO

  • [ ] Make ChunkChunk take a string, instead of a file descriptor.
  • [ ] This library screams to be made a Transform Stream.
  • [ ] Add config option for feature 'uniqueness'

Package Sidebar

Install

npm i chunkchunk

Weekly Downloads

1

Version

0.2.0

License

ISC

Last publish

Collaborators

  • xori