Methodius A utility for analyzing frequency of text in chunks.
This CLI lets you do it from the command line.
- Node LTS (as of September 2023, Node 18.16.0)
Download this package. Then run
npm install
npm i -g methodius-cli
Get all the details:
methodius -f "great-expectations.txt"
Decide what properties you'd like to see:
(use -p
for each property you want to see)
methodius -f "great-expectations.txt" -p "uniqueWords" -p "uniqueBigrams" -p letterFrequencies
Do the same on multiple files
methodius -f "great-expectations.txt" "a-tale-of-two-cities.txt" -p "uniqueWords" -p "uniqueBigrams" -p letterFrequencies
Output multiple files to a directory
methodius -f "great-expectations.txt" "a-tale-of-two-cities.txt" -p "uniqueWords" -p "uniqueBigrams" -p letterFrequencies -o "dickens/"
Set your own output file
methodius -f "great-expectations.txt" "a-tale-of-two-cities.txt" -p "uniqueWords" -o uniqueWords.json
Option | Alias | Description | Defaults |
---|---|---|---|
--files |
-f |
fully qualified path to a file Required. | samples/alice.txt |
--topLimit |
-l |
for any methods, this sets the number of top-ngrams to get. Optional. | 15 |
--properties |
-p |
which properties to return. Optional. Get the list off of the repo | 'bigramFrequencies','trigramFrequencies','letterFrequencies','meanWordSize', 'medianWordSize','wordFrequencies','bigramPositions','trigramPositions', 'uniqueWords' |
--topMethods |
-s |
which "top" methods to use. optional. | 'topBigrams', 'topTrigrams','topWords', |
--outputFileName |
-o |
name of the output file. Optional. |
analysis.json or <inputfilename>.analysis.json if multiple files. This could also be a directory: analysis/en/
|
--mergeResults |
-m |
Merges the results files. output will be .merged.json . Optional. |
false |
--mergeResults
, -m
, and methodius-merge
analyzes all of the results files and creates a single file that contains all of the results. How it merges is based on the type of value for the property:
- If a property in a results file is an Object or a Map, what's merged are the keys. Duplicates are removed.
- If a property in a results file is an array or Set (which would be weird because JSON can't output a Set), the arrays are concatenated. Duplicates are removed.
- If a property in a results file is a number, the numbers are averaged.
If you want to merge results after the fact, you can use the methodius-merge
command. This takes all of the same arguments as methodius
. It exists so that there's the option to pick the files you merge.
methodius-merge -f "alice.analysis.json" "huck-fin.analysis.json" -o "merged.json"