yarn add sk22/im-wordcounter.js
const fs = require('fs');
const counter = require('im-wordcounter');
const whatsapp = require('im-wordcounter/middlewares/whatsapp');
const stream = fs.createReadStream('chat.txt');
const c = counter(whatsapp);
c.count(stream).then(console.log);
count
returns a Promise that resolves to an array of objects, like this:
[ { word: 'foo', count: 2 },
{ word: 'bar', count: 1 } ]
A counter's count
function takes an options object as its second parameter.
The wc
object is what is passed to the wordcounter
constructor.
For information about its parameters, see Word Counter by Fengyuan Chen on GitHub
Here's an example on how to alter these options.
count(stream, { sorted: true, wc: { ignorecase: true } })
These are the default values:
{
sorted: true,
wc: {
ignorecase: true,
ignore: ['_'],
report: false,
},
}
Middlewares are functions that convert a line to a string that only contains the actual message sent by the user.
Therefore, a middleware's signature looks like this:
(line: string) => string
At the moment, this is the only middleware available.
To make use of it, use the Email chat
feature on your phone's WhatsApp
app to get the chat history. The middleware will filter out the actual messages
from the file.
Since line breaks by multi-line messages are represented as normal line breaks, lines that do not have the prefix (date, sender) are treated as messages as a whole.
WhatsApp chat:
2/2/17, 09:46 - John Doe: Hello!
2/2/17, 09:48 - Samuel Kaiser: Hello!
2/2/17, 09:48 - Samuel Kaiser: How are you?
Result:
[ { word: 'hello', count: 2 },
{ word: 'are', count: 1 },
{ word: 'how', count: 1 },
{ word: 'you', count: 1 } ]