Skip to content

Python module providing parsing, filtering an analysing tools for WhatsApp exported chats

License

Notifications You must be signed in to change notification settings

migzamudior3/Wanalyzer

 
 

Repository files navigation

Wanalyzer

Tests

Wanalyzer is a simple Python module providing tools to parse and analyze exported WhatsApp conversations (via the Export Chat feature of WhatsApp).

Installation

To install Wanalyzer, you can simply use pip install analyser or use the setup.py python script.

And import it to your project with import wanalyzer as wa.

Usage

Parsing

You can choose between:

  • Using the provided conversation parser and send the file path as str to the Conversation constructor.
  • Do the parsing part yourself and send a List[Message] to the Conversation constructor (can be useful if your input formatting is not exactly the same as the WhatsApp exported chat feature).

Message dataclass description

Attribute Type Description
date datetime Time (precise with secs) of the message
author str Author of the message
content str Content of the message
msg_type MessageType Type of the message

Message.MessageType can be TEXT, STICKER, AUDIO, IMAGE, VIDEO, or FILE.

Filtering

You are able to create your own filters of type Callable[[Message], bool]: It have to be a function taking the Message as parameter, and return false if it have to be removed from the analysed conversation.

Some built-in filters as already available:

  • filter__min_size: Remove all messages less than the given size.
  • filter__from_authors: Remove all messages that are not from the given authors.
  • filter__contains_texts: Remove all messages not containing any of the given texts.
  • filter__not_contains_texts: Remove all messages containing any of the given texts.

Statistics

Wanalyzer provide both member based, and time based statistics tools.

Member Statistics

With the Conversation.get_members_stats() method, you can get a List[MemberStat] describing the statistics of each member of the conversation.

The MemberStat dataclass gives informations about the messages sent by the member.

MemberStat dataclass description

Attribute Type
member_name str
member_conv str
messages List[Message]
msg_count_by_type Dict[MessageType, int]

The msg_count_by_type attribute provide the messages count of each type.

Time scale splitting

With the Conversation.get_split_by(scale: TimeScale) method, you can get a Dict[datetime, Conversation] dictionary that split the original conversation into smaller ones, taking the messages sorted by the given TimeScale: YEAR, MONTH, DAY, HOUR, MINUTE or SECOND.

Contributing

Pull requests are opened to add any kind of features or improvements in the code. Please make sure that all the regression tests are passing (using test.sh) and to clean the repository (using clean.sh) before submitting your pull request.

About

Python module providing parsing, filtering an analysing tools for WhatsApp exported chats

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.7%
  • Shell 3.3%