Skip to content

mosdeo/FB-Message-Parser

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Facebook Message Export Parser (目前 only for 繁體中文)

Facebook 允許使用者使下載自己的 Message 對話紀錄,在下載後的 zip 檔案中以 html 檔案保存。 這樣方便在瀏覽器中點選檢視,但是很難給程式讀取做為資料分析用。所以,有了專門將對話內容 parser 出來的程式。

由於時間格式不同,原本的版本無法 parser 繁體中文的對話紀錄。

  • 原作者時間格式:
dtFormat = '%A, %B %d, %Y at %I:%M%p %Z'
dt.strptime(y.find(class_='meta').string.replace("+01", ""), dtFormat),
  • 可 parser 繁體中文格式
dtFormat = "%Y年%m月%d日 %H:%M"
dt.strptime(y.find(class_='meta').string[0:-7], dtFormat), # 用 [0:-7] 去除掉繁體中文無法 parser 的 UTC

目前這個 fork 暫時只讓繁體中文可以被 parser,但是和原作者的功能會衝突,必須靠自行修改程式碼解決,日後會想辦法讓這些功能相容。

=======

Facebook has a feature that allows users to download a copy of their data as a zip archive containing htm files with their data. The aim of this parser is to take this archive and to extract a user's Facebook Messages from it; to transfer them into a more useful format, as well as performing some analysis to produce interesting data.

Running the Code

The Facebook Export can be downloaded from the Facebook Settings menu.

Run "python fb_parser.py" with the 'messages.htm' file in the same directory to export to JSON as proof of concept.

Dependencies

The code is written in Python 3+. The parser uses BeautifulSoup to do the bulk of the capture from the htm file.

Anaconda Python for scientific computing is a simple and easy way to install all the dependencies for the code, alongside many other useful libraries. It can be downloaded here.

About

Parse your 繁中 Facebook messages!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%