You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is to documenting the way "notes" field is being used in th.csv, the test list for Thailand, to encoded some metadata about the URL itself or about the test status of the URL.
Almost all of them are not well-formatted, still not totally "free text".
The one that is consistency coded is the language code. This one can be easily parse.
Available metadata:
“[<lang_code>]” -- Appended to the end of the notes. Natural language of the page at the URL. <lang_code> is ISO 639-1 language code (2 characters).
“Regional site” -- Appended to the end of the notes. Telling if the website at the URL is a “regional site” where the same site is intended to serve more than one country. Useful when reporting about the characteristic of the website.
“blocked in ” or “blocked on “ -- Date the URL got issued a block order from court or is known to be blocked, from media or other sources. Currently some dates are now in ISO, some are not. Ideally, it should be all ISO 8601.
“blocked in , see:“ -- Date the URL got issued a block order or is known to be blocked, with a reference.
“last updated on” -- Date where a human annotator can verify (from the web content) that the page was most recently get updated.
Examples from the actual th.csv
Thai politics review journal [en]
Asian politics review and analysis [en]
Thai Lawyers for Human Rights (old website) [th] [en]
Issues in Deep South of Thailand [th] [en] [ms]
Telecom Asia, also cover ICT news in Asia. Announced closure on 2019-05-31. No longer updated. [en]
Anti-censorship group. (As of June 2020, the blog was last updated on 13 April 2019)
This is to documenting the way "notes" field is being used in
th.csv
, the test list for Thailand, to encoded some metadata about the URL itself or about the test status of the URL.Available metadata:
“[<lang_code>]” -- Appended to the end of the notes. Natural language of the page at the URL. <lang_code> is ISO 639-1 language code (2 characters).
“Regional site” -- Appended to the end of the notes. Telling if the website at the URL is a “regional site” where the same site is intended to serve more than one country. Useful when reporting about the characteristic of the website.
“blocked in ” or “blocked on “ -- Date the URL got issued a block order from court or is known to be blocked, from media or other sources. Currently some dates are now in ISO, some are not. Ideally, it should be all ISO 8601.
“blocked in , see:“ -- Date the URL got issued a block order or is known to be blocked, with a reference.
“last updated on” -- Date where a human annotator can verify (from the web content) that the page was most recently get updated.
Examples from the actual th.csv
Thai politics review journal [en]
Asian politics review and analysis [en]
Thai Lawyers for Human Rights (old website) [th] [en]
Issues in Deep South of Thailand [th] [en] [ms]
Telecom Asia, also cover ICT news in Asia. Announced closure on 2019-05-31. No longer updated. [en]
Anti-censorship group. (As of June 2020, the blog was last updated on 13 April 2019)
Asian porn. Found blocked in 2014, see: https://citizenlab.ca/2014/07/information-controls-thailand-2014-coup/ [en]
Human Rights Watch - Thailand page. Was blocked on Nov 2014, after the coup. https://www.blognone.com/node/63330 [en]
Think tank on civil society. Based in Singapore. [en] Regional site
Midnight University, was blocked in 2006. Found anomaly on OONI Explorer (most recent on 2020-06-09, as of 2020-06-11). [th]
The text was updated successfully, but these errors were encountered: