Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dataset the_new_york_times #342

Open
albertvillanova opened this issue Jan 19, 2022 · 0 comments
Open

Create dataset the_new_york_times #342

albertvillanova opened this issue Jan 19, 2022 · 0 comments
Labels
data catalog Gathering data from data sources

Comments

@albertvillanova
Copy link
Member

  • uid: the_new_york_times
  • type: primary
  • description:
    • name: The New York Times

    • description: The New York Times is an American daily newspaper based in New York City with a worldwide readership.[7][8] It was founded in 1851, by Henry Jarvis Raymond and George Jones, and was initially published by Raymond, Jones & Company. [9]

      The Times has since won 132 Pulitzer Prizes, the most of any newspaper,[10] and has long been regarded within the industry as a national "newspaper of record".[11] It is ranked 18th in the world by circulation and 3rd in the U.S.[12]

      The paper is owned by The New York Times Company, which is publicly traded. It has been governed by the Sulzberger family since 1896, through a dual-class share structure after its shares became publicly traded.[13] A. G. Sulzberger and his father, Arthur Ochs Sulzberger Jr.—the paper's publisher and the company's chairman, respectively—are the fifth and fourth generation of the family to head the paper.

    • homepage: https://www.nytimes.com/

    • validated: True

  • languages:
    • language_names:
      • English
    • language_comments:
    • language_locations:
      • World-Wide
      • United States of America
    • validated: False
  • custodian:
  • availability:
    • procurement:
      • for_download: No - but the current owners/custodians have contact information for data queries
      • download_url:
      • download_email: nytnews@nytimes.com
    • licensing:
    • pii:
      • has_pii: Yes
      • generic_pii_likely: very likely
      • generic_pii_list:
        • names
        • email addresses
        • website account name or handle
      • numeric_pii_likely: somewhat likely
      • numeric_pii_list:
        • telephone numbers
      • sensitive_pii_likely: somewhat likely
      • sensitive_pii_list:
        • racial or ethnic origin
        • political opinions
        • religious or philosophical beliefs
      • no_pii_justification_class:
      • no_pii_justification_text:
    • validated: False
  • source_category:
    • category_type: website
    • category_web: news or magazine website
    • category_media:
    • validated: False
  • media:
    • category:
      • text
      • image
    • text_format:
      • .HTML
    • audiovisual_format:
    • image_format:
      • .JPG
    • database_format:
    • text_is_transcribed: No
    • instance_type: article
    • instance_count: 100K<n<1M
    • instance_size: 100<n<10,000
    • validated: False
  • fname: the_new_york_times.json
@albertvillanova albertvillanova added the data catalog Gathering data from data sources label Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data catalog Gathering data from data sources
Development

No branches or pull requests

1 participant