Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save the community some typing #21

Open
buddha314 opened this issue Aug 9, 2018 · 4 comments
Open

Save the community some typing #21

buddha314 opened this issue Aug 9, 2018 · 4 comments

Comments

@buddha314
Copy link

I know this is an abuse of "issues" but it doesn't warrant a full repo. Here is some Python code you can cut/paste

class Rutweet:
    def __init__(self, external_author_id, author, content,
                region, language, publish_date, harvested_date,
                following, followers, updates, post_type, account_type,
                retweet, account_category, new_june_2018):
    
        self.external_author_id = external_author_id
        self.author = author
        self.content = content
        self.region = region
        self.language = language
        self.publish_date = publish_date
        self.harvested_date = harvested_date
        self.following = following
        self.updates = updates
        self.post_type = post_type
        self.account_type = account_type
        self.retweet = retweet
        self.account_category = account_category
        self.new_june_2018 = new_june_2018

And a quick loader.

def load_tweets(fn):
    with open(fn, 'r') as f:
        for line in f.readlines():
            fields = line.split(',')
            rut = Rutweet(fields[0], fields[1], fields[2],
                          fields[3], fields[4], fields[5],
                          fields[6], fields[7], fields[8],
                          fields[9], fields[10], fields[11],
                          fields[12], fields[13], fields[14],
                         )
@Meeds122
Copy link

If you're using python 3 sometimes emojis will screw with unicode decoding of the text files. Do open(fn, 'r', encoding="latin-1") if you're getting a UnicodeDecodeError

@EvanCarroll
Copy link

If you want to make this work with my schema, I'll make you a contributor and we can develop on it instead.

I think putting the python code in a subdirectory organized under python would be a great idea for python users. But this code is for the older v1 dataset, not the v2 data set. I've done the same thing for PostgreSQL you can find my scripts under ./PostgreSQL

@EvanCarroll
Copy link

@Meeds122 see my note at the bottom of #20

#20 (comment)

@buddha314
Copy link
Author

@EvanCarroll Great, could you create an issue and assign it to me? I'll try to contribute this week. I have some other things I could add as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants