Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat code #10

Open
nerdcommander opened this issue Jan 11, 2017 · 2 comments
Open

concat code #10

nerdcommander opened this issue Jan 11, 2017 · 2 comments

Comments

@nerdcommander
Copy link
Contributor

bring in the People's Republic of Berkeley data

PRB = pd.read_table('PRB_data.csv', sep = "\t")
PRB

make this into an exercise

bring in PRB data (no major problems) and make it conform to the gapminder at this point

our version...

clean the data to look like the current gapminder

PRB['country']=PRB['region'].str.split('', 1).str[1].str.lower()
PRB['continent']=PRB['region'].str.split('
', 1).str[0].str.lower()
PRB = PRB.drop('region', 1)
PRB.columns = PRB.columns.str.lower()
PRB = PRB.rename(columns={'life exp' : 'lifeexp'})
PRB

double check that the gapminder is the same

gapminder.head()

combine the data sets with concat

gapminder_comb = pd.concat([gapminder, PRB])
gapminder_comb

this is markdown

Now that the data frames have been concatenated, notice that the index is funky. It repeats the numbers 0 - 11 in the peoples republic of berkeley data.


as an exercise fix the index.

our code for fixing index

gapminder_comb = gapminder_comb.reset_index(drop=True)
gapminder_comb

@kellieotto
Copy link
Contributor

Inserted into the lesson. Still need the narrative + correct file reference

@nerdcommander
Copy link
Contributor Author

here is the narrative markdown:

Merging data

Often we have more than one data frame that contains parts of our data set and we want to put them together. This is known as merging the data.


Our advisor now wants us to add a new country called The People's Republic of Berkeley to the gapminder data set that we have cleaned up. Our goal is to get this new data into the same data frame in the same format as the gapminder data and, in this case, we want to concatenate (add) it onto the end of the gapminder data.


Concatentating is a simple form of merging, there are many useful (and more complicated) ways to merge data. If you are interested in more information, the Pandas Documentation is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants