Skip to content

This dataset compiles the number of occurrences of male and female baby names during specific time periods. It then calculates the probability of a name based on the total count. The data comes directly from government authorities, ensuring its credibility.

Notifications You must be signed in to change notification settings

ShreyaPatil1199/Gender_By_Name

Repository files navigation

Gender_By_Name

image

GitHub

GitHub release (latest by date)

GitHub last commit

Python 3

Table of Contents

Objective

The Gender Names Analysis project aims to analyze a dataset that compiles the number of occurrences of male and female names during specific time periods. It calculates the probability of a name based on the total count. The data used in this project comes directly from government authorities, ensuring its credibility.

The specific dataset used in this analysis is:

  • US: Baby Names from Social Security Card Applications - National Data, 1880 to 2019

Data Description

This dataset provides valuable insights into the popularity and distribution of baby names in the United States over the years. It allows you to explore trends in naming conventions, gender-based naming preferences, and the probability of encountering a specific name. The dataset is sourced from government authorities, ensuring its accuracy and reliability. It covers a significant time span from 1880 to 2019, providing a comprehensive view of historical naming trends.

The key attributes in this dataset include:

  • Name: String
  • Gender: M/F (category/string)
  • Count: Integer
  • Probability: Float

Prerequisite

To run this analysis, you need the following prerequisites:

  • Python 3
  • Jupyter Notebook (optional)
  • Pandas
  • Matplotlib (for data visualization)
  • Seaborn (for enhanced data visualization)

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This license allows for the sharing and adaptation of the dataset for any purpose, provided that the appropriate credit is given. When using this dataset in your projects or analyses, please make sure to provide proper attribution as per the CC BY 4.0 license.

For more details about the license, visit Creative Commons Attribution 4.0 International License.

About

This dataset compiles the number of occurrences of male and female baby names during specific time periods. It then calculates the probability of a name based on the total count. The data comes directly from government authorities, ensuring its credibility.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published