top of page
Search
  • Writer's pictureMeryl Marie

Attention Sociology Students: Try Data Science!

When I was in college, I took a Statistics 101 class for my Sociology/Anthropology major. We calculated chi-square tests, correlation, and statistical significance to find relationships between variables. We investigated questions inspired by the General Social Survey (GSS) data, which has information on family life, careers, demographics, and more. At the time, it felt overwhelming and confusing. I couldn't remember the difference between the independent and dependent variables and found SPSS hard to work with. But I wrote a paper on how Americans perceive women in politics, and truly enjoyed the difficult, confusing, and exciting process of integrating the quantitative data analysis with academic & qualitative research.


After that experience, though, I soon forgot my stats knowledge. When I entered the data analysis field early in my career, I tried to remember how to apply my basic statistics to a business problem. The idea excited me - but I couldn't quite figure it out.


Fast forward to 2020 and I feel like I finally have the answer on how to use my sociology degree in the "real world", and that is through data science!


Sociologists, and social scientists in general, are no strangers to investigating large problems and looking for solutions. The research process for sociology is closely related to the data science process. Many times people think that Machine Learning and Data Scientists only deal with clean, large data sets. Unfortunately, many times you're looking to solve a problem with data- whether it is a business problem or social problem - you have to put it in the correct format first. Models can be specific and finicky, requiring specific parameters to run. Sometimes, you think you have the right information, but in reality your model is off because of one misspelled variable and your model won't predict anything. That is why a data scientist has to work hard to make sure the data is in the correct format by cleaning, munging, and/or wrangling before you can start modeling. Similarly, quantitative sociologists must take care in setting up their data collection to make sure it can be properly investigated using statistical methods as well.


My favorite model that we learned in the General Assembly Data Science Immersive program is logistic regression. Similar to a linear regression, which predicts a numeric value based on a number of features, the logistic regression predicts a class or category. Not only can it predict a class such as race or gender (very exciting for a social scientist), it can tell the data scientist which features in the dataset helped it get there. These models are called "white box" models, meaning we can make inferences based on the coefficients that the model produces.


If you are like me, and get excited when you solve a puzzle, enjoy reading and learning about data, and miss the thrill of academic research but don't want to BE in academia, I really can't recommend a data science program enough. And while I don't have a career in the industry yet, I am excited for all the possibilities ahead!


Feel free to reach out if you have questions about my experience with data science and sociology:

4 views0 comments

Recent Posts

See All

Life After a Data Science Bootcamp

The last time I updated my LinkedIn, I told people I was working on a personal project and to keep and eye out! Well, of course life happens and I was busy with interviews and vacation and living, so

A Fix for My ARIMA: Frequency in Time Series Data

I recently attempted an Auto Regressive Integrated Moving Average (ARIMA) model for my time series data, COVID-19 cases in New Jersey prisons. After creating a successful linear time series regression

bottom of page