• data science

    Killer Data Science Training Sale

    So real quick, anyone who knows me knows that I love love love Data Camp. This week, Data Camp is offering up to 62% off their year subscriptions. These are already a good deal but a total no-brainer at this price. I recommend this to anyone who works in a data profession or who wants to pivot into a data profession like Data Science or Machine Learning. I really love the format: very short but informative videos followed up by hands-on simulations. You really learn by doing AND you get the theoretical context so you understand how to do the analysis. They have skill tracks so you can just learn…

  • covid-19,  data science,  portfolio

    County Level Covid-19 Dashboard

    Explore US COVID-19 new cases and deaths by US county. You can change the states, counties and time frame. There are a total of three tabs, the first two are visualizations while the third is a data table if you want more detail. See https://github.com/mattjcamp/covid_dashboard for the Python data pipeline used to gather the data for this visualization.

  • covid-19,  data science,  portfolio

    Covid-19 Visualization

    Explore US COVID-19 infection rates and hospitalizations for each state. You can change the states and time frame. There are a total of four tabs, the first two are visualizations while the second two are data tables if you want more detail. See https://github.com/mattjcamp/covid_dashboard for the Python data pipeline used to gather the data for this visualization.

  • covid-19,  data science

    Europe vs United States COVID Rates

    Check out this chart from the Statista blog on how the US and European Union compare on COVID-19 infection rates. You will find more infographics at Statista This is interesting, the US is still doing far worse than Europe but the gap appears to be closing. It will be interesting to see how this plays out in the Fall when people in the US will start spending more time indoors.

  • code,  data science

    Setup Your Local Data Warehouse Content

    In last week’s post, I showed you how to get a free SQL database by setting up a MySQL database server on your desktop. Now, it’s time to finish setting up your local data warehouse that you can use it to practice your SQL chops. We will use data that is posted on data.world to fill out our data warehouse. The first dataset is from the WICHE organization and shows you student enrollments and graduations in K-12 schools. This is a dataset that I uploaded to data.world years ago and I use it in demos like this. We will also download a file with information on US states so we…

  • code,  data science

    Get this Free SQL Database

    For a long time, I’ve wanted to published guides to the programming languages that I use. Last week, I started by writing up quick notes on SQL but it occurred to me that you will need a SQL database to follow along. So I found a free SQL database system that you can download right now. Download MAMP The program that you need is called MAMP and you can download that here. Take care to download the correct version for your operating system. MAMP makes it easy to install these three open-source products: Apache, MySQL, and PHP. MySQL is the free SQL database, but you need all three working set…

  • data science

    How to Get a Data Analyst Job

    Data Analysts use data to answer questions about the state of an organization’s business. Since data is stored in computer systems, a Data Analyst needs to be comfortable with technology. Data Analysts investigate data problems and be able to communicate their findings. Sound like a good job? Read on to learn how to get a data analyst job. This is a good job for people who are Investigative on the Holland Scale. If you are a problem solver and like to think things through then this may be the job for you. My Holland Code is ISE (Investigative, Social, Enterprising) and I have found data analysis to be a good…

  • data science

    Linear Models in R

    Regression or linear models are used to show a relationship between two variables. Basically, you take data points for two variables and then you can attempt to create a mathematical model based on these data points. If you succeed, in the future you will be able to guess the value of one variable (the dependent variable) based on the value of another variable (the independent variable). This example from DataCamp shows how to use the lm (for linear model) function to build a linear model that can predict height based on weight. The data used is the bdims dataset which is summarized below: Fit Model To create the linear model,…

  • code,  data science

    D3 Experiments

    D3 is a Javascript library used for data visualization. You can use this tool to make your webpage data driven. Essentially, you bind your data to components on your website including SVG elements. Here is code that will create a simple D3 bar chart based on an array of numbers. This will create a very simple bar graph using the numbers in the array to determine the height of the bar. In order for this visualization to be displayed on my webpage here, we need to have a div with an id of “plot” and we need to import the D3 library. So this is a very simple example that…

  • data science

    datapointsr

    datapointsr is an R package that I wrote while I was working at College Board. datapointsr makes it a little bit easier to QC and work on statistical summary tables. datapointsr acts as a wrapper for sqldf and reshape and it also puts statistical tables into a standard format(categories, variables, values). The idea is that it will be easier to build reusable functions later on if I can assume that data will be in this format. I’m moving toward making this work more like the dplyr package with a emphasis on verbs. You can install datapointsr from Github using the devtools package like this: Here is an example of how…