In class we have started to discus the topic of big data. Where large quantities of information are gathered for the purpose of having the information in on place for that subject. More or less at least. We where instructed to find a data set online and provide some information about it using a list questions. The data set that I found was on voter registration statics for the state of Oregon, for 2016. The source of the data set was data.gov. It was made and published by data.Oregon.gov. Even on the source site, it doesn’t say how the data was collected it just says “Monthly voter registration statistics for registered voters in Oregon.” This data set is very new, it was created on the 20th of September 2016, and was published on the 21st. Its file format is a CSV file, so it can be viewed as an excel document with rows and columns. The data is geographical data I guess. It’s showing what the number of people that are affiliated with each party in the entire state, and can be sorted by county, house district, congregational district. To make this data useful I would probably try and narrow down the area I would be looking at. So instead of an entire state I would probably only want to look at a single district or county. This is due to the amount of information given. To make a graph of just the first house district there would be three counties each with numerous parties that are supported. I am unsure how to represent all of the data properly without distorting it at all.
Our reading was focused on making sure that we can evaluate the usefulness of the data that we are looking at or thinking about acquiring. Also we read about knowing what type of source we are getting our data from. So I got my data from a secondary source. A website that the data creator published it on in order to get it out there.