Data, data, everywhere
I often talk with fellow educators that work in district offices or technology departments at their schools. They ask for advice on getting their data organized so it can be useful. It is clear that schools are collecting data. Lots and lots of data. Collecting it doesn’t seem to be a problem. There is a general void in the understanding of collecting data to visualizing data for information.
Most questions I get go along the lines of “What system do you use?”, “What program do you use?” “How do you connect it all together?”, “Where do you keep it?”, “Who manages it?” While these are all great questions, my response is usually something along the lines of “It’s complicated.” In order to be able to answer these and similar questions, it’s important to know the context of the situation. For the people that want to work with data or have been designated to work with data in schools, they need to understand the process or the data ecosystem, data road map, data pipeline, or whatever term you would like to call it.
What can data be?
Data doesn’t have to be just numbers and test scores. There are tons and tons of different forms of data we collect in educational institutions. We collect demographic information such as addresses, phone numbers, birthdates, citizenship, and so on. We also collect information about relationships such as parent names and contact information, brothers and sisters, whether a child has an IEP or other types of special reports. These are just some of the pieces of data we collect.
On top of that is the data that an individual teacher might collect about a student. This could be assessment information, anecdotal notes, reading levels, attendance, friendships, reading groups, and the list could go on and on. Keep in mind though, more data doesn’t always mean better data.
Schools collect lots of data, but then sometimes they don’t know what to do with it. They might not know how to distinguish what is important. There are a few things to keep in mind when deciding what data is going to be useful.
Not all data is good data
According to Harvard Business School’s Data Science Ready Program, data worth using is accurate, valid and complete. Data that is accurate is free of errors. Therefore, you know that the data is reliable and comes from a trusted source. Data is valid. This means that the data is providing us with the information we are expecting. Complete data does not have missing values. Finding data that is 100% perfect is nearly impossible. There will usually be something wrong with it. However, even if data is not perfect, it can still be used.
Data by itself isn’t very helpful to most educators or people in schools. It doesn’t provide any real information. In order for it to be useful, it needs to be organized. There are lots of ways to arrange data depending on what is being collected and what the potential outcome is to be. In a recent article, I wrote about different ways to collect and organize the same data for students. Data is regularly ordered by date, time, or location. Data can also be categorized by demographics like age, citizenship, or gender.
Filling in the gap
Let’s take a look at the entire process and the various parts in it. This will help shed some light on how data gets from “data” to “information.” This is what we need to make informed decisions. There are a number of steps along the way. Depending on what type of data you are working with, some of these steps might not be necessary. However, these are the main parts of the process.
1. Data sourcing & collecting: Where the sources of data are and what raw data is taken from those sources.
2. Data wrangling: Raw data from the data source needs to be put into a form that is usable. This is where data wrangling comes in. This might also include where the data will actually be stored.
3. Data cleaning: (some people combine data wrangling and data cleaning) This is where the raw data gets organized. Missing data is checked and and errors are located. The formatting of the data becomes consistent between data sources. The data becomes easier to process.
4. Data analysis: This is where you start to transform and explore the data to help answer your questions about the original data. As data from multiple sources can be complex, those with skills and training such as data scientists, can help tremendously in this area.
5. Data visualization: this is the final product where you see the charts, reports, and dashboards that help people interpret the data as information. If set up well, the visualizations should reduce the cognitive load needed to understand the data.
Graphs don’t just happen
There is a lot of work that goes into setting up and using data correctly. Collecting data and creating data visualizations are only a small part of the entire process. Therefore, it is important to think carefully about the types of data that are being collected by schools. Remember, just because you have the data, doesn’t mean you need to use all the data. Consider what types of questions you have as a school or as a district. Then, seek the help of your school’s data team to assist you in answering these questions.
If you don’t have a data team or someone responsible for data at your school, I would highly recommend looking into the role as a critical step moving forward in ensuring your school is using data to inform instruction. You should also consider the current staff you have. Perhaps someone on your team is eager to be able to do more. Seek out those Data Knights and Data Dreamers. Look for ways to give those people more opportunities to grow by learning and helping your school.
If you are not sure where to get started, feel free to reach out to our team and we’d be happy to give you some guidance and training.