Daily Blog 6/24/2016

with No Comments

Powered almost exclusively by La Mex and Gulzar’s the EC contingent embarked on the Icelandic Field Studies trip today! (6/24/2016) We left a spectacularly stormy Indiana to fly to JFK where a subsection of the group spent a couple of hours in NYC looking for a 7.2 volt 1800 miliamp battery. (We finally found one at Tinkersphere) Following a mandatory airline shuffle and confusion period, we were finally on our way! We learned that when an airport is the size of the familiar Dayton airport but handles multiple international flights, bad things happen with luggage times. About an hour and a half of waiting around later(without water fountains, we strangely couldn’t find any), we found and counted our 12 TSA inspected bags and were off! Some of us met Lupin for the first time on our way to Bonus, and all of us ooh’ed and aah’ed at the giant hot water pipes. We were able to get all the topographic maps we needed at IÐNÚ The rest of the day was pretty low-key as the group rested and got used to the 12am brightness.

That’s all for today but more tomorrow! We will be at Pingvellir and the Hellisheidi Geothermal plant tomorrow.

Data update

with No Comments

This past week, I worked on getting some database views ready that would show us the results for the sampling day. This is so each evening in Iceland, we’re able to effectively and quickly QA the data we have and check to make sure what we have matches with what we should have. This is now set up so that we have views of data from the last 24 hours for both streaming and readings tables in the database. The views are currently set up so we can see the recordtime, latitude, longitude, elevation, sensortype and sensor value for the streaming table. The readings table view displays the time, site, sector, spot, sensortype and value for the last 24 hours.

I also finished up the ER diagram for the data model as it now stands in our database with a couple of changes from the last iteration.

I began working on learning enough about flask this week to start adapting the viz tool to work with the current data model, and expect to make more headway on that this week. I also started writing up and documenting all our tables and why we chose to structure them the way we did, along with explanations for the relationships between the tables and how we chose those.

Data Progress

with No Comments

This past week, the new data model was implemented after some K,C and I talked about it for the past few weeks. We now have multiple tables in the model, with a table for basically every data point. I have been looking at Iceland 2014 data to figure out what data we can move over into this new model. I’ve learned that there isn’t very much data we’d like to move over from Iceland 2014, although almost all the Nicaragua data is good to move over (in the fall, once we’re back from the Iceland scrum.) The distance function came in really handy here, and made me really glad that we finally all have the old data in a single, easily searchable place. We’re not moving the data over into the new model, but it’s easy to query it where it is now (something that hasn’t been possible for a while)

 

So for now, we are keeping all the Iceland 2014 and Nicaragua data in the old model on Postgres, and focusing on this years’ data, and getting that working in time.

Coordinate functions

with No Comments

This week I worked on getting the coordinate function working in the field science database. It’s now up and running, which means we have a quick and easy way of finding the distance between two points in our table. This is help a lot with remaining cleanup, since we’re now able to get a distance value between two points to help us understand if the data we’re looking at is where we think it is.

I will work with Charlie this week to start populating the readings tables with all the missing sectors in the Iceland 2014 data, which is now easier now that this function allows us to contextualize the difference between two consecutive points in the table. This also gives us the ability to see how many of the points in the table are within a given mile radius, which is also going to be really useful with clean-up.

Lat Long Database Function Progress

with No Comments

To help with the bounding box interface and further data clean-up, I’m working on a function in PSQL that takes two lat long pairs and calculates the distance between them. I was able to get the function working, but before writing the update that inserts a column in the readings table with values, I want to make sure I know how to read the results my function is giving me right now.

A first degree approximation of the values my function gives me shows me that it isn’t returning values in km or miles right now. We definitely want this function to populate the table with a value that can be easily Q&A’ed( miles vs km anyone?) so I’m now going back and looking over the math of how the function works to get it to return an easily usable value.

After I have that figured out, the cleaning that Charlie will help me do that has to do with populating sectors can begin.

 

Databases, and lat long functions

with No Comments

Since my last post was a a very long time ago, this is what has happened with the data clean-up over the past weeks:

Eamon and I both cleaned up all the data we could find individually. Since we’d worked separately, when we compared the cleaned data sets that we each had, they turned out to be different. Neither of us had all the data individually, but when our data sets were combined, the list was exhaustive of all permutations of Iceland 2014 and Nicaragua 2014 data. Then, with Charlie’s help, we were able to determine which of the data sets we needed to zorch. It turns out that a significant chunk of our data sets needed to be zorched, because we each had thousands of rows of testing data or data taken in the car while the group was driving.

After much too much time wrestling data in spreadsheets, the readings table is finally in the field science database where further clean-up relating to sectors and spots can be done. As of right now, most Iceland 2014 data has no sector or spot data. With Charlie’s help, I can now populate the sectors.This should be way quicker to do by date, time stamp and lat long coordinates now that we have it all in the database.

Next up, I will be working to create/adapt a function that measures the distance between a pair of lat long coordinates. This should help with further clean-up and with the bounding box interface.

Clean up done…now what?

with 2 Comments

I now have a master table with all our data in a consistent format,waiting to be imported into a database.
Based on Kristin’s post about,it looks like the next thing for me to start doing is learning about how to use SQLite.It will definitely be really nice to not have to deal with different versions of CSV data,and I’m excited to learn more about how to implement it.

The ability to look up later in the day records that we created earlier to associate bench values is a feature that seems particularly cool,in terms of having everything in one place at one time.

 

Cleaning Up

with No Comments

I’ve continued working on my master table.I’ve been doing two things to the data,cleaning up the format it’s in,and substituting dummy values for those records where we don’t have values.I’ve also been doing a reasonableness check with the data by comparing our calendars and where the calendars said we were to the data we have.This has let me know what data we can safely get rid of,like the couple of testing streams we have from around hostels.

I’m almost done with cleaning up and should be able to just put all of my nice clean data (all in one format!!) into the fieldscience db.

SIGCSE 2016

with No Comments

I’ve been looking at the poster submission for SIGCSE 2016. SIGCSE(Special Interest Group for Computer Science Education) is a conference that as the name indicates,cares about CS education and pedagogy. Since we’ve recently had some experience in writing about computer science education because of the paper,it makes sense to submit a poster to SIGCSE. I’ve been bringing together all the bits and pieces of usable material for this project,with the primary source of data being the XSEDE paper and related bits. We already have a plethora of prose,both used and unused,about CS pedagogy and our take on it, the task now at hand is compiling them into an existing poster. There’s now a folder in Drive for SIGCSE,and it’s where I’ve been putting all relevant material.

The poster deadline is in 9 days,so the clock is ticking on this one….

 

Grooming!

with No Comments

After finishing finalizing the new data model after staring at the iterations we’ve had since 2013,I started making a “master table” this week.It has all the data from all the different iterations,and they aren’t groomed to look the same or follow the same model,but they’re all getting put in one place,so that it’s easy to find all the data we do have once I start grooming it.

This next week,I plan to do exactly that.I plan on putting all the data,in their newly organized format according to our model,into the database,so that  it’s all consistent and in one place.I have a feeling this will be messy and take a while,but I now know what the format is and where the data is,so I’m hopeful of traversing it better.

1 2