Team Project in Glacier Data Analysis

Fox Glacier, Feb.6.2019

Last year, I was standing on the Fox Glacier in New Zealand. I had never seen such a grand natural site spreading out in front me, the color of a sapphire. I wandered in the Ice Age, trying to capture every second with my camera. However, such beauty on the Earth is dying.

When I was scrolling down through my photo album and came upon the images of the Fox Glacier, that was the moment I decided to make glacier extent forecasting the topic of my group project.

Fox Glacier by Serena Li , Feb.6.2019

We started the project first by doing simple visualization and forecastings using only the forecast variable, the glacier extent. The line graph showed a slight downward trend and a clear seasonal pattern along the year. The glacier extent measured in millions square km, has a general yearly trend of a peak in March and a trough in September .

Glacier Extent in month

For our next step we fitted another predictor variable CO2 to build a linear model with the extent. The relationship between these two variables is negative: With 1 ppmv increase in the CO2, it is associated with a 0.004 ppmv decrease in the glacier extent. ppmv stands for parts per million volume; i.e., 397 ppmv of CO2 means that CO2 constitutes 397 millionths of the total volume of the atmosphere

Extent=13.1518 - 0.004 * CO2

After visualizing the relation between them, we finally started forecasting the glacier extent with CO2. We did a couple Ex-ante forecasts, where forecasts are made using only the information that is available in advance. (Forecasting: Principle and Practice, 7.6)This is one of our Scenario Based Forecasting using Ex-ante forecast.

Glacier extent forecasts in two different CO2 increasing scenarios

The Average increase case takes in the historical mean value of CO2 to predict 2 years glacier extent.

The Forecast increase case is comparatively the most accurate one. To get reasonable future values of CO2, we first did a forecast on CO2 itself. We then do forecasting of glacier extent using the forecasted CO2 values. The blue line in the graph indicates that the predicted 1year extent will be slightly lower than the Average Increase. This makes sense in the reality because the predicted future CO2 is higher than the historical mean CO2, leading to a lower extent.

Working in a group of three has been easier than I’ve expected. Both working and communication within smaller groups are more likely to be efficient and on-point in my experience. After one month putting our heart and soul into completing our project, it’s time to sit down and make some summaries.

1. A Good Topic Matters

Brainstorming Oct.31,2020

When we were doing individual research on topics in our group , I was always very determined to choose a topic that I am interested in. I have worried about whether glaciers can produce more “topics”, i.e. does this topic provide us enough space to dive deep down. But I decided to trust my intuition and count on my analytical abilities.

One fun story we had was with a dataset containing newborn penguins’ body mass and length over a few years. We first thought that might be a good branch of our research. However, as the research progressed, I started to realize the tradeoff between interesting and relevant. Comparing the penguin birth rate adds amusement to the models, but such deviation will make it hard for people to keep their focus on our main track.

2. Team Work Makes Dreams Work

Doing a group project requires coding skills, but more importantly, communication skills. Two things I have been working hard on are keeping an individual’s work from not overlapping with another’s, and finding a balance between being too pushy and too relaxed.

Imagine you came to a weekly meeting excitedly to share one week’s work, but then found out someone else did the same job. This is frustrating and feels like a waste of time — — we could have talked more precisely about what each of us should do.

In our usual meetings, we updated our progress, discussed the results, and asked for suggestions. In the last ten minutes, we discussed what to do next and distributed the work. Sometimes I asked again over Slack to make sure we three were certain about future steps. When we postponed the meeting due to an emergency, sometimes I checked in with my group in the middle, with the intention of not falling behind.

But once in a while I ended up in a dilemma: Was I too pushy in the group?

I am used to fast paced studying and staying intensively concentrated, but should I also want, or even require, my group to be like me? I gradually came to learn that a good team work environment is not all about task-oriented nor building the most complex model, rather having all team members feel comfortable and motivated in their position is a higher and harder achievement.

To maintain an equilibrium between pressure and progressing is another tradeoff I learned in teamwork.

“No one can whistle a symphony. It takes a whole orchestra to play it.”

Orchestra concert clipart-1(

Rigid rules and harsh goals will only make the orchestra’s symphony sound like metal scraps. Flexible coordination makes the team go further and grow stronger.

3. Question Your Work

“Wait, what? Why do these two models produce different results???”

Scatterpoint plot
Line graph

“Wait, what? Why do these two models produce different results???”

One day I found that my monthly scatter point graph didn’t conform with a previous monthly line graph that was generated by tools from another R package. I was confused at the first point. However, after rounds of struggling, I came to the fact that a mistake/bug might have happened in that R package. A similar situation happened when we tried some simple models in Exponential Smoothing. The best model claimed no trend in the glacier extent, which is opposite to our previous conclusion. We still decided to take in the result because different models use different algorithms and have different emphases.

“Using tools from others’ packages means you are fully trusting them and the result,” said my statician teacher. No matter what tool we used, they shouldn’t stop us from making our own independent judgement.

In data analysis, only data tells the story. There is no right or wrong.We accept all the facts generated by the models, but also remain skeptical of our work.

4. Presentation: What makes the audience stay ?

Presentations should meet the audience’s needs. The best presentations result in the audience having a complete understanding of the material presented.

I have long thought that a presentation consisted of a speaker giving a full report of all the work they had done. However, after watching others’ presentations, I found that what the audienced learned is the core measure of presentation success.

The Art of Effective Communication (Mike, 2014)

I used to struggle between introducing a complicated model or discussing the simpler model. If I do a complicated model, I don’t have the confidence to make everyone understand the complexity in limited time, which could end up disappointing the audience; but if I discuss the simpler model, I am 100% confident that I can explain every parameter and every graph in my own words. My voice would also sound more certain and loud to the listeners. This would ensure that everyone understands our project’s focus. This is and should always be the goal of a presentation.

Data visualizations and short clear conclusive sentences are also powerful tools that convince an audience. A graph with colors and axis units clearly drawn and stated is already self-explanatory.

My understanding of a good presentation is knowing what the audience wants and making them stay, even though a presenter may need to give up the “advanced and fancy” model.

It was pretty amazing when I look back in the past one month. The nights I spent with debugging, the meetings I had in early mornings, and the presentation I delivered with growing confidence. And I’m glad I have this blog to take down those precious moments.




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Home Field Advantage: NFL Scoring Distribution Analysis

Frequent Matrix Operations

Arctic Monkeys Lyrics Generator | Revisited

Contribute to Towards Algorithmic Trading

Rio de Janeiro Airbnb Data Analysis

RetentionX: The Data Solution for Decision Makers

3 Steps to implement Self Service BI

YouTube Video Review: Statistics with Python (1 of 3)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Serena Li

Serena Li

More from Medium

Data Buzz Keep up with the Data Lingo!

Learn with Memgraph - Graph Modelling Course: Lesson #1

The Most Famous Data Storyteller : Florence Nightingale

Building scrape data from the marketplace Part 3(project study