So What is “Data Science” Anyway, and What Does a Data Scientist Do? Part 1: Backstory

Preface: This blog will be released in several parts, which I’m working on in tandem in hopes of releasing new content weekly. My plan at this point is to follow this structure:

  • Part 1: Backstory – how did I get from being an education-minded Ph.D. graduate to a Data Scientist?
  • Part 2: Inner Workings – what do I do on a daily basis as a Data Scientist? What are the “big picture” aspects of data science that are typical at mine and other positions?
  • Part 3: Theoretical Background – although my story is unique, what is my statistical and theoretical background that qualified me for data science? Is this common?
  • Part 4: Thinking Forward – how will the Data Scientist position evolve over time, both for the degree program in which I work and at large?

Let me start then with Part 1, providing a little bit of a backstory to my overarching blog post theme, starting with my apologies for not blogging more often. To be sure, I have been incredibly busy looking for, building towards, and ultimately getting permanent full-time employment since receiving my Ph.D. in 2013.

My emphasis in both my work experience and formal training has been to take statistical methods and apply them to emerging technologies. The focus of my degree was within higher education settings (i.e. colleges and universities), as I was most interested in learning how technologies affect student outcomes.

In 2013 through 2015, I worked in a postdoc position studying MOOC courses offered by the University of Illinois, ultimately developing a new instrument for measuring faculty acceptance of emerging technologies and personally finding a deeper desire to study MOOCs.

I feel it necessary here to point out that my attitude toward MOOCs is not one of being a “true believer;” rather, I identified that they had immense potential to disrupt the traditional understanding of higher education, and felt that my research expertise could effectively be used to objectively measure the usefulness of the technology and the promising practices that ensure quality outcomes for its learners. All of this is to say I don’t feel myself a “champion” of MOOCs per se, as I think there’s still a lot of analysis that needs to be done before declaring a verdict on its usefulness and potential to disrupt.

In 2015 the University of Illinois announced a new MBA program was under development that would feature content delivered by MOOC, through a partnership with Coursera. Immediately, I knew I wanted to be a part of that program. It was a figurative dream of mine to be able to witness the development and implementation of a technology-based degree program from the ground-up. At the time (summer 2015) I was officially on the job market, promised another year in my postdoc but without an official research project on which to focus. I had an incredibly promising lead at one of the top medical schools in the US, an upcoming on-campus interview, when I was approached by Maryalice Wu at the Center for Innovation in Teaching and Learning (CITL) at Illinois, wondering if I might be interested in a position studying the new online MBA.


However, it was posed as another postdoc, a position title and status I wasn’t really interested in; furthermore, I didn’t think that was the right approach to the new degree program. This degree needed a full-time data professional. As a new degree program, the first of its kind, there would be a variety of questions stakeholders will have – how effective is the program? How well do students fare after graduating? How does the program compare to the on-campus and traditional online-delivered programs? Is a different population of students served by this degree? Does the program provide increased access or equity to MBA education? These are the sorts of questions that require full-time attention, formal data structuring, and deep analytics. So I crafted an evaluation and assessment plan for the new degree (iMBA), intended to show that I had thought about this topic and the needs of the degree program, and furthermore to outline the human resources that would be necessary to enact the plan.

Note: The attached plan is not an official evaluation or assessment plan adopted by the iMBA degree program. I wanted to share what I had generated for the purpose of helping readers develop their own ideas for thinking about assessment and evaluation and moving forward in their own careers.

It’s also worth noting here that the title of “Data Scientist” wasn’t a part of my vocabulary. Most positions to which I had applied had the title of “Assessment Coordinator,” “Senior Evaluation Specialist,” and so on. I had heard the term before but it was pretty much within the business enterprise and not a position I necessarily wanted to pursue.

I believe the honest attempt at showing my skills and desire for this sort of position paid off; shortly after sending the plan, I heard that the administration at the University and College of Business had discussed priorities and determined it was of strategic importance to have a data-driven position with the iMBA, at the full-time permanent level. This was in July 2015. I turned down what ended up being a very lucrative offer from the aforementioned medical school and took my chance with the iMBA.

Unfortunately, this ended up being a much more complicated process in the end. If it helps you understand the scope of this complication, note the approval for a full-time position was in July 2015. My official start date as a Data Scientist was July 1, 2016.

Yeah, that long.

University policies and procedures are highly nuanced. I don’t pretend to know the inner workings of the hiring process, but suffice to say, it took several months before a glimmer of a job description even began to form, with the search committee not convening until early 2016 (I believe, I wasn’t a part of the process nor privy to its operations), the job announcement coming some time in April 2016, with interviews in May and negotiations in June.

During this time, I was working more closely with CITL than the College of Business, taking the approach of looking at this through an online course/program lens. In many ways, I felt as if I were a consultant for the College of Business and an employee of CITL. Technically, I was neither. My time had been “bought out” by the College of Business but I was still an employee of the Graduate College. Imagine describing that to your parents!

Our approach was one of generating new measurement instruments, following loosely the plan (and follow-up plans) I had devised and ensuring we had a good “machine” in place in time for January 2016, when the first admitted cohort would enter. To be sure, it was an intense scramble. Even being given some latitude to develop instrumentation starting in July 2015, 6 months is hardly enough time to accomplish what I had laid out. As such, like with many new programs, there was a lot of “learn as we go” on the evaluation side, one that really only now, in December 2016, feels like a solid process.

In retrospect, the road I took was a long and unsure one. While I knew the University had a position approved for me, I had to deal with an inordinate amount of stress at the real potential that I would not be the one receiving the position. I had lost a lot of sleep, created tension at home, and made a lot of stress. In the future I expect I will be much more diligent about removing all uncertainty within my control. Get a straight answer on the length of time the process will take.

Fortunately for me, I absolutely LOVE the position I am in. Each day I’m facing new challenges and questions, and formulating data-driven ways to address them. We have a highly responsive program staff and a dedicated student body, all which contribute to an exciting and growing profile and attention.

I’m excited to share what this process looks like with you. I hope you’ll continue to read as I post. I wanted to first share my journey that led me to data science, and would love to hear from you how you were led here. Data science is a unique discipline; it seems that people can come from a variety of backgrounds, and that lends itself to diversity in approach, interpretation, and execution. I truly hope the conversation continues as we move towards a more thorough understanding of our field.

When Incomplete Research and a Hot-Button Issue Collide: MOOCs as Case in Point

Undoubtedly, most who keep up with innovations in higher education have at least heard of massive open online courses (MOOCs), and, based on my conversations with many, have already come to a conclusion about their feasibility in higher education and their impact (or non-impact) on higher and online education. What seems to be missing is a significant body of robust research to support or discount these conclusions.

This lack of significant research is understandable. After all, MOOCs have only been in existence for a couple years, with most pointing to 2008 as an origin point for the term and concept. Additional traction was afforded to MOOCs in 2011 and 2012 as courses stemmed from online and distance education, with the added feature of being publicly accessible and reaching a wide, often international, audience. Between then and now, however, MOOCs have been receiving a lot of buzz with highly mixed reactions. With that buzz has been a flurry to begin research in order to better understand this phenomenon and its potential impact on the status quo.

For the sake of full disclosure, I want to point out a couple of important considerations:

  • I am currently serving as a postdoctoral research associate at the University of Illinois, where I participate in a project studying MOOC courses at Illinois, through a mixed-methods examination of engagement, expectations, and outcomes, with an eye to the potential issues that may confront the expansion of MOOCs to graduate and professional education; and
  • In terms of opinions, I do not have a strong opinion either way on whether MOOCs are an innovation that will positively impact higher education and access to education. I feel there is not enough yet that is known, and am cautious to develop a strong opinion in either direction without further support from research on quality.

What follows is an (hopefully) objective look at just a couple pieces of research that have come to fore and how the media has treated the research, regardless of the whole of findings or quality of methods, as a means for showing how, when the public is very interested in knowing more about an innovation, that studies released the soonest are often used by media to make opinions on an issue of import to higher education seem more heated than they may in fact be.

What the Media is Reporting

If we are to believe what the media shares regarding MOOCs, the outlook on them is dismal. Perhaps the most recent of these reports comes from the Chronicle of Higher Education on January 16, 2014, entitled “Attitudes on Innovation: How College Leaders and Faculty See the Key Issues Facing Higher Education.” Publicity for the report in the e-mail correspondence details that the findings will show “the validity of the MOOCs business model for the future,” and reports that “60% [of] presidents think MOOCs will negatively impact the future of higher education.” Another recentChronicle article, entitled “Doubts About MOOCs Continue to Rise, Survey Finds,” focuses on a report from the Babson Survey Research Group that showed “a growing skepticism among academic leaders about the promise of MOOCs,” namely that more of the leaders surveyed indicated concern about MOOCs’ sustainability and tended to think that “credentials for MOOC completion will cause confusion about higher education degrees.” Based off of how the media presented these findings, the take-away would be that MOOCs are all but dead-in-the-water. However, when looking more deeply at the text of the reports, it seems that limitations and important contextual considerations were overlooked.

Limits of the Research

Several points of the reports are not prominently featured by the media, or omitted entirely, either for the sake of presenting a controversial story or to save on time, elements that seem critical to understanding the findings. Some of the information that would be helpful includes:

  • Low response rates: The response rate to the president and faculty survey was only 8-10 percent, which questions the reliability and generalizability of the findings.
  • Low scope of MOOC availability: The Babson survey found that only 5.0 percent of institutions included in the survey actually implemented any MOOCs. Surveys only presented attitudinal data, a majority of which could be from faculty or administrators with no experience with these courses.
  • High emphasis on uncertainty: The reports themselves detail a large amount of uncertainty, or center-leaning opinions toward questions of the feasibility of MOOCs. This uncertainty seems lost when the articles present the findings.
  • Missing the point on online education: There is so much more that the reports present than is published in the articles. The Babson survey included more questions pertaining to online education in general and its comparison to face-to-face instruction, while the Chronicle study had interesting findings about perceived changes to higher education in the next 10 years. Neither got coverage in the article, as the limited findings related to MOOCs seemed to dominate the conversation.

Closing Thoughts

I do not intend to be overly critical of the work of the media that covers education, as they engage valuable conversations and highlight reports worth reading. However, in the case of emerging innovations such as MOOCs, to me, it appears that this media has extracted findings from reports with much more mixed results and presented a story that makes this innovation seem like it is infeasible and meeting wide resistance. While it makes for a good story, my experience suggests that there are far more individuals who are on the fence, interested in watching how things develop and getting involved in the process of developing strong practices than there are individuals who are completely pro- or completely con-MOOCs. That type of story has its own interest, and is a story worth telling. Mixed results are still results.